Ray Solomonoff and The Dartmouth Summer Research Project in Artificial Intelligence, 1956
Ray Solomonoff and The Dartmouth Summer Research Project in Artificial Intelligence, 1956
Oxbridge Research
Mailing Address: P.O.B. 400404, Cambridge, Ma. 02140, U.S.A.
[email protected] http://raysolomonoff.com
Abstract
This is the story of Ray Solomonoff and the Dartmouth Summer Work-
shop in 1956, often called the beginning of Artificial Intelligence. His notes
and those of his friends show the events that led to it, what that amazing
summer was like, and what happened after.
“Now tell me, just what have you and Marv been up to — Gloria has received
just as much information as I have. . . ”
What indeed? Gloria is the wife of the famous mathematician Marvin Min-
sky, then a Harvard Junior Fellow in Math and Neurology. Ray Solomonoff was
a graduate of the University of Chicago, then working at Technical Research
∗ Thanks to Alex Solomonoff, Donald Rosenthal, Sally Williamson, Julie Sussman for their
invaluable help. Thanks to Professors Carey Heckman and James Moor for sharing copies of
the Rockefeller Foundation file on the Dartmouth conference; to Trenchard More and Marvin
Minsky for sharing files and interviews; to John McCarthy for interviews.
1
Group in New York. Marvin and Ray were spending the summer at Dartmouth
College with a group of scientists, and what they were up to was research on
an unusual new science — People didn’t agree on what it was, how to do it or
even what to call it:
2
systems in a way not well known by the AI community. Second, Ray’s own
ideas about probability were influenced by John McCarthy.
Ray, Marvin, and McCarthy were the only three who spent the whole time
at the Workshop, and I have more notes about them than the other participants
except for Trenchard More, who also had many notes. My main source is notes
by Ray, so this particular report has a “Ray’s eye view” of that summer.
But all of the participants made wonderful contributions to the scientific
world. The summer reaffirmed something: uncensored thought shared in a
free environment inspires the creative process. Recently uncovered notes and
recorded conversations by participants at the Dartmouth Summer Workshop
will show how.
Left to Right: Ray and his father. Ray (on the left), with his brother George.
Ray on a horsie.
Ray loved math and science from his earliest years, which his parents en-
couraged. He graduated in 1944 from the progressive Glenville High School,
3
and after a two-year stint in the Navy, entered the University of Chicago.
Those were brilliant years at the University of Chicago. Enrico Fermi’s stu-
dents taught Ray from their compendium of Fermi’s notes (some titled “Unclear
Notes” rather than “Nuclear Notes”). Anatole Rapoport was his teacher and
Rudolf Carnap, who was trying to find a way to describe the universe by a
digital string, was his professor of Philosophy.[2]
A fellow Navy man, Kirby Smith, was Ray’s roommate in 1950. Kirby wrote
in a letter:
“When I first met Ray he was sitting in an armchair surrounded by piles of
manuscript covered in symbolic logic, to which he was continuing to add at a
furious rate. It was some time before I took in the significance of all that.”[7]
In a letter written in 1950, Ray tells a girlfriend that cybernetics (a common
name for thinking machines) has been his chief interest for the past 4-5 years.
Here’s what Ray was reading:
4
In 1941 the electronic computer had been introduced to the public. By the
1950s, scientists were debating how to use computers for thinking machines.
Both government and private companies funded new technology. After
World War II money shifted from direct war needs to consumer and other prod-
ucts. Nuclear power came on in a big way to outperform British, French and
Italian efforts. With the rise of the cold war, competition with Russia led to more
space research,[48] while the Department of Defense gave grants for broadly de-
fined research in computers and robotics, funding many civilian projects.[26, p.
203]
These were some of the elements that came together, making the time right
for thinking machines.
After graduation Ray went to work in New York. At a conference, he met
Marvin Minsky and they became lifelong friends.
Marvin was one of the four who organized the Dartmouth conference. Due
to Marvin, and to his own interests, Ray was one of the 11 original planned
attendees to the Dartmouth Summer Workship.
An Inspired Decision
In the 1950s, those early days of computing and robotics, there was much con-
fusion and some rancor about what thinking machines and robots would be like.
Some, like Norbert Wiener, believed they would be humanoid, some did brain
modeling, some focused on semantic-based systems, others on mathematical
logic.
Among the confusion and politics, early in 1955 John McCarthy, a young
Assistant Professor of Mathematics at Dartmouth, recognized the incredible
5
potential of this — whatever it was — and the possibilities for some kind of
coherence about it.
He boldly picked the name “Artificial Intelligence.” He chose the name partly
for its neutrality; avoiding a focus on narrow automata theory, and avoiding
cybernetics which was too heavily focused on analog feedback, as well as him
potentially having to accept the assertive Norbert Wiener as guru or having to
argue with him.[26, p. 53] He began pushing to get a group together. McCarthy
hoped a special conference would clarify direction and forge ahead.
The Proposal
In February, 1955, McCarthy first approached the Rockefeller Foundation to
request funding for a summer seminar at Dartmouth for about 10 participants.
In June, he and Claude Shannon, a founder of Information Theory then at Bell
Labs, met with Robert Morison, Director of Biological and Medical Research at
the foundation, and discussed what kind of proposal might get funded.
At the meeting with Morison, participants were suggested such as Nat
Rochester, head of Information Research at I.B.M., a designer of the early IBM
701 computer, Marvin Minsky, John Von Neuman, and others. Morison, how-
ever was very unsure that money would be made available for such a visionary
project.[24]
McCarthy pressed ahead to prepare a proposal for the summer. He was
enthusiastically joined by Shannon, Marvin Minsky, and Nat Rochester. On
September 2, 1955, the four sent a proposal to the Rockefeller Foundation:
“We propose that a 2 month, 10 man study of artificial intelligence be carried
out during the summer of 1956 at Dartmouth College in Hanover, New Hamp-
shire. The study is to proceed on the basis of the conjecture that every aspect
of learning or any other feature of intelligence can in principle be so precisely
described that a machine can be made to simulate it . . . We think that a signif-
icant advance can be made . . . [on components of intelligence] . . . if a carefully
selected group of scientists work on it together for a summer. . . ”[15, p. 2] and
“For the present purpose the artificial intelligence problem is taken to be that
of making a machine behave in ways that would be called intelligent if a human
were so behaving.”[26, p. 53]1
Topics to study: automatic computers and programs for them; programming
computers to use a language; neuron nets; machine self-improvement; classifying
abstractions; and two areas particularly relevant to Solomonoff’s work, though
they would not be part of AI for many years — how to measure the complexity
of calculations, and randomness and creativity.[15, pp. 2–5]
6
of the British school of information theory. Oliver Selfridge who was connected
with Norbert Weiner’s cybernetics group was also proposed,[24] Minsky knew
Oliver Selfridge well, and strongly supported his inclusion. McCarthy met with
Herb Simon and Allen Newell at Carnegie Tech and discussed their work devel-
oping a program to find proofs for theorems in Principia Mathematica. They
both agreed to attend [11]
Among the scientists and mathematicians, Minsky suggested Ray Solomonoff
calling him, a “best buy.” He wrote the other organizers: “It has come to my
attention that R. Solomonoff incessantly fills notebooks with resumes and obser-
vations on his and other peoples ideas, including those with which he may dis-
agree. I cannot begin to describe the extraordinary depth and completel(sic)ess
of these notes. On this ground alone he would be a great asset when it comes
to writing a report of the project; if Ray has been present, the bulk of the work
may be done automatically.”[18]
John McCarthy hoped the summer would produce some specific result. The
group of scientists was motley, however, and so were their proposals. They
included Shannon’s idea for application of information theory to computing
machines; Minsky’s plan relating successful machine output activity to its ability
of assembling “motor abstractions” as the environment changes; Rochester’s
program ideas for originality which included use of randomness; More’s work on
developing a programming technique for handling axiomatic systems in algebraic
form.[15, pp. 2–5]
Everyone had a different idea, a hearty ego, and much enthusiasm for their
own plan. Forging ahead together on anything at all would require some fancy
footwork! But everyone was hopeful: Marvin wrote
“. . . by the time the project starts, the whole bunch of us will, I bet, have an
unprecedented agreement on philosophical and language matters so that there
will be little time wasted on such trivialities.” [18]
Here is the beginning of Ray’s outline for work in thinking machines at the
Dartmouth Summer, sent in March 1956:
What is probability?
Probability and its relation to prediction became Ray’s main focus for the
rest of his life. But at that time, it wasn’t known how probability could be
useful in AI.[41, p. 1]
Ray was invited,
7
On his invitation he noted some of the others who planned to attend.
8
The Summer of 1956
When Did It Happen?
The Dartmouth Workshop is often said to have run for six weeks in the summer
of 1956.[26, p.53] Ray’s notes, however, say the workshop ran for “roughly eight
weeks, from about June 18 to August 17.”[34] Marvin, and Trenchard’s notes
as well, agree.[22] Ray’s Dartmouth notes start on June 22; June 28 mentions
Minsky, June 30 mentions Hanover, N.H., July 1 mentions Tom Etter. On
August 17, Ray gave a final talk, so the event ran about eight weeks.
9
Ray’s List: is smaller, but accurate for actual attendence:
10
Trenchard said “. . . they photographed me walking up the stairs and I ex-
plained where the rooms were where McCarthy was and said there was a dic-
tionary in the room where we looked up the word heuristic and Wendy looked
around and said ‘Oh there’s a dictionary in a stand over there.’ I said ‘that’s the
dictionary.’ And so that’s how even though the math department had moved
out and another department had moved in, the dictionary was still there and
on the stand. It was a big fat dictionary.”[23]
Window looking out from the math room, and the dictionary, still open in
2014 at the word “heuristic.”
11
would change one or more of the uniselectors, reorganizing the in-
put connections, and the homeostat would search the new field for
stability.
Ray’s notes: Problem 1). No memory of previous solutions. i.e.
learns to handle A, then learns B -> then going back to A takes as
much time as before.
Trenchard’s notes: Problem 2). time for problem-solving goes up
exponentially with the number of units in the homeostat.
Both: More questions and ideas about these and other AI problems.
Ray’s notes: Ashby is interested in bringing some rather important
simple ideas to the psychologists... Ashby “is fascinated by the fact
that a table of random numbers can give one all one wants — i.e.
that selection is the only problem.”
McCarthy’s view, that the only real problem is the search problem
— how to speed it up. Ashby feels the same but has fewer ideas.
Ashby wants to start slowly, like organic evolution, and work up to
more complex phenomena
the good evaluation method wasn’t so important, as it played well when the evaluator was
‘seriously wrong’!”
12
have been made. Looks at losing opponent’s play to see in which way his moves
appear to be weak.”
Selfridge, no date given: Interested in the learning process. What sen-
tences are close to a question sentence. What is reasonable logic for a game;
examples of useful words in the game.
Julian Bigelow, Aug 15: “Very wary of speaking vaguely with the hope of
being able to translate this vague talk into machine language (McCarthy also).”4
(The foregoing are sampled from Ray’s more detailed notes.[37])
More discussions — McCarthy, Minsky, Simon: led discussions, some
referring to recent and current papers such as McCarthy’s “On the Meaning of
‘The Reader will Easily Verify’,” Minsky’s “A Framework for Artificial Intelli-
gence,” and Newell, Shaw and Simon’s “Logic Theorist.”
Minsky in particular is not afraid to describe things at at a fairly intuitive level, then become
more and more concrete.”
13
They had afternoon tea; a note written in 2002 by Trenchard describes
an incident with the mathematician John Nash: “He and I were following Herb
Simon, Allen Newell, and others up the stairs to an apartment for late afternoon
tea, when Nash commanded sharply: “Pick that up!” I glanced at the odds and
ends on the landing and said: “Pick it up yourself!.” He replied that he was
testing me.”[22]
During these weeks Ray wrote 175 pages on induction, describing matrices
of symbols which could be program code, that predicted future matrices. He
wrote it up in a paper,“An Inductive Inference Machine,” Ray’s first report on
induction for intelligent machines. He gave two talks on it.[34]
Trenchard and his wife Kit in 2011 and Trenchard at work on his Lattice
Array Theory in 2011.
The summer project had other practical difficulties: Later John McCarthy
said “You see my intention when I first thought . . . about organizing it — The
whole group would meet for the whole summer. Two things prevented that;
one was that Rockefeller didn’t give us anywhere near enough money for that;
and the people had other commitments and so they came for short times and
different times.”[14] Many participants only showed up for a day or even less.
14
Was It A Success?
Glowing words have called the Dartmouth Summer “The Birth of Artificial
Intelligence,” introducing the major AI figures to each other.[28, p. 17]
Both Trenchard and Ray gave a summary in their notes. Trenchard was
positive, detailed the specific interests of each member and felt he had been
influenced by many members.[21, pp. 2–19]
Ray’s comment was austere:
Th. research project wasn’t very suggestive. The main things of
value:
1) wrote and got report reproduced (very important)
2) Met some interesting people in this field
3) Got idea of how poor most thought in this field is
4) Some ideas:
a) Search problem may be imp.
b) These guys may eventually invent a T.M. simply by working more
and more interesting special problems.[39, p. 2]
So the project never created a clear direction toward the goal of AI. But
several results came out of the summer that had lasting effects.
15
other participants, though I wish I knew enough to do so. But Ray can serve
as a focus to describe these three aspects.
Symbolic Methods
During the summer Marvin had many discussions with Ray about his inductive
inference paper on how machines could be made to improve themselves using
symbolic methods with associated utility values of how successful they were at
prediction.[40] Even at Ray’s first talk, he discussed the advantages of using
symbols for induction.[32, p. 3]5
Bill Shutz. Some reactions were: Shannon: do a hand simulation of this; McCarthy: worried
about length of some of the potential search processes; Minsky: happy to add many ad-hoc
mechanisms if useful (which Ray much disagreed with).[34, pp. 8–10] On the last day of the
workshop, Ray gave another talk on the paper, one or more versions of which he had circulated
at the summer session.
16
Transitions toward Symbols
Others were also interested in symbolic language. An example is Trenchard’s
work. In his June 12, 1956 paper “Computer Decisions in Deductive Logic”[20]
he discusses the use of symbols being more general than arithmetical methods,
and using deductive methods rather than lookup decision trees. Trenchard
later expanded his ideas into the creation of Array Theory, which led to the
understanding and better use of arrays. (Arrays can even be empty. And an
empty array, empty of 2, is different from an empty array, empty of 3! A
mystery!)
17
Later Developments
Newell and Simon were by far the most influential Dartmouth Summer attendees
contributing to the AI community’s shift to symbolic information processing,
having helped originate it, continuing with “Logic Theorist,” and later develop-
ing new symbol-manipulation strategies in “General Problem Solver;” eventually
they received the prestigious Turing Award for their work.[26, pp. 81, 88] Pri-
marily through Marvin, Ray also was influential. But since no version of his
Dartmouth paper was published until 1957, since he was not affiliated with any
university, and since he was focused on induction for prediction, rather than
on deduction strategies, his contribution wasn’t noticed — not even by Ray,
himself!
Limited-Domain Projects
However most participants — Ashby, More, Samuel, McCarthy, Bernstein, Min-
sky, Bigelow — used more limited and well-defined domains and problems.7
Some examples:
6 e.g., “[In a board game] there is an optimal constant time for deciding how good a position
is. Therefore if we play very deep, its time should be negligible compared to the amt. of time
spent updating positions.” McCarthy’s updater, the “plausible move program,” has a special
operator with complex procedures.
7 When Marvin Minsky wrote about building a logic-proving machine (July 6, 1956): “Ac-
tually, the machine is intended to operate over much broader classes of problems, including the
making of empirical perceptions and judgements.” Ray noted in the margin that it sounded
“suitable mainly for well-defined problems” [17].
18
Marvin Minsky’s geometry machine: Quoted from Ray’s notes on Mar-
vin’s discussion, August 13 —
Make machine to prove theorems in ordinary plane geometry. State-
ments of theorem are in modified English. The machine makes up
proofs by drawing diagrams to suggest various sub-goal theorems.
. . . Basically, I think his primary idea is to study heuristic devices
— Eventually, to make generalizations about them, so that one can
get methods of inventing new ones.[] (boxaaugust 13)
Rochester probably attended that session; certainly he talked a lot with
Minsky about Marvin’s geometry machine ideas and became very enthusias-
tic, describing these ideas afterward to Herb Gelernter, who went on to re-
search a geometry-theorem-proving machine, presenting a major paper on this
in 1959.[26, p. 85]
Arthur Samuel’s checkers program: He had a continuing research project
on a checkers-playing program using several different representations for check-
ers knowledge, and different training methods. Eventually he developed a pro-
gram that played better than he did! In 1965, however, the world champion,
W.F. Hellman, beat it in all four correspondence games and only drew one
other game because, as Hellman explained, it was a “hurriedly played cross-
board game.”[28, p. 18] [3, pp. 457–463]
19
And as the cold war got frigid, in 1969 Congress passed the “Mansfield
Amendment” to the Defense Procurement Act, demanding that basic research
funded by ARPA (soon renamed DARPA to stress defense) had to have “a
direct and apparent relationship to a specific military function or operation.”
Wonderful civilian research, such as that done at BBN, got choked off! In the
following years, a continuing series of Mansfield Amendments further narrowing
appropriations from the military, channeled research still more.[26, p. 20]
Eugene Higgins Professor of Math at Princeton University. In one section relevant to Ray’s
work, it derives formulas of how to update the likelihood of some events as the sample spaces
including them are updated (Bayes’ rule); it states that “If the events . . . are called causes,
then [the formula] becomes Bayes’ rule for the probability of causes.”[4, p. 85]
20
Participants also felt that induction would eventually be very important, and
that deductive procedures could be the gateway toward this goal. For example,
Trenchard noted . . . “concepts of this program [in his work proposed] may then
be extended to . . . mathematical induction”. . . .
He planned ordered steps and a strategy, keeping away from statistics since
success/failure decisions would get too large too fast and the criteria for success
too removed from the elementary act.[12]9
Turing machines. Specifically, given a machine M whose inputs were strings of symbols. Given
a desired output string, s, we are required to find a string p such that M(p) = s. McCarthy
gave many examples to show how problems could be put into this form.[45]
21
of the sequence and was about to print the next symbol. Wouldn’t you bet that
it would be correct?’ ... I remembered the idea because it seemed intuitively
reasonable.”[45]11
In part, McCarthy’s ideas at Dartmouth helped Ray reach his major work,
which became a foundation for algorithmic information theory: algorithmic
probability and his general theory of inductive inference. This is a mathe-
matical system for predicting future theories based on weights related to the
lengths of present theories coded as strings, with the shortest having the most
weight.[42][43]
Since that time, probability has become well established in AI, and proba-
bilistic methods are used in AI.
But the early focus on deductive methods, especially the popularity of rule-
based expert systems, led to probability being scorned for a while. What is
now called AI often focuses on logic-based, and “deep” systems. Inductive and
probabilistic methods are now often categorized as Machine Learning, holding
separate conferences.
Logic-based systems still don’t generalize very well and lack superior meth-
ods of learning.
Prediction by induction has not achieved the AI goals either, but several
projects are moving in that direction, such as the work of Marcus Hutter, on a
system called AIXI which has a reinforcement learning agent that uses a form
of algorithmic probability to weight programs and the rewards they generate
depending on the agent’s next action.[9]
indeed, work.”[38]
12 Good Old Fashioned AI
22
and myself we talked about the Dartmouth paper. Marvin said
“Anyway this is the paper [Ray’s paper at Dartmouth] and it’s full of these
little N-grams and I was very impressed by this and this led me to switch from
working on neural nets to working on symbolic ideas about AI. Neural nets just
learned by computing correlations between things and sort of in a way it was
a kind of simple statistical type of learning whereas Ray’s idea was you would
consider the possible meanings of different rearrangement of symbols and from
my point of view the progress of artificial intelligence in the 60s and 70s was rapid
and very impressive because everybody that I knew was working on symbolic
reasoning and representation and that sort of thing, and then pretty much
starting in the 1980s people started to go back to statistical learning so that the
artificial intelligence being done now resembles the kind that was being done
from 1900 to 1920 when people like Watson and Pavlov were doing statistical
learning. So in some sense I see Ray as having started the modern direction,
then when he made this more beautiful mathematical theory it seems to me that
the general theory of induction that he produced solves lots of philosophical
problems and it sort of ended 3000 years of speculations in philosphy of what
is meaning and what is information and so forth. But I’m not sure that it’s a
good line to pursue.” Marvin concludes,“As a tour de force we should republish
the old paper ‘The Inductive Inference Machine’ ”[](marvingroup1)
23
It Was Fun!
These are some examples of what the Dartmouth Summer was like, how it began,
what happened during those summer weeks and after. The summer conference
was influential, in ways less obvious than the few well-known programs that
came out of it. Several directions were initiated or encouraged by the Workshop,
debated to this day. By understanding what each other was thinking, and
why, the participants got valuable new ideas, even when disagreeing. The most
interesting and direct changes with respect to Ray, as described here, were the
change in Marvin’s ideas, due to Ray; and a development of Ray’s ideas, due to
McCarthy.
There were many other inspired ideas, such as Selfridge’s thoughts that
later led to his “Pandemonium”, Rochester’s work on cell assembly theory,
More’s work leading to his “NIAL” programming, Marvin’s paper (July 5, 1956),
“A Framework for Artificial Intelligence” which foreshadowed his seminal work
“Steps toward Artificial Intelligence”.
Regardless of what papers and philosophical views emerged, the Dartmouth
Summer of 1956 was a wonderful event. A group of amazing people got together
and freely exchanged ideas and enthusiasms. As Arthur Samuel said, “It was
very interesting, very stimulating, very exciting.”[16] It was not a directed group
research project. It was more like inviting a bunch of brilliant people to an
eight-week conference party, where everyone was brimming over with their own
ideas. Ray always remembered that time as part of the joy and promise of those
beginning days of AI.
AI has a colorful and contentious history. Many scientists and mathemati-
cians have contributed to it and been influenced by each other in meaningful
but little-known ways. Ray was there at one of AI’s bright early moments: the
Dartmouth Summer Workshop of 1956. Hopefully this paper, by looking at
what happened, through early notes, has added some new insights into that
summer, and how it affected Ray’s work as well that of others. Papers of other
scientists yet to be discovered will add to the wealth.
24
The five attendees at the AI@50 conference in 2006 who were part of the
Dartmouth Summer of 1956: Trenchard More, John McCarthy, Marvin
Minsky, Oliver Selfridge, Ray Solomonoff.
References
[1] A. Barr and E.A. Feigenbaum, editors. the Handbook of Artificial Intelli-
gence, volume 1. William Kaufmann, Inc, 1981.
[2] R. Carnap. Logical Foundations of Probability. 1950.
[3] P. Cohen and E.A. Feigenbaum, editors. the Handbook of Artificial Intelli-
gence, volume 3. William Kaufmann, Inc, 1982.
[4] W. Feller. Probability Theory and Its Applications, volume 1. Wiley and
Sons, 1950.
[5] Memorial Guests. Ray Solomonoff Memorial Notes. March 2010.
25
[7] Kirby. kirletter.pdf. to be published at
http://raysolomonoff.com/dartmouth/dart/misc, Nov 2011.
[8] Ronald R. Kline. Cybernetics, automata studies and the dartmouth con-
ference on artificial intelligence. IEEE Annals of the History of Computing,
October-December 2011. Published by IEEE Computer Society.
26
[23] Trenchard More. home interview. To be published in raysolomonoff.com,
September 2011.
[24] Morison. Rockefeller_to_shannon_and_mccarthy_6_17_55.pdf. Rocke-
feller Foundation Archives, Dartmouth file, June 1955.
27
[40] R.J. Solomonoff. An inductive inference machine. Dartmouth Summer Re-
search Project on Artificial Intelligence, August 1956. A privately circulated
report.
[41] R.J. Solomonoff. rayaiapproach.pdf. http://raysolomonoff.com/dartmouth/dart/boxa,
January 1956.
28