An Information-Processing Account of Representation Change: International Mathematical Olympiad Problems Are Hard Not Only For Humans

The document presents a new information-processing model for mathematical problem solving that incorporates representation change theory. It divides the problem representation process into translating problem texts into formulas in Zermelo-Fraenkel set theory and then interpreting those formulas in local mathematical theories. This allows representation change to be implemented as choosing an appropriate interpretation. The document develops a prototype system using real closed fields theory and benchmark problems to suggest this model can quantitatively study representation change by how well the system solves problems.

Uploaded by

Nguyễn Quang Huy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views6 pages

An Information-Processing Account of Representation Change: International Mathematical Olympiad Problems Are Hard Not Only For Humans

Uploaded by

Nguyễn Quang Huy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

An Information-Processing Account of Representation Change:

International Mathematical Olympiad Problems are Hard not only for Humans
Takuya Matsuzaki ([email protected])
Nagoya University, Furo-cho, Chikusa-ku, Nagoya, 464-8603, JAPAN

Munehiro Kobayashi ([email protected])

University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8571, JAPAN

Noriko H. Arai ([email protected])

National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8634, JAPAN

Abstract formalized as proof search in ZF. However, we cannot expect

the search to be terminated in a realistic time since the search
In this paper, we present a new information-processing model
of math problem solving in which representation change the- space of ZF is too vast. On the other hand, the representation
ory can be implemented. Specifically, we divided the problem change account also has some downsides. It does not provide
representation process into two. One is to straightforwardly any process model, and the analysis remains qualitative but
translate problem texts into formulas in a conservative exten-
sion of Zermelo-Fraenkel’s set theory, and the other is to in- not quantitative (MacGregor et al., 2001).
terpret the translated formulas in local mathematical theories. In this paper, we present a new information-processing
A ZF formula has several interpretations, and representation
change is thus implementable as a choice of an appropriate in- model that enables us to include the representation change
terpretation. Adopting the theory of real closed fields as an ex- account. On the basis of the flow chart of insight problem
ample of local theory and its quantifier elimination algorithms solving (Öllinger et al., 2014), we first specify the perceptual
as an approximate process of searching for solutions, we de-
velop a prototype system. We use more than 400 problems process as translation of a given problem into a formula in ZF.
from three sources as benchmarks: exercise books, univer- We extend the language of ZF so that the translation is kept
sity entrance examination, and the International Mathematical as straightforward as possible. In other words, we assume
Olympiad problems. Our experimental results suggest that our
model can serve as a basis of a quantitative study on represen- that the perceptual process requires no insight but rather cor-
tation change in the sense that the performance of our proto- responds to natural language and image processing. This is
type system reflects difficulties of the problems quite precisely. worth mentioning since the inputs of the existing information-
Keywords: problem solving; information-processing model; processing account are usually not obtainable without insight
insight; representation change
regardless of the theories in which the problems are repre-
sented (Newell & Simon, 1972; Chou, 1988; Kerber & Pol-
Introduction
let, 2006). The obtained ZF formula is considered to be the
Some math problems are much more difficult than others to primary problem representation. There are usually many pos-
solve even though they do not require higher levels of mathe- sible interpretations of the primary problem representation in
matical knowledge or techniques. Nine dot problem and mu- different mathematical local theories. For example, the mu-
tilated draughtboard problem are examples of such problems. tilated draughtboard problem can be embedded to not only
Where does the difficulty come from? propositional logic but also Peano Arithmetic and Presburger
In classical information-processing models, the difficulty Arithmetic. The possible interpretations of the primary rep-
of a given problem is explained by its computational com- resentation are called secondary representation.
plexity: the cost of search (Kaplan & Simon, 1990; Mac-
Gregor, Ormerod, & Chronicle, 2001). In contrast, Gestalts We take the theory of real closed field (RCF) as an ex-
explain the phenomena by the term insights (Isaak & Just, ample of local theories and implement an interpretation pro-
1995; Ohlsson, 1992). A problem is called an insight prob- cess from the primary to secondary representation. We adopt
lem when solving it requires a key feature of the problem to a quantifier elimination (QE) algorithm as an approximate
be recognized or restructured (representation change). process of searching for solutions (Iwane, Yanami, & Anai,
One of the major criticisms of classical information- 2014) and develop a prototype system to solve geometry and
processing account is that it has no mechanism to implement introductory calculus problems.
representation change since problem solving is understood We manually formalize more than 400 math problems
as a search within a well-defined problem space (Öllinger, from three different sources in our extended ZF language
Jones, & Knoblich, 2014). If one tries to enlarge the frame- as a benchmark. The problems are translated so that they
work (theory) of the problem to implement representation can be obtainable automatically from the problem text us-
change inside it, then search space explosion is almost always ing state-of-the-art natural language processing theories and
inevitable. For example, it is a well-known fact that almost techniques (Kamp & Reyle, 1993; Steedman, 2001; Zettle-
all the mathematical activities can be formalized in Zermelo- moyer & Collins, 2005). One source of the problems is exer-
Fraenkel’s set theory (ZF), thus representation change can be cise books, another is university entrance examinations, and
Problem

Suppose that x and y are real numbers. Find the range of a

satisfying x2 + ax + 1 > 0 for all x. Perception
the graph of 𝑦 = 𝑥 2  the graph of 𝜆𝑥. 𝑥 2 /NP
Lexical Processing let 𝑦 = 𝑥 2  let 𝑝𝑟𝑜𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛(𝑦 = 𝑥 2 )/S
∀x ∈ R(x + ax + 1 > 0)
2
⇔ a −4 < 0
2
Formula Parsing
⇔ −2 < a < 2 the perimeter of circle O
POS tagging  the/DT perimeter/NN of/PP circle/NN O/PN

Suppose that x and y are real numbers. Find the range of a such Sentence Processing 𝐴𝐵𝐶 is a right triangle with ∠𝐴𝐵𝐶 = 90°.
that there exists x satisfying x2 + ax + 1 < 0. Zero Anaphora
The length of the hypotenuse (of 𝜙) is 3.

Detection 𝑑𝑖𝑎𝑔𝑜𝑛𝑎𝑙 𝑜𝑓 𝑅
∃x ∈ R(x2 + ax + 1 < 0) ⇔ a2 − 4 > 0 𝑁/𝑃𝑃𝑜𝑓 𝑃𝑃𝑜𝑓
Syntactic 𝑎 : 𝜆𝑦𝜆𝑥. 𝑑𝑖𝑎𝑔(𝑥, 𝑦) :𝑅
⇔ a < −2 ∨ a > 2 Parsing 𝑇 ∖ (𝑇 Τ𝑁𝑃)/𝑁 𝑁: 𝜆𝑥. 𝑑𝑖𝑎𝑔(𝑥, 𝑅)

𝑖𝑠 : 𝜆𝑁𝜆𝑃𝜆𝑦.
റ ∃𝑥(𝑁𝑥 ∧ 𝑃𝑥 𝑦)
റ
Semantic 𝑆 ∖ 𝑁𝑃/𝑁𝑃 𝑇 ∖ (𝑇 Τ𝑁𝑃): 𝜆𝑃𝜆𝑦.
റ ∃𝑥(𝑑𝑖𝑎𝑔 𝑥, 𝑅 ∧ 𝑃𝑥 𝑦) റ
𝐴𝐵 : 𝜆𝑥𝜆𝑦. (𝑦 = 𝑥)
Composition 𝑁𝑃 𝑆 ∖ 𝑁𝑃: 𝜆𝑦. ∃𝑥(𝑑𝑖𝑎𝑔 𝑥, 𝑅 ∧ 𝑦 = 𝑥)
: 𝑠𝑒𝑔(𝐴, 𝐵)
Figure 1: Problem solving and quantifier-elimination Discourse Processing
S: ∃𝑥(𝑑𝑖𝑎𝑔 𝑥, 𝑅 ∧ 𝑠𝑒𝑔 𝐴, 𝐵 = 𝑥)

Coreference 𝐴𝐵𝐶 is a right triangle with ∠𝐴𝐵𝐶 = 90°.

Resolution
The length of the hypotenuse (of 𝜙) is 3.
the other is the International Mathematical Olympiad (IMO). Discourse
Structure Analysis
Though all the problems require mathematical knowledge Let 𝑚, 𝑛 be natural numbers = 𝑚, 𝑛 ∈ 𝑁
Assume that 𝑚 = 𝑛2 = 𝑚 = 𝑛2
and techniques no higher than high-school level, they have Primary Prove that (𝑚 − 𝑛) is even = 𝑒𝑣𝑒𝑛(𝑚 − 𝑛)
different levels of difficulty. We naturally assume that more  ∀𝑚∀𝑛((𝑚, 𝑛 ∈ 𝑁 ∧ 𝑚 = 𝑛2 ) → 𝑒𝑣𝑒𝑛 𝑚 − 𝑛 )
Representation
insight problems can be found in the IMO than in the other (When 𝑛 pigeons sit in 𝑛－1 holes, some hole
two because IMO problems are known to be solvable by only Formulation contains more than one pigeon)
𝑃 = 𝑛 ∧ 𝐻 = 𝑛−1 ∧ 𝑆 ⊂ 𝑃 × 𝐻 ∧ 𝜋1 𝑆 = 𝑃
a few mathematically talented students. The highlight of our HOLFOL → ∃𝑥∃𝑦∃𝑧(𝑥 ≠ 𝑦 ∧ 𝑥, 𝑧 ∈ 𝑆 ∧ 𝑦, 𝑧 ∈ 𝑆)
Transformation
paper is the experimental results on the benchmark. This is  Propositional logic:
Theory
the first paper to report the automated problem solving results Choice ‫=𝑖𝑛ٿ‬1 ‫𝑛ڀ‬−1 𝑛−1
𝑗=1 𝑝𝑖𝑗 → ‫ڀ‬1≤𝑖<𝑚≤𝑛 ‫=𝑗ڀ‬1 𝑝𝑖𝑗 ∧ 𝑝𝑚𝑗
on not only a few problems or a set of artificial problems but  Peano Arithmetic:
Embedding into
∀𝑚 𝑚<𝑛 → 𝑓 𝑚 <𝑛
a large number of real high-school-level problems. Local Theories
→ ∃𝑖∃𝑗 i < j ≤ 𝑛 ∧ 𝑓 𝑖 = 𝑓 𝑗

Secondary
Preliminaries Representation Representation
Change
Let us first redefine what we mean by “mathematical problem
Search / Reasoning
solving.” A math problem is usually expressed as a combina- Failure in search / reasoning
Constraint Satisfaction
tion of sentences, formulas, and figures. In principle, it can be
Quantifier Elimination
expressed as a logical formula in a theory. A theory consists
Theorem proving ∀𝑥 ∈ 𝑅 𝑥 2 + 𝑎𝑥 + 1 > 0
of a set of symbols called a language and a set of axioms. ↓
A language consists of constants, variables, relations, func- Success −2 < 𝑎 < 2

tions, and logical symbols. Constants and variables are terms, Solution
and also f (t1 , . . . ,tn ) is a term if f is an n-ary function symbol
and all ti s are terms. R(t1 , . . . ,tn ) is an atomic formula if R is
an n-ary relation symbol and all ti s are terms. For example, Figure 2: End-to-end problem solving model
2x + 1 = y and x > y + z are atomic formulas in arithmetic.
In the first-order mathematical logic, formulas are defined re-
cursively from atomic formulas and logical symbols. In clas- Specifically, we say that a problem is proved when we show
sical logic, we have seven connectives: ∧ (and), ∨ (or), ¬ that a given problem is equivalent to True.
(not), → (implies), ↔ (if and only if), ∀ (for all), and ∃ (there
exists). The last two connectives, ∀ and ∃, are called quan- A theory is called decidable when there is an algorithm
tifiers. When a variable is quantified, it is called a bound to determine whether any sentence is true or not. Gödel’s
variable. For example, the variable x is bound in the formula incompleteness theorem shows that any theory containing
∃x( f (a) = x) though a remains free (not bound). A formula Peano Arithmetic is undecidable.
containing no free variable is called a sentence. A formula Propositional logic, RCF, and Presburger Arithmetic are
containing no quantifier is called quantifier-free. The set of rare exceptions that are known to be decidable. However,
seven connectives are known to be complete in a sense that computational complexity of the decision procedures is quite
any mathematical assertion can be expressed as a first-order high. The theoretical lower bound of the decision procedure
formula provided that an appropriate language and a set of for propositional logic is superpolynomial to the size of in-
axioms are given. put formulas assuming that P̸=NP, and those for RCF and
A mathematical problem is solved when we find a formal Presburger Arithmetic are doubly exponential (Tarski, 1951;
procedure to show the problem is equivalent to a quantifier Fischer & Rabin, 1974). These lower bounds reflect the phe-
free formula of the simplest form. Fig. 1 gives examples. nomena of search space explosion.
An End-to-end Math Problem Solving Model those interpretable in a local theory, such as RCF and propo-
Fig. 2 presents an overview of our problem solving model. sitional logic. The former is usually a routine procedure for
It consists of three modules. The perception module trans- a person with the necessary math knowledge. The latter re-
lates a problem into a primary representation expressed in quires a target theory to be chosen beforehand. In the experi-
ZF by language processing. The formulation module trans- ments, we chose RCF as the target local theory and confirmed
forms the primary representation to another formula in ZF that many pre-university math problems can be mechanically
that is interpretable in a local theory such as RCF. Finally, the reformulated in RCF. This suggests that, once an appropriate
search/reasoning module works on the secondary representa- local theory is chosen, the reformulation can be modeled as a
tion. Once a failure is detected in the reasoning, the process heuristic search that seeks a formula in the local theory that
backtracks to the formulation module and seeks another prob- is equivalent to the primary representation.
lem representation that makes the reasoning easier. The rest What is missing in our prototype implementation is a
of this section provides more details on the three modules. mechanism to choose an appropriate local theory. Our hy-
pothesis is that it is the key ability in the representation
Perception Module change, which truly requires ‘insight.’ Our information-
processing model thus serves as a test bed for computational
The perception module is organized along a hierarchy in nat-
models of insight problem solving by plugging-in a theory
ural language: words, sentences, and discourses (i.e., se-
choice model to it. The rest of this section summarizes the
quences of sentences). The lexical processing unit identi-
implementation of the two reformulation steps and elucidates
fies the parts-of-speech and other syntactic properties of the
the contribution of the problem solving model.
words and math formulas in a problem. Since math formulas
have their own grammar, they are analyzed by a specialized
parser. Fig. 2 provides an example in which the same for- Higher-order to first-order transformation The primary
mula y = x2 has different syntactic roles, noun phrase (NP) representation often includes higher-order elements (λ-
and embedded sentence (S), in accordance with its context. abstractions), which denotes functions (e.g., λx.x2 ) and con-
The sentence processing unit translates each sentence in ditions (e.g., λy.(|y| < 1)). They are necessary to translate
the problem into a formal representation. We assume a the natural language expressions such as “The function that
grammar-driven translation model here, which composes the maps x ∈ R to x2 ” and “The absolute value of y is less than
semantic representation of a sentence along its syntactic 1. The same condition also applies to x.” We eliminate such
structure (Carpenter, 1997; Heim & Kratzer, 1998). Specifi- higher-order elements to obtain a first-order formula. In the
cally, we developed a Japanese grammar in the formalism of current implementation, this is done by iteratively applying a
Combinatory Categorial Grammar (CCG) (Steedman, 2001). handful of transformation rules such as β-reduction and vari-
Fig. 2 depicts the process of semantic composition with CCG able elimination by substitution (∃x(x = α ∧ φ(x)) ⇔ φ(α)).
for the sentence “AB is a diagonal of R.”
We need to detect omissions (zero pronouns) in the text Reformulation in RCF In the prototype implementation, a
before the semantic composition. Our current implementa- primary representation is rewritten into the language of RCF.
tion detects them using a list of words and their syntactic The first-order language of RCF consists of polynomial equa-
arguments (i.e., case frames). Fig. 2 provides an example tions and inequalities, logical connectives, and quantifiers.
where an omission (“of φ,” where φ, a zero pronoun, stands We developed a set of axioms that define various math con-
for something) is detected as the argument of ‘hypotenuse.’ cepts in the (higher-order) language of RCF, such as:
The discourse processing unit combines the sentence-level
semantic representations into a single formula. We adopt ∀x∀ f (minimize(x, f ) ↔ ∀x′ ( f (x) ≤ f (x′ ))).
the discourse representation theory (Kamp & Reyle, 1993)
as the basic mechanism of the inter-sentential composition. The primary representation is iteratively rewritten with these
Fig. 2 depicts an example where the semantic representations axioms until an equivalent formula is found in the first-order
of three sentences are combined into one with the two con- language of RCF. There is no theoretical guarantee that such a
nectives ∧ and →, and two universal quantifications (∀m∀n). formula will be eventually found even when it exists. We em-
The discourse processing unit also determines the antecedents pirically examined how often it succeeds in the experiments.
of the anaphoric expressions including zero pronouns.
Where in the process does insight come? The vocabulary
Formulation Module of a problem usually tells us in which theory it should be
The formulation module receives a primary problem repre- solved. However, this is not always the case. For instance,
sentation and transforms it into a secondary representation the wording in the mutilated draughtboard problem does not
that is amenable to reasoning. The process consists of two suggest it should be formulated in arithmetic but not in propo-
steps. One is the transformation of the higher-order formu- sitional logic. Human solvers thus usually start by searching
las produced by the perception module to first-order formulas for the solution in propositional logic, putting dominoes on
in ZF. The other is the transformation of the ZF formulas to the board in trial-and-error manner. It is inevitable to change
Table 1: Subject areas of the benchmark problems Table 2: Overall benchmark results
Ex Univ IMO Solved Failed
Algebra 0 10 21 Problem Solved Time (sec) FM TO WR
Linear Algebra 14 62 0 Source (%) min/med/avg/max (%) (%) (%)
Geometry 81 65 94 Ex 75.2 1 / 4 / 20 / 1069 7.9 13.9 3.0
Pre-calculus 0 75 0 Univ 65.3 1 / 7 / 38 / 1061 7.3 22.9 4.5
Calculus 6 33 0 IMO 26.1 2 / 10 / 56 / 513 10.4 60.0 3.5
total 101 245 115

lem in a certain representation, which is a requirement for a

the representation of the problem to solve it in a realistic time. quantitative study on representation change.
When and where does representation change happen in cogni-
tive process? The main contribution of our processing model Material
lies in pinning it down to a specific step in the problem for- We collected more than 400 problems taken from three
mulation process, namely the theory choice. sources: exercise books (Ex), Japanese university entrance
Our information-processing model helps discriminate be- exams (Univ), and International Mathematical Olympiads
tween different kinds of ‘insight’ problems. Nine-dot prob- (IMO). The Ex problems were sampled from a popular ex-
lem and mutilated draughtboard problem have been consid- ercise book series. The problems in the books are marked
ered typical insight problems of the same kind. However, the with one to five stars in accordance with their difficulty: one
reasons why people have difficulties are different in nature. to three stars signify textbook exercise level and four and five
Failure in solving nine-dot problem is at least partially due stars signify university entrance exam level. We sampled ap-
to the ambiguity of the term “line (segment)”. Disambigua- proximately the same number of problems from those marked
tion of terms is a part of the perception process, but not of with one, two, and three stars. The Univ problems were taken
the formulation or representation change in our model. In from the past entrance exams of seven top Japanese national
contrast, they fail to solve mutilated draughtboard problem universities. The IMO problems were taken from the past
because they cannot choose an appropriate theory to solve it IMOs held from 1959 through 2014.
only from the superficial properties of the problem. We examined the problems and exhaustively selected those
that can be formulated (by humans) in the theory of RCF. The
Solution Search/Reasoning
distinction between RCF and non-RCF problems was made
In the current implementation, we adopt a QE algorithm for solely on the basis of the essential mathematical content of
RCF (Iwane et al., 2014) as an example for solution search. the problems. The selected problems thus contain problems
Note that we do not argue the QE algorithm per se is the in several subject areas as shown in Table 1.
model of human answer-deduction process. We utilize it to The problems were manually formalized in a higher-order
approximately measure the difficulty of mathematical reason- language. Operators, who all majored in computer science
ing on a given problem representation. The computational and/or mathematics, were trained to translate the problems as
cost of the QE algorithm is quite sensitive to the problem rep- faithfully as possible to the original natural language state-
resentation; its time complexity is doubly exponential to the ments following the design of the perception module.
number of the variables in the representation. We regard a
long running time of the algorithm as a sign of the impasse Experimental Results
in the reasoning, which has been considered as a trigger of
The prototype system was run on the benchmark problems
representation change (e.g., (Öllinger et al., 2014)). In the ex-
with a time limit of 3600s per problem. Table 2 shows
periments, we examined to what extent this failure detection
the number of successfully solved problems; minimum, me-
mechanism correctly reflects the difficulty of the problems.
dian, average, and maximum (wallclock) time spent on solved
Experimental Procedure problems; number of failures in the reformulation of the pri-
mary ZF representation in RCF (FM); number of failures due
Aim of the Experiment to timeout (TO); and wrong answers (WR). Wrong answers
We developed a prototype implementation of the model de- were due to bugs in the current implementation.
scribed in the previous section. The theory choice process and Overall, the performances on the Ex, Univ, and IMO prob-
representation change mechanism is not yet implemented. lems seem to well reflect the inherent differences in their dif-
The aim of the experiment is to test if we can use the model ficulty levels. We conducted χ2 -test on the difference in the
as a basis for developing a computational model of theory rates of success on them. The difference between IMO and
choice and representation change. We thus need to verify: other sources were statistically significant (p < 0.01) though
A) the model can solve many non-insight problems, which that between Ex and Univ was not (p = 0.09).
do not require representation change and B) the response of We further examined how well the system performance
the model correlates with the difficulty of the problems. B) correlates with the fine-grained difficulty level assessed by
means the model is usable to quantify the difficulty of a prob- human experts. Table 3 lists the performance figures for
Table 3: Results for Ex problems by number of stars Table 5: Accuracy of the solvability prediction
Succeeded Failed Source Precision Recall
#⋆ Success % Time (sec) FM TO Ex 88% ( 67/ 76) 93% ( 67/ 72)
min/med/avg/max (%) (%) Univ 73% (116/160) 78% (116/149)
1 82.4 (28/34) 1 / 4 / 5 / 39 11.8 5.9 IMO 57% ( 17/ 30) 47% ( 17/ 36)
2 73.5 (25/34) 2 / 4 / 6 / 39 5.9 11.8 All 75% (200/266) 78% (200/257)
3 69.7 (23/33) 2 / 4 / 51 / 1069 6.1 24.2
4 63.2 (24/38) 2 / 6 / 36 / 589 10.5 23.7
5 54.3 (19/35) 3 / 10 / 198 / 3245 2.9 42.9
The analysis presented above revealed that certain types of
the difficulty are not captured by the superficial properties of
Table 4: Syntactic profiles of the formalized problems the problems including the problem size and the vocabulary.
Ex Univ IMO This is a partial indication of the necessity of representation
# of ∀ 2.2 2.0 5.8 change or other kinds of insight for solving the problems. A
# of ∃ 5.3 9.3 3.1
# of λ 1.3 2.1 0.1 future work is to examine such problems and clarify why they
# of relations 12.5 19.8 13.8 are difficult and what kinds of theory choice appear in human
# of functions 19.9 36.3 21.9 solutions of such problems.
# of bound variables 8.8 13.4 9.1
# of free variables 3.0 3.1 1.8
Discussions
A first-order theory consists of a language and axioms. A for-
the Ex problems (one to three stars) and additional problems mal theory is expressed in propositional logic, the first-order
sampled from those marked with four and five stars in the predicate logic, or higher-order predicate logic (typed lambda
same exercise books. The overall correlation between the calculus). In our model, we set the primary representation
difficulty level and the system performance is clear although expressed in the first-order ZF. Thus, there are three kinds of
the difference in the success rates was statistically signifi- theory changes: axiom change, language (and axiom) change,
cant only between the problems with one star and five stars and change from propositional to predicate logic.
(p < 0.05, χ2 -test).
Axiom Change
Analysis of the Experimental Results There are infinitely-many possible representations for propo-
Can we estimate the difficulty of a problem just by seeing it? sitional logic. Among them we can find analytic tableaux
If we can, the difficulty of the problems shall be attributed (cut-free LK), resolution, and Frege system (LK). The former
more to its inherent search cost (e.g., the time complexity de- two are the major systems used as the basis for automated
termined by the number of variables) rather than the necessity theorem proving. Cut rule (axiom) allows one to introduce
of representation change. Table 4 presents several syntactic “lemmas” to prove theorems.
features of the benchmark problems. The figures are averaged The pigeonhole principle is known to require exponen-
over the problems taken from each source. It reveals that the tial size proofs both in analytic tableau (Cook & Reckhow,
syntactic features of the IMO problems are not very different 1979) and resolution (Haken, 1985), but it has polynomial-
from the exercise problems in Ex except for the distribution size proofs in Frege system (Buss, 1987). This is because
of variable binders (∀, ∃, λ). we can introduce concepts of “addition”, “subtraction” and
In addition to the basic features listed in Table 4, we may “counting”, and manipulate them to do some “restricted arith-
be able to estimate the difficulty of a problem by the vocab- metic” in Frege system. However, search-cost for appropriate
ulary (i.e., distribution of function/relation symbols). To see cut-formulas is extravagant, and there is almost no hope that
this, we trained a binary classifier that predicts whether or someone comes up with appropriate cut-formulas.
not a problem can be solved by the prototype system in one Another way to shorten proofs is to introduce a “symme-
hour. We used the features in Table 4 and the number of each try rule” (Arai, 1996) as a new axiom. Propositional vari-
symbol in a problem as the input and trained the classifier able p∨i, j stands for “the ith -pigeon sitting in the jth -hole”,
on the results of the benchmark test. Table 5 lists the pre- and “ nj=1 p1, j ” for “the first pigeon sitting in some hole”
cision and the recall of the classification obtained by 5-fold when expressing the pigeonhole principle in propositional
cross-validation. The definitions of the precision and recall logic (Fig. 2). If we have to check all the possible pigeons’
are: precision = T P/(T P + FP) and recall = T P/(T P + FN), positions, proofs blow up exponentially. Proofs will be short-
where T P (resp. FP) is the number of problems correctly ened if we, without loss of generality, assume that the first
(resp. wrongly) predicted ‘solvable’ and FN is the number of pigeon sits in the first hole. In other words, “insight” real-
problems wrongly predicted ‘unsolvable.’ The overall predic- izing that a given problem has the property of symmetric-
tion accuracy in Table 5 is way above the majority baseline of ity helps us to escape from an exhaustive search. There are
57% but the accuracy is not very high especially on Univ and some heuristics known to detect symmetricity, and it is im-
IMO problems. plemented on computer (Arai & Masukawa, 2000).
D D the eighth symposium on the integration of symbolic com-
putation and mechanized reasoning.
A A
P’
Buss, S. R. (1987). Polynomial size proofs of the proposi-
P
tional pigeonhole principle. The Journal of Symbolic Logic,
52(04), 916–927.
B C B C
Carpenter, B. (1997). Type-logical semantics. MIT Press.
Chou, S.-C. (1988). Mechanical geometry theorem proving
(Vol. 41). Springer Science & Business Media.
Figure 3: Solution to the quadrangle problem Cook, S. A., & Reckhow, R. A. (1979). The relative efficiency
of propositional proof systems. The Journal of Symbolic
Logic, 44(01), 36–50.
Language Change Fischer, M. J., & Rabin, M. O. (1974). Super-exponential
Elementary (Euclidean) geometry is known to be embeddable complexity of presburger arithmetic. In Proc. of the siam-
into the Cartesian coordinate system, and finally to RCF. ams symposia in applied mathematics (Vol. 7, pp. 27–41).
However, languages and sets of axioms are different. As a Haken, A. (1985). The intractability of resolution. Theoreti-
result, the difficulties of problems do not remain the same. cal Computer Science, 39, 297–308.
Consider the following problem: “Let ABCD be a quad- Heim, I., & Kratzer, A. (1998). Semantics in generative
rangle. Find the point P that minimizes the sum of AP, BP, grammar. Wiley.
CP, and DP.” Fig. 3 illustrates that the intersection of the Isaak, M. I., & Just, M. A. (1995). Constraints on thinking in
diagonals minimizes the sum because of triangle inequality. insight and invention. In R. J. Sternberg & J. E. Davidson
Insight may be required to line up the intersection of the (Eds.), The nature of insight (pp. 281–325). MIT Press.
diagonals as a candidate for P. However, the idea is easier to Iwane, H., Yanami, H., & Anai, H. (2014). Synrac: A tool-
conceive when it is represented in Euclidean Geometry than box for solving real algebraic constraints. In Mathematical
in RCF since the intersection of the diagonals has salience in software–icms 2014 (pp. 518–522). Springer.
Euclidean Geometry. Kamp, H., & Reyle, U. (1993). From discourse to logic:
Introduction to modeltheoretic semantics of natural lan-
Propositional or Predicate? guage, formal logic and discourse representation theory.
Mutilated draughtboard problem (2n × 2n version) is a good Kluwer Academic.
example of problems which is solvable when one changes Kaplan, C. A., & Simon, H. A. (1990). In search of insight.
the setting radically. The problem requires exponential-size Cognitive psychology, 22(3), 374–419.
proofs in resolution and analytic tableaux. It is not known Kerber, M., & Pollet, M. (2006). A tough nut for mathemat-
whether or not it has short proofs in tableaux with symmetry ical knowledge management. In Mathematical knowledge
rule. However, it has a short proof in arithmetic. management (pp. 81–95). Springer.
MacGregor, J. N., Ormerod, T. C., & Chronicle, E. P. (2001).
Conclusion Information processing and insight: A process model of
An end-to-end model of math problem solving has been pre- performance on the nine-dot and related problems. Jour-
sented. In the model, representation change is explained as nal of Experimental Psychology: Learning, Memory, and
the result of a choice of a local theory and the reformula- Cognition, 27(1), 176.
tion of a primary problem representation in it. Experimen- Newell, A., & Simon, H. A. (1972). Human problem solving
tal results on more than 400 problems show that our proto- (Vol. 104) (No. 9). Englewood Cliffs, NJ: Prentice-Hall.
type implementation reflects the difficulties of the problems Ohlsson, S. (1992). Information-processing explanations of
quite precisely. Specifically, IMO problems require the sys- insight and related phenomena. In Advances in the psychol-
tem “theory change” more often than others when interpret- ogy of thinking (pp. 1–44). Harvester Wheatsheaf.
ing the timeout as “impasse”. It indicates the model correctly Öllinger, M., Jones, G., & Knoblich, G. (2014). The dynam-
captures the difficulty of the problems and hence it can serve ics of search, impasse, and representational change provide
as a basis of a quantitative study on representation change. a coherent explanation of difficulty in the nine-dot problem.
Future work includes further analysis of the difficulty of math Psychological research, 78(2), 266–275.
problems in light of our information-processing account and Steedman, M. (2001). The syntactic process. MIT Press.
development of computational models of theory choice. Tarski, A. (1951). A decision method for elementary algebra
and geometry. University of California Press.
References Zettlemoyer, L. S., & Collins, M. (2005). Learning to map
Arai, N. H. (1996). Tractability of cut-free gentzen type sentences to logical form: Structured classification with
propositional calculus with permutation inference. Theo- probabilistic categorial grammars. In Proc. of the 21st con-
retical Computer Science, 170(1), 129–144. ference in uncertainty in artificial intelligence (pp. 658–
Arai, N. H., & Masukawa, R. (2000). How to find symme- 666).
tries hidden in combinatorial problems. In Proceedings of

AI Unit-2
No ratings yet
AI Unit-2
153 pages
AI Lecture 3 & 4 Problem Solving AI
No ratings yet
AI Lecture 3 & 4 Problem Solving AI
31 pages
Unit 2 CH 2
No ratings yet
Unit 2 CH 2
97 pages
Towards Tractable Mathematical Reasoning Challenge
No ratings yet
Towards Tractable Mathematical Reasoning Challenge
15 pages
Unit 2 (AI)
No ratings yet
Unit 2 (AI)
46 pages
Solving Probabilistic Problems With Technologies in Middle and High School: The French Case
No ratings yet
Solving Probabilistic Problems With Technologies in Middle and High School: The French Case
37 pages
Problem Solving Agents
No ratings yet
Problem Solving Agents
10 pages
B. Tech Unit II Problem Solving
No ratings yet
B. Tech Unit II Problem Solving
199 pages
Unit 2
No ratings yet
Unit 2
70 pages
Algebra: Polynomials, Galois Theory and Applications
From Everand
Algebra: Polynomials, Galois Theory and Applications
Frédéric Butin
No ratings yet
Lecture 8
No ratings yet
Lecture 8
21 pages
Ai Unit - 2
No ratings yet
Ai Unit - 2
48 pages
Ch3 Hitt Saboya Corts2017
No ratings yet
Ch3 Hitt Saboya Corts2017
19 pages
AI Module2
No ratings yet
AI Module2
46 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
6 Month MCQs (Oct To May 25) English
No ratings yet
6 Month MCQs (Oct To May 25) English
197 pages
Starting Point
No ratings yet
Starting Point
69 pages
Unit 2
No ratings yet
Unit 2
48 pages
AI Lecture Three Search
No ratings yet
AI Lecture Three Search
17 pages
Unit 2
No ratings yet
Unit 2
27 pages
Zygmunt Pizlo - Problem Solving - Cognitive Mechanisms and Formal Models-Cambridge University Press (2022)
No ratings yet
Zygmunt Pizlo - Problem Solving - Cognitive Mechanisms and Formal Models-Cambridge University Press (2022)
216 pages
Change of Problem Representation
No ratings yet
Change of Problem Representation
48 pages
MODULE 1 - Second Half
No ratings yet
MODULE 1 - Second Half
26 pages
AI Unit2
No ratings yet
AI Unit2
30 pages
Beliefs, Processes and Difficulties Associated With Mathematical Problem Solving of Grade 9 Students
No ratings yet
Beliefs, Processes and Difficulties Associated With Mathematical Problem Solving of Grade 9 Students
27 pages
Challenges of Symbolic Computation My Favorite Open Problems
No ratings yet
Challenges of Symbolic Computation My Favorite Open Problems
36 pages
Advanced Mathematics for Engineers and Scientists
From Everand
Advanced Mathematics for Engineers and Scientists
Paul DuChateau
4/5 (2)
Teach 02 - 2020 Chapter-3
No ratings yet
Teach 02 - 2020 Chapter-3
75 pages
The Logical Solution Syracuse Conjecture
From Everand
The Logical Solution Syracuse Conjecture
Rolando Zucchini
No ratings yet
Homework Helpers: Trigonometry
From Everand
Homework Helpers: Trigonometry
Denise Szecsei
1/5 (1)
Foundations of Boundedly Rational Choices and Satisficing Decisions
No ratings yet
Foundations of Boundedly Rational Choices and Satisficing Decisions
21 pages
Ordinary Differential Equations and Stability Theory: An Introduction
From Everand
Ordinary Differential Equations and Stability Theory: An Introduction
David A. Sanchez
No ratings yet
Lec3-Problem Solving Agents
No ratings yet
Lec3-Problem Solving Agents
19 pages
Problem Solving in Mathematics Education Tracing Its Foundations and Current Research-Practice Trends
No ratings yet
Problem Solving in Mathematics Education Tracing Its Foundations and Current Research-Practice Trends
12 pages
Applied Partial Differential Equations
From Everand
Applied Partial Differential Equations
Paul DuChateau
5/5 (1)
Data Science Essentials
No ratings yet
Data Science Essentials
279 pages
Optimization in Function Spaces
From Everand
Optimization in Function Spaces
Amol Sasane
No ratings yet
Artificial Intelligence & Machine Learning
No ratings yet
Artificial Intelligence & Machine Learning
21 pages
Recursive Analysis
From Everand
Recursive Analysis
R. L. Goodstein
No ratings yet
The Induction Book
From Everand
The Induction Book
Steven H. Weintraub
No ratings yet
Algebras of Holomorphic Functions and Control Theory
From Everand
Algebras of Holomorphic Functions and Control Theory
Amol Sasane
No ratings yet
Problem Solving and Mathematical Knowledge: Joseph Corneli December 15, 2010
No ratings yet
Problem Solving and Mathematical Knowledge: Joseph Corneli December 15, 2010
24 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
The Summation of Series
From Everand
The Summation of Series
Harold T. Davis
4/5 (1)
Algebra & Trigonometry II Essentials
From Everand
Algebra & Trigonometry II Essentials
Editors of REA
4/5 (4)
Calculus Fundamentals Explained
From Everand
Calculus Fundamentals Explained
Samuel Horelick
3/5 (3)
Automated Theory Formation in Mathematics: Douglas B. Lenat
No ratings yet
Automated Theory Formation in Mathematics: Douglas B. Lenat
10 pages
Stability Theory of Differential Equations
From Everand
Stability Theory of Differential Equations
Richard Bellman
4/5 (1)
Tata Play Packs 20240904 - 0
No ratings yet
Tata Play Packs 20240904 - 0
243 pages
Anatomy Of: Domain - Driven Design
No ratings yet
Anatomy Of: Domain - Driven Design
24 pages
Applied Complex Variables
From Everand
Applied Complex Variables
John W. Dettman
4.5/5 (2)
Nonlinear Transformations of Random Processes
From Everand
Nonlinear Transformations of Random Processes
Ralph Deutsch
No ratings yet
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
R June 6 Prakash Bari Health
No ratings yet
R June 6 Prakash Bari Health
6 pages
Cooling Tower Motor Type
No ratings yet
Cooling Tower Motor Type
1 page
SLL 12
No ratings yet
SLL 12
124 pages
Lecture 1a
No ratings yet
Lecture 1a
22 pages
QMM Report Tata Steel
100% (1)
QMM Report Tata Steel
33 pages
500 Grammar Questions With Keys PDF
No ratings yet
500 Grammar Questions With Keys PDF
48 pages
Python Essentials
No ratings yet
Python Essentials
181 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Elementary Theory and Application of Numerical Analysis: Revised Edition
From Everand
Elementary Theory and Application of Numerical Analysis: Revised Edition
David G. Moursund
No ratings yet
Asymptotic Expansions
From Everand
Asymptotic Expansions
A. Erdélyi
3/5 (1)
DesignThinking UNIT II
No ratings yet
DesignThinking UNIT II
43 pages
Lecture 17 MTH343
No ratings yet
Lecture 17 MTH343
60 pages
Towards The Interpretability of Machine Learning Predictions For Medical Applications Targeting Personalised Therapies: A Cancer Case Survey
No ratings yet
Towards The Interpretability of Machine Learning Predictions For Medical Applications Targeting Personalised Therapies: A Cancer Case Survey
31 pages
Conditional Gradient (Frank-Wolfe) Method: Lecturer: Javier Pe Na Convex Optimization 10-725/36-725
No ratings yet
Conditional Gradient (Frank-Wolfe) Method: Lecturer: Javier Pe Na Convex Optimization 10-725/36-725
28 pages
On The Transport-Diffusion Algorithm and Its Applications To The Navier-Stokes Equations
No ratings yet
On The Transport-Diffusion Algorithm and Its Applications To The Navier-Stokes Equations
24 pages
Answers To The First General Quick TEST UTME
No ratings yet
Answers To The First General Quick TEST UTME
22 pages
Metric Spaces: 1.1 Definition and Examples
No ratings yet
Metric Spaces: 1.1 Definition and Examples
103 pages
National HQ - 1978
No ratings yet
National HQ - 1978
40 pages
Elementary Functional Analysis
From Everand
Elementary Functional Analysis
Georgi E. Shilov
4/5 (1)
Partial Differential Equations: Dr. Q. M. Zaigham Zia
No ratings yet
Partial Differential Equations: Dr. Q. M. Zaigham Zia
13 pages
SAT Math: Master the Skills in 40 Pages
From Everand
SAT Math: Master the Skills in 40 Pages
Jennifer L Johnson
No ratings yet
Solution of Certain Problems in Quantum Mechanics
From Everand
Solution of Certain Problems in Quantum Mechanics
A. Bolotin
No ratings yet
9YA, 95B, 971-Broken Valve Springs
No ratings yet
9YA, 95B, 971-Broken Valve Springs
3 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Tropical Rainforest: Presented by
No ratings yet
Tropical Rainforest: Presented by
30 pages
Electrical Installation Level 5 Learning Guide
No ratings yet
Electrical Installation Level 5 Learning Guide
76 pages
Expert Systems With Applications: Qinghua Wen, Zehong Yang, Yixu Song, Peifa Jia
No ratings yet
Expert Systems With Applications: Qinghua Wen, Zehong Yang, Yixu Song, Peifa Jia
8 pages
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
Basic Tools in Routine Evaluation of Cardiac Patients
No ratings yet
Basic Tools in Routine Evaluation of Cardiac Patients
26 pages
Soil Mechanics Formula 1700830319
No ratings yet
Soil Mechanics Formula 1700830319
3 pages
2001-3-04 Roubicek Schmidt
No ratings yet
2001-3-04 Roubicek Schmidt
20 pages
FV - Pitch Deck - Company Name
No ratings yet
FV - Pitch Deck - Company Name
12 pages
Attacking Problems in Logarithms and Exponential Functions
From Everand
Attacking Problems in Logarithms and Exponential Functions
David S. Kahn
5/5 (1)
Expert Systems With Applications: Qing Yan, Hongmei Yan, Fei Han, Xinchuan Wei, Tao Zhu
No ratings yet
Expert Systems With Applications: Qing Yan, Hongmei Yan, Fei Han, Xinchuan Wei, Tao Zhu
5 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Mobilink Packages FF
No ratings yet
Mobilink Packages FF
6 pages
Journal of King Saud University - Computer and Information Sciences
No ratings yet
Journal of King Saud University - Computer and Information Sciences
7 pages
Possible Quiz Questions: For January 21st Quiz #1
No ratings yet
Possible Quiz Questions: For January 21st Quiz #1
4 pages
BHS Inggris Xi Sem-1 TP 2021-2022
No ratings yet
BHS Inggris Xi Sem-1 TP 2021-2022
8 pages
15MW Periodic Maintenace Schedule
No ratings yet
15MW Periodic Maintenace Schedule
8 pages
Project - Up Land Law
No ratings yet
Project - Up Land Law
7 pages
Image Encryption Using Elliptic Curve Cryptography: Sciencedirect
No ratings yet
Image Encryption Using Elliptic Curve Cryptography: Sciencedirect
10 pages
Updated Constitution of Business Club
No ratings yet
Updated Constitution of Business Club
13 pages
American Manufacturing Aw1122bcd Parts Book
100% (1)
American Manufacturing Aw1122bcd Parts Book
6 pages
Todorov Theory
No ratings yet
Todorov Theory
1 page
- δ) : Gaussian Markov random fields - . - . - . - . - 54 - δ) : Laplace Markov random fields - . - . - . - . - . 65
No ratings yet
- δ) : Gaussian Markov random fields - . - . - . - . - 54 - δ) : Laplace Markov random fields - . - . - . - . - . 65
2 pages
Updated Resume
No ratings yet
Updated Resume
3 pages
ANCHORE
No ratings yet
ANCHORE
2 pages
Room Tariff: Special Rates On Continental Plan (CPAI)
No ratings yet
Room Tariff: Special Rates On Continental Plan (CPAI)
4 pages
cs10 Toc 2
No ratings yet
cs10 Toc 2
3 pages
ECC For EBS
100% (1)
ECC For EBS
6 pages
KYC Template Individual AnnexB1
No ratings yet
KYC Template Individual AnnexB1
1 page
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)