AYENACHEW
AYENACHEW
Shaw-Yi Chaw, Ken Barker, Bruce Porter, Dan Tecuci Peter. Z. Yeh
University of Texas at Austin Accenture Technology Labs.
Department of Computer Sciences 50 West San Fernando St. Suite 1200
1 University Station C0500 San Jose, CA 95113, USA
Austin, TX 78712, USA [email protected]
{jchaw,kbarker,porter,tecuci}@cs.utexas.edu
In the end, the problem solver answers the question us- to the mini-kb. Second, we describe the search heuristics
ing information in the H UMAN -C ELL concept (see panel 3), enabling the problem-solver to preferentially select content
and generates the answer shown in panel 4. it judges to be relevant.
The problem solver answers a physics question in a sim- To ground our discussion, Figure 4 shows a portion of
ilar manner, as shown in Fig. 2. In this case, relevant the search graph for solving the physics question in Section
information is found in the M OTION - UNDER - FORCE and 2.
M OTION - WITH - CONSTANT ACCELERATION concepts.
Details on formulating original questions using simpli- 3.1 Problem-Solver without relevance reasoning
fied English can be found in [6, 5]. Additional details on
generating explanations for derived answers are described
3.1.1 States
in [9, 2].
Each state in our state-space graph is a mini-kb. The initial
3 Approach state contains only the triples resulting from the interpre-
tation of the question. Descendant states are elaborations
The problem solver’s challenge is to efficiently find rele- of the initial state containing additional information drawn
vant information in large knowledge bases. Our approach from the knowledge base.
is to answer each question with a mini-knowledge-base In Fig. 4, State 1 is the initial state containing the
(“mini-kb”), containing just enough information to infer an the physics question in Section 2. State 2 results from
answer. Initially, the mini-kb contains only the information elaborating State 1 using the MOTION - WITH - CONSTANT-
(triples) provided by the question. The problem solver in- ACCELERATION concept from the knowledge base being
crementally extends the mini-kb with frames (domain con- queried. This elaboration introduced an equation for cal-
cepts) drawn from the knowledge base that was queried. culating the acceleration of the MOVE event to be -14.45
The frames include both domain assertions and inference meters-per-second-squared.
methods. The problem solver succeeds if it constructs a
mini-kb that is sufficient to answer the question before a 3.1.2 Goal Test and the Goal State
time bound is reached.
We describe the problem solver in two steps. First, The goal test determines whether a state (itself a knowledge
we describe the problem solver’s method of incrementally base) can be used to answer the query.
growing the mini-kb. The method consists of searching a Consider states 2 and 5 in Figure 4. State 2 fails the goal
state space in which states are mini-kbs and operators select test because it does not answer the question; it lacks the
content from the knowledge-base being queried and add it knowledge needed to infer a value for the net-force of the
462
Figure 2. In panel 1, a physics question is posed to the system in simplified English. The system
interprets the question as shown in Panel 2. The scenario and query of the question is interpreted
as a M OVE event on an O BJECT having mass 80 kg. The initial and final velocity of the M OVE is 17 m/s
and 0 m/s respectively. The distance of the M OVE is 10 m. There is also an E XERT-F ORCE event whose
object is the same object of the M OVE event. The E XERT-F ORCE event causes the M OVE event. The
query is on the net-force of the E XERT-F ORCE and is the node with a question-mark. In panel 3, the
problem-solver draws in information from the knowledge-base. The final answer and explanation is
shown in Panel 4.
EXERT- FORCE event. In contrast, state 5 contains the nec- Figure 3 shows the initial state and the MOTION - WITH -
essary equations to compute the net-force on the E XERT- CONSTANT- ACCELERATION concept. Their overlapping
F ORCE event to be -1156 Newtons. Thus, state 5 satisfies features, identified by the semantic-matcher, are highlighted
the goal test. in bold. Joining state 1 with M OTION - WITH - CONSTANT-
ACCELERATION results in state 2, shown in panel 3. State 2
463
Figure 3. The three panels show the initial state (consisting of only the question interpretation),
the concept M OTION - WITH - CONSTANT- ACCELERATION, and the results of extending the initial state with
the concept. State 2 is formed by joining the graph for M OTION - WITH - CONSTANT- ACCELERATION with
state 1 using their overlapping features (highlighted in bold in panels 1 and 2). State 2 contains the
problem-solving knowledge, in this case equations, to compute the acceleration of the M OVE given
values for initial velocity, final velocity, and distance.
464
Figure 4. The search graph created by the problem-solver in solving the example question in Section
2. Panels 2-4 show states 1, 2, and 5 in the search graph. State 1, which is the initial state in the search
graph, contains only the triples from the interpretation of the question. State 2 contains the results
of elaborating State 1 with the MOTION - WITH - CONSTANT- ACCELERATION concept in the knowledge base
being queried. This elaboration introduced an equation to infer the acceleration of the MOVE to be
14.45 meters-per-second-squared. State 2 is elaborated using the M OTION - UNDER - FORCE concept to
create state 5. State 5 contains the necessary equations to infer the net-force causing the M OVE to
be -1156 Newtons.
3.2.2 Control Strategy an entire knowledge base. Following that, we built a second
problem-solver by enhancing the naive problem-solver with
We order the application of operators to prioritize concepts
the relevance reasoning described in Section 3.2 to control
in the knowledge-base having
operator creation and ordering. Both problem-solvers use
1. knowledge structures directly affecting query the Knowledge Machine inference engine [8].
We assess the claim in two ways. First, we assess if the
2. a high degree of similarity heuristics used for relevance reasoning reduce correctness
scores. Second, we compare the number of states explored
3. the least number of assumptions added by both problem-solvers during search to assess the effi-
Fig. 6 shows how operators B and C in Fig. 4 are or- ciency of our problem-solver on large knowledge-bases.
dered based on their concept’s similarity to State 1. Figure
4.1 Setup
6(a) and 6(b) shows how state 1 matches MOTION - UNDER -
FORCE and T WO -D IMENSIONAL -M OVE respectively. The
We pose AP-like exam questions in biology and physics
match with MOTION - UNDER - FORCE is preferred because
on a variety of knowledge-bases. The test set consists of 308
a larger portion of the MOTION - UNDER - FORCE concept
questions in biology and 105 questions in physics. Our test
graph matches state 1.
harness determines whether an answer is correct by com-
paring the generated answer with an answer key. This en-
4 Evaluation ables the test harness to automatically retry a question, caus-
ing the problem-solver to return different answers, until the
We claim that simple relevance reasoning can signif- question is correctly answered or a time-bound of 20 min-
icantly improve performance without sacrificing correct- utes is reached.
ness. Three knowledge-bases are used in the evaluation:
We built two versions of the problem-solver to assess physics-domain-kb, biology-domain-kb, and multi-domain-
this claim. First, we built a problem-solver without rele- kb. The two domain-kbs were authored by domain ex-
vance reasoning to establish a gold-standard on the number perts to cover selected chapters of college-level science text-
of questions that can be correctly answered by searching books. The domain-kb for physics contained 38 concepts
465
edge base. We found that on all knowledge-bases, the ques-
tions answered in the gold-standard were also answered by
the problem-solver with relevance reasoning. Additionally,
the problem-solver with relevance reasoning required fewer
retries to find the correct answer. This indicates that the
heuristics used in our problem-solver do not sacrifice cor-
rectness and, in fact, enable the problem-solver to find the
correct answer with fewer retries. We were pleasantly sur-
prised to find the problem-solver with relevance reasoning
(a) answering additional questions outside the gold-standard.
This is due to the larger number of states explored by the
problem-solver without relevance reasoning and our setup
aborting an attempt after a time-bound. In summary, the
lower correctness scores and the higher number of retries
required by the problem-solver without relevance reason-
ing motivates the need for relevance reasoning to focus the
search on the most relevant portions of the knowledge base.
Assessment #2: On the significantly larger, multi-
domain-kb, the problem-solver with relevance reasoning
outperform the problem-solver without relevance reason-
(b) ing, especially in physics (see Table 1). Without relevance
reasoning, a number of questions fail to answer due to
Figure 6. Different degrees of match between the significantly larger number of states explored by the
state 1 in Fig. 4 and the MOTION - UNDER - FORCE problem-solver and our setup aborting the attempt after a
and TWO - DIMENSIONAL - MOVE concepts in the time-bound. Table 2 lists the number of states explored
knowledge-base. The match with MOTION - by both problem-solvers to find the correct answers on all
UNDER - FORCE has a higher degree of match knowledge bases. We found relevance reasoning to exhibit
because a larger portion of MOTION - UNDER - good scalability on large knowledge-bases by reducing the
FORCE matches state 1. Thus, Operator B is number of states explored by the problem-solver.
preferred over Operator C in Fig. 4.
5 Summary
466
Question set Problem-Solver Knowledge-base used
Domain-kb Multi-domain-kb
% correct Average % correct Average
retries required retries required
With relevance reasoning 68.51 0.23 65.91 0.19
Biology
Without relevance reasoning 68.18 0.27 64.29 0.26
With relevance reasoning 73.33 0.19 70.67 0.16
Physics
Without relevance reasoning 70.67 0.57 56.00 0.54
Table 1. The correctness scores for the problem-solvers with and without relevance reasoning. Both
problem-solvers achieved similar correctness scores on the domain knowledge-bases. This indi-
cates that relevance reasoning did not sacrifice correctness. The problem-solver without relevance
reasoning recorded lower correctness scores when used with the significantly larger multi-domain-
kb when answering physics questions. This is due to the large number of states explored during
blind-search and our evaluation setup aborting an attempt after a time-bound is reached. This result
highlights the need for the problem solver to select only the most relevant portions of the knowledge
base to reason with.
Table 2. The average, median, 75th percentile, 90th percentile, and the maximum number of states
explored by both problem-solvers – with and without relevance reasoning – for both domains.
467
of First International Conference on Knowledge Capture,
2001.
[4] R. J. Brachman and J. G. Schmolze. An overview of the kl-
one knowledge representation system. Cognitive Science,
9:171–216, 1985.
[5] S.-Y. Chaw, J. Fan, D. Tecuci, and P. Yeh. Capturing a
Taxonomy of Failures During Automatic Interpretation of
Questions Posed in Natural Language. In Proceedings to
The Fourth International Conference on Knowledge Capture
(KCAP, 2007.
[6] P. Clark, S.-Y. Chaw, K. Barker, V. Chaudhri, P. Harrison,
J. Fan, B. John, B. Porter, A. Spaulding, J. Thompson, and
P. Z. Yeh. Capturing and Answering Questions Posed to
a Knowledge-Based System. In Proceedings of the Fourth
International Conference on Knowledge Capture, 2007.
[7] P. Clark and P. Harrison. Controlled language processing for
Halo question-asking, 2003.
[8] P. Clark and B. Porter. KM - The Knowl-
edge Machine: Reference manual. Techni-
cal report, University of Texas at Austin, 1998.
http://www.cs.utexas.edu/users/mfkb/km.html.
[9] N. Friedland, P. Allen, P. Matthews, M. Witbrock,
D. Baxter, J. Curtis, B. Shepard, P. Miraglia, J. Angele,
S. Staab, E. Moench, H. Oppermann, D. Wenke, D. Israel,
V. Chaudhri, B. Porter, K. Barker, J. Fan, S. Chaw, P. Yeh,
D. Tecuci, and P. Clark. Project Halo: Towards a Digital
Aristotle. AI Magazine, 2004.
[10] S. M. Harabagiu, S. J. Maiorano, and M. A. Paşca. Open-
domain textual question answering techniques. Nat. Lang.
Eng., 9(3):231–267, 2003.
[11] J. F. Sowa. Conceptual Structures: Information Processing
in Mind and Machine. Addison-Wesley, 1984.
[12] Vulcan Inc. Project Halo, 2003. http://projecthalo.com.
468