Semantic Parsing Via Staged Query Graph Generation: Question Answering With Knowledge Base
Semantic Parsing Via Staged Query Graph Generation: Question Answering With Knowledge Base
1321
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics
and the 7th International Joint Conference on Natural Language Processing, pages 1321–1331,
Beijing, China, July 26-31, 2015. c 2015 Association for Computational Linguistics
ily Guy” in the question “Who first voiced Meg
on Family Guy?” to FamilyGuy (the TV show) 12/26/1999
in the knowledge base, the procedure needs only
from
to examine the predicates that can be applied to Mila Kunis
1322
Aa/Ac
argmin Meg Griffin Ae Ap Aa/Ac
f Se Sp Sc
Family Guy cast y x
Figure 3: The legitimate actions to grow a query
graph. See text for detail.
Figure 2: Query graph that represents the question
“Who first voiced Meg on Family Guy?” of the target knowledge base to retrieve answers.
Semantically, our query graph is more related to
grounded entities. In particular, we would like to simple λ-DCS (Berant et al., 2013; Liang, 2013),
retrieve all the entities that can map to the lambda which is a syntactic simplification of λ-calculus
variables in the end as the answers. Aggregation when applied to graph databases. A query graph
function is designed to operate on a specific entity, can be viewed as the tree-like graph pattern of a
which typically captures some numerical proper- logical form in λ-DCS. For instance, the path from
ties. Just like in the knowledge graph, related the answer node to an entity node can be described
nodes in the query graph are connected by directed using a series of join operations in λ-DCS. Differ-
edges, labeled with predicates in K. ent paths of the tree graph are combined via the
To demonstrate this design, Fig. 2 shows one intersection operators.
possible query graph for the question “Who first
voiced Meg on Family Guy?” using Freebase. 3 Staged Query Graph Generation
The two entities, MegGriffin and FamilyGuy We focus on generating query graphs with the fol-
are represented by two rounded rectangle nodes. lowing properties. First, the tree graph consists of
The circle node y means that there should exist one entity node as the root, referred as the topic
an entity describing some casting relations like entity. Second, there exists only one lambda vari-
the character, actor and the time she started the able x as the answer node, with a directed path
role2 . The shaded circle node x is also called from the root to it, and has zero or more existential
the answer node, and is used to map entities re- variables in-between. We call this path the core
trieved by the query. The diamond node arg min inferential chain of the graph, as it describes the
constrains that the answer needs to be the ear- main relationship between the answer and topic
liest actor for this role. Equivalently, the logi- entity. Variables can only occur in this chain, and
cal form query in λ-calculus without the aggrega- the chain only has variable nodes except the root.
tion function is: λx.∃y.cast(FamilyGuy, y) ∧ Finally, zero or more entity or aggregation nodes
actor(y, x) ∧ character(y, MegGriffin) can be attached to each variable node, including
Running this query graph against K as in the answer node. These branches are the addi-
Fig. 1 will match both LaceyChabert and tional constraints that the answers need to satisfy.
MilaKunis before applying the aggregation For example, in Fig. 2, FamilyGuy is the root
function, but only LaceyChabert is the correct and FamilyGuy → y → x is the core inferential
answer as she started this role earlier (by checking chain. The branch y → MegGriffin specifies
the from property of the grounded CVT node). the character and y → arg min constrains that the
Our query graph design is inspired by (Reddy answer needs to be the earliest actor for this role.
et al., 2014), but with some key differences. The
Given a question, we formalize the query
nodes and edges in our query graph closely re-
graph generation process as a search problem,
semble the exact entities and predicates from the
with staged states and actions.
S Let S =
knowledge base. As a result, the graph can
{φ, Se , Sp , Sc } be the set of states, where each
be straightforwardly translated to a logical form
state could be an empty graph (φ), a single-
query that is directly executable. In contrast, the
node graph with the topic entity (Se ), a core in-
query graph in (Reddy et al., 2014) is mapped
ferential chain (Sp ), or a more complex query
from the CCG parse of the question, and needs fur-
graph
S with additional constraints (Sc ). Let A =
ther transformations before mapping to subgraphs
{Ae , Ap , Ac , Aa } be the set of actions. An ac-
2
y should be grounded to a CVT entity in this case. tion grows a given graph by adding some edges
1323
s3
s1 Family Guy cast y actor x
Family Guy
s0 s1 s4
ϕ Family Guy Family Guy writer y start x
s2
Meg Griffin s5
Family Guy genre x
Figure 4: Two possible topic entity linking actions Figure 5: Candidate core inferential chains start
applied to an empty graph, for question “Who first from the entity FamilyGuy.
voiced [Meg] on [Family Guy]?”
1324
s3
Semantic layer: y 300
Family Guy cast y actor x
Semantic projection matrix: Ws
Max pooling layer: v 300
s6
... Meg Griffin
1325
3.3 Augmenting Constraints & Aggregations is the correct semantic parse of the input ques-
tion q. We use a log-linear model to learn the re-
A graph with just the inferential chain forms the
ward function. Below we describe the features and
simplest legitimate query graph and can be exe-
the learning process.
cuted against the knowledge base K to retrieve
the answers; namely, all the entities that x can 3.4.1 Features
be grounded to. For instance, the graph in s3 in
The features we designed essentially match spe-
Fig. 7 will retrieve all the actors who have been on
cific portions of the graph to the question, and gen-
FamilyGuy. Although this set of entities obvi-
erally correspond to the staged actions described
ously contains the correct answer to the question
previously, including:
(assuming the topic entity FamilyGuy is correct),
it also includes incorrect entities that do not sat- Topic Entity The score returned by the entity
isfy additional constraints implicitly or explicitly linking system is directly used as a feature.
mentioned in the question.
To further restrict the set of answer entities, the Core Inferential Chain We use similarity
graph with only the core inferential chain can be scores of different CNN models described in
expanded by two types of actions: Ac and Aa . Ac Sec. 3.2.1 to measure the quality of the core infer-
is the set of possible ways to attach an entity to a ential chain. PatChain compares the pattern (re-
variable node, where the edge denotes one of the placing the topic entity with an entity symbol) and
valid predicates that can link the variable to the the predicate sequence. QuesEP concatenates the
entity. For instance, in Fig. 7, s6 is created by canonical name of the topic entity and the predi-
attaching MegGriffin to y with the predicate cate sequence, and compares it with the question.
character. This is equivalent to the last con- This feature conceptually tries to verify the entity
junctive term in the corresponding λ-expression: linking suggestion. These two CNN models are
λx.∃y.cast(FamilyGuy, y) ∧ actor(y, x) ∧ learned using pairs of the question and the infer-
character(y, MegGriffin). Sometimes, the ential chain of the parse in the training data. In
constraints are described over the entire answer addition to the in-domain similarity features, we
set through the aggregation function, such as the also train a ClueWeb model using the Freebase
word “first” in our example question qex . This is annotation of ClueWeb corpora (Gabrilovich et al.,
handled similarly by actions Aa , which attach an 2013). For two entities in a sentence that can be
aggregation node on a variable node. For exam- linked by one or two predicates, we pair the sen-
ple, the arg min node of s7 in Fig. 7 chooses the tences and predicates to form a parallel corpus to
grounding with the smallest from attribute of y. train the CNN model.
The full possible constraint set can be derived
Constraints & Aggregations When a con-
by first issuing the core inferential chain as a query
straint node is present in the graph, we use some
to the knowledge base to find the bindings of vari-
simple features to check whether there are words
ables y’s and x, and then enumerating all neigh-
in the question that can be associated with the con-
boring nodes of these entities. This, however,
straint entity or property. Examples of such fea-
often results in an unnecessarily large constraint
tures include whether a mention in the question
pool. In this work, we employ simple rules to re-
can be linked to this entity, and the percentage of
tain only the nodes that have some possibility to be
the words in the name of the constraint entity ap-
legitimate constraints. For instance, a constraint
pear in the question. Similarly, we check the ex-
node can be an entity that also appears in the ques-
istence of some keywords in a pre-compiled list,
tion (detected by our entity linking component), or
such as “first”, “current” or “latest” as features for
an aggregation constraint can only be added if cer-
aggregation nodes such as arg min. The complete
tain keywords like “first” or “latest” occur in the
list of these simple word matching features can
question. The complete set of these rules can be
also be found in Appendix B.
found in Appendix B.
Overall The number of the answer entities re-
3.4 Learning the reward function
trieved when issuing the query to the knowledge
Given a state s, the reward function γ(s) basically base and the number of nodes in the query graph
judges whether the query graph represented by s are both included as features.
1326
q = Who first voiced Meg on Family Guy? 4 Experiments
s
argmin Meg Griffin We first introduce the dataset and evaluation met-
ric, followed by the main experimental results and
Family Guy cast y actor x some analysis.
(1) EntityLinkingScore(FamilyGuy, Family Guy ) = 0.9 4.1 Data & evaluation metric
(2) PatChain( who first voiced meg on <e> , cast-actor) = 0.7
We use the W EB Q UESTIONS dataset (Berant
(3) QuesEP(q, family guy cast-actor ) = 0.6
(4) ClueWeb( who first voiced meg on <e> , cast-actor) = 0.2 et al., 2013), which consists of 5,810 ques-
(5) ConstraintEntityWord( Meg Griffin , q) = 0.5 tion/answer pairs. These questions were collected
(6) ConstraintEntityInQ( Meg Griffin , q) = 1 using Google Suggest API and the answers were
(7) AggregationKeyword(argmin, q) = 1
(8) NumNodes(s) = 5
obtained from Freebase with the help of Amazon
(9) NumAns(s) = 1 MTurk. The questions are split into training and
testing sets, which contain 3,778 questions (65%)
Figure 8: Active features of a query graph s. (1) and 2,032 questions (35%), respectively. This
is the entity linking score of the topic entity. (2)- dataset has several unique properties that make it
(4) are different model scores of the core chain. appealing and was used in several recent papers
(5) indicates 50% of the words in “Meg Griffin” on semantic parsing and question answering. For
appear in the question q. (6) is 1 when the mention instance, although the questions are not directly
“Meg” in q is correctly linked to MegGriffin sampled from search query logs, the selection pro-
by the entity linking component. (8) is the number cess was still biased to commonly asked questions
of nodes in s. The knowledge base returns only 1 on a search engine. The distribution of this ques-
entity when issuing this query, so (9) is 1. tion set is thus closer to the “real” information
need of search users than that of a small number
of human editors. The system performance is ba-
To illustrate our feature design, Fig. 8 presents
sically measured by the ratio of questions that are
the active features of an example query graph.
answered correctly. Because there can be more
3.4.2 Learning than one answer to a question, precision, recall
and F1 are computed based on the system output
In principle, once the features are extracted, the
for each individual question. The average F1 score
model can be trained using any standard off-the-
is reported as the main evaluation metric6 .
shelf learning algorithm. Instead of treating it as a
Because this dataset contains only question and
binary classification problem, where only the cor-
answer pairs, we use essentially the same search
rect query graphs are labeled as positive, we view
procedure to simulate the semantic parses for
it as a ranking problem. Suppose we have several
training the CNN models and the overall reward
candidate query graphs for each question4 . Let ga
function. Candidate topic entities are first gener-
and gb be the query graphs described in states sa
ated using the same entity linking system for each
and sb for the same question q, and the entity sets
question in the training data. Paths on the Free-
Aa and Ab be those retrieved by executing ga and
base knowledge graph that connect a candidate
gb , respectively. Suppose that A is the labeled an-
entity to at least one answer entity are identified
swers to q. We first compute the precision, recall
as the core inferential chains7 . If an inferential-
and F1 score of Aa and Ab , compared with the
chain query returns more entities than the correct
gold answer set A. We then rank sa and sb by their
answers, we explore adding constraint and aggre-
F1 scores5 . The intuition behind is that even if a
gation nodes, until the entities retrieved by the
query is not completely correct, it is still preferred
query graph are identical to the labeled answers, or
than some other totally incorrect queries. In this
the F1 score cannot be increased further. Negative
work, we use a one-layer neural network model
examples are sampled from of the incorrect can-
based on lambda-rank (Burges, 2010) for training
didate graphs generated during the search process.
the ranker.
6
We used the official evaluation script from http://
4
We will discuss how to create these candidate query www-nlp.stanford.edu/software/sempre/.
graphs from question/answer pairs in Sec. 4.1. 7
We restrict the path length to 2. In principle, parses of
5
We use F1 partially because it is the evaluation metric shorter chains can be used to train the initial reward function,
used in the experiments. for exploring longer paths using the same search procedure.
1327
Method Prec. Rec. F1 Method #Entities # Covered Ques. # Labeled Ent.
(Berant et al., 2013) 48.0 41.3 35.7 Freebase API 19,485 3,734 (98.8%) 3,069 (81.2%)
(Bordes et al., 2014b) - - 29.7 Ours 9,147 3,770 (99.8%) 3,318 (87.8%)
(Yao and Van Durme, 2014) - - 33.0
(Berant and Liang, 2014) 40.5 46.6 39.9 Table 2: Statistics of entity linking results on train-
(Bao et al., 2014) - - 37.5
(Bordes et al., 2014a) - - 39.2
ing set questions. Both methods cover roughly the
(Yang et al., 2014) - - 41.3 same number of questions, but Freebase API sug-
(Wang et al., 2014) - - 45.3 gests twice the number of entities output by our
Our approach – STAGG 52.8 60.7 52.5
entity linking system and covers fewer topic enti-
Table 1: The results of our approach compared to ties labeled in the original data.
existing work. The numbers of other systems are
either from the original papers or derived from the system outperforms the previous state-of-the-art
evaluation script, when the output is available. method by a large margin – 7.2% absolute gain.
Given the staged design of our approach, it is
In the end, we produce 17,277 query graphs with thus interesting to examine the contributions of
none-zero F1 scores from the training set questions each component. Because topic entity linking is
and about 1.7M completely incorrect ones. the very first stage, the quality of the entities found
For training the CNN models to identify the in the questions, both in precision and recall, af-
core inferential chain (Sec. 3.2.1), we only fects the final results significantly. To get some
use 4,058 chain-only query graphs that achieve insight about how our topic entity linking com-
F1 = 0.5 to form the parallel question and pred- ponent performs, we also experimented with ap-
icate sequence pairs. The hyper-parameters in plying Freebase Search API to suggest entities for
CNN, such as the learning rate and the numbers possible mentions in a question. As can be ob-
of hidden nodes at the convolutional and semantic served in Tab. 2, to cover most of the training
layers were chosen via cross-validation. We re- questions, we only need half of the number of
served 684 pairs of patterns and inference-chains suggestions when using our entity linking compo-
from the whole training examples as the held-out nent, compared to Freebase API. Moreover, they
set, and the rest as the initial training set. The also cover more entities that were selected as the
optimal hyper-parameters were determined by the topic entities in the original dataset. Starting from
performance of models trained on the initial train- those 9,147 entities output by our component, an-
ing set when applied to the held-out data. We swers of 3,453 questions (91.4%) can be found in
then fixed the hyper-parameters and retrained the their neighboring nodes. When replacing our en-
CNN models using the whole training set. The tity linking component with the results from Free-
performance of CNN is insensitive to the hyper- base API, we also observed a significant perfor-
parameters as long as they are in a reasonable mance degradation. The overall system perfor-
range (e.g., 1000 ± 200 nodes in the convolutional mance drops from 52.5% to 48.4% in F1 (Prec =
layer, 300 ± 100 nodes in the semantic layer, and 49.8%, Rec = 55.7%), which is 4.1 points lower.
learning rate 0.05 ∼ 0.005) and the training pro- Next we test the system performance when the
cess often converges after ∼ 800 epochs. query graph has just the core inferential chain.
When training the reward function, we created Tab. 3 summarizes the results. When only the
up to 4,000 examples for each question that con- PatChain CNN model is used, the performance
tain all the positive query graphs and randomly se- is already very strong, outperforming all existing
lected negative examples. The model is trained as work. Adding the other CNN models boosts the
a ranker, where example query graphs are ranked performance further, reaching 51.8% and is only
by their F1 scores. slightly lower than the full system performance.
This may be due to two reasons. First, the ques-
4.2 Results tions from search engine users are often short and
Tab. 1 shows the results of our system, STAGG a large portion of them simply ask about properties
(Staged query graph generation), compared to ex- of an entity. Examining the query graphs gener-
isting work8 . As can be seen from the table, our ated for training set questions, we found that 1,888
8
We do not include results of (Reddy et al., 2014) because directly comparable to results from other work. On these 570
they used only a subset of 570 test questions, which are not questions, our system achieves 67.0% in F1 .
1328
Method Prec. Rec. F1 graph generation method is inspired by (Yao and
PatChain 48.8 59.3 49.6 Van Durme, 2014; Bao et al., 2014). Unlike tra-
+QuesEP 50.7 60.6 50.9
+ClueWeb 51.3 62.6 51.8 ditional semantic parsing approaches, it uses the
knowledge base to help prune the search space
Table 3: The system results when only the when forming the parse. Similar ideas have also
inferential-chain query graphs are generated. We been explored in (Poon, 2013).
started with the PatChain CNN model and then Empirically, our results suggest that it is cru-
added QuesEP and ClueWeb sequentially. See cial to identify the core inferential chain, which
Sec. 3.4 for the description of these models. matches the relationship between the topic en-
tity in the question and the answer. Our CNN
(50.0%) can be answered exactly (i.e., F1 = 1) us- models can be analogous to the embedding ap-
ing a chain-only query graph. Second, even if the proaches (Bordes et al., 2014a; Yang et al., 2014),
correct parse requires more constraints, the less but are more sophisticated. By allowing param-
constrained graph still gets a partial score, as its eter sharing among different question-pattern and
results cover the correct answers. KB predicate pairs, the matching score of a rare
or even unseen pair in the training data can still be
4.3 Error Analysis predicted precisely. This is due to the fact that the
Although our approach substantially outperforms prediction is based on the shared model parame-
existing methods, the room for improvement ters (i.e., projection matrices) that are estimated
seems big. After all, the accuracy for the intended using all training pairs.
application, question answering, is still low and 6 Conclusion
only slightly above 50%. We randomly sampled
100 questions that our system did not generate In this paper, we present a semantic parsing frame-
the completely correct query graphs, and catego- work for question answering using a knowledge
rized the errors. About one third of errors are in base. We define a query graph as the meaning rep-
fact due to label issues and are not real mistakes. resentation that can be directly mapped to a logical
This includes label error (2%), incomplete labels form. Semantic parsing is reduced to query graph
(17%, e.g., only one song is labeled as the an- generation, formulated as a staged search prob-
swer to “What songs did Bob Dylan write?”) and lem. With the help of an advanced entity linking
acceptable answers (15%, e.g., “Time in China” system and a deep convolutional neural network
vs. “UTC+8”). 8% of the errors are due to incor- model that matches questions and predicate se-
rect entity linking; however, sometimes the men- quences, our system outperforms previous meth-
tion is inherently ambiguous (e.g., AFL in “Who ods substantially on the W EB Q UESTIONS dataset.
founded the AFL?” could mean either “American In the future, we would like to extend our query
Football League” or “American Federation of La- graph to represent more complicated questions,
bor”). 35% of the errors are because of the incor- and explore more features and models for match-
rect inferential chains; 23% are due to incorrect or ing constraints and aggregation functions. Apply-
missing constraints. ing other structured-output prediction methods to
graph generation will also be investigated.
5 Related Work and Discussion
Acknowledgments
Several semantic parsing methods use a domain-
independent meaning representation derived from We thank the anonymous reviewers for their
the combinatory categorial grammar (CCG) parses thoughtful comments, Ming Zhou, Nan Duan and
(e.g., (Cai and Yates, 2013; Kwiatkowski et al., Xuchen Yao for sharing their experience on the
2013; Reddy et al., 2014)). In contrast, our query question answering problem studied in this work,
graph design matches closely the graph knowl- and Chris Meek for his valuable suggestions. We
edge base. Although not fully demonstrated in are also grateful to Siva Reddy and Jonathan Be-
this paper, the query graph can in fact be fairly ex- rant for providing us additional data.
pressive. For instance, negations can be handled
Appendix
by adding tags to the constraint nodes indicating
that certain conditions cannot be satisfied. Our See supplementary notes.
1329
References Sofia, Bulgaria, August. Association for Computa-
tional Linguistics.
Sören Auer, Christian Bizer, Georgi Kobilarov, Jens
Lehmann, Richard Cyganiak, and Zachary Ives. Evgeniy Gabrilovich, Michael Ringgaard, and Amar-
2007. DBpedia: A nucleus for a web of open data. nag Subramanya. 2013. FACC1: Freebase annota-
In The semantic web, pages 722–735. Springer. tion of ClueWeb corpora, version 1. Technical re-
port, June.
Junwei Bao, Nan Duan, Ming Zhou, and Tiejun Zhao.
2014. Knowledge-based question answering as ma- Jianfeng Gao, Patrick Pantel, Michael Gamon, Xi-
chine translation. In Proceedings of the 52nd An- aodong He, Li Deng, and Yelong Shen. 2014. Mod-
nual Meeting of the Association for Computational eling interestingness with deep neural networks. In
Linguistics (Volume 1: Long Papers), pages 967– Proceedings of the 2013 Conference on Empirical
976, Baltimore, Maryland, June. Association for Methods in Natural Language Processing.
Computational Linguistics.
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng,
Jonathan Berant and Percy Liang. 2014. Seman- Alex Acero, and Larry Heck. 2013. Learning deep
tic parsing via paraphrasing. In Proceedings of the structured semantic models for Web search using
52nd Annual Meeting of the Association for Compu- clickthrough data. In Proceedings of the 22nd ACM
tational Linguistics (Volume 1: Long Papers), pages international conference on Conference on informa-
1415–1425, Baltimore, Maryland, June. Association tion & knowledge management, pages 2333–2338.
for Computational Linguistics. ACM.
Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Tom Kwiatkowski, Eunsol Choi, Yoav Artzi, and Luke
Liang. 2013. Semantic parsing on Freebase from Zettlemoyer. 2013. Scaling semantic parsers with
question-answer pairs. In Proceedings of the 2013 on-the-fly ontology matching. In Proceedings of
Conference on Empirical Methods in Natural Lan- the 2013 Conference on Empirical Methods in Natu-
guage Processing, pages 1533–1544, Seattle, Wash- ral Language Processing, pages 1545–1556, Seattle,
ington, USA, October. Association for Computa- Washington, USA, October. Association for Compu-
tional Linguistics. tational Linguistics.
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Percy Liang. 2013. Lambda dependency-based com-
Sturge, and Jamie Taylor. 2008. Freebase: A positional semantics. Technical report, arXiv.
collaboratively created graph database for structur-
Hoifung Poon. 2013. Grounded unsupervised seman-
ing human knowledge. In Proceedings of the 2008
tic parsing. In Annual Meeting of the Association for
ACM SIGMOD International Conference on Man-
Computational Linguistics (ACL), pages 933–943.
agement of Data, SIGMOD ’08, pages 1247–1250,
New York, NY, USA. ACM. Siva Reddy, Mirella Lapata, and Mark Steedman.
2014. Large-scale semantic parsing without
Antoine Bordes, Sumit Chopra, and Jason Weston. question-answer pairs. Transactions of the Associ-
2014a. Question answering with subgraph embed- ation for Computational Linguistics, 2:377–392.
dings. In Proceedings of the 2014 Conference on
Empirical Methods in Natural Language Process- Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng,
ing (EMNLP), pages 615–620, Doha, Qatar, Octo- and Gregoire Mesnil. 2014a. A latent semantic
ber. Association for Computational Linguistics. model with convolutional-pooling structure for in-
formation retrieval. In Proceedings of the 23rd ACM
Antoine Bordes, Jason Weston, and Nicolas Usunier. International Conference on Conference on Infor-
2014b. Open question answering with weakly su- mation and Knowledge Management, pages 101–
pervised embedding models. In Proceedings of 110. ACM.
ECML-PKDD.
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng,
Jane Bromley, James W. Bentz, Léon Bottou, Is- and Grégoire Mesnil. 2014b. Learning semantic
abelle Guyon, Yann LeCun, Cliff Moore, Eduard representations using convolutional neural networks
Säckinger, and Roopak Shah. 1993. Signature ver- for web search. In Proceedings of the Companion
ification using a “Siamese” time delay neural net- Publication of the 23rd International Conference on
work. International Journal Pattern Recognition World Wide Web Companion, pages 373–374.
and Artificial Intelligence, 7(4):669–688.
Zhenghao Wang, Shengquan Yan, Huaming Wang, and
Christopher JC Burges. 2010. From RankNet to Xuedong Huang. 2014. An overview of Microsoft
LambdaRank to LambdaMart: An overview. Learn- Deep QA system on Stanford WebQuestions bench-
ing, 11:23–581. mark. Technical Report MSR-TR-2014-121, Mi-
crosoft, Sep.
Qingqing Cai and Alexander Yates. 2013. Large-
scale semantic parsing via schema matching and lex- Yi Yang and Ming-Wei Chang. 2015. S-MART: Novel
icon extension. In Proceedings of the 51st Annual tree-based structured learning algorithms applied to
Meeting of the Association for Computational Lin- tweet entity linking. In Annual Meeting of the Asso-
guistics (Volume 1: Long Papers), pages 423–433, ciation for Computational Linguistics (ACL).
1330
Min-Chul Yang, Nan Duan, Ming Zhou, and Hae-
Chang Rim. 2014. Joint relational embeddings for
knowledge-based question answering. In Proceed-
ings of the 2014 Conference on Empirical Methods
in Natural Language Processing (EMNLP), pages
645–650, Doha, Qatar, October. Association for
Computational Linguistics.
Xuchen Yao and Benjamin Van Durme. 2014. Infor-
mation extraction over structured data: Question an-
swering with Freebase. In Proceedings of the 52nd
Annual Meeting of the Association for Computa-
tional Linguistics (Volume 1: Long Papers), pages
956–966, Baltimore, Maryland, June. Association
for Computational Linguistics.
Wen-tau Yih, Xiaodong He, and Christopher Meek.
2014. Semantic parsing for single-relation ques-
tion answering. In Proceedings of the 52nd Annual
Meeting of the Association for Computational Lin-
guistics (Volume 2: Short Papers), pages 643–648,
Baltimore, Maryland, June. Association for Compu-
tational Linguistics.
1331