0% found this document useful (0 votes)

21 views12 pages

RATSQL

Text to sql research paper publish in 2020

Uploaded by

Abdul Bari Malik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views12 pages

RATSQL

Text to sql research paper publish in 2020

Uploaded by

Abdul Bari Malik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

RAT-SQL: Relation-Aware Schema Encoding and Linking

for Text-to-SQL Parsers

Bailin Wang∗† Richard Shin∗‡
University of Edinburgh UC Berkeley
[email protected] [email protected]

Xiaodong Liu Oleksandr Polozov Matthew Richardson

Microsoft Research, Redmond
{xiaodl,polozov,mattri}@microsoft.com

Abstract 2018), new tasks such as WikiSQL (Zhong et al.,

2017) and Spider (Yu et al., 2018b) pose the real-
When translating natural language questions life challenge of generalization to unseen database
into SQL queries to answer questions from a
schemas. Every query is conditioned on a multi-
database, contemporary semantic parsing mod-
els struggle to generalize to unseen database table database schema, and the databases do not
schemas. The generalization challenge lies overlap between the train and test sets.
in (a) encoding the database relations in an Schema generalization is challenging for three
accessible way for the semantic parser, and interconnected reasons. First, any text-to-SQL pars-
(b) modeling alignment between database ing model must encode the schema into representa-
columns and their mentions in a given query. tions suitable for decoding a SQL query that might
We present a unified framework, based on the
involve the given columns or tables. Second, these
relation-aware self-attention mechanism, to
address schema encoding, schema linking, and representations should encode all the information
feature representation within a text-to-SQL about the schema such as its column types, foreign
encoder. On the challenging Spider dataset key relations, and primary keys used for database
this framework boosts the exact match accu- joins. Finally, the model must recognize NL used
racy to 57.2%, surpassing its best counterparts to refer to columns and tables, which might differ
by 8.7% absolute improvement. Further from the referential language seen in training. The
augmented with BERT, it achieves the new latter challenge is known as schema linking – align-
state-of-the-art performance of 65.6% on the
ing entity references in the question to the intended
Spider leaderboard. In addition, we observe
qualitative improvements in the model’s un- schema columns or tables.
derstanding of schema linking and alignment. While the question of schema encoding has been
Our implementation will be open-sourced at studied in recent literature (Bogin et al., 2019a),
https://github.com/Microsoft/rat-sql. schema linking has been relatively less explored.
Consider the example in Figure 1. It illustrates the
1 Introduction challenge of ambiguity in linking: while “model”
The ability to effectively query databases with nat- in the question refers to car_names.model
ural language (NL) unlocks the power of large rather than model_list.model, “cars” actu-
datasets to the vast majority of users who are not ally refers to both cars_data and car_names
proficient in query languages. As such, a large (but not car_makers) for the purpose of table
body of research has focused on the task of trans- joining. To resolve the column/table references
lating NL questions into SQL queries that existing properly, the semantic parser must take into ac-
database software can execute. count both the known schema relations (e.g. foreign
The development of large annotated datasets of keys) and the question context.
questions and the corresponding SQL queries has Prior work (Bogin et al., 2019a) addressed the
catalyzed progress in the field. In contrast to prior schema representation problem by encoding the di-
semantic parsing datasets (Finegan-Dollak et al., rected graph of foreign key relations in the schema
∗
with a graph neural network (GNN). While effec-
Equal contribution. Order decided by a coin toss.
† tive, this approach has two important shortcomings.
Work done during an internship at Microsoft Research.
‡
Work done while partly affiliated with Microsoft Re- First, it does not contextualize schema encoding
search. Now at Microsoft: [email protected]. with the question, thus making reasoning about

7567
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7567–7578
July 5 - 10, 2020. c 2020 Association for Computational Linguistics
Natural Language Question: Desired SQL:
For the cars with 4 cylinders, which model has the largest horsepower? SELECT T1.model
FROM car_names AS T1 JOIN cars_data AS T2
Schema: ON T1.make_id = T2.id
WHERE T2.cylinders = 4
cars_data
id mpg cylinders edispl horsepower weight accelerate year
… ORDER BY T2.horsepower DESC LIMIT 1

car_names model_list car_makers Question → Column linking (unknown)

Question → Table linking (unknown)
make_id model make model_id maker model id maker full_name country
Column → Column foreign keys (known)

Figure 1: A challenging text-to-SQL task from the Spider dataset.

schema linking difficult after both the column rep- 2019) achieves a test set accuracy of 91.8%, signif-
resentations and question word representations are icantly higher than the state of the art on Spider.
built. Second, it limits information propagation The recent state-of-the-art models evaluated on
during schema encoding to the predefined graph of Spider use various attentional architectures for
foreign key relations. The advent of self-attentional question/schema encoding and AST-based struc-
mechanisms in NLP (Vaswani et al., 2017) shows tural architectures for query decoding. IRNet (Guo
that global reasoning is crucial to effective repre- et al., 2019) encodes the question and schema sep-
sentations of relational structures. However, we arately with LSTM and self-attention respectively,
would like any global reasoning to still take into augmenting them with custom type vectors for
account the aforementioned schema relations. schema linking. They further use the AST-based de-
In this work, we present a unified framework, coder of Yin and Neubig (2017) to decode a query
called RAT-SQL,1 for encoding relational structure in an intermediate representation (IR) that exhibits
in the database schema and a given question. It uses higher-level abstractions than SQL. Bogin et al.
relation-aware self-attention to combine global rea- (2019a) encode the schema with a GNN and a simi-
soning over the schema entities and question words lar grammar-based decoder. Both works emphasize
with structured reasoning over predefined schema schema encoding and schema linking, but design
relations. We then apply RAT-SQL to the problems separate featurization techniques to augment word
of schema encoding and schema linking. As a re- vectors (as opposed to relations between words and
sult, we obtain 57.2% exact match accuracy on the columns) to resolve it. In contrast, the RAT-SQL
Spider test set. At the time of writing, this result framework provides a unified way to encode arbi-
is the state of the art among models unaugmented trary relational information among inputs.
with pretrained BERT embeddings – and further
Concurrently with this work, Bogin et al.
reaches to the overall state of the art (65.6%) when
(2019b) published Global-GNN, a different ap-
RAT-SQL is augmented with BERT. In addition,
proach to schema linking for Spider, which ap-
we experimentally demonstrate that RAT-SQL en-
plies global reasoning between question words and
ables the model to build more accurate internal
schema columns/tables. Global reasoning is imple-
representations of the question’s true alignment
mented by gating the GNN that encodes the schema
with schema columns and tables.
using the question token representations. This dif-
2 Related Work fers from RAT-SQL in two important ways: (a)
question word representations influence the schema
Semantic parsing of NL to SQL recently surged representations but not vice versa, and (b) like in
in popularity thanks to the creation of two new other GNN-based encoders, message propagation
multi-table datasets with the challenge of schema is limited to the schema-induced edges such as for-
generalization – WikiSQL (Zhong et al., 2017) and eign key relations. In contrast, our relation-aware
Spider (Yu et al., 2018b). Schema encoding is not transformer mechanism allows encoding arbitrary
as challenging in WikiSQL as in Spider because relations between question words and schema ele-
it lacks multi-table relations. Schema linking is ments explicitly, and these representations are com-
relevant for both tasks but also more challenging in puted jointly over all inputs using self-attention.
Spider due to the richer NL expressiveness and less
We use the same formulation of relation-aware
restricted SQL grammar observed in it. The state
self-attention as Shaw et al. (2018). However, they
of the art semantic parser on WikiSQL (He et al.,
only apply it to sequences of words in the context
1
Relation-Aware Transformer. of machine translation, and as such, their relation

7568
types only encode the relative distance between two computes a learned relation between all the in-
words. We extend their work and show that relation- put elements xi , and the strength of this relation
(h)
aware self-attention can effectively encode more is encoded in the attention weights αij . How-
complex relationships within an unordered set of ever, in many applications (including text-to-SQL
elements (in our case, columns and tables within a parsing) we are aware of some preexisting rela-
database schema as well as relations between the tional features between the inputs, and would like
schema and the question). To the best of our knowl- to bias our encoder model toward them. This is
edge, this is the first application of relation-aware straightforward for non-relational features (repre-
self-attention to joint representation learning with sented directly in each xi ). We could limit the at-
both predefined and softly induced relations in the tention computation only to the “hard” edges where
input structure. Hellendoorn et al. (2020) develop the preexisting relations are known to hold. This
a similar model concurrently with this work, where would make the model similar to a graph atten-
they use relation-aware self-attention to encode tion network (Veličković et al., 2018), and would
data flow structure in source code embeddings. also impede the Transformer’s ability to learn new
Sun et al. (2018) use a heterogeneous graph of relations. Instead, RAT provides a way to commu-
KB facts and relevant documents for open-domain nicate known relations to the encoder by adding
question answering. The nodes of their graph are their representations to the attention mechanism.
analogous to the database schema nodes in RAT- Shaw et al. (2018) describe a way to represent
SQL, but RAT-SQL also incorporates the question relative position information in a self-attention
in the same formalism to enable joint representation layer by changing Equation (1) as follows:
learning between the question and the schema. (h) (h)
xi WQ (xj WK + rij K )>
(h)
3 Relation-Aware Self-Attention eij = p
dz /H
n (2)
First, we introduce relation-aware self-attention, (h)
X (h) (h) V
a model for embedding semi-structured input se- zi = αij (xj WV + rij ).
j=1
quences in a way that jointly encodes pre-existing
relational structure in the input as well as induced Here the rij terms encode the known relationship
“soft” relations between sequence elements in the between the two elements xi and xj in the input.
same embedding. Our solutions to schema embed- While Shaw et al. used it exclusively for relative
ding and linking naturally arise as features imple- position representation, we show how to use the
mented in this framework. same framework to effectively bias the Transformer
Consider a set of inputs X = {xi }ni=1 where toward arbitrary relational information.
xi ∈ Rdx . In general, we consider it an unordered Consider R relational features, each a binary
set, although xi may be imbued with positional relation R(s) ⊆ X × X (1 ≤ s ≤ R). The RAT
embeddings to add an explicit ordering relation. A framework represents all the pre-existing fea-
tures for each edge (i, j) as rij K = rV =
self-attention encoder, or Transformer, introduced ij
by Vaswani et al. (2017), is a stack of self-attention (1) (R) (s)
Concat ρij , . . . , ρij where each ρij is either
layers where each layer (consisting of H heads)
a learned embedding for the relation R(s) if the
transforms each xi into yi ∈ Rdx as follows:
relation holds for the corresponding edge (i.e. if
(h)
(h)
xi WQ (xj WK )>
(h)
(h) (h) (i, j) ∈ R(s) ), or a zero vector of appropriate size.
eij = ; αij = softmax eij
p
dz /H j In the following section, we will describe the set
Xn of relations our RAT-SQL model uses to encode a
(h) (h) (h) (1) (H)
zi = αij (xj WV ); z i = Concat z i , · · · , z i given database schema.
j=1

ỹi = LayerNorm(xi + z i ) 4 RAT-SQL

yi = LayerNorm(ỹi + FC(ReLU(FC(ỹi ))) (1)
We now describe the RAT-SQL framework and its
where FC is a fully-connected layer, LayerNorm is application to the problems of schema encoding
layer normalization (Ba et al., 2016), 1 ≤ h ≤ H, and linking. First, we formally define the text-to-
(h) (h) (h)
and WQ , WK , WV ∈ Rdx ×(dx /H) . SQL semantic parsing problem and its components.
One interpretation of the embeddings computed In the rest of the section, we present our implemen-
by a Transformer is that each head of each layer tation of schema linking in the RAT framework.

7569
Type of x Type of y Edge label Description
S AME -TABLE x and y belong to the same table.
Column Column F OREIGN -K EY-C OL -F x is a foreign key for y.
F OREIGN -K EY-C OL -R y is a foreign key for x.
P RIMARY-K EY-F x is the primary key of y.
Column Table
B ELONGS -T O -F x is a column of y (but not the primary key).
P RIMARY-K EY-R y is the primary key of x.
Table Column
B ELONGS -T O -R y is a column of x (but not the primary key).
F OREIGN -K EY-TAB -F Table x has a foreign key column in y.
Table Table F OREIGN -K EY-TAB -R Same as above, but x and y are reversed.
F OREIGN -K EY-TAB -B x and y have foreign keys in both directions.

Table 1: Description of edge types present in the directed graph G created to represent the schema. An edge exists
from source node x ∈ S to target node y ∈ S if the pair fulfills one of the descriptions listed in the table, with the
corresponding label. Otherwise, no edge exists from x to y.
airports country abbrev airline id airline name
… … …
city
C∉T
primary key
primary key Table-Ques T-Table
airport code airport name country Table-Q Pri. Key C∈T
foreign key airlines
foreign key
ﬂights
a
… … …
source airport dest airport
primary key primary key
How many airlines airline airline airports city
airline ﬂight number abbreviation country id name

Figure 2: An illustration of an example schema as a Figure 3: One RAT layer in the schema encoder.
graph G. We do not depict all the edges and label types
of Table 1 to reduce clutter.
While G holds all the known information about
the schema, it is insufficient for appropriately en-
4.1 Problem Definition coding a previously unseen schema in the context
of the question Q. We would like our representa-
Given a natural language question Q and a schema
tions of the schema S and the question Q to be
S = hC, T i for a relational database, our goal is to
joint, in particular for modeling the alignment be-
generate the corresponding SQL P . Here the ques-
tween them. Thus, we also define the question-
tion Q = q1 . . . q|Q| is a sequence of words, and
contextualized schema graph GQ = hVQ , EQ i
the schema consistsof columns C = {c1 , . . . , c|C| }
where VQ = V ∪ Q = C ∪ T ∪ Q includes nodes
and tables T = t1 , . . . , t|T | . Each column
for the question words (each labeled with a cor-
name ci contains words ci,1 , . . . , ci,|ci | and each
responding word), and EQ = E ∪ EQ↔S are the
table name ti contains words ti,1 , . . . , ti,|ti | . The
schema edges E extended with additional special
desired program P is represented as an abstract
relations between the question words and schema
syntax tree T in the context-free grammar of SQL.
members, detailed in the rest of this section.
Some columns in the schema are primary keys,
used for uniquely indexing the corresponding table, For modeling text-to-SQL generation, we adopt
and some are foreign keys, used to reference a pri- the encoder-decoder framework. Given the input
mary key column in a different table. In addition, as a graph GQ , the encoder fenc embeds it into joint
each column has a type τ ∈ {number, text}. representations ci , ti , q i for each column ci ∈ C,
table ti ∈ T , and question word q ∈ Q respec-
Formally, we represent the database schema as a
tively. The decoder fdec then uses them to compute
directed graph G = hV, Ei. Its nodes V = C ∪ T
a distribution Pr(P | GQ ) over the SQL programs.
are the columns and tables of the schema, each la-
beled with the words in its name (for columns, we
4.2 Relation-Aware Input Encoding
prepend their type τ to the label). Its edges E are
defined by the pre-existing database relations, de- Following the state-of-the-art NLP literature, our
scribed in Table 1. Figure 2 illustrates an example encoder first obtains the initial representations cinit
i ,
graph (with a subset of actual edges and labels). tinit
i for every node of G by (a) retrieving a pre-

7570
trained Glove embedding (Pennington et al., 2014) Name-Based Linking Name-based linking
for each word, and (b) processing the embeddings refers to exact or partial occurrences of the
in each multi-word label with a bidirectional LSTM column/table names in the question, such as the
(BiLSTM) (Hochreiter and Schmidhuber, 1997). It occurrences of “cylinders” and “cars” in the
also runs a separate BiLSTM over the question Q question in Figure 1. Textual matches are the most
to obtain initial word representations q init
i . explicit evidence of question-schema alignment
The initial representations cinit
i , tinit , and q init
i i
and as such, one might expect them to be directly
are independent of each other and devoid of any beneficial to the encoder. However, in all our
relational information known to hold in EQ . To experiments the representations produced by
produce joint representations for the entire input vanilla self-attention were insensitive to textual
graph GQ , we use the relation-aware self-attention matches even though their initial representations
mechanism (Section 3). Its input X is the set of all were identical. Brunner et al. (2020) suggest
the node representations in GQ : that representations produced by Transformers
mix the information from different positions and
init init init init init init
X = (c1 , · · · , c|C| , t1 , · · · , t|T | , q 1 , · · · , q |Q| ). cease to be directly interpretable after 2+ layers,
which might explain our observations. Thus, to
The encoder fenc applies a stack of N relation- remedy this phenomenon, we explicitly encode
aware self-attention layers to X, with separate name-based linking using RAT relations.
weight matrices in each layer. The final representa- Specifically, for all n-grams of length 1 to 5 in
th
tions ci , ti , q i produced by the N layer constitute the question, we determine (1) whether it exactly
the output of the whole encoder. matches the name of a column/table (exact match);
Alternatively, we also consider pre-trained or (2) whether the n-gram is a subsequence of the
BERT (Devlin et al., 2019) embeddings to obtain name of a column/table (partial match).3 Then, for
the initial representations. Following (Huang et al., every (i, j) where xi ∈ Q, xj ∈ S (or vice versa),
2019; Zhang et al., 2019), we feed X to the BERT we set rij ∈ EQ↔S to Q UESTION -C OLUMN-M,
and use the last hidden states as the initial represen- Q UESTION -TABLE-M, C OLUMN -Q UESTION-M or
tations before proceeding with the RAT layers.2 TABLE -Q UESTION-M depending on the type of xi
Importantly, as detailed in Section 3, every RAT and xj . Here M is one of E XACT M ATCH, PAR -
layer uses self-attention between all elements of TIAL M ATCH , or N O M ATCH .
the input graph GQ to compute new contextual rep-
Value-Based Linking Question-schema align-
resentations of question words and schema mem-
ment also occurs when the question mentions any
bers. However, this self-attention is biased toward
values that occur in the database and consequently
some pre-defined relations using the edge vectors
K , r V in each layer. We define the set of used
participate in the desired SQL, such as “4” in Fig-
rij ij ure 1. While this example makes the alignment
relation types in a way that directly addresses the
explicit by mentioning the column name “cylin-
challenges of schema embedding and linking. Oc-
ders”, many real-world questions do not. Thus,
currences of these relations between the question
linking a value to the corresponding column re-
and the schema constitute the edges EQ↔S . Most
quires background knowledge.
of these relation types address schema linking (Sec-
The database itself is the most comprehensive
tion 4.3); we also add some auxiliary edges to aid
and readily available source of knowledge about
schema encoding (see Appendix A).
possible values, but also the most challenging to
process in an end-to-end model because of the
4.3 Schema Linking
privacy and speed impact. However, the RAT
Schema linking relations in EQ↔S aid the model framework allows us to outsource this processing
with aligning column/table references in the ques- to the database engine to augment GQ with po-
tion to the corresponding schema columns/tables. tential value-based linking without exposing the
This alignment is implicitly defined by two kinds model itself to the data. Specifically, we add a
of information in the input: matching names and new C OLUMN -VALUE relation between any word
matching values, which we detail in order below. qi and column name cj s.t. qi occurs as a value
2
In this case, the initial representations cinit init init
i , ti , q i are
3
This procedure matches that of Guo et al. (2019), but we
not strictly independent although still yet uninfluenced by E. use the matching information differently in RAT.

7571
Tree-structured
(or a full word within a value) of cj . This simple decoder
SELECT
approach drastically improves the performance of
RAT-SQL (see Section 5). It also directly addresses count(*) … WHERE =
Column?
the aforementioned DB challenges: (a) the model is
0.1 0.1 0.8
never exposed to database content that does not oc-
… … …

Self-attention
cur in the question, (b) word matches are retrieved

layers
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
quickly via DB indices & textual search. … … …
Memory-Schema Alignment Matrix Our intu- How many airlines airline airline airports city
id name
ition suggests that the columns and tables which
occur in the SQL P will generally have a corre- Figure 4: Choosing a column in a tree decoder.
sponding reference in the natural language ques-
expanding the parent AST node of the current
tion. To capture this intuition in the model, we
node, and nft is the embedding of the current
apply relation-aware attention as a pointer mecha-
node type. Finally, z t is the context representation,
nism between every memory element in y and all
computed using multi-head attention (with 8
the columns/tables to compute explicit alignment
heads) on ht−1 over Y.
matrices Lcol ∈ R|y|×|C| and Ltab ∈ R|y|×|T | :
For A PPLY RULE[R], we compute Pr(at =
yi WQcol (cfinal col K >
j WK + r ij ) A PPLY RULE[R] | a<t , y) = softmaxR (g(ht ))
L̃col
i,j = √ (3) where g(·) is a 2-layer MLP with a tanh non-
dx
K > linearity. For S ELECT C OLUMN, we compute
yi WQtab (tfinal tab
j WK + r ij )
L̃tab
i,j = √ sc )T
dx ht WQsc (yi WK
λ̃i = √ λi = softmax λ̃i
Lcol
col
Ltab
tab
i,j = softmax L̃i,j i,j = softmax L̃i,j dx i
j j
|y|
X
Intuitively, the alignment matrices in Eq. (3) Pr(at = S ELECT C OLUMN[i] | a<t , y) = λj Lcol
j,i
should resemble the real discrete alignments, there- j=1
fore should respect certain constraints like sparsity.
When the encoder is sufficiently parameterized, and similarly for S ELECT TABLE. We refer the
sparsity tends to arise with learning, but we can reader to Yin and Neubig (2017) for details.
also encourage it with an explicit objective. Ap- 5 Experiments
pendix B presents this objective and discusses our
experiments with sparse alignment in RAT-SQL. We implemented RAT-SQL in PyTorch (Paszke
et al., 2017). During preprocessing, the input of
4.4 Decoder questions, column names and table names are to-
The decoder fdec of RAT-SQL follows the tree- kenized and lemmatized with the StandfordNLP
structured architecture of Yin and Neubig (2017). toolkit (Manning et al., 2014). Within the encoder,
It generates the SQL P as an abstract syntax tree we use GloVe (Pennington et al., 2014) word em-
in depth-first traversal order, by using an LSTM to beddings, held fixed in training except for the 50
output a sequence of decoder actions that either (i) most common words in the training set. For RAT-
expand the last generated node into a grammar rule, SQL BERT, we use the WordPiece tokenization.
called A PPLY RULE; or when completing a leaf All word embeddings have dimension 300. The
node, (ii) choose a column/table from the schema, bidirectional LSTMs have hidden size 128 per di-
called S ELECT C OLUMN and S QELECT TABLE. rection, and use the recurrent dropout method of
Formally, Pr(P | Y) = t Pr(at | a<t , Y) Gal and Ghahramani (2016) with rate 0.2. We
where Y = fenc (GQ ) is the final encoding stack 8 relation-aware self-attention layers on top
of the question and schema, and a<t are all of the bidirectional LSTMs. Within them, we set
the previous actions. In a tree-structured de- dx = dz = 256, H = 8, and use dropout with rate
coder, the LSTM state is updated as mt , ht = 0.1. The position-wise feed-forward network has
fLSTM ([at−1 k z t k hpt k apt k nft ], mt−1 , ht−1 ) inner layer dimension 1024. Inside the decoder, we
where mt is the LSTM cell state, ht is the LSTM use rule embeddings of size 128, node type embed-
output at step t, at−1 is the embedding of the dings of size 64, and a hidden size of 512 inside
previous action, pt is the step corresponding to the LSTM with dropout of 0.21.

7572
Model Dev Test Split Easy Medium Hard Extra Hard All
IRNet (Guo et al., 2019) 53.2 46.7 RAT-SQL
Global-GNN (Bogin et al., 2019b) 52.7 47.4 Dev 80.4 63.9 55.7 40.6 62.7
IRNet V2 (Guo et al., 2019) 55.4 48.5 Test 74.8 60.7 53.6 31.5 57.2
RAT-SQL (ours) 62.7 57.2
RAT-SQL + BERT
With BERT: Dev 86.4 73.6 62.1 42.9 69.7
EditSQL + BERT (Zhang et al., 2019) 57.6 53.4 Test 83.0 71.3 58.3 38.4 65.6
GNN + Bertrand-DR (Kelkar et al., 2020) 57.9 54.6
IRNet V2 + BERT (Guo et al., 2019) 63.9 55.0 Table 3: Accuracy on the Spider development and test
RYANSQL V2 + BERT (Choi et al., 2020) 70.6 60.6
RAT-SQL + BERT (ours) 69.7 65.6 sets, by difficulty as defined by Yu et al. (2018b).

Table 2: Accuracy on the Spider development and test Model Accuracy (%)
sets, compared to the other approaches at the top of the RAT-SQL + value-based linking 60.54 ± 0.80
dataset leaderboard as of May 1st, 2020. The test set RAT-SQL 55.13 ± 0.84
results were scored using the Spider evaluation server. w/o schema linking relations 40.37 ± 2.32
w/o schema graph relations 35.59 ± 0.85
We used the Adam optimizer (Kingma and Ba,
Table 4: Accuracy (and ±95% confidence interval) of
2015) with the default hyperparameters. During
RAT-SQL ablations on the dev set.
the first warmup_steps = max_steps/20 steps
of training, the learning rate linearly increases from
most evaluations (other than the final accuracy mea-
0 to 7.4 × 10−4 . Afterwards, it is annealed to 0
step−warmup_steps surement) using the development set. It contains
with 7.4 × 10−4 (1 − max_steps−warmup_steps )−0.5 .
1,034 examples, with databases and schemas dis-
We use a batch size of 20 and train for up to 40,000
tinct from those in the training set. We report re-
steps. For RAT-SQL + BERT, we use a separate
sults using the same metrics as Yu et al. (2018a):
learning rate of 3×10−6 to fine-tune BERT, a batch
exact match accuracy on all examples, as well as
size of 24 and train for up to 90,000 steps.
divided by difficulty levels. As in previous work on
Hyperparameter Search We tuned the batch Spider, these metrics do not measure the model’s
size (20, 50, 80), number of RAT layers (4, 6, 8), performance on generating values in the SQL.
dropout (uniformly sampled from [0.1, 0.3]), hid-
den size of decoder RNN (256, 512), max learning 5.2 Spider Results
rate (log-uniformly sampled from [5 × 10−4 , 2 × In Table 2 we show accuracy on the (hidden) Spi-
10−3 ]). We randomly sampled 100 configurations der test set for RAT-SQL and compare to all other
and optimized on the dev set. RAT-SQL + BERT approaches at or near state-of-the-art (according to
reuses most hyperparameters of RAT-SQL, only the official leaderboard). RAT-SQL outperforms all
tuning the BERT learning rate (1 × 10−4 , 3 × 10−4 , other methods that are not augmented with BERT
5×10−4 ), number of RAT layers (6, 8, 10), number embeddings by a large margin of 8.7%. Surpris-
of training steps (4 × 104 , 6 × 104 , 9 × 104 ). ingly, it even beats other BERT-augmented models.
When RAT-SQL is further augmented with BERT,
5.1 Datasets and Metrics
it achieves the new state-of-the-art performance.
We use the Spider dataset (Yu et al., 2018b) for Compared with other BERT-argumented models,
most of our experiments, and also conduct pre- our RAT-SQL + BERT has smaller generalization
liminary experiments on WikiSQL (Zhong et al., gap between development and test set.
2017) to confirm generalization to other datasets. We also provide a breakdown of the accuracy
As described by Yu et al., Spider contains 8,659 by difficulty in Table 3. As expected, performance
examples (questions and SQL queries, with the ac- drops with increasing difficulty. The overall gen-
companying schemas), including 1,659 examples eralization gap between development and test of
lifted from the Restaurants (Popescu et al., 2003; RAT-SQL was strongly affected by the significant
Tang and Mooney, 2000), GeoQuery (Zelle and drop in accuracy (9%) on the extra hard questions.
Mooney, 1996), Scholar (Iyer et al., 2017), Aca- When RAT-SQL is augmented with BERT, the gen-
demic (Li and Jagadish, 2014), Yelp and IMDB eralization gaps of most difficulties are reduced.
(Yaghmazadeh et al., 2017) datasets.
As Yu et al. (2018b) make the test set accessi- Ablation Study Table 4 shows an ablation study
ble only through an evaluation server, we perform over different RAT-based relations. The ablations

7573
Figure 5: Alignment between the question “For the cars with 4 cylinders, which model has the largest horsepower”
and the database car_1 schema (columns and tables) depicted in Figure 1.

are run on RAT-SQL without value-based linking RAT-SQL is an important extension that we plan
to avoid interference with information from the to address outside the scope of this work.
database. Schema linking and graph relations make
statistically significant improvements (p<0.001). 5.4 Discussions
The full model accuracy here slightly differs from Alignment Recall from Section 4 that we explic-
Table 2 because the latter shows the best model itly model the alignment matrix between question
from a hyper-parameter sweep (used for test evalu- words and table columns, used during decoding
ation) and the former gives the mean over five runs for column and table selection. The existence of
where we only change the random seeds. the alignment matrix provides a mechanism for
the model to align words to columns. An accurate
alignment representation has other benefits such
5.3 WikiSQL Results
as identifying question words to copy to emit a
We also conducted preliminary experiments on constant value in SQL.
WikiSQL (Zhong et al., 2017) to test generalization In Figure 5 we show the alignment generated by
of RAT-SQL to new datasets. Although WikiSQL our model on the example from Figure 1.4 For the
lacks multi-table schemas (and thus, its challenge three words that reference columns (“cylinders”,
of schema encoding is not as prominent), it still “model”, “horsepower”), the alignment matrix cor-
presents the challenges of schema linking and gen- rectly identifies their corresponding columns. The
eralization to new schemas. For simplicity of exper- alignments of other words are strongly affected by
iments, we did not implement either BERT augmen- these three keywords, resulting in a sparse span-to-
tation or execution-guided decoding (EG) (Wang column like alignment, e.g. “largest horsepower”
et al., 2018), both of which are common in state-of- to horsepower. The tables cars_data and
the-art WikiSQL models. We thus only compare to cars_names are implicitly mentioned by the
the models that also lack these two enhancements. word “cars”. The alignment matrix success-
fully infers to use these two tables instead of
While not reaching state of the art, RAT-SQL
car_makers using the evidence that they con-
still achieves competitive performance on WikiSQL
tain the three mentioned columns.
as shown in Table 5. Most of the gap between its
accuracy and state of the art is due to the simpli- The Need for Schema Linking One natural
fied implementation of value decoding, which is question is how often does the decoder fail to select
required for WikiSQL evaluation but not in Spi- the correct column, even with the schema encod-
der. Our value decoding for these experiments is ing and linking improvements we have made. To
a simple token-based pointer mechanism, which 4
The full alignment also maps from column and table
often fails to retrieve multi-token value constants names, but those end up simply aligning to themselves or the
accurately. A robust value decoding mechanism in table they belong to, so we omit them for brevity.

7574
Dev Test
Model LF Acc% Ex. Acc% LF Acc% Ex. Acc%
IncSQL (Shi et al., 2018) 49.9 84.0 49.9 83.7
MQAN (McCann et al., 2018) 76.1 82.0 75.4 81.4
RAT-SQL (ours) 73.6 79.5 73.3 78.8
Coarse2Fine (Dong and Lapata, 2018) 72.5 79.0 71.7 78.5
PT-MAML (Huang et al., 2018) 63.1 68.3 62.8 68.0

Table 5: RAT-SQL accuracy on WikiSQL, trained without BERT augmentation or execution-guided decoding (EG).
Compared to other approaches without EG. “LF Acc” = Logical Form Accuracy; “Ex. Acc” = Execution Accuracy.

Model Acc. resolution, still struggles with some ambiguous ref-

RAT-SQL 62.7 erences. Some of them are unavoidable as Spider
RAT-SQL + Oracle columns 69.8 questions do not always specify which columns
RAT-SQL + Oracle sketch 73.0
RAT-SQL + Oracle sketch + Oracle columns 99.4
should be returned by the desired SQL. Finally,
(III) 29% of errors are missing a WHERE clause,
Table 6: Accuracy (exact match %) on the development which is a common error class in text-to-SQL mod-
set given an oracle providing correct columns and ta- els as reported by prior works. One common ex-
bles (“Oracle columns”) and/or the AST sketch struc- ample is domain-specific phrasing such as “older
ture (“Oracle sketch”). than 21”, which requires background knowledge
to map it to age > 21 rather than age < 21.
answer this, we conducted an oracle experiment Such errors disappear after in-domain fine-tuning.
(see Table 6). For “oracle sketch”, at every gram-
mar nonterminal the decoder is forced to choose 6 Conclusion
the correct production so the final SQL sketch ex-
actly matches that of the ground truth. The rest of Despite active research in text-to-SQL parsing,
the decoding proceeds conditioned on that choice. many contemporary models struggle to learn good
Likewise, “oracle columns” forces the decoder to representations for a given database schema as
emit the correct column/table at terminal nodes. well as to properly link column/table references
With both oracles, we see an accuracy of 99.4% in the question. These problems are related: to
which just verifies that our grammar is sufficient to encode & use columns/tables from the schema, the
answer nearly every question in the data set. With model must reason about their role in the context
just “oracle sketch”, the accuracy is only 73.0%, of the question. In this work, we present a unified
which means 72.4% of the questions that RAT-SQL framework for addressing the schema encoding
gets wrong and could get right have incorrect col- and linking challenges. Thanks to relation-aware
umn or table selection. Similarly, with just “oracle self-attention, it jointly learns schema and question
columns”, the accuracy is 69.8%, which means that representations based on their alignment with each
81.0% of the questions that RAT-SQL gets wrong other and schema relations.
have incorrect structure. In other words, most ques- Empirically, the RAT framework allows us to
tions have both column and structure wrong, so gain significant state of the art improvement on
both problems require important future work. text-to-SQL parsing. Qualitatively, it provides a
way to combine predefined hard schema relations
Error Analysis An analysis of mispredicted
and inferred soft self-attended relations in the same
SQL queries in the Spider dev set showed three
encoder architecture. This representation learning
main causes of evaluation errors. (I) 18% of the
will be beneficial in tasks beyond text-to-SQL, as
mispredicted queries are in fact equivalent im-
long as the input has some predefined structure.
plementations of the NL intent with a different
SQL syntax (e.g. ORDER BY C LIMIT 1 vs. Acknowledgments
SELECT MIN(C)). Measuring execution accu-
racy rather than exact match would detect them as We thank Jianfeng Gao, Vladlen Koltun, Chris
valid. (II) 39% of errors involve a wrong, miss- Meek, and Vignesh Shiv for the discussions that
ing, or extraneous column in the SELECT clause. helped shape this work. We thank Bo Pang, Tao Yu
This is a limitation of our schema linking mecha- for their help with the evaluation. We also thank
nism, which, while substantially improving column anonymous reviewers for their invaluable feedback.

7575
References Pengcheng He, Yi Mao, Kaushik Chakrabarti, and
Weizhu Chen. 2019. X-SQL: reinforce schema
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hin- representation with context. arXiv preprint
ton. 2016. Layer Normalization. arXiv:1607.06450. arXiv:1908.08113.
Ben Bogin, Jonathan Berant, and Matt Gardner. 2019a.
Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh,
Representing schema structure with graph neural
Petros Maniatis, and David Bieber. 2020. Global
networks for text-to-SQL parsing. In Proceedings
relational models of source code. In International
of the 57th Annual Meeting of the Association for
Conference on Learning Representations.
Computational Linguistics, pages 4560–4565.
Ben Bogin, Matt Gardner, and Jonathan Berant. 2019b. Sepp Hochreiter and Jürgen Schmidhuber. 1997.
Global reasoning over database structures for text- Long short-term memory. Neural computation,
to-SQL parsing. In Proceedings of the 2019 Con- 9(8):1735–1780.
ference on Empirical Methods in Natural Language
Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob
Processing and the 9th International Joint Confer-
Uszkoreit, Ian Simon, Curtis Hawthorne, Noam
ence on Natural Language Processing (EMNLP-
Shazeer, Andrew M. Dai, Matthew D. Hoffman,
IJCNLP), pages 3657–3662.
Monica Dinculescu, and Douglas Eck. 2019. Music
Gino Brunner, Yang Liu, Damian Pascual, Oliver Transformer. In International Conference on Learn-
Richter, Massimiliano Ciaramita, and Roger Watten- ing Representations.
hofer. 2020. On identifiability in Transformers. In
International Conference on Learning Representa- Po-Sen Huang, Chenglong Wang, Rishabh Singh, Wen-
tions. tau Yih, and Xiaodong He. 2018. Natural language
to structured query generation via meta-learning. In
DongHyun Choi, Myeong Cheol Shin, EungGyun Kim, Proceedings of the 2018 Conference of the North
and Dong Ryeol Shin. 2020. RYANSQL: Recur- American Chapter of the Association for Computa-
sively applying sketch-based slot fillings for com- tional Linguistics: Human Language Technologies,
plex text-to-SQL in cross-domain databases. arXiv Volume 2 (Short Papers), pages 732–738.
preprint arXiv:2004.03125.
Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Krishnamurthy, and Luke Zettlemoyer. 2017. Learn-
Kristina Toutanova. 2019. BERT: Pre-training of ing a neural semantic parser from user feedback. In
deep bidirectional transformers for language under- Proceedings of the 55th Annual Meeting of the As-
standing. In Proceedings of the 2019 Conference of sociation for Computational Linguistics (Volume 1:
the North American Chapter of the Association for Long Papers), pages 963–973.
Computational Linguistics: Human Language Tech-
nologies, Volume 1 (Long and Short Papers), pages Amol Kelkar, Rohan Relan, Vaishali Bhardwaj,
4171–4186. Saurabh Vaichal, and Peter Relan. 2020. Bertrand-
DR: Improving text-to-SQL using a discriminative
Li Dong and Mirella Lapata. 2018. Coarse-to-fine de- re-ranker. arXiv preprint arXiv:2002.00557.
coding for neural semantic parsing. In Proceedings
of the 56th Annual Meeting of the Association for Diederik P. Kingma and Jimmy Ba. 2015. Adam: A
Computational Linguistics (Volume 1: Long Papers), Method for Stochastic Optimization. In Interna-
pages 731–742, Melbourne, Australia. Association tional Conference on Learning Representations.
for Computational Linguistics.
Fei Li and H. V. Jagadish. 2014. Constructing an
Catherine Finegan-Dollak, Jonathan K. Kummerfeld, interactive natural language interface for relational
Li Zhang, Karthik Ramanathan, Sesh Sadasivam, databases. Proceedings of the VLDB Endowment,
Rui Zhang, and Dragomir Radev. 2018. Improving 8(1):73–84.
Text-to-SQL Evaluation Methodology. In Proceed-
ings of the 56th Annual Meeting of the Association Christopher D. Manning, Mihai Surdeanu, John Bauer,
for Computational Linguistics (Volume 1: Long Pa- Jenny Finkel, Steven J. Bethard, and David Mc-
pers), pages 351–360. Closky. 2014. The Stanford CoreNLP natural lan-
guage processing toolkit. In Association for Compu-
Yarin Gal and Zoubin Ghahramani. 2016. A Theoreti- tational Linguistics (ACL) System Demonstrations,
cally Grounded Application of Dropout in Recurrent pages 55–60.
Neural Networks. In Advances in Neural Informa-
tion Processing Systems 29, pages 1019–1027. Bryan McCann, Nitish Shirish Keskar, Caiming Xiong,
and Richard Socher. 2018. The natural language de-
Jiaqi Guo, Zecheng Zhan, Yan Gao, Yan Xiao, cathlon: Multitask learning as question answering.
Jian-Guang Lou, Ting Liu, and Dongmei Zhang. arXiv preprint arXiv:1806.08730.
2019. Towards complex text-to-SQL in cross-
domain database with intermediate representation. Adam Paszke, Sam Gross, Soumith Chintala, Gregory
In Proceedings of the 57th Annual Meeting of the Chanan, Edward Yang, Zachary DeVito, Zeming
Association for Computational Linguistics, pages Lin, Alban Desmaison, Luca Antiga, and Adam
4524–4535. Lerer. 2017. Automatic differentiation in PyTorch.

7576
Jeffrey Pennington, Richard Socher, and Christopher Pengcheng Yin and Graham Neubig. 2017. A Syntactic
Manning. 2014. Glove: Global vectors for word rep- Neural Model for General-Purpose Code Generation.
resentation. In Proceedings of the 2014 Conference In Proceedings of the 55th Annual Meeting of the
on Empirical Methods in Natural Language Process- Association for Computational Linguistics (Volume
ing (EMNLP), pages 1532–1543, Doha, Qatar. Asso- 1: Long Papers), pages 440–450.
ciation for Computational Linguistics.
Tao Yu, Michihiro Yasunaga, Kai Yang, Rui Zhang,
Ana-Maria Popescu, Oren Etzioni, , and Henry Kautz. Dongxu Wang, Zifan Li, and Dragomir Radev.
2003. Towards a theory of natural language inter- 2018a. SyntaxSQLNet: Syntax Tree Networks for
faces to databases. In Proceedings of the 8th Inter- Complex and Cross-Domain Text-to-SQL Task. In
national Conference on Intelligent User Interfaces, Proceedings of the 2018 Conference on Empirical
pages 149–157. Methods in Natural Language Processing, pages
1653–1663.
Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani.
2018. Self-Attention with Relative Position Repre- Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga,
sentations. In Proceedings of the 2018 Conference Dongxu Wang, Zifan Li, James Ma, Irene Li,
of the North American Chapter of the Association Qingning Yao, Shanelle Roman, Zilin Zhang, and
for Computational Linguistics: Human Language Dragomir Radev. 2018b. Spider: A Large-Scale
Technologies, Volume 2 (Short Papers), pages 464– Human-Labeled Dataset for Complex and Cross-
468. Domain Semantic Parsing and Text-to-SQL Task.
In Proceedings of the 2018 Conference on Empiri-
Tianze Shi, Kedar Tatwawadi, Kaushik Chakrabarti, cal Methods in Natural Language Processing, pages
Yi Mao, Oleksandr Polozov, and Weizhu Chen. 2018. 3911–3921.
IncSQL: Training Incremental Text-to-SQL Parsers
with Non-Deterministic Oracles. arXiv:1809.05054 John M. Zelle and Raymond J. Mooney. 1996. Learn-
[cs]. ing to parse database queries using inductive logic
programming. In Proceedings of the Thirteenth Na-
Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Kathryn tional Conference on Artificial Intelligence - Volume
Mazaitis, Ruslan Salakhutdinov, and William Cohen. 2, pages 1050–1055.
2018. Open domain question answering using early
fusion of knowledge bases and text. In Proceed- Rui Zhang, Tao Yu, He Yang Er, Sungrok Shim,
ings of the 2018 Conference on Empirical Methods Eric Xue, Xi Victoria Lin, Tianze Shi, Caim-
in Natural Language Processing, pages 4231–4242, ing Xiong, Richard Socher, and Dragomir Radev.
Brussels, Belgium. Association for Computational 2019. Editing-based SQL query generation for
Linguistics. cross-domain context-dependent questions. In Pro-
ceedings of the 2019 Conference on Empirical Meth-
Lappoon R. Tang and Raymond J. Mooney. 2000. Au- ods in Natural Language Processing.
tomated construction of database interfaces: Inter-
grating statistical and relational learning for seman- Victor Zhong, Caiming Xiong, and Richard Socher.
tic parsing. In 2000 Joint SIGDAT Conference on 2017. Seq2SQL: Generating Structured Queries
Empirical Methods in Natural Language Processing from Natural Language using Reinforcement Learn-
and Very Large Corpora, pages 133–141. ing. arXiv:1709.00103 [cs].

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob

Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz
Kaiser, and Illia Polosukhin. 2017. Attention is All
you Need. In Advances in Neural Information Pro-
cessing Systems 30, pages 5998–6008.

Petar Veličković, Guillem Cucurull, Arantxa Casanova,

Adriana Romero, Pietro Liò, and Yoshua Bengio.
2018. Graph Attention Networks. International
Conference on Learning Representations.

Chenglong Wang, Kedar Tatwawadi, Marc

Brockschmidt, Po-Sen Huang, Yi Mao, Olek-
sandr Polozov, and Rishabh Singh. 2018. Robust
Text-to-SQL Generation with Execution-Guided
Decoding. arXiv:1807.03100 [cs].

Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, and

Thomas Dillig. 2017. Sqlizer: Query synthesis from
natural language. In International Conference on
Object-Oriented Programming, Systems, Languages,
and Applications, ACM, pages 63:1–63:26.

7577
A Auxiliary Relations for Schema Model Exact Match Correctness
Encoding RAT-SQL 0.59 0.81
RAT-SQL + BERT 0.67 0.86
In addition to the schema graph edges E (Sec-
tion 4.2) and schema linking edges (Section 4.3), Table 7: Consistency of the two RAT-SQL models.
the edges in EQ also include some auxiliary rela-
tion types to aid the relation-aware self-attention.
Specifically, for each xi , xj ∈ VQ : to increase encoding depth eliminated the need for
explicit supervision of alignment. With few layers
• If i = j, then C OLUMN -I DENTITY or TABLE - in the Transformer, the alignment matrix provided
I DENTITY. additional degrees of freedom, which became un-
necessary once the Transformer was sufficiently
• xi ∈ Q, xj ∈ Q: Q UESTION -D IST-d, where deep to build a rich joint representation of the ques-
tion and the schema.
d = clip(j − i, D),
clip(a, D) = max(−D, min(D, a)). C Consistency of RAT-SQL
We use D = 2. In Spider dataset, most SQL queries correspond to
more than one question, making it possible to evalu-
• Otherwise, one of C OLUMN -C OLUMN, ate the consistency of RAT-SQL given paraphrases.
C OLUMN -TABLE, TABLE -C OLUMN, or We use two metrics to evaluate the consistency:
TABLE -TABLE. 1) Exact Match – whether RAT-SQL produces the
exact same predictions given paraphrases, 2) Cor-
B Alignment Loss
rectness – whether RAT-SQL achieves the same
The memory-schema alignment matrix is expected correctness given paraphrases. The analysis is con-
to resemble the real discrete alignments, therefore ducted on the development set.
should respect certain constraints like sparsity. For The results are shown in Table 7. We found that
example, the question word “model” in Figure 1 when augmented with BERT, RAT-SQL becomes
should be aligned with car_names.model more consistent in terms of both metrics, indicat-
rather than model_list.model or ing the pre-trained representations of BERT are
model_list.model_id. To further bias beneficial for handling paraphrases.
the soft alignment towards the real discrete
structures, we add an auxiliary loss to encourage
sparsity of the alignment matrix. Specifically,
for a column/table that is mentioned in the SQL
query, we treat the model’s current belief of the
best alignment as the ground truth. Then we use a
cross-entropy loss, referred as alignment loss, to
strengthen the model’s belief:
1 X
align_loss = − log max Lcol
i,j
|Rel(C)| i
j∈Rel(C)
1 X
− log max Ltab
i,j
|Rel(T )| i
j∈Rel(T )

where Rel(C) and Rel(T ) denote the set of rele-

vant columns and tables that appear in the SQL.
In earlier experiments, we found that the align-
ment loss did improve the model (statistically sig-
nificantly, from 53.0% to 55.4%). However, it does
not make a statistically significant difference in our
final model in terms of overall exact match. We hy-
pothesize that hyperparameter tuning that caused us

7578

Syntax and Relation Enhanced Query Generation For
No ratings yet
Syntax and Relation Enhanced Query Generation For
12 pages
Semantic Parsing For Complex Data Retrieval: Targeting Query Plans vs. SQL For No-Code Access To Relational Databases
No ratings yet
Semantic Parsing For Complex Data Retrieval: Targeting Query Plans vs. SQL For No-Code Access To Relational Databases
17 pages
Paper 1
No ratings yet
Paper 1
6 pages
QALD-4 Open Challenge: Question Answering Over Linked Data
No ratings yet
QALD-4 Open Challenge: Question Answering Over Linked Data
17 pages
125552
No ratings yet
125552
12 pages
Catsql:: Towards Real World Natural Language To SQL Applications
No ratings yet
Catsql:: Towards Real World Natural Language To SQL Applications
14 pages
E-SQL: Direct Schema Linking Via Question Enrichment in Text-to-SQL
No ratings yet
E-SQL: Direct Schema Linking Via Question Enrichment in Text-to-SQL
18 pages
Seq 2 SQL
No ratings yet
Seq 2 SQL
13 pages
A Survey On Text-to-SQL Parsing: Concepts, Methods, and Future Directions
No ratings yet
A Survey On Text-to-SQL Parsing: Concepts, Methods, and Future Directions
19 pages
C++ Data Structures Explained: A Practical Guide with Examples
From Everand
C++ Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
SQLPa LM
No ratings yet
SQLPa LM
61 pages
E-SQL Direct Schema Linking Via Question Enrichment in
No ratings yet
E-SQL Direct Schema Linking Via Question Enrichment in
20 pages
JavaScript Data Structures Explained: A Practical Guide with Examples
From Everand
JavaScript Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
DATABASE From the conceptual model to the final application in Access, Visual Basic, Pascal, Html and Php: Inside, examples of applications created with Access, Visual Studio, Lazarus and Wamp
From Everand
DATABASE From the conceptual model to the final application in Access, Visual Basic, Pascal, Html and Php: Inside, examples of applications created with Access, Visual Studio, Lazarus and Wamp
Olga Maria Stefania Cucaro
No ratings yet
Dusql
No ratings yet
Dusql
13 pages
PS-SQL Phrase-Based Schema-Linking With Pre-Trained Language Models For Text-to-SQL Parsing
No ratings yet
PS-SQL Phrase-Based Schema-Linking With Pre-Trained Language Models For Text-to-SQL Parsing
5 pages
Research Paper
No ratings yet
Research Paper
32 pages
Recent Advances in Text To SQL
No ratings yet
Recent Advances in Text To SQL
22 pages
Ontotext GraphDB in Practice: The Complete Guide for Developers and Engineers
From Everand
Ontotext GraphDB in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Semantic Parsing Via Staged Query Graph Generation: Question Answering With Knowledge Base
No ratings yet
Semantic Parsing Via Staged Query Graph Generation: Question Answering With Knowledge Base
11 pages
24 Data Centric Text To SQL Wi
No ratings yet
24 Data Centric Text To SQL Wi
6 pages
2005.08314v1
No ratings yet
2005.08314v1
15 pages
Schema-R1: A Reasoning Training Approach For Schema Linking in Text-To-Sql Task
No ratings yet
Schema-R1: A Reasoning Training Approach For Schema Linking in Text-To-Sql Task
11 pages
A Survey On Semantic Parsing
No ratings yet
A Survey On Semantic Parsing
22 pages
TURL: Table Understanding Through Representation Learning: Xiang Deng Huan Sun Alyssa Lees
No ratings yet
TURL: Table Understanding Through Representation Learning: Xiang Deng Huan Sun Alyssa Lees
14 pages
C Data Structures and Algorithms: Implementing Efficient ADTs
From Everand
C Data Structures and Algorithms: Implementing Efficient ADTs
Larry Jones
No ratings yet
STaR SQL Self Taught Reasoner For Text To SQL
No ratings yet
STaR SQL Self Taught Reasoner For Text To SQL
11 pages
Thesis
No ratings yet
Thesis
16 pages
Large Language Model Enhanced Text-to-SQL Generation - A Survey
No ratings yet
Large Language Model Enhanced Text-to-SQL Generation - A Survey
18 pages
Semantics-Aware BERT For Language Understanding: Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li
No ratings yet
Semantics-Aware BERT For Language Understanding: Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li
8 pages
Open Relation Extraction: Relational Knowledge Transfer From Supervised Data To Unsupervised Data
No ratings yet
Open Relation Extraction: Relational Knowledge Transfer From Supervised Data To Unsupervised Data
10 pages
A Natural Language Interface To Relational Databases Using An Online Analytic Processing Hypercube
No ratings yet
A Natural Language Interface To Relational Databases Using An Online Analytic Processing Hypercube
18 pages
SQL and NoSQL Full Mastery: A Comprehensive Guide to Modern Data Management
From Everand
SQL and NoSQL Full Mastery: A Comprehensive Guide to Modern Data Management
Kameron Hussain
No ratings yet
Shuyan S SoP-3
No ratings yet
Shuyan S SoP-3
3 pages
RESDSQL
No ratings yet
RESDSQL
9 pages
1 s2.0 S0169023X1730561X Main
No ratings yet
1 s2.0 S0169023X1730561X Main
17 pages
Recent Advances in Text-To-SQL - A Survey of What We Have and What We Expect
No ratings yet
Recent Advances in Text-To-SQL - A Survey of What We Have and What We Expect
22 pages
2019 Introduction To Neural Network Based Approaches For Question Answering Over Knowledge Graphs
No ratings yet
2019 Introduction To Neural Network Based Approaches For Question Answering Over Knowledge Graphs
34 pages
Retrieval Reranking and Multi Task Learning For Knowledge Base Question Answering
No ratings yet
Retrieval Reranking and Multi Task Learning For Knowledge Base Question Answering
11 pages
DBMS MASTER: Become Pro in Database Management System
From Everand
DBMS MASTER: Become Pro in Database Management System
Ummed Singh
No ratings yet
Naik-1 8
No ratings yet
Naik-1 8
9 pages
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
IGNOU MCA Data Science and Big Data Previous Years Unsolved Papers MCS 226
From Everand
IGNOU MCA Data Science and Big Data Previous Years Unsolved Papers MCS 226
Manish Soni
No ratings yet
Pet-Sql:: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL With Cross-Consistency
No ratings yet
Pet-Sql:: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL With Cross-Consistency
15 pages
Advanced SQL Queries: Writing Efficient Code for Big Data
From Everand
Advanced SQL Queries: Writing Efficient Code for Big Data
Robert Johnson
5/5 (2)
1711 04436v1
No ratings yet
1711 04436v1
13 pages
NL 2 SQL
No ratings yet
NL 2 SQL
12 pages
C# Data Structures and Algorithms: Harness the power of C# to build a diverse range of efficient applications
From Everand
C# Data Structures and Algorithms: Harness the power of C# to build a diverse range of efficient applications
Marcin Jamro
No ratings yet
Cilk Programming and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Cilk Programming and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Semantics-Aware BERT For Language Understanding: Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li
No ratings yet
Semantics-Aware BERT For Language Understanding: Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li
8 pages
Parallel Software Development with Threading Building Blocks: Definitive Reference for Developers and Engineers
From Everand
Parallel Software Development with Threading Building Blocks: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Structure-Guided Large Language Models For
No ratings yet
Structure-Guided Large Language Models For
24 pages
Text Understanding With The Attention Sum Reader Network: Rudolf Kadlec, Martin Schmid, Ondrej Bajgar & Jan Kleindienst
No ratings yet
Text Understanding With The Attention Sum Reader Network: Rudolf Kadlec, Martin Schmid, Ondrej Bajgar & Jan Kleindienst
11 pages
Enhancing Text-To-SQL Capabilities of Large Language Models
No ratings yet
Enhancing Text-To-SQL Capabilities of Large Language Models
22 pages
C# Fundamentals Made Simple: A Practical Guide with Examples
From Everand
C# Fundamentals Made Simple: A Practical Guide with Examples
William E. Clark
No ratings yet
IGNOU MCA Second Semester Previous Years Unsolved Papers
From Everand
IGNOU MCA Second Semester Previous Years Unsolved Papers
Manish Soni
No ratings yet
Logical Reasoning
No ratings yet
Logical Reasoning
12 pages
LUKE: Deep Contextualized Entity Representations With Entity-Aware Self-Attention
No ratings yet
LUKE: Deep Contextualized Entity Representations With Entity-Aware Self-Attention
13 pages
Mastering the Art of Scala Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Scala Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Few-Shot Text-to-SQL Translation Using Structure
No ratings yet
Few-Shot Text-to-SQL Translation Using Structure
28 pages
ChatGPT SQL
No ratings yet
ChatGPT SQL
7 pages
Graphix T5
No ratings yet
Graphix T5
10 pages
PICARD
No ratings yet
PICARD
7 pages
Basic Database Concepts
No ratings yet
Basic Database Concepts
24 pages
Structure Query Language (SQL)
No ratings yet
Structure Query Language (SQL)
112 pages
Allah Name
No ratings yet
Allah Name
5 pages
Generalites SIG - UNIM RDC
No ratings yet
Generalites SIG - UNIM RDC
80 pages
Half Yearly Examinations: Marks Obtained Gradeteacher'S Signature
No ratings yet
Half Yearly Examinations: Marks Obtained Gradeteacher'S Signature
6 pages
Quiz 8
No ratings yet
Quiz 8
12 pages
MDM 4
No ratings yet
MDM 4
159 pages
Bigdata Syllabus
No ratings yet
Bigdata Syllabus
2 pages
SQL CREATE Table - Javatpoint
No ratings yet
SQL CREATE Table - Javatpoint
10 pages
Numetry Technologies Campus Recruitment - 2025 Passing Out Batch
No ratings yet
Numetry Technologies Campus Recruitment - 2025 Passing Out Batch
1 page
Dbms Imp Answers
No ratings yet
Dbms Imp Answers
21 pages
SQL Multiple Tables
No ratings yet
SQL Multiple Tables
2 pages
Roles and Responsibilities:: Bhavya Reddy V
No ratings yet
Roles and Responsibilities:: Bhavya Reddy V
2 pages
Eventmodeling and Eventsourcing Sample
100% (1)
Eventmodeling and Eventsourcing Sample
18 pages
PLSQL Questions
No ratings yet
PLSQL Questions
2 pages
Set-3 DBMS MCQ With Solution
No ratings yet
Set-3 DBMS MCQ With Solution
10 pages
DBMS - IT C08 Practical List: Create Database and Write SQL Queries For The Following
No ratings yet
DBMS - IT C08 Practical List: Create Database and Write SQL Queries For The Following
2 pages
111236
No ratings yet
111236
154 pages
Sap Abap Dictionary
No ratings yet
Sap Abap Dictionary
24 pages
The Incubator Hub - SQL For Data Analysis
No ratings yet
The Incubator Hub - SQL For Data Analysis
33 pages
Oracle Performance Tuning Case
No ratings yet
Oracle Performance Tuning Case
21 pages
SQL Basics Cheat Sheet
No ratings yet
SQL Basics Cheat Sheet
2 pages
Your SQL Quickstart Guide 1694613471
No ratings yet
Your SQL Quickstart Guide 1694613471
32 pages
2yrs Mca Sem4
No ratings yet
2yrs Mca Sem4
10 pages
Practice 2
No ratings yet
Practice 2
5 pages
Troubleshooting CM
No ratings yet
Troubleshooting CM
6 pages
AD0-E306 Adobe Exam Dumps Questions
No ratings yet
AD0-E306 Adobe Exam Dumps Questions
6 pages
3 Steps Prep.: D.B.M.S
No ratings yet
3 Steps Prep.: D.B.M.S
12 pages
Sap Hana-Sql Statements
No ratings yet
Sap Hana-Sql Statements
3 pages
Unit-3 DBMS
No ratings yet
Unit-3 DBMS
45 pages
Unit4-Relational Data Model Notes
No ratings yet
Unit4-Relational Data Model Notes
7 pages
In - Memory Data Grid: White Paper
No ratings yet
In - Memory Data Grid: White Paper
16 pages
Star Transformation
No ratings yet
Star Transformation
17 pages

RATSQL

Uploaded by

RATSQL

Uploaded by

RAT-SQL: Relation-Aware Schema Encoding and Linking

for Text-to-SQL Parsers

Xiaodong Liu Oleksandr Polozov Matthew Richardson

Abstract 2018), new tasks such as WikiSQL (Zhong et al.,

car_names model_list car_makers Question → Column linking (unknown)

Figure 1: A challenging text-to-SQL task from the Spider dataset.

ỹi = LayerNorm(xi + z i ) 4 RAT-SQL

Model Acc. resolution, still struggles with some ambiguous ref-

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob

Petar Veličković, Guillem Cucurull, Arantxa Casanova,

Chenglong Wang, Kedar Tatwawadi, Marc

Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, and

where Rel(C) and Rel(T ) denote the set of rele-

You might also like