0% found this document useful (0 votes)

5 views11 pages

Collaborative Dialogue in Minecraft: Anjali Narayan-Chen Prashant Jayannavar Julia Hockenmaier

The document introduces the Minecraft Collaborative Building Task, where two players, an Architect and a Builder, collaborate to construct 3D structures in Minecraft using text chat for communication. It presents the Minecraft Dialogue Corpus, consisting of 509 dialogues and game logs, highlighting the challenges of Architect utterance generation and the complexities of effective communication in a virtual environment. The study aims to develop interactive agents capable of collaborating with humans in such tasks, emphasizing the importance of clear instructions and the need for both roles to engage actively in the dialogue.

Uploaded by

Putra Fanny

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views11 pages

Collaborative Dialogue in Minecraft: Anjali Narayan-Chen Prashant Jayannavar Julia Hockenmaier

Uploaded by

Putra Fanny

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Collaborative Dialogue in Minecraft

Anjali Narayan-Chen∗ Prashant Jayannavar∗ Julia Hockenmaier

University of Illinois at Urbana-Champaign
{nrynchn2, paj3, juliahmr}@illinois.edu

Abstract 2015; Misra et al., 2016; Chai et al., 2018), simu-

lated environments that allow easier experimenta-
We wish to develop interactive agents that can
communicate with humans to collaboratively
tion are commonly used (Koller et al., 2010; Chen
solve tasks in grounded scenarios. Since com- and Mooney, 2011; Janarthanam et al., 2012).
puter games allow us to simulate such tasks In this paper, we therefore introduce the
without the need for physical robots, we de- Minecraft Collaborative Building Task, in which
fine a Minecraft-based collaborative building pairs of users control avatars in the Minecraft
task in which one player (A, the Architect) is virtual environment and collaboratively build 3D
shown a target structure and needs to instruct structures in a Blocks World-like scenario while
the other player (B, the Builder) to build this
structure. Both players interact via a chat inter-
communicating solely via text chat (Section 3).
face. A can observe B but cannot place blocks. We have built a data collection platform and have
We present the Minecraft Dialogue Corpus, a used it to collect the Minecraft Dialogue Cor-
collection of 509 conversations and game logs. pus, consisting of 509 human-human written di-
As a first step towards our goal of developing alogues, screenshots and complete game logs for
fully interactive agents for this task, we con- this task (Section 4). While our ultimate goal is
sider the subtask of Architect utterance gener- to develop fully interactive agents that can collab-
ation, and show how challenging it is.
orate with humans successfully on this task, we
1 Introduction first consider the subtask of Architect utterance
generation (Section 5) and describe a set of base-
Building interactive agents that can successfully line models that encode both the dialogue history
communicate with humans about the physical (Section 6) and the world state (Section 7). Sec-
world around them to collaboratively solve tasks tion 8 describes our experiments. Our analysis
in this environment is a long-sought goal of (Section 9) highlights the challenges of this task.
AI (e.g. Winograd, 1971). Such situated dialogue The corpus and platform as well as our models are
poses challenges that go beyond what is required available for download. 1
for the slot-value filling tasks performed by stan-
dard dialogue systems (e.g. Kim et al., 2016, 2017; 2 Related Work
Budzianowski et al., 2018) or chatbots (e.g. Ritter
et al., 2010; Schrading et al., 2015; Lowe et al., Our work is partly inspired by the HCRC Map
2015), as well as for so-called visual dialogue Task Corpus (Anderson et al., 1991), which con-
where users talk about a static image (Das et al., sists of route-following dialogues between an In-
2017) or video-context dialogue where users inter- struction Giver and a Follower who are given maps
act in a chat room while viewing a live-streamed of an environment that differ in significant details.
video (Pasunuru and Bansal, 2018). It requires the Our task also features asymmetric roles and lev-
ability to refer to real-world objects and spatial re- els of information between the two speakers, but
lations that depend on the current position of the operates in 3D space and focuses on the creation
speakers as well as changes in the environment. of structures rather than navigation around exist-
Due to the expense of actual human-robot commu- ing ones. Koller et al. (2010) design a challenge
nication (e.g. Tellex et al., 2011; Thomason et al., where systems with access to symbolic world rep-
∗ 1
Both authors equally contributed to the paper. http://juliahmr.cs.illinois.edu/Minecraft

5405
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5405–5415
Florence, Italy, July 28 - August 2, 2019. c 2019 Association for Computational Linguistics
resentations and a route planner generate real-time toy blocks: Minecraft blocks can only be placed
instructions to guide users through a treasure hunt on a discrete 3D grid, and they do not need to obey
in a virtual 3D world. gravity. That is, they do not need to be placed on
There is a resurgence of interest in Blocks the ground or on top of another block, but can be
World-like scenarios. Wang et al. (2017) let users put anywhere as long as one of their sides touches
define 3D voxel structures via a highly program- another block. That neighboring block can later
matic natural language. The interface learns to be removed, allowing the second block (and any
understand descriptions of increasing complex- structure supported by it) to “float”. Players need
ity, but does not engage in a back-and-forth di- to identify when such supporting blocks need to
alogue with the user. Most closely related to be added or removed.
our work are the corpora of Bisk et al. (2018,
2016a,b), which feature pairs of scenes involving Collaborative Building Task We define the
simulated, uniquely labeled, 3D blocks annotated Collaborative Building Task as a two-player game
with single-shot instructions aimed at guiding an between an Architect (A) and a Builder (B). A is
(imaginary) partner on how to transform an input given a target structure (Target) and has to instruct
scene into the target. In their scenario, the build- B via a text chat interface to build a copy of Target
ing area is always viewed from a fixed bird’s-eye on a given build region. A and B can communicate
perspective. Simpler versions of the data retain back and forth via chat throughout the game (e.g.
the grid-based assumption over blocks, and struc- to resolve confusions or to correct B’s mistakes).
tures consist solely of numeric digits procedurally B is given access to an inventory of 120 blocks of
reconstructed along the horizontal plane. Later six given colors that it can place and remove. A
versions increase the task complexity significantly can observe B and move around in its world, al-
by incorporating human-generated, truly 3D struc- lowing it to provide instructions from varying per-
tures and removed the grid assumption, as well as spectives. But A cannot move blocks, and remains
allowing for rotations of individual blocks. Their invisible to B. The task is complete when the struc-
blocks behave like physical blocks, disallowing ture built by B (Built) matches Target, invariant to
structures with floating blocks that are prevalent translations within the horizontal plane and rota-
in our data. Our work differs considerably in a tions about the vertical axis. Built also needs to
few other aspects: our corpus features two-way lie completely within the boundaries of the prede-
dialogue between an instructor and a real human fined build region.
partner; it also includes a wide range of perspec- Although human players were able to complete
tives as a result of using Minecraft avatars, rather each structure successfully, this task is not triv-
than a fixed bird’s-eye perspective; and we utilize ial. Figure 1 shows the perspectives seen by each
blocks of different colors, allowing for entire sub- player in the Minecraft client. This example from
structures to be identified (e.g., “the red pillar”). our corpus shows some of the challenges of this
task. A often provides instructions that they think
are sufficient, but leave B still clearly confused,
3 Minecraft Collaborative Building Task indicated either by B’s lack of initiative to start
building or a confused response. Once a multi-
Minecraft (https://minecraft.net/) is a
step instruction is understood, B also needs to plan
popular multi-player game in which players
a sequence of steps to follow that instruction; in
control avatars to navigate in a 3D world and
many cases, B chooses clearly suboptimal solu-
manipulate inherently block-like materials in
tions, resulting in large amounts of redundancy
order to build structures. Players can freely move,
in block movements. A misinterpreted instruction
jump and fly, and they can choose between first-
may also lead to a whole sequence of blocks being
or third-person perspectives. Camera angles can
misplaced by B (either due to miscommunication,
be smoothly rotated by moving around or turning
or because B made an educated guess on how to
one’s avatar’s head up, down, and side-to-side,
proceed) until A decides to intervene (in the ex-
resulting in a wide range of possible viewpoints.
ample, this can be seen with the built yellow 6). A
Blocks World in Minecraft Minecraft provides could also misinterpret the target structure, giving
an ideal setting to simulate Blocks World, al- B incorrect instructions that would later need to be
though there are two key differences to physical rectified. This illustrates the challenges involved

5406
Figure 1: In the Minecraft Collaborative Building Task, the Architect (A) has to instruct a Builder (B) to build a
target structure. A can observe B, but remains invisible to B. Both players communicate via a chat interface. (NB:
We show B’s actions in the dialogue as a visual aid to the reader.)

in designing an interactive agent for this task: the Architects were encouraged not to overwhelm
Architect needs to provide clear instructions; the the Builder with instructions and to allow their
Builder needs to identify when more information partner a chance to respond or act before moving
is required, and both agents may need to design on. Builders were instructed not to place blocks
efficient plans to construct complex structures. outside the specified build region and to stay as
faithful as possible to the Architect’s instructions.
4 The Minecraft Dialogue Corpus Both players were asked to communicate as natu-
rally as possible while avoiding idle chit-chat.
The Minecraft Dialogue Corpus consists of 509 Participants were allowed to complete multiple
human-human dialogues and game logs for the sessions if desired; we ensured that an individual
Collaborative Building Task. This section de- never saw the same target structure twice, and at-
scribes this corpus and our data collection process. tempted as much as possible to pair them with a
Further details are in the supplementary materials. previously unseen partner. While some individu-
als indicated a preference towards either the Ar-
4.1 Data Collection Procedure
chitect or Builder roles, roles were, for the most
Data was collected over the course of 3 weeks (ap- part, assigned in such a way that each individual
prox. 62 hours overall). 40 volunteers, both under- who participated in repeat sessions played both
graduate and graduate students with varying levels roles equally often. Each participant is assigned
of proficiency with Minecraft, participated in 1.5 a unique anonymous ID across sessions.
hour sessions in which they were paired up and
asked to build various predefined structures within 4.2 Data Structures and Collection Platform
a 11 × 11 × 9 sized build region. Builders be- Microsoft’s Project Malmo (Johnson et al., 2016)
gan with an inventory of 6 colors of blocks and 20 is an AI research platform that provides an API
blocks of each color. After a brief warm-up round for Minecraft agents and the ability to log, save,
to become familiar with the interface, participants and load game states. We have extended Malmo
were asked to successfully build as many struc- into a data collection platform. We represent the
tures as they could manage within this time frame. progression of each game (involving the construc-
On average, each game took 8.55 minutes. tion of a single target structure by an Architect and

5407
Builder pair) as a discrete sequence of game states. tures varied greatly, ranging from step-by-step in-
Although Malmo continuously monitors the game, structions involving temporary supporting blocks
we selectively discretize this data by only saving to single-shot descriptions such as, simply, “build
snapshots, or “observations,” of the game state at a floating yellow block” (sufficient for a veteran
certain triggering moments (whenever B picks up Minecraft player, but not necessarily for a novice).
or puts down a block or when either player sends
Referring expressions and ellipsis Architects
a chat message). This allows us to reduce the
made frequent use of implicit arguments and ref-
amount of (redundant) data to be logged while pre-
erences, relying heavily on the Builder’s current
serving significant game state changes. Each ob-
perspective and their most recent actions for ref-
servation is a JSON object that contains the fol-
erence resolution. For instance, Architect instruc-
lowing information: 1) a time stamp, 2) the chat
tions could include references such as “two more
history up until that point in time, 3) B’s posi-
in the same direction,” “one up,” “two towards
tion (a tuple of real-valued x, y, z coordinates as
you,” and “one right from the last thing you built.”
well as pitch and yaw angles, representing the
orientation of their camera), 4) B’s block inven- Recognizable shapes and sub-structures
tory, 5) the locations of the blocks in the build Some target structures were designed with com-
region, 6) screenshots taken from A’s and B’s monplace objects in mind. Some Architects took
perspectives. Whenever B manipulates a block, advantage of this in their instructions, ranging
we also capture screenshots from four invisible from straightforward (‘L’-shapes, “staircases”) to
“Fixed Viewer” clients hovering around the build more eccentric descriptions (“either a chicken or a
region at fixed angles. gun turret,” “a heart that looks diseased,” “a silly
multicolored worm”). To avoid slogging through
4.3 Data Statistics and Analysis block-by-block instructions, Architects frequently
Overall statistics The Minecraft Dialogue Cor- used such names to refer to sub-elements of the
pus contains 509 human-human dialogues (15,926 target structure. Some even defined new terms
utterances, 113,116 tokens) and game logs for 150 that get re-used across utterances: A: i will refer
target structures of varying complexity (min. 6 to this shape as r-windows from here on out... B:
blocks, max. 68 blocks, avg. 23.5 blocks). We okay A: please place the first green block in the
collected a minimum of three dialogues per struc- right open space of the blue r-window.
ture. The training, test and development sets con-
Builder utterances Even though the Architect
sist of 85 structures (281 dialogues), 39 structures
shouldered the large responsibility of describing
(137 dialogues), and 29 structures (101 dialogues)
the unseen structure, the Builder played an active
respectively. Dialogues for the same structure are
role in continuing and clarifying the dialogue, es-
fully contained within a single split; structures in
pecially for more complex structures. Builders
training are thus guaranteed to be unseen in test.
regularly took initiative during the course of a dia-
On average, dialogues contain 30.7 utterances:
logue in a variety of ways, including verification
22.5 Architect utterances (avg. length 7.9 tokens),
questions (“is this ok?”), clarification questions
8.2 Builder utterances (avg. length 2.9 tokens),
(“is it flat?” or “did I clean it up correctly?”),
and 49.5 Builder block movements. Dialogue
status updates (“i’m out of red blocks”), sugges-
length varies greatly with the complexity of the
tions (“feel free to give more than one direction at
target structure (not just the number of blocks, but
a time if you’re comfortable,” “i’ll stay in a fixed
whether it requires floating blocks or contains rec-
position so it’s easier to give me directions with
ognizable substructures).
respect to what i’m looking at”), or extrapolation
Floating blocks Blocks in Minecraft can be (“I think I know what you want. Let me try,” then
placed anywhere as long as they touch an existing continuing to build without explicit instruction).
block (or the ground). If such a supporting block is
5 Architect Utterance Generation Task
later removed, the remaining block (and any struc-
ture supported by it) will continue to “float” in Although the Minecraft Dialogue Corpus was mo-
place. This makes it possible to produce complex tivated by our ultimate goal of building agents that
designs. 53.6% of our target structures contain can successfully play an entire collaborative build-
such floating blocks. Instructions for these struc- ing game as Architect or Builder, we first con-

5408
Figure 3: A target structure (left) and corresponding
built structure at a certain point in the game (right).

ceding dialogue. Since Architects need to com-

Figure 2: An overview of the full model combining pare the current state of the build region against
global and local world representation variants. the target structure, we augment this model in the
next section with world state information.
sider the task of Architect utterance generation: Dialogue History Encoder We encode the en-
given access to the entire game state context lead- tire dialogue history as a sequence of tokens
ing up to a certain point in a human-human game in which each player’s utterances are contained
at which the human Architect spoke next, we aim within speaker-specific start and end tokens
to generate a suitable Architect utterance. (<A>...</A> or <B>...</B>....). Each
Architect utterance generation is a much sim- utterance corresponds to a single chat message,
pler task than developing a fully interactive Ar- and may consist of multiple sentences. These
chitect or Builder, but it still captures some of the tokens are fed through a word embedding layer
essential difficulties of the Architect’s role. Since and subsequently passed through a bidirectional
Architects need to be able to give instructions, cor- RNN (Schuster and Paliwal, 1997) to produce an
rect Builders’ mistakes and answer their questions, embedding of the entire dialogue history in the en-
they need the ability to compare the built structure coder RNN’s final hidden state.
against the target structure, and to understand the
Output Utterance Decoder The output utter-
preceding dialogue. We also believe that the mod-
ance is generated by a decoder RNN conditioned
els developed for this task could be leveraged to at
on the discourse context. In standard fashion, the
least bootstrap a fully interactive Architect (which
final hidden state of the encoder RNN is used to
will also need to decide when to speak, as well as
initialize the hidden state of the decoder RNN.
deal with potentially much noisier dialogue histo-
ries than those we are considering here). 7 World State Representations
Although future work should consider the task
of Builder utterance generation, the challenges in To be able to give accurate instructions, the Ar-
creating a fully interactive Builder lie more in the chitect requires a mental model of how the tar-
need to understand and execute complex instruc- get structure can be constructed successfully given
tions in a discourse and game context, to know the current state of the built structure. Since the
when it is appropriate to ask clarification questions Builder’s world is not explicitly aligned to the
and to understand the Architect’s answers, than in target structure (our space does not contain any
the need to generate complex utterances. markers that would indicate cardinal directions or
other landmarks, and we consider any built struc-
6 Seq2Seq Architect Utterance Model ture a success as long as it matches the target
structure and fits completely into the Builder’s
We define a sequence of models for Architect build region), this model must consider all possi-
utterance generation. Our most basic variant is ble translational and rotational alignment variants,
a sequence-to-sequence model (Sutskever et al., although we assume it can ignore any sub-optimal
2014) that conditions the next utterance on the pre- alignments. For any given alignment, we compute

5409
the Hamming distance between the built structure blocks Bn can be placed in the immediate next ac-
and the target (the total number of blocks of each tion. Bn , the set of blocks that can be feasibly
color to be placed and removed), and only retain placed, is a subset of Bp .
those alignments that have the smallest distance
Block counters To obtain a summary represen-
to the target. Once the game has progressed suf-
tation of the optimal alignments (without detailed
ficiently far, there is often only one optimal align-
spatial information), we represent each of the sets
ment between built and target structures, but in the
Bp and Br (as well as Bn ) of an alignment A =
early stages, a number of different optimal align-
Bp ∪ Br as sets of counters over block colors (in-
ments may be possible. Our world state represen-
dicating how many blocks of each color remain to
tation captures this uncertainty.
be placed [next] and to be removed). We compute
Figure 3 depicts a target structure (left) and a
the set of expected block counters for each color
point in the game at which a single red block has
c ∈ {red,blue,orange, purple, yellow, green} and
been placed (right). We can identify three poten-
action a ∈ {p, r, n} as the average over all k opti-
tial paths (left, up, and down) to continue the struc-
mal alignments A∗ = arg minA (|diff(T, S, A)|).
ture by extending it along the four cardinal direc-
tions. A permissibility check disqualifies the op- k
1X
tion of extending to the right, as blocks would end E[countc,a ] = countic,a
k
up placed outside the build region. These remain- i=1
ing paths, considered equally likely, indicate the With six colors, and three sets of blocks (all place-
colors and locations of blocks to be placed (or re- ments, next placements, removals), we obtain an
moved). A summary of this information forms the 18-dimensional vector of expected block counts.
basis of the input to our model.
7.1 Block Counter Models
Computing the distance between structures
Computing the Hamming distance between the We augment our basic seq2seq model with two
built and target structure under a given alignment variants of block counters that capture the current
tells us also which blocks need to be placed or re- state of the built structure:
moved. A structure S is a set of blocks (c, x, y, z). Global block counters are 18-dimensional vec-
Each block has a color c and occupies a location tors (capturing expected overall placements, next
(x, y, z) in absolute coordinate space (i.e., the co- placements, and removals for each of the six col-
ordinate system defined by the Minecraft client). ors) that are computed over the whole build region.
A structure’s position and orientation can be mu-
tated by an alignment A in which S undergoes a Local block counters Since many Builder ac-
translation AT (shift) followed by a rotation AR , tions involve locations immediately adjacent to
denoted A(S) = AR (AT (S)). We only consider their last action, we construct local block coun-
rotations about the vertical axis in 90-degree inter- ters that focus on and encode spatial information
vals, but allow all possible translations along the of this concentrated region. Here, we consider
horizontal plane. The symmetric difference be- a 3 × 3 × 3 cube of block locations: those di-
tween the target T and a built structure S w.r.t. an rectly surrounding the location of the last Builder
alignment A, diff(T, S, A), consists of the set of action as well as the last action itself. We com-
blocks to be placed, Bp = A(T ) − S and the set pute a separate set of block counters for each of
of blocks to be removed from S, Br = S − A(T ). these 27 locations. Using the Builder’s position
and gaze, we deterministically assign a relative
diff(T, S, A) = Bp ∪ Br direction for each location that indicates its posi-
tion relative to the last action in the Builder’s per-
The cardinality |diff(T, S, A)| is the Hamming spective, e.g., “left”, “top”, “back-right,” etc. The
distance between A(T ) and S. 27 18-dimensional block counters of each location
are concatenated, using a fixed canonical ordering
Feasible next placements Architects’ instruc-
of the assigned directions.
tions often concern the immediate next blocks to
be placed. Since new blocks can only be feasi- Adding block counters to the model To add
bly placed if one of their faces touches the ground block counters to out models, we found the best re-
or another block, we also wish to capture which sults by feeding the concatenated global and local

5410
counter vectors through a single fully-connected dicative of dialogue acts (e.g., responding “yes”
layer before concatenating them to the word em- vs. “no”, instructing to “place” vs. “remove”,
bedding vector that is fed into the decoder at each etc.). These lists also capture synonyms that are
time step (Figure 2). common in our data (e.g. “yes”/“yeah”), and
were obtained by curating non-overlapping lists
8 Experimental Setup of words (with a frequency ≥ 10 across all data
Data Our training, test and dev splits contain splits) that are appropriate to each category.2
6,548, 2,855, and 2,251 Architect utterances. We report precision and recall scores per cate-
gory, and for an “all keywords” list consisting of
Training We trained for a maximum of 40 the union of all category word lists. For each cat-
epochs using the Adam optimizer (Kingma and egory, we reduce both human and generated utter-
Ba, 2015). During training, we minimize the ances to those tokens that occur in the correspond-
sum of the cross entropy losses between each pre- ing keyword list: “place another red left of the
dicted and ground truth token. We stop training green” reduces to “red green” for color, to “left”
early when perplexity on the held-out validation for spatial relations and “place” for dialogue.
set had increased monotonically for two epochs. For a given (reduced) generated sentence Sg
All word embeddings were initialized with pre- and its associated (reduced) human utterance Sh ,
trained GloVe vectors (Pennington et al., 2014). we calculate term-specific precision (and recall) as
We first performed grid search over model archi- follows. Any token tg in Sg matches a token th in
tecture hyperparameters (embedding layer sizes Sh if tg and th are identical or synonyms. Similar
and RNN layer depths). Once the best-performing to BLEU’s modified unigram precision, once tg is
architecture was found, we then varied dropout pa- matched to one token th , it cannot be used for fur-
rameters (Srivastava et al., 2014). More details can ther matches to other tokens within Sh . Counts are
be found in the supplementary materials. accumulated over the entire corpus to compute the
ratio of matched to total tokens in Sg (or Sh ).
Decoding We use beam search decoding to
generate the utterance with the maximum log- Ablation study Table 1 shows the results of an
likelihood score according to our model normal- ablation study on the validation set. All model
ized by utterance length (beam size = 10). In or- variants here share the same RNN parameters.
der to promote diversity of generated utterances, While the individual addition of global and local
we use a γ penalty (Li et al., 2016) of γ = 0.8. block counters each see a slight boost in perfor-
These parameters were found by a grid search on mance in precision and recall respectively, com-
the validation set for our best model. bining them as in our final model shows significant
performance increase, especially on colors.
9 Results and Analysis
Test set results We finetune our most basic and
We evaluate our models in three ways: we use au-
most complex model via a grid search over all ar-
tomated metrics to assess how closely the gener-
chitectural parameters and dropout values on the
ated utterances match the human utterances. For
validation set. The best model’s results on the test
a random sample of 100 utterances per model, we
set are shown in Table 2. Our full model shows no-
use human evaluators to identify dialogue acts and
ticeable improvements on each of our metrics over
to evaluate whether the generated utterances are
the baseline. Most promising is again the signifi-
correct in the given game context. Finally, we per-
cant increase in performance on colors, indicating
form a qualitative analysis of our best model.
that the block counters capture necessary informa-
9.1 Automated Evaluation tion about next Builder actions.
Metrics To evaluate how closely the generated 9.2 Human Evaluation
utterances resemble the human utterances, we re-
port standard BLEU scores (Papineni et al., 2002). In order to better evaluate the quality of generated
We also compute (modified) precision and recall utterances as well as benchmark human perfor-
of a number of lists of domain-specific keywords mance, we performed a small-scale human eval-
that are instrumental to task success: colors, spa- uation of Architect utterances. We asked 3 hu-
2
tial relations, and other words that are highly in- These word lists are in the supplementary materials.

5411
BLEU Precision / Recall
Metric B-1 B-2 B-3 B-4 all keywords colors spatial dialogue
seq2seq 14.9 6.9 3.8 2.1 12.0 / 10.3 8.4 / 12.1 9.9 / 9.1 16.5 / 19.1
+ global only 16.1 7.7 4.1 2.4 12.9 / 11.6 14.4 / 15.5 8.8 / 7.0 19.1 / 18.8
+ local only 16.0 7.9 4.5 2.6 13.5 / 13.8 13.3 / 23.5 9.5 / 11.3 19.3 / 22.0
+ global & local 16.2 8.1 4.7 2.8 14.5 / 13.8 14.8 / 23.3 10.7 / 9.5 17.9 / 20.6

Table 1: BLEU score and term-specific precision and recall ablation study on the validation set.

BLEU Precision / Recall

Metric B-1 B-2 B-3 B-4 all keywords colors spatial dialogue
seq2seq 15.3 7.8 4.5 2.8 11.8 / 11.1 8.1 / 17.0 9.3 / 8.6 17.9 / 19.3
+ global & local 15.7 8.1 4.8 2.9 13.5 / 14.4 14.9 / 28.7 8.7 / 8.7 18.5 / 19.9

Table 2: BLEU and term-specific precision and recall scores of the seq2seq and the full model on the test set.

man participants who had previously completed Utterance correctness Given a window of
the Minecraft Collaborative Building Task to eval- game context (consisting of at least the last seven
uate 100 randomly sampled scenarios from the test Builder’s and Architect’s actions, but always in-
set. Each scenario was reenacted from an actual cluding the previous Architect’s utterance) and ac-
human-human game by simulating the context of cess to the target structure to be built, evaluators
dialogue and Builder actions in Minecraft. Then, were asked to rate the correctness of an utterance
we presented 3 candidate Architect utterances to immediately following that context with respect
follow that context (one each generated from the to task completion. For an utterance to be fully
models in Table 2 as well as the original human correct, information contained within it must both
utterance) to the evaluators in randomized order. be consistent with the current state of the world
as well as not lead the Builder off-course from
Here, we analyze a subset of results on coarse
the target. Utterances could be considered par-
annotation of dialogue acts and utterance correct-
tially correct if some described elements (e.g. col-
ness. More details on the full evaluation frame-
ors) were accurate, but other incorrect elements
work, including descriptions of evaluation crite-
precluded full correctness. Otherwise, utterances
ria and inter-annotator agreement statistics, are in-
could be deemed incorrect (if wildly off-course) or
cluded in the supplementary materials.
N/A (if there was not enough information). Results
can be found in Table 4. Unsurprisingly, with-
Dialogue acts Given a list of six predefined out access to world state information, the baseline
coarse-grained dialogue acts (including Instruct B, model performs poorly, conveying incorrect infor-
Describe Target, etc.; see the supplementary ma- mation about half of the time. With access to a
terial for full details), evaluators were asked to simple world representation, our full model shows
choose all dialogue acts that categorized a candi- marked improvement on generating both fully and
date utterance. An utterance could belong to any partially correct utterances. Finally, human per-
number of categories; e.g., “great! now place a formance sets a high bar; when not engaging in
red block” is both a confirmation as well as an in- chitchat or correcting typos, humans consistently
struction. Results can be found in Table 3. These produce fully correct utterances constructive to-
results show a significantly higher diversity of ut- wards task completion.
terance types generated by humans. Humans pro- 9.3 Qualitative Analysis
vided instructions only about half of the time, and
Here, we use examples to illustrate different as-
devoted more energy to providing higher-level de-
pects of our best model’s utterances.
scriptions of the target, responding to the Builder’s
actions and queries, and rectifying mistakes. On Identifying the game state In the course of a
the other hand, even the improved model failed to game, players progress through different states. In
capture this, mainly generating instructions even if the human-human data, dialogue is peppered with
it was inappropriate or unhelpful to do so. context cues (greetings, questions, apologies, in-

5412
Describe Answer Confirm B’s Correct/
Model Instruct B Target question actions/plans clarify A/B Other
seq2seq 76.0 12.0 7.0 9.0 3.0 4.0
+ global & local 72.0 14.0 8.0 9.0 3.0 4.0
human 47.0 14.0 12.0 17.0 23.0 8.0

Table 3: Percentage of utterances categorized as a given dialogue act. Labels were determined per dialogue act by
majority vote across three human evaluators. An utterance can belong to multiple dialogue acts.

Model Full Partial None N/A the blue block, put a blue block on top of the blue”
seq2seq 14.0 28.0 48.0 10.0 or “yes, now, purple, purple, purple, ...”
+ global & local 25.0 36.0 32.0 7.0
human 89.0 2.0 0.0 9.0 10 Conclusion and Future Work

Table 4: Percentage of utterances deemed correct by The Minecraft Collaborative Building Task pro-
human evaluators. vides interesting challenges for interactive agents:
they must understand and generate spatially-aware
dialogue, execute instructions, identify and re-
structions to move or place blocks) that indicate cover from mistakes. As a first step towards the
the flow of a game. Our model is able to capture goal of developing fully interactive agents for this
some of these aspects. It often begins games with task, we considered the subtask of Architect utter-
an instruction like “we’ll start with blue”, and ance generation. To give accurate, high-level in-
may end them with “ok we’re done!” (although structions, Architects need to align the Builder’s
it occasionally continues with further instructions, world state to the target structure and identify
e.g “great! now we’ll do the same thing on the complex substructures. We show that models
other side”.) It often says “perfect!” immediately that capture some world state information improve
followed by a new instruction which indicates the over naive baselines. Richer models (e.g. CNNs
model’s ability to acknowledge a Builder’s previ- over world states, attention mechanisms (Bah-
ous actions before continuing. The model often danau et al., 2015), memory networks (Bordes
describes the type of the next required action cor- et al., 2017)) and/or explicit semantic representa-
rectly (even if it makes mistakes in the specifics of tions should be able to generate better utterances.
that action): it generated “remove the bottom row” Clearly, much work remains to be done to create
when the ground truth was “okay so now get rid of actual agents that can play either role interactively
the inner most layer of purple in the square”. against a human. The Minecraft Dialogue Corpus
as well as the Malmo platform and our extension
Predicting block colors and spatial relations of it enable many such future directions. Our plat-
Generated utterances often identify the correct form can also be extended to support fully inter-
color of blocks, e.g “then place a red block on active scenarios that may involve a human player,
top of that” in a context when the the next place- measure task completion, or support other training
ments include a layer of red blocks (ground truth regimes (e.g. reinforcement learning).
utterance: “the second level of the structure con-
sists wholly of red blocks. start by putting a red Acknowledgements
block on each orange block”.) Less frequently,
the model is also able to predict accurate spatial We would like to thank the reviewers for their
relations (“perfect! now place a red block to the valuable comments. This work was supported
left of that”) for referent blocks. by Contract W911NF-15-1-0461 with the US
Defense Advanced Research Projects Agency
Utterance diversity and repetition Generated (DARPA) Communicating with Computers Pro-
utterances lack diversity: the pattern “a x b” (for gram and the Army Research Office (ARO). Ap-
a rectangle of size a × b) is almost exclusively proved for Public Release, Distribution Unlimited.
used to describe squares (an extremely common The views expressed are those of the authors and
shape in our data). Utterances are mostly fluent, do not reflect the official policy or position of the
but sometimes contain repeats: “okay, on top of Department of Defense or the U.S. Government.

5413
References Abhishek Das, Satwik Kottur, Khushi Gupta, Avi
Singh, Deshraj Yadav, José M.F. Moura, Devi
Anne H Anderson, Miles Bader, Ellen Gurman Bard, Parikh, and Dhruv Batra. 2017. Visual Dialog. In
Elizabeth Boyle, Gwyneth Doherty, Simon Garrod, Proceedings of the IEEE Conference on Computer
Stephen Isard, Jacqueline Kowtko, Jan McAllister, Vision and Pattern Recognition (CVPR), pages 326–
Jim Miller, et al. 1991. The HCRC map task corpus. 335.
Language and speech, 34(4):351–366.
Srinivasan Janarthanam, Oliver Lemon, and Xingkun
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Ben- Liu. 2012. A web-based evaluation framework for
gio. 2015. Neural machine translation by jointly spatial instruction-giving systems. In Proceedings
learning to align and translate. In 3rd Inter- of the ACL 2012 System Demonstrations, pages 49–
national Conference on Learning Representations, 54, Jeju Island, Korea. Association for Computa-
ICLR 2015, San Diego, CA, USA, May 7-9, 2015, tional Linguistics.
Conference Track Proceedings.
Matthew Johnson, Katja Hofmann, Tim Hutton, and
Yonatan Bisk, Daniel Marcu, and William Wong. David Bignell. 2016. The Malmo platform for artifi-
2016a. Towards a dataset for human computer com- cial intelligence experimentation. In Proceedings of
munication via grounded language acquisition. In the Twenty-Fifth International Joint Conference on
AAAI Workshop: Symbiotic Cognitive Systems. Artificial Intelligence (IJCAI-16), pages 4246–4247.
Yonatan Bisk, Kevin Shih, Yejin Choi, and Daniel
Seokhwan Kim, Luis Fernando D’Haro, Rafael E
Marcu. 2018. Learning interpretable spatial oper-
Banchs, Jason D Williams, and Matthew Henderson.
ations in a rich 3D Blocks World. In Proceedings
2017. The fourth dialog state tracking challenge.
of the Thirty-Second AAAI Conference on Artificial
In Dialogues with Social Robots, pages 435–449.
Intelligence, pages 5028–5036.
Springer.
Yonatan Bisk, Deniz Yuret, and Daniel Marcu. 2016b.
Natural language communication with robots. In Seokhwan Kim, Luis Fernando D’Haro, Rafael E
Proceedings of the 2016 Conference of the North Banchs, Jason D Williams, Matthew Henderson, and
American Chapter of the Association for Computa- Koichiro Yoshino. 2016. The fifth dialog state track-
tional Linguistics: Human Language Technologies, ing challenge. In 2016 IEEE Spoken Language
pages 751–761, San Diego, California. Association Technology Workshop (SLT), pages 511–517.
for Computational Linguistics.
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A
Antoine Bordes, Y.-Lan Boureau, and Jason Weston. method for stochastic optimization. In 3rd Inter-
2017. Learning end-to-end goal-oriented dialog. In national Conference on Learning Representations,
5th International Conference on Learning Represen- ICLR 2015, San Diego, CA, USA, May 7-9, 2015,
tations, ICLR 2017, Toulon, France, April 24-26, Conference Track Proceedings.
2017, Conference Track Proceedings.
Alexander Koller, Kristina Striegnitz, Donna Byron,
Paweł Budzianowski, Tsung-Hsien Wen, Bo-Hsiang Justine Cassell, Robert Dale, Johanna Moore, and
Tseng, Iñigo Casanueva, Stefan Ultes, Osman Ra- Jon Oberlander. 2010. The first challenge on gen-
madan, and Milica Gašić. 2018. MultiWOZ - a erating instructions in virtual environments. In
large-scale multi-domain wizard-of-Oz dataset for Empirical Methods in Natural Language Genera-
task-oriented dialogue modelling. In Proceedings of tion, pages 328–352, Berlin, Heidelberg. Springer-
the 2018 Conference on Empirical Methods in Nat- Verlag.
ural Language Processing, pages 5016–5026, Brus-
sels, Belgium. Association for Computational Lin- Jiwei Li, Will Monroe, and Dan Jurafsky. 2016. A sim-
guistics. ple, fast diverse decoding algorithm for neural gen-
eration. arXiv preprint arXiv:1611.08562.
Joyce Y. Chai, Qiaozi Gao, Lanbo She, Shaohua Yang,
Sari Saba-Sadiya, and Guangyue Xu. 2018. Lan- Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle
guage to action: Towards interactive task learn- Pineau. 2015. The Ubuntu dialogue corpus: A large
ing with physical agents. In Proceedings of the dataset for research in unstructured multi-turn dia-
Twenty-Seventh International Joint Conference on logue systems. In Proceedings of the 16th Annual
Artificial Intelligence (IJCAI-18), pages 2–9. Inter- Meeting of the Special Interest Group on Discourse
national Joint Conferences on Artificial Intelligence and Dialogue, pages 285–294, Prague, Czech Re-
Organization. public. Association for Computational Linguistics.

David Chen and Raymond Mooney. 2011. Learning Dipendra K. Misra, Jaeyong Sung, Kevin Lee, and
to interpret natural language navigation instructions Ashutosh Saxena. 2016. Tell me Dave: Context-
from observations. In Proceedings of the Twenty- sensitive grounding of natural language to manip-
Fifth AAAI Conference on Artificial Intelligence, ulation instructions. The International Journal of
pages 859–865. Robotics Research, 35(1-3):281–300.

5414
Kishore Papineni, Salim Roukos, Todd Ward, and Wei- Sida I. Wang, Samuel Ginn, Percy Liang, and Christo-
Jing Zhu. 2002. Bleu: a method for automatic eval- pher D. Manning. 2017. Naturalizing a program-
uation of machine translation. In Proceedings of ming language via interactive learning. In Proceed-
40th Annual Meeting of the Association for Com- ings of the 55th Annual Meeting of the Association
putational Linguistics, pages 311–318, Philadelphia, for Computational Linguistics (Volume 1: Long Pa-
Pennsylvania, USA. Association for Computational pers), pages 929–938, Vancouver, Canada. Associa-
Linguistics. tion for Computational Linguistics.
Ramakanth Pasunuru and Mohit Bansal. 2018. Game- Terry Winograd. 1971. Procedures as a representa-
based video-context dialogue. In Proceedings of tion for data in a computer program for understand-
the 2018 Conference on Empirical Methods in Nat- ing natural language. Technical report, MIT. Cent.
ural Language Processing, pages 125–136, Brus- Space Res.
sels, Belgium. Association for Computational Lin-
guistics.
Jeffrey Pennington, Richard Socher, and Christopher
Manning. 2014. GloVe: Global vectors for word
representation. In Proceedings of the 2014 Con-
ference on Empirical Methods in Natural Language
Processing, pages 1532–1543, Doha, Qatar. Associ-
ation for Computational Linguistics.
Alan Ritter, Colin Cherry, and Bill Dolan. 2010. Un-
supervised modeling of Twitter conversations. In
Human Language Technologies: The 2010 Annual
Conference of the North American Chapter of the
Association for Computational Linguistics, pages
172–180, Los Angeles, California. Association for
Computational Linguistics.
Nicolas Schrading, Cecilia Ovesdotter Alm, Ray
Ptucha, and Christopher Homan. 2015. An analy-
sis of domestic abuse discourse on Reddit. In Pro-
ceedings of the 2015 Conference on Empirical Meth-
ods in Natural Language Processing, pages 2577–
2583, Lisbon, Portugal. Association for Computa-
tional Linguistics.
M. Schuster and K. K. Paliwal. 1997. Bidirectional re-
current neural networks. IEEE Transactions on Sig-
nal Processing, 45(11):2673–2681.
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky,
Ilya Sutskever, and Ruslan Salakhutdinov. 2014.
Dropout: A simple way to prevent neural networks
from overfitting. Journal of Machine Learning Re-
search, 15:1929–1958.
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014.
Sequence to sequence learning with neural net-
works. In Advances in neural information process-
ing systems, pages 3104–3112.
Stefanie Tellex, Thomas Kollar, Steven Dickerson,
Matthew Walter, Ashis Banerjee, Seth Teller, and
Nicholas Roy. 2011. Understanding natural lan-
guage commands for robotic navigation and mobile
manipulation. In Proceedings of the Twenty-Fifth
AAAI Conference on Artificial Intelligence, pages
1507–1514.
Jesse Thomason, Shiqi Zhang, Raymond J Mooney,
and Peter Stone. 2015. Learning to interpret nat-
ural language commands through human-robot di-
alog. In Proceedings of the Twenty-Fourth Interna-
tional Joint Conference on Artificial Intelligence (IJ-
CAI 2015), pages 1923–1929.

5415

Basement Ventilation
67% (3)
Basement Ventilation
8 pages
Coding With Minecraft - Curriculum Overview
100% (1)
Coding With Minecraft - Curriculum Overview
22 pages
Coding With Minecraft - Unit 1 - Introduction
100% (1)
Coding With Minecraft - Unit 1 - Introduction
10 pages
2625566761469343-Coding With Minecraft - Unit 2 - Events and Event Handlers
100% (1)
2625566761469343-Coding With Minecraft - Unit 2 - Events and Event Handlers
24 pages
9223644809867444-Computing With Minecraft - Unit 2 - City Planner
100% (1)
9223644809867444-Computing With Minecraft - Unit 2 - City Planner
22 pages
Unofficial Minecraft Lab For Kids Family
75% (4)
Unofficial Minecraft Lab For Kids Family
357 pages
Minecraft Game Design - Complete Edition 063337
No ratings yet
Minecraft Game Design - Complete Edition 063337
48 pages
TeamCraft A Benchmark For Multi-Modal Multi-Agent Systems in Minecraft
No ratings yet
TeamCraft A Benchmark For Multi-Modal Multi-Agent Systems in Minecraft
50 pages
Coding With Minecraft 1 Introduction Student Workbook 1.docx - 0
100% (2)
Coding With Minecraft 1 Introduction Student Workbook 1.docx - 0
13 pages
NeurIPS 2023 Describe Explain Plan and Select Interactive Planning With Llms Enables Open World Multi Task Agents Paper Conference
No ratings yet
NeurIPS 2023 Describe Explain Plan and Select Interactive Planning With Llms Enables Open World Multi Task Agents Paper Conference
37 pages
Ward Patterson Final Report
No ratings yet
Ward Patterson Final Report
28 pages
Dreu Final Report
No ratings yet
Dreu Final Report
5 pages
Malik Ayman Lesson 4
No ratings yet
Malik Ayman Lesson 4
14 pages
Modelling of Virtual Campus Tour in Minecraft
No ratings yet
Modelling of Virtual Campus Tour in Minecraft
26 pages
Sign Language Translator
100% (1)
Sign Language Translator
4 pages
See and Think: Embodied Agent in Virtual Environment: Abstract
No ratings yet
See and Think: Embodied Agent in Virtual Environment: Abstract
19 pages
Metaversee
No ratings yet
Metaversee
39 pages
Collaborative LLM Agent Minecraft
No ratings yet
Collaborative LLM Agent Minecraft
12 pages
Craft An Iron Sword Minecraft Collaboration LLM 1
No ratings yet
Craft An Iron Sword Minecraft Collaboration LLM 1
19 pages
Buildings 13 00857
No ratings yet
Buildings 13 00857
16 pages
(Lecture Games) Python Programming Game: Andreas Lyngstad Johnsen Georgy Ushakov
100% (1)
(Lecture Games) Python Programming Game: Andreas Lyngstad Johnsen Georgy Ushakov
110 pages
Essentials Minecraft v1
82% (11)
Essentials Minecraft v1
98 pages
Classroom Build Challenges
86% (7)
Classroom Build Challenges
36 pages
Read Through Minecraft For Makers Minecraft in The Real World With LEGO, 3D Printing, Arduino, and More! 1st Edition Direct Ebook Download
No ratings yet
Read Through Minecraft For Makers Minecraft in The Real World With LEGO, 3D Printing, Arduino, and More! 1st Edition Direct Ebook Download
18 pages
Essentials Minecraft v1 PDF
No ratings yet
Essentials Minecraft v1 PDF
98 pages
Laigoddddd
No ratings yet
Laigoddddd
16 pages
Quantum Mechanics - Special Chapters PDF
No ratings yet
Quantum Mechanics - Special Chapters PDF
398 pages
World-GAN - A Generative Model For Minecraft
No ratings yet
World-GAN - A Generative Model For Minecraft
8 pages
Next Generation MMO Architecture
No ratings yet
Next Generation MMO Architecture
23 pages
Build Your Own 3D Story
No ratings yet
Build Your Own 3D Story
24 pages
Gamification-Game-Based Learning For World Language Teachers Resource Document - Lesson Plans
No ratings yet
Gamification-Game-Based Learning For World Language Teachers Resource Document - Lesson Plans
37 pages
Deriving Quests From Open World Mechanics - 2017
No ratings yet
Deriving Quests From Open World Mechanics - 2017
6 pages
Saep 349 PDF
100% (1)
Saep 349 PDF
41 pages
Coding With Minecraft 1 Introduction Student Workbook 1
20% (5)
Coding With Minecraft 1 Introduction Student Workbook 1
13 pages
Lesson Plan
No ratings yet
Lesson Plan
30 pages
Explore The Future Earth With Wander 2.0: AI Chatbot Driven by Knowledge-Base Story Generation and Text-To-Image Model
No ratings yet
Explore The Future Earth With Wander 2.0: AI Chatbot Driven by Knowledge-Base Story Generation and Text-To-Image Model
5 pages
SAP CONTROLLING - PRODUCT COSTING PART-1 - SAP Blogs
No ratings yet
SAP CONTROLLING - PRODUCT COSTING PART-1 - SAP Blogs
47 pages
Exploring AI Innovations in Minecraft
No ratings yet
Exploring AI Innovations in Minecraft
5 pages
Mincraft Pi Teacher Notes
No ratings yet
Mincraft Pi Teacher Notes
51 pages
7280609100589213-HOC 2023 - EDU Guide 11.14.23
No ratings yet
7280609100589213-HOC 2023 - EDU Guide 11.14.23
21 pages
Adding Physical Properties To 3D Models in Augmented Reality For Realistic Interactions Experiments
No ratings yet
Adding Physical Properties To 3D Models in Augmented Reality For Realistic Interactions Experiments
6 pages
Build Challenges Collection
No ratings yet
Build Challenges Collection
52 pages
Mind Craft
No ratings yet
Mind Craft
8 pages
Note de Cours 23 10 Math
No ratings yet
Note de Cours 23 10 Math
3 pages
Hour of Code - Zachariah Stuive
No ratings yet
Hour of Code - Zachariah Stuive
13 pages
Mining Learning and Crafting Scientific Experiments: A Literature Review On The Use of Minecraft in Education and Research
No ratings yet
Mining Learning and Crafting Scientific Experiments: A Literature Review On The Use of Minecraft in Education and Research
12 pages
Computational Thinking in Constructionist Video Games: Volume 6 - Issue 1 - January-March 2016
No ratings yet
Computational Thinking in Constructionist Video Games: Volume 6 - Issue 1 - January-March 2016
17 pages
Mine Tutorial Ifrn
No ratings yet
Mine Tutorial Ifrn
6 pages
Fine-Tuning GPT-2 On Annotated RPG Quests For NPC Dialogue Generation
No ratings yet
Fine-Tuning GPT-2 On Annotated RPG Quests For NPC Dialogue Generation
8 pages
Minecraft
No ratings yet
Minecraft
6 pages
Screening Tasks OpenFOAM GUI
No ratings yet
Screening Tasks OpenFOAM GUI
6 pages
We Are and VR - Final Year Project - Research Paper
No ratings yet
We Are and VR - Final Year Project - Research Paper
3 pages
Class 2 - Plan & Ws
No ratings yet
Class 2 - Plan & Ws
7 pages
Maker Culture and Minecraft Implications
No ratings yet
Maker Culture and Minecraft Implications
12 pages
Creating 3D Games in Godot (Alvaro Del Castillo)
No ratings yet
Creating 3D Games in Godot (Alvaro Del Castillo)
24 pages
Intervention Proposal Ideas 17
No ratings yet
Intervention Proposal Ideas 17
4 pages
University of Puerto Rico, Mayagüez Campus Department of Computer Science and Engineering
No ratings yet
University of Puerto Rico, Mayagüez Campus Department of Computer Science and Engineering
7 pages
Ambient Adventures: Teaching Chatgpt On Developing Complex Stories
No ratings yet
Ambient Adventures: Teaching Chatgpt On Developing Complex Stories
5 pages
Jian Lee (MMS) - Spooky Story
No ratings yet
Jian Lee (MMS) - Spooky Story
4 pages
Minecraft: A Game As An Education and Scientific Learning Tool
No ratings yet
Minecraft: A Game As An Education and Scientific Learning Tool
6 pages
The Design Process & The Role of CAD
100% (1)
The Design Process & The Role of CAD
12 pages
Dialogue
No ratings yet
Dialogue
1 page
Google Proposal
No ratings yet
Google Proposal
7 pages
Design of Horizontal Axis Tidal Turbines
No ratings yet
Design of Horizontal Axis Tidal Turbines
8 pages
2.RGP Corneal Lens
No ratings yet
2.RGP Corneal Lens
13 pages
Computer Awareness: Computer Awareness For IBPS PO/MT and Clerk
No ratings yet
Computer Awareness: Computer Awareness For IBPS PO/MT and Clerk
10 pages
First Order Open-Loop Systems
0% (1)
First Order Open-Loop Systems
18 pages
Maths Scanner
No ratings yet
Maths Scanner
136 pages
Line Parameters Program: Frequency-Dependent Electromagnetic
No ratings yet
Line Parameters Program: Frequency-Dependent Electromagnetic
10 pages
Class Xii Latest (Ii) Updated Checklist
No ratings yet
Class Xii Latest (Ii) Updated Checklist
36 pages
Ionic Equilibrium DPP
No ratings yet
Ionic Equilibrium DPP
33 pages
Computer Ebook English RBE
No ratings yet
Computer Ebook English RBE
69 pages
9 Database - PPT Compatibility Mode
No ratings yet
9 Database - PPT Compatibility Mode
30 pages
Microsoft Excel Intermediate
No ratings yet
Microsoft Excel Intermediate
9 pages
Moment Gradient Factor For Steel I-Beams
No ratings yet
Moment Gradient Factor For Steel I-Beams
20 pages
Ramsey S Legacy 1st Edition Lillehammer Download PDF
100% (6)
Ramsey S Legacy 1st Edition Lillehammer Download PDF
84 pages
CH 19 Cardiovascular System
No ratings yet
CH 19 Cardiovascular System
25 pages
How To Know (Check) My Own Mobile Number - Airtel, Idea, Jio Vodafone, Tata Docomo, Reliance, BSNL, Aircel, MTNL, Videocon, Virgin, Uninor
No ratings yet
How To Know (Check) My Own Mobile Number - Airtel, Idea, Jio Vodafone, Tata Docomo, Reliance, BSNL, Aircel, MTNL, Videocon, Virgin, Uninor
3 pages
Lec1 PDF
No ratings yet
Lec1 PDF
28 pages
Mark Scheme For Grade 11 HL Chemistry-Revision Booklet
No ratings yet
Mark Scheme For Grade 11 HL Chemistry-Revision Booklet
15 pages
Classroom Inventory List SCHOOL YEAR
No ratings yet
Classroom Inventory List SCHOOL YEAR
1 page
JavaScript Cheat Sheet & Quick Reference
No ratings yet
JavaScript Cheat Sheet & Quick Reference
23 pages
Biology Revision KS3 Cells To Systems and Respiration
No ratings yet
Biology Revision KS3 Cells To Systems and Respiration
3 pages
LTspice Tutorial Part 4 - Intermediate Circuits
No ratings yet
LTspice Tutorial Part 4 - Intermediate Circuits
23 pages
Its A Small Small Small Small World
No ratings yet
Its A Small Small Small Small World
15 pages
DVE Viscometer
No ratings yet
DVE Viscometer
1 page
Automatic High Beam Controller For Vehicles
No ratings yet
Automatic High Beam Controller For Vehicles
6 pages
Risc VS Cisc
No ratings yet
Risc VS Cisc
2 pages

Collaborative Dialogue in Minecraft: Anjali Narayan-Chen Prashant Jayannavar Julia Hockenmaier

Uploaded by

Collaborative Dialogue in Minecraft: Anjali Narayan-Chen Prashant Jayannavar Julia Hockenmaier

Uploaded by

Collaborative Dialogue in Minecraft

Anjali Narayan-Chen∗ Prashant Jayannavar∗ Julia Hockenmaier

Abstract 2015; Misra et al., 2016; Chai et al., 2018), simu-

ceding dialogue. Since Architects need to com-

BLEU Precision / Recall

You might also like