Computer Assisted Learning - 2021 - Huang - Chatbots For Language Learning Are They Really Useful A Systematic Review of
Computer Assisted Learning - 2021 - Huang - Chatbots For Language Learning Are They Really Useful A Systematic Review of
DOI: 10.1111/jcal.12610
REVIEW ARTICLE
KEYWORDS
affordances, chatbot usefulness, chatbots, language learning
J Comput Assist Learn. 2022;38:237–257. wileyonlinelibrary.com/journal/jcal © 2021 John Wiley & Sons Ltd 237
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
238 HUANG ET AL.
1.1 | A niche for chatbots in language learning chatbots can play the role of a tireless assistant, freeing humans from
repetitive work (Fryer et al., 2019; Kim, 2018b) such as answering fre-
Chatbots have caught the attention of language teaching researchers quently asked questions and sustaining language practice. Chatbots as
due to their capacity to communicate with users in the target lan- learning partners are willing to communicate with students endlessly,
guage (Fryer et al., 2019; Jia, Chen, Ding, & Ruan, 2012; Tegos, which offered students the chance to continuously practise the new
Demetriadis, & Karakostas, 2015). Chatbot-supported language learn- language.
ing refers to the use of a chatbot to interact with students using natu-
ral language for daily language practice (e.g., conversation practice;
Fryer et al., 2017), answering language learning questions (e.g., 1.2 | Critical appraisal of using chatbots in
storybook reading; Xu, Wang, Collins, Lee, & Warschauer, 2021) and language learning
conducting assessment and providing feedback (e.g., vocabulary test;
Jia et al., 2012). With the help of visual chatbot development plat- However, despite the potential of chatbots to reduce students' anxi-
forms, teachers can create chatbots by themselves without prior pro- ety (Ayedoun, Hayashi, & Seta, 2015, 2019; Bao, 2019) and engage
gramming experience. For instance, Dialogflow from Google enables them in language learning (Ruan et al., 2019), the novelty effect of
users to customize conversational contents by adding pre-set data- chatbots has been mentioned as a possible reason why learners'
bases. BotStar, an online chatbot platform, allows users to drag and engagement and performance improvement are only short-term
drop conversational flows using a design dashboard, by which (Fryer et al., 2019; Ayedoun et al., 2019). “The novelty effect” refers
teachers can script students' learning experience with the intended to the newness of a technology to students, which disappears after
learning objectives. Recently, Artificial Intelligence and Machine students become more familiar with the technology.
Learning techniques can enhance chatbots' ability to adapt to end- An additional worry that has been expressed about the use of
users' unstructured inputs. chatbots is their limited capabilities, despite the exponential increase
Active dialogue practice and sufficient immersion in language of chatbot implementation in educational contexts (Smutny &
learning contexts are critical drivers of learners' communication com- Schreiberova, 2020). Designing intelligent dialogue in chatbots is chal-
petence and language proficiency. However, language teachers are lenging for software developers despite the advancement of artificial
often challenged by the unwillingness of many students to communi- intelligence (Brandtzaeg & Følstad, 2018). For instance, if students
cate in their second or foreign language. Chatbot researchers have misspell their inputs, they may receive irrelevant responses from the
suggested that a more interactive and authentic language environ- chatbot. A chatbot with low intelligence cannot fulfil students'
ment enabled by chatbot-supported activities can improve student requests and thereby may provide unrelated answers
language learning outcomes (Lu, Chiou, Day, Ong, & Hsu, 2006; (Haristiani, 2019; Lu et al., 2006). Students' interaction may be
Wang, Petrina, & Feng, 2017). Fryer and Carpenter (2006) highlighted restricted to the pre-set knowledge base (Grudin & Jacques, 2019).
the potential of chatbots to diminish the shyness that students may Another limitation is the chatbot's inability to understand multiple
feel during language practice compared with talking with a human sentences at once (Kim, Cha, & Kim, 2019), which is unlike human-
partner. Chatbots may also reduce the transactional distance between human interaction in a real language learning context.
learners and instructors in an online learning space. According to To more effectively implement chatbot use, it is crucial to know
Moore's (1993) theory of transactional distance, there is a psychologi- how chatbots have been used for current language learning and what
cal and communication gap between the instructor and the learner in improvements might be incorporated into future chatbot-supported
an online learning space, which creates room for potential misunder- language learning environments.
standing. If the transactional distance is reduced, learners are more
likely to feel satisfied with their learning environment. Chatbots can
help reduce the transactional distance by providing a dialogue for the 1.3 | Rationale for the current review
learner to interact with the course content.
Educational chatbots in language learning contexts can generally Several recent articles have reviewed the use of chatbots in language
be identified by the following three common features. First, they are learning (Fryer et al., 2020; Haristiani, 2019; Kim et al., 2019).
available to support students 24/7 (Garcia, Fuertes, & Molas, 2018). Although these articles have undoubtedly increased our understand-
Students can practise their language skills with chatbots anytime they ing of chatbot use in language learning, they have mainly focused on
like, which a human partner could not easily do (Haristiani, 2019; only one or two narrow aspects of chatbot use. Haristiani (2019), for
Winkler & Soellner, 2018). Second, chatbots can provide students instance, reviewed the different types of chatbots used in language
with broad language information that human language partners may learning, and found that Cleverbot was the main chatbot used. Kim
lack. Given the fact that most EFL/ESL students and their peers are at et al. (2019) similarly reviewed and reported on the different types of
a similar target language proficiency level, learners may not be able to chatbots used in language learning, and found that few chatbot pro-
provide extra language knowledge to their peers (Fryer et al., 2019). A grams allowed chatbots and humans to directly interact via voice rec-
well designed chatbot, however, could provide extra information such ognition systems or texting for the purpose of learning foreign
as a broad range of expressions, questions, and vocabulary. Third, languages. Fryer et al. (2020) reported on two current developers of
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUANG ET AL. 239
chatbots (“Cleverbot”, “Mondly”), and provided some suggestions on particular learning behaviour could possibly be enacted within a given
how the two chatbots could be structured to make the technology context (Kirschner et al., 2004). For example, the characteristics of a
more useful to foreign language learners. learning tool for teaching and learning activities can determine if and
Unlike the previous reviews, the current review goes beyond how individual and group-based learning can take place. Social
merely reporting the specific types of chatbot employed in past empiri- affordances, as portrayed in Kirschner et al. (2004), are the functions
cal studies. It empirically examines the possible technological, pedagogi- of a tool that facilitate social interaction among participants. One key
cal, and social affordances associated with chatbots in language feature that can determine the extent of social interaction is the tech-
learning through the lens of the usefulness theoretical perspective nological tool's ability to support social presence among the different
(Kirschner, Strijbos, Kreijns, & Beers, 2004, see following section for participants (Tu, 2000). Social presence can be defined as the feeling
detail). This could help educators better understand how chatbots have that the interactants in an online space are real people (Garrison,
actually been used in language learning, and their benefits or chal- Anderson, & Archer, 1999). Hence, in this study, we characterize
lenges, as well as suggestions to deal with these challenges. social affordances as the potential of chatbots to promote social pres-
ence in communication with the learners.
TABLE 1 Inclusion and exclusion criteria However, we also included conference proceedings (if any)
Inclusion Exclusion due to the emerging nature of this field to obtain the most up-to-
date information about the chatbots' implementation. In addition,
The study must be conducted Any study on other tools for
with a chatbot language learning (e.g., physical according to the Cochrane Handbook for Systematic Reviews of
robots, multimedia platforms). Interventions, searching for conference proceedings is a highly
The study must be conducted Any study about computer desirable practice because it can help capture as many studies as
in the field of language language learning (e.g., natural possible, and minimize the risk of publication bias (Lefebvre
learning. language processing,
et al., 2021). Peer-reviewed journal articles, as well as conference
programming).
proceeding papers, were selected if they exhibited academic rig-
The study must include Any study that proposes only the
our in reporting empirical data on students' language learning
empirical data. design, development, or
evaluation of a chatbot but does using chatbots. For example, articles must clearly report the types
not describe in detail how the of student outcomes that were examined. In addition, if the article
chatbot was used to support was a quasi-experiment, it must report the statistical values
language learning.
(e.g., p value).
The study must be conducted Any study involving students as
By the end of Oct 2020, two rounds of literature search had
in educational settings for participants that merely reports
yielded 2261 papers that met all of the inclusion criteria (Table 1).
educational purposes. about instruction activities. No
details were given about the Figure 2 presents the literature searching and selection process, which
chatbot used. was in accordance with PRISMA guidelines (Moher et al., 2009). After
The study must be written in Studies written in any other deleting duplicates, reviewing abstracts, and reading full-text papers,
the English language. language. we identified 25 eligible studies for this review.
F I G U R E 2 The process of
Records identified through database searching (n = 2,261)
literature searching and selecting
Identification
EBSCOhost (n = 268)
Web of Science (n = 38)
Scopus (n = 67)
ProQuest (n = 122)
Google Scholar (n = 1,766)
2.2 | Data analysis studies (28% of all eligible studies) were randomly chosen and a
trained coder was involved to code the information. The inter-coder
To answer RQ1—in what contexts (e.g., country, language domain) agreement was 86%. Disagreements were resolved by discussion
have chatbots been used in language learning—we analysed the between the first author and the trained coder until the consent was
descriptions of the chatbot implementations in the 25 studies and reached.
reported the relevant information such as the country or region where
the research took place, the language being taught, the educational
context, the chatbot interface design, and the learning mode between 3 | RE SU LT S
the chatbot and students (see Table 2).
To answer RQ2, RQ3, and RQ5—what are the technological, and 3.1 | In what contexts have chatbots been used in
pedagogical affordances, and the challenges of using chatbots in lan- language learning?
guage learning—we used the inductive grounded approach (Braun &
Clarke, 2006) to identify and categorize the major relevant themes. The Table 2 presents the contextual information of all 25 reviewed
unit of analysis was each individual empirical study. The coding scheme studies.
was not predetermined prior to our analysis but emerged inductively
and was continually refined through our interaction with the data. The
two examples below illustrate how the data were analysed and coded. 3.1.1 | Geographic distribution
The first example was taken from the study by Ruan et al. (2019)
which reported the use of a voice-based chatbot called BookBuddy The geographic distribution of studies indicated that 18 studies were
that asked a child for basic information such as name, gender, and conducted in Asia, five in North America, one in Ireland, and one in
interests (e.g., animals, gardens). Then the underlying recommendation Ukraine. One study (Yin & Satar, 2020) was considered cross-continental
algorithm in the chatbot would find the most appropriate book in a because it was conducted in both China and the United Kingdom.
book database for the child to read. The example described here was
coded as the pedagogical activity of “providing recommendation”
because the most salient element appeared to be the chatbot using a 3.1.2 | Language
recommender system to suggest relevant books for a child.
The second example was taken from a study by Fryer et al., The results showed that English was the dominant language in the use
(2017), which demonstrated that students' interest in interacting with of chatbots for students' language learning. Among 23 studies involv-
a chatbot partner significantly declined over time due to a novelty ing chatbots used for English learning, three of them were for lan-
effect. This apparent novelty effect, however, did not occur when stu- guage literature for native English speakers (Lin & Chang, 2020; Xu
dents interacted with a human partner. The example described here et al., 2021; Xu & Warschauer, 2020); the other 20 studies involved
was coded as a challenge of using chatbot called “novelty effect” the teaching of English as a foreign language (EFL) or second language
because this was the explanation provided by the study researchers. (ESL). One study involved the teaching of Chinese as a second lan-
To answer RQ4—what are the social affordances of using chatbots guage. One study involved the teaching of Irish as an endangered lan-
in language learning—we used Garrison's social presence framework guage to native students in Ireland, where the national language was
(Garrison, 2011; Garrison et al., 1999) to guide our initial analysis and changed to English.
coding. According to Garrison (2011), the classification scheme for
social presence consists of three categories, namely interpersonal com-
munication, open communication, and cohesive communication. Inter- 3.1.3 | Domains
personal communication (e.g., the expression of emotion, self-
disclosure, use of humour) creates an academic climate and sense of Chatbots were implemented to assist students' learning of speaking,
belonging to students' purposeful communication. Open communica- listening, reading, and writing skills. The learning topics covered
tion (e.g., asking questions, expressing agreement and appreciation) is vocabulary, grammar (e.g., verb tense), academic pragmatics, and
for a trustful learning environment that enables students to question meaning negotiation strategies (e.g., checking comprehension during
each other while protecting self-esteem and acceptance. Cohesive dialogue). For example, in one study, students conversed with a
communication (e.g., using vocatives, greetings and closures) is to build chatbot on diverse topics (e.g., business, the environment) for more
students' group identify and sustain a collaborative online learning envi- than 10 min per session, as homework (Kim, 2018b).
ronment. Although Garrison's framework was used a priori, we did not
forcefully impose any of the indicators onto our data corpus. During
the course of our analysis, we also allowed for new types of indicators 3.1.4 | Educational settings
(if any) to emerge inductively during the coding process.
All articles were firstly read in its entirety and coded into themes The use of chatbots for language learning was concentrated in higher
by the first author. To ensure the reliability of the data analysis, seven education. A total 19 of the 25 studies were conducted with university
TABLE 2 Summary of 25 articles reviewed
242
Edu. Learning
First author (year) Country/region Language: Domain context Chatbot name Chatbot interface Development Capability mode
Ayedoun et al. (2015) Japan EFL: Speaking HE Jack Web-based, human-like Self-designed TF, CD Individual
avatar system
Ayedoun et al. (2019) Japan ESL: Speaking HE Peter Web-based, human-like Self-designed TF, CD Individual
avatar system
Ayedoun (2020) Japan ESL: Speaking HE Peter Web-based, human-like Self-designed TF, CD Individual
avatar system
Chen, Vicki Widarso, and Taiwan CSL: Vocabulary HE Xiaowen Mobile messenger, textual Self-designed TF, CD Individual
Sutrisno (2020) system
Chiaráin (2016) Ireland Irish: Speaking SE Taidhgín Web-based, text-to-speech Self-designed VC, UD Individual
system
Fryer (2017) Japan EFL: Speaking HE Cleverbot Web-based, speech-to-text Existing system VC, UD Individual
Fryer (2019) Japan EFL: Speaking HE Cleverbot Web-based, speech-to-text Existing system VC, UD Individual
Fryer (2020) Japan EFL: Speaking HE Cleverbot Web-based, speech-to-text Existing system VC, UD Individual
Gallacher, Thompson, and Japan EFL: Speaking HE Cleverbot Web-based, speech-to-text Existing system VC, UD Individual
Howarth (2018)
Goda, Yamada, Matsukawa, Hata, and Japan EFL: Speaking HE ELIZA Web-based Self-designed TF, CD Individual
Yasunami (2014) system
Hsu (2020) Taiwan EFL: Speaking HE n/a n/a n/a n/a Individual
Jia (2008) China EFL: Grammar SE CSIEC Web-based, textual and Self-designed VC, TF, Individual
auditory system UD, CD
Jia et al. (2012) China EFL: Vocabulary SE CSIEC Web-based, textual and Self-designed TF, CD Individual
auditory system
Kim (2016) Korea EFL: Negotiation of HE Indigo Mobile messenger, auditory Existing system VC, UD Individual
meaning
Kim (2018a) Korea EFL: Vocabulary HE Elbot Mobile messenger, textual Existing system VC, UD Individual
and auditory
Kim (2018b) Korea EFL: Listening & HE Elbot Mobile messenger, textual Existing system VC, UD Individual
Reading and auditory
Kim et al. (2019) Korea EFL: Grammar HE Replika Mobile messenger, textual Existing system VC, UD Individual
Lin and Chang (2020) Canada English Writing HE DD Web-based, textual Self-designed TF, CD Individual
system
Ruan, Willis (2019) United States EFL: Reading ECE BookBuddy Web-based, auditory Self-designed TF, CD Individual
system
Tegos, Demetriadis, and Tsiatsos (2014) Ukraine EFL: Speaking HE MentorChat Web-based, textual Self-designed TF, CD Group
system
Wang et al. (2017) China EFL: Grammar HE VILLAGE Web-based Self-designed TF, CD Individual
system
HUANG ET AL.
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUANG ET AL. 243
Abbreviations: CD, chatbot-driven interaction; CSL, Chinese as a second language; ECE, early children education; EFL, English as a foreign language; ESL, English as a second language; HE, higher education; SE,
Individual
Individual
Individual
Individual
Learning studies involved elementary school students as the target learners; The
mode
other three studies were conducted in a secondary education context.
Capability
VC, UD
TF, CD
TF, CD
TF, CD
The most common chatbot interface designs can be categorized into two
types: Web-based (Figure 3) and mobile messenger (Figure 4) interfaces.
Existing system
Development
Self-designed
Self-designed
Self-designed
system
system
Web-based, human-like
capability
Auditory
Auditory
avatar
avatar
chatbots provide open conversation with the user on any topic with
the primary purpose of engaging users in a continuous dialogue
n/a
n/a
(Grudin & Jacques, 2019). Students can direct the conversation topics
secondary education; TF, task-focused chatbot; UD, user-driven interaction; VC, virtual companion chatbot.
ECE
ECE
HE
HE
such as thesis writing (Lin & Chang, 2020) and restaurant reservation
(Ayedoun et al., 2015).”
EFL: Negotiation of
Language: Domain
English: Reading
English: Reading
ESL: Speaking
The majority of the chatbots were used for individual learning activities,
in which students communicated with one chatbot via an individual
channel and could not interact with each other synchronously in the
China and United
Country/region
United States
United States
Kingdom
Xu et al. (2021)
3.2.1 | Timeliness
Yang (2010)
Yin (2020)
Xu (2020)
TABLE 2
F I G U R E 3 Screenshot of Cleverbot
(an example of a web-based chatbot)
opportunity to learn the language at any time. For instance, the stu- students' learning. For example, students in Ruan and her colleagues'
dents in Kim's (2018a) study practiced English with chatbots outside (2019) study received recommended reading materials based on
class. Real-time interaction with chatbots can satisfy students' need their gender and interests by the chatbot BookBuddy. Similarly, Jia
for self-learning pace (Chen et al., 2020) and offer students a sense of et al. (2012) enabled a chatbot CSIEC to provide students English
authenticity in a native-speaking environment (Wang et al., 2017). chatting topics based on students' registration information, such as
the educational level and address.
Even when given a common topic to discuss with a chatbot, differ- 3.3.1 | Interlocutor
ent students can communicate with the chatbot differently, with dif-
ferent inputs. The ability of chatbots to respond with specific This function emphasized the role of the chatbot as a learning com-
information to students' previous utterances can personalize panion to assist students' language learning. Three subcategories of
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUANG ET AL. 245
interlocution were revealed: (a) language knowledge practice activi- The first type of learning activity involved students interacting
ties, (b) learning skills facilitation activities, and (c) the coordination of with chatbots to facilitate the daily practice of targeted language
group discussion. knowledge. Kim (2018b), for example, explored the use of chatbots in
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
246 HUANG ET AL.
an EFL course for freshmen students majoring in different subjects in students in the experimental group conversed with the Web-based
a Korean university; the chatbots were used to promote their vocabu- chatbot ELIZA for 10 min to prepare for a group discussion about the
lary learning. The students were randomly allocated to a treatment topic “an ideal family,” while the control group searched for related
group or a control group. Across an eight-week intervention period, information online. During the pre-discussion preparation, ELIZA
the students in the treatment group interacted with a messenger required students to clarify their ideas or thoughts on the topic using
chatbot for 10 min per week discussing diverse chat topics regarding Socratic questions (e.g., “Why do you think that?”). The experimental
school life and movies; while no chatbot was used in the control group showed a significant improvement in critical thinking, especially
group. The students' pre- and post-test results revealed a significant in their awareness of critical thinking. For instance, students reported
improvement only in the treatment group in terms of a change in their that they were able to organize ideas well and engaged in difficult
vocabulary knowledge, specifically adjective and verb knowledge. In problems. In another example, students with different levels of lan-
the post-survey, students reported that they felt confident practicing guage proficiency interacted with the voice-based chatbot Indigo over
English vocabulary with the chatbot. In another example of a language their mobile phones. The students engaged with Indigo for 16 weeks,
practice activity, 50 students in their first year of secondary school in and their chat scripts in the last session indicated an increasing fre-
China were introduced to the CSIEC chatbot system to practise gram- quency of use of negotiation strategies compared with the first ses-
mar knowledge and sentence expression (Jia & Ruan, 2008). The sion (Kim, 2016). Students at different language proficiency levels
learning content in the CSIEC system was taken from English text- showed improvement in different negotiation strategies: low-level
books. The students interacted with the chatbot both during class students demonstrated more repetition and reformulation skills to
time and at home through gap-filling exercises. Once a student had overcome communication silence during chatting, while medium-level
finished a practice session, the chatbot system rewarded the student and high-level students were more inclined to use confirmation check
with a star badge as positive reinforcement. In the post-intervention strategies to comprehend the conversation.
survey, the students reported that using the chatbot had benefited The third type of activity involved using chatbots to coordinate
their English learning. More than 75% of the students stated that they student online group discussions, where students were asked to inter-
wished to use the chatbot in the whole English instruction. act with one another synchronously. Tegos et al. (2014), for example,
The second type of activity involved using chatbots to help stu- explored the use of the dialogue-based chatbot MentorChat to trigger
dents learn certain skills such as critical thinking and negotiation skill. student' utterances in group discussions and balance the conversation
An example of this can be found in Goda's study (2014), in which between “weak” and “strong” students to foster peer interactions.
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUANG ET AL. 247
Once MentorChat identified “weak” students, who gave no response A correction system was used in this chatbot to present the most com-
to a given question, the system would direct the question again to mon grammatical and orthographic errors made by students.
these students by mentioning their names, for instance, “Janna, what
can help you to block out negative thoughts?” (p. 78). The participat-
ing students reported that the chatbot helped them recall key con- 3.3.4 | Helpline
cepts learned in previous sessions.
By trawling through enormous amounts of information in a database,
chatbots can perform the function of a helpline, providing students
3.3.2 | Simulating an authentic language with information about the learning content when queried by users.
environment Wang et al. (2017), for example, explored the integration of chatbots
in the Web-based virtual learning platform VILLAGE, in which stu-
This pedagogical approach emphasized the use of chatbots to simulate a dents were assigned to practice their grammar (e.g., using linking
virtual target language speaking environment by two types of learning verbs and constructing sentences using different verb tenses). Stu-
activities: (a) role-play activity and (b) learning scenario representation. dents could access the chatbot for help whenever they encountered
Role-playing entails having chatbots perform real-life roles to simulate an problems with the learning activities.
authentic communication environment of the targeted language. In a
study conducted by Ayedoun et al. (2015), for example, a Web-based
chatbot was presented as a native English-speaking waiter equipped with 3.3.5 | Providing recommendation
verbal and non-verbal (i.e., facial expressions, head movements, and lip-
syncs) functions to simulate a restaurant setting. Five university students Chatbots can also automatically recommend learning materials
played the role of customers, interacting with the chatbot individually. according to the students' prior utterances. For example, Ruan
An analysis of answers to the pre- and post-questionnaires suggested et al. (2019) explored a voiced-based chatbot BookBuddy as a virtual
students' rising self-confidence and growing desire to communicate in learning partner to facilitate children's reading comprehension. The
English. Another example of a role play simulation activity was found chatbot system collected children's information and analysed appro-
in Yang and Zapata-Rivera's (2010) study, in which a chatbot took priate topics during the interaction and recommended books to chil-
the role of a professor named Dr. Brown who responded to stu- dren from the database. The children reported that they enjoyed
dents' request strategies. Fifteen students from a university in the speaking English with the chatbot and were highly engaged during the
U.S. interacted with the chatbot system for 45 min during the inter- interaction.
vention. The results of the usability questionnaire indicated that
92% of the students agreed that communicating with the chatbot
motivated them to learn how to make requests in academic settings 3.3.6 | Effects of using chatbots on students'
with professors. behavioural and cognitive outcomes
An example of a learning scenario representation was reported by
Wang et al. (2017). Within the virtual simulation, chatbots were built Chatbots are synchronous tools that support individual learning and have
into different scenarios that students could “visit” (e.g., a real estate been used with the aim of increasing student outcomes (Winkler &
company office, stores, a supermarket, a hotel, and a restaurant). The Soellner, 2018). Fryer et al. (2020), however, pointed out that the limita-
virtual learning environment immersed students in simulations of sce- tions of the technology itself (e.g., low accuracy of chatbots' responses)
narios in which they had to tackle problems similar to those that they may diminish the degree to which chatbots can help improve students'
might meet in reality. Students who experienced interacting with the performance. To robustly understand the effects of using chatbots on stu-
chatbots later reported they felt a sense of being physically present in dents' language outcomes, we searched within the previous list of 25 arti-
the language learning environment. cles using a more restrictive criteria to select eligible studies (Table 5).
Guided by the inclusion and exclusion criteria, we identified eight
experimental research studies (Table 6). Two of them examined stu-
3.3.3 | Transmission of information dents' behavioural outcomes and the other six investigated students'
cognitive outcomes. Behavioural outcomes refer to students per-
This function highlighted the use of chatbots as a channel to deliver forming learning tasks such as participating in discussions (Goda
learning contents prepared beforehand by the course teacher. For et al., 2014), which can be assessed by observations (e.g., the number
example, Lin and Chang (2020) explored the use of the chatbot DD to of conversations). Cognitive outcomes refer to students' learning per-
deliver essay writing outline in two tutorial sessions, where the chatbot formance concerning domain-specific knowledge, such as reading (Xu
introduced the features of a thesis statement. Chiaráin and Chas- et al., 2021), writing (Lin & Chang, 2020), vocabulary (Jia et al., 2012;
aide (2016) designed a chatbot Taidhgín as an Irish native speaker to Kim, 2018b) and grammar (Kim, 2019).
talk to secondary students. The topics (e.g., hobbies and holidays) were Overall, previous studies exploring examining the effects of
associated with the curriculum for second level school oral examination. chatbots on students' behavioural outcomes showed positive results
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
248 HUANG ET AL.
T A B L E 5 Criteria of studies evaluating the effects of chatbots on no significant difference in the performance of comprehension check
student behavioural and cognitive outcomes strategies between the chatbot groups and the control group.
Inclusion Exclusion Previous research, however, suggested mixed findings con-
The study must have an Any study that has no control cerning the effects of chatbot on students' cognitive outcomes. On
experimental group (i.e., group. one hand, several studies (Kim, 2018a, 2018b reading; Xu
learning with chatbot) and a et al., 2021) reported no significant difference between participants
control group (i.e., learning
who used chatbots and those who did not. For example, Xu
without chatbot).
et al. (2021) compared children's reading comprehension under a
The study must report Any study that evaluates
chatbot-assisted conversation (experiment group), and a human con-
quantitative findings students' anxiety, confidence,
regarding chatbots' impact on task and course interest, and versation (control group). The chatbot was assigned to ask questions
students' language learning attention. to guide children in the experimental group to understand the story,
performance for specific whereas children in control group were asked the same questions by
language knowledge or the
a human teacher. A post-hoc analysis on the comprehension scores
obtainment of certain
language skills. from experimental and control groups indicated that chatbot had a
Students' learning outcomes Any study that relies only on self- similar effect (p = 0.29) as a human teacher on facilitating children's
must be assessed by objective reported data, such as reading comprehension by asking guided questions. Kim (2018a and
measurements such as test students' questionnaires. 2018b reading) integrated a chatbot as language practice partners
scores, chat records, or the
into university students' homework to engage students' out-of-class
number of conversations
vocabulary, and reading learning. Students interacted with the
observed.
chatbot “Elbot” on mobile phones via text or auditory messages. No
The study must provide Any study that merely reports
statistical data, such as mean, percentages of outcomes. significant differences in students' vocabulary knowledge (adjective
standard deviations, or t-test Any study that reports use, noun use), and reading skill were reported between the experi-
results, for both the incomplete p values. mental groups (interacting with chatbot) and control groups
experimental group and the Any study that reports data only
(received no treatment).
control group. from the experimental groups.
On the other hand, other studies (Jia et al., 2012; Kim, 2018b
listening, 2019; Lin & Chang, 2020) reported positive effects of
chatbot on students' language learning. For example, Jia
when chatbots were used to buttress the learning content through et al. (2012) designed specific dialogue scripts based on a course
interaction with students. For example, Goda et al. (2014) used a syllabus. The students in the experimental group were required to
chatbot to prepare students' group discussion activity. All students use the CSIEC chatbot to take one vocabulary assessment per week
were given a discussion topic and 10 min to prepare. A chatbot was to review their vocabulary knowledge through both closed ques-
used in experimental group to help students organize their ideas and tions and multiple-choice questions, whereas the students in the
structure the discussion; students in control group were asked to sea- control group did not complete any chatbot assessments. A post-
rch information online. After 10-min preparation, students joined the test on students' vocabulary acquisition indicated a significant dif-
group discussion with their peers in the same condition. The number ference in favour of the experimental group (p = 0.044), with a
of interactions in this peer group discussion was coded for both moderate effect size (g = 0.417).
groups. The results indicated that the number of conversation actions The experiment group students in Kim (2018b listening) carried
in experimental group were higher than that of control group. How- out conversations on topics ranging from business to the environment
ever, it should be noted that conversation frequency is not equivalent using voice chats with the chatbot Elbot for a total of 20 sessions with
to communication quality, which the authors did not discuss. In each session lasting more than 10 min over 16 weeks. Although both
Kim's (2016) study, students were first grouped into three proficiency the experimental and the control group received formal listening
levels (i.e., low, medium, and high levels, based on a TOEIC test). instruction during the regular English teaching time period, the control
TOEIC is a standardized English test that measured students' English group received no treatment. Results showed that students using the
language skills required in the workplace. Next, students in each level chatbot significantly outperformed the control group (p = 0.013) in
(i.e., low, medium, high) were randomly assigned into either an experi- their listening skills with a large effect size (g = 0.752).
ment group (i.e., communicate with a chatbot) or a control group In another study, Kim et al. (2019) examined the effects of a
(i.e., communicate with human peers). Results indicated that the low- chatbot on Korean college students' English grammar skills. Students
level students in the chatbot group performed more repetition and in the experiment group conversed via text on mobile phones with
reformulation strategies (e.g., repeating words or paraphrases from the chatbot Replika, which could ask them questions. The chatbot
previous interactions rather than paraphrasing the meanings) than the conversations took place in 10 chat sessions, each session lasting
non-chatbot group. In contrast, the medium-level and high-level stu- 10 min, over a period of 16 weeks. Students in the control group con-
dents requested clarifications and performed confirmation checks versed with a human partner. Student grammar outcomes were mea-
more frequently during the conversation with the chatbot. There was sured by a grammar test adapted from a standardized grammar test. A
TABLE 6 Experimental studies using chatbots
(Continues)
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
TABLE 6 (Continued)
250
Kim (2018b) EG:24 10 min a Yes t-test Listening + (p < 0.05) in favour of EG. 0.752 EG: Students talked with the chatbot Elbot on
CG:22 week * (TOEIC listening and mobile phones as homework. The
16 weeks reading test) conversations could be textual or auditory, and
Reading No significant differences 0.148 took place in weekly 10-min sessions for
(p > 0.05) between EG and 16 weeks. There were 20 chat sessions in
CG. total. Chat topics varied from business to the
environment.
CG: No treatment was received.
*No initial difference between EG and CG.
Kim et al. EG:36 10 min a Yes t-test Grammar + (p < 0.05) in favour of EG. 0.482 EG: Students talked via text on mobile phones
(2019) CG:34 week * (grammar test adapted with the chatbot Replika, which could ask
16 weeks from a standardized them questions. The conversations took place
test) in weekly 10-min sessions for 16 weeks.
CG: Students chatted with student partners.
*No initial difference between EG and CG.
Xu et al. EG:33 20 min No Post-hoc analysis (story Reading No significant differences 0.024 EG: Students read a story with being asked
(2021) CG:31 comprehend-sion (p > 0.05) between EG and questions from a voice-based chatbot to guided
quizzes) CG. their comprehension within 20 min.
CG: Students were asked the same questions
from a human teacher.
*No initial difference between EG and CG.
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUANG ET AL. 251
post-test indicated a significant difference in favour of the chatbot The findings suggested that interpersonal communication can be
group (p = 0.046), with a moderate effect size (g = 0.482). established through students' self-disclosure with chatbots. For exam-
The final study by Lin and Chang (2020) employed the chatbot ple, Goda et al. (2014) reported that students demonstrated self-
DD to introduce students the main elements of an argumentative disclosure when they discussed the essential factors of an ideal family
essay writing (e.g., statement, topic sentences, and conclusion) during with a chatbot partner. The students expressed their personal opinions
tutorial sessions in 2 weeks. Each tutorial class lasted 50 min. In the about their own family, such as “my family members aren't friendly,”
experimental group, the chatbot DD was assigned to deliver the out- and they also asked the chatbot partner questions, such as “Do you
line of essay statement in the first week, and then guide students in have family?” Similarly, Xu and Warschauer (2020) reported students
providing peer feedback to their classmates' essay outline. Students in shared personal experience when they were explaining their opinions
the control group wrote the essay outline without interacting with the with the chatbot. The findings showed that it was possible to use a
chatbot. Students' achievement was measured by their essay outline chatbot as a learning partner to enhance social interaction through
writing. Students learning with chatbot DD performed better than the exchanging self-disclosure information. In another study, analysis of the
control group (p = 0.027) with an almost negligible effect size conversation scripts by Ayedoun et al. (2020, p. 608) showed that the
(g = 0.039). The authors did not establish whether any significant dif- chatbot was able to continue a thread with students by providing exam-
ference existed between the experiment and control group students' ples of possible responses when students faced difficulties to answer
initial writing ability at the start of the study. the question, for example, “You may say, 'one beer please' to order a
In summary, results from the eight experimental studies reveal beer.” Social presence can also be found in chatbots' use of greetings,
that chatbots can positively enhance students' language learning in expressions of agreement with students' ideas, asking question, and voc-
some topics such as grammar, listening, and writing. Chatbots do not atives. Greetings were used usually at the beginning of the conversation
appear to improve students' reading comprehension. Results con- as an ice-breaking activity. The chatbot DD in the study of Lin and
cerning the effectiveness of chatbots in enhancing vocabulary learning Chang (2020), p. 82 greeted students by saying “Hi, remember me? I
were mixed. However, because these results (reading and vocabulary) worked with you for your thesis statement. It's me DD!” When chatbots
were based on only two experimental studies each, it should be inter- expressed their agreements during a conversation, for example, “Good
preted with caution. So far, no studies reported an adverse negative job! You're correct!”, students perceived the chatbot as a patient,
effect of using chatbots on language student learning outcomes. friendly, and non-judgmental partner (Ruan et al., 2019). In terms of ask-
ing other students questions in an online language learning environment.
We found only one study (i.e., Tegos et al., 2014) that employed a
3.4 | What are the social affordances, if any, of chatbot as a group discussion coordinator asking students questions,
using chatbots in language learning? which in turn helped increase students' actions of asking questions with
each other. The chatbot in this study also referred to students by names,
We examined the social presence of chatbots in language learning via such as in the statement “Janna, what can help you to block out nega-
focusing especially on the eligible studies' instructional design, the tive thoughts (p.78)? [Janna was the name of a student]”.
analysis of students' discourse, and interviews with students about Embracing chatbots in language learning can be a way to encourage
their views on chatbot-supported language learning. As given in an open learning climate of interpersonal communication, which can help
Table 7, three categories of social presence were identified in overcome students' nervousness about speaking the target language and
chatbot-supported language learning, including the interpersonal com- promote their willingness to communicate (Ayedoun et al., 2015), help
munication (e.g., students' self-discourse), open communication them better understand learning objectives and support them in collabo-
(e.g., continuing a thread, asking questions, expressing agreement) and rative learning (Tegos et al., 2014), and strengthen their sense of social
cohesion communication (e.g., using vocatives and greetings). presence within virtual language environments (Wang et al., 2017).
However, the immature development of chatbot technology can The students viewed the chatbot as a novelty rather than a lasting
also lessen social presence in students' language learning. Fryer et al. partner in daily language practice (Gallacher et al., 2018).
(2017) criticized chatbots for their inability to maintain students' lan-
guage learning interest; students' interest in speaking tasks with
chatbot partners dropped after the first task compared with speaking 3.5.3 | Cognitive load aroused via chatbots
to a human partner. Similarly, the participants in Hsu's (2020) study
who interacted face-to-face with human interlocutors perceived a “Cognitive load” in this context refers to students' additional attention
higher socialization than those who conversed with chatbot partners or mental effort that needs to be exerted to perform a learning task
in a second language context. during the learning process. More specifically, the instructional design
of chatbot-supported activities determines how much mental effort
students are required to produce. Given that humans have limited
3.5 | What are the challenges, if any, of using capacity of cognitive processing, the imposed cognitive load on stu-
chatbots in language learning? dents influenced their learning performance (Sweller, 1988). For
example, the design of chatbot-supported learning with complex ele-
Despite the aforementioned technological, pedagogical, and ments (e.g., voice and animation) can confuse students to allocate
social affordances, researchers have noted substantial existing attention and process task information. Kim (2016) reported that stu-
challenges in implementing chatbots in language learning. Three dents of medium and high language proficiency processed more inter-
categories of challenges were summarized, namely chatbots' tech- actions with a voice-based chatbot than students with low
nological limitations, novelty effects, and students' cognitive load language proficiency, suggesting that the medium- and high-level stu-
limitations. dents benefit more from the voice-based chatbot than the low-level
students who were burdened with a higher cognitive load to process
the auditory information. In such a situation, the use of chatbots may
3.5.1 | Technological limitation of chatbot be a barrier to students' language learning. A higher extraneous
cognitive load could diminish students' learning outcomes (Fryer
Although chatbots can contribute to students' language learning, the et al., 2020).
limitations of their technological capability cannot be overlooked. The
most frequently reported technological challenge was the perceived
unnaturalness of the computer-generated voice, which students con- 4 | DI SCU SSION
trasted with human voices (Goda et al., 2014; Tegos et al., 2014).
Failed communication was also found to happen when students This systematic review set out to identifying the usefulness of chatbots
entered incomplete sentences (Yin & Satar, 2020) or when chatbots in language learning, and the effects on students' learning outcomes
responded with nonsense outputs (Fryer et al., 2019). Along with the thereof. In this section, we first address main findings of five research
design of the chatbot interface, the lack of emotion and visible cues questions regarding the current use of chatbots in language learning, and
from chatbots during interactions were found to diminish students' then propose several implications for future chatbot implementation
positive affective states (e.g., interest) in relation to language learning based on our findings. Figure 5 shows all identified affordances within
(Gallacher et al., 2018). Chatbots with limited artificial intelligence the usefulness framework. Finally, we suggest directions for further
could not decipher students' inputs that were out of their range. For research on chatbot-supported language learning.
example, students' ideas headed in unpredictable directions as conver-
sations developed, and new topics introduced by students could not
be recognized by the chatbot system (Yang & Zapata-Rivera, 2010). 4.1 | The current stage of chatbots in language
Due to chatbots' unnatural robotic voices and their inability to carry learning
on long conversations, chatbots appear to isolate learners from the
language learning environment. The first research question identifies the current contexts where
chatbots have been used in language education. Higher education has
been the main setting where chatbots are used. The explanation for
3.5.2 | The novelty effect of language learning this may be found in the growth of online learning and computer-
assisted learning in higher education. Chatbots embedded in either
A novelty effect arises when a new technology is introduced to stu- webpages or instant messaging applications provide higher education
dents, which may increase students' motivation or learning perfor- students convenient online access to learning language. The findings
mance due to the newness of the technology (Chen et al., 2016). also encourage educators to use chatbots for open conversation
Fryer et al. (2017) demonstrated this effect with chatbots in a (i.e., discussing any topic with students), mainly because of chatbots'
16-week experimental study, in which students' interest in speaking capability of keeping a conversation going (Grudin & Jacques, 2019)
tasks declined after the first communication task with the chatbot. that helps increase students' willingness to communicate in specific
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUANG ET AL. 253
F I G U R E 5 Identified technological,
pedagogical, and social affordances of
chatbots in language learning
languages (Ayedoun et al., 2019). Additionally, educators can use interaction during their language learning process. This can be
task-focused chatbots to assist students in learning specific procedure explained by Moore's (1989) three types of interaction in online learn-
such as the outline of a thesis (Lin & Chang, 2020). ing, namely student–student, student-teacher, and student-content
interaction. Educators can employ a chatbot to be a knowledgeable
friend accessible to students (i.e., student–student interaction), a vir-
4.2 | The technological affordances of chatbots in tual tutor providing guidance and recommendation (i.e., student-
language learning teacher interaction), and to deliver language learning contents
(i.e., student-content interaction) in a simulated language learning
Evidence addressing the second research question indicates that scenario.
chatbots can promote students' communication in target languages However, our understanding of the effects of chatbot on student
through three technological affordances: timeliness, personalization, outcomes is still limited. In this review, only eight eligible studies could
and ease of use. Students can practice their language knowledge with- be found that specifically examined the possible effects of chatbots
out the temporal and geographical limitations of having a human on students' behavioural and cognitive outcomes. Two of these stud-
learning partner. Unlike human partners, a chatbot can provide imme- ies failed to establish whether there was an initial difference between
diate response tirelessly (Brandtzaeg & Følstad, 2017). Students can the control and experimental groups (see Table 5). The equivalence of
receive personalized language learning materials based on their previ- students and teachers before the intervention should be considered
ous interaction with chatbots. Chatbots embedded into webpages, in future research design and data analysis. In addition, the measure-
tablets, or mobile instant messaging applications can help facilitate ment of behavioural engagement by previous studies so far has
easier-to-use student-chatbot interaction. Chatbots have achieved focused mainly on computing the number of student-chatbot interac-
the “fundamental benefit of using online technologies” in an educa- tion. Counting the number of student-chatbot interaction does not
tional environment (Bower, 2017) because they can provide students provide us with an indication of the quality of student language learn-
access to learning resources. These findings should encourage educa- ing. Interaction oriented to students' cognitive outcomes should be
tors to use chatbots to fill the role of a tireless learning companion more characterized by the qualitative nature of the interaction and
who can be contacted for domain-specific conversation in any acces- less by quantitative measures (Garrison & Cleveland-Innes, 2005).
sible device.
Yamashita, Huang, & Fu, 2020). Educators can assign chatbot to a between chatbot and students and further engaged students
social member who can communicate with students in the target lan- (Sundar, 2012).
guage in a warm and friendly way. The study of De Gennaro,
Krumhuber, and Lucas (2020) suggests that using an empathic chatbot
has a mitigating impact on students' social exclusion; and this can help 4.6 | Suggestions for future research
develop an open and interpersonal communication atmosphere within
the language learning environment. First, the era of mass adoption of chatbots in language education has
not yet arrived. Currently, we do not have enough empirical evidence
of whether using chatbots is beneficial for language learners across all
4.5 | Current challenges of using chatbots in age ranges. To further investigate the validity of using chatbots in lan-
language learning and potential solutions guage learning, research involving education levels other than univer-
sity (e.g., primary and secondary school) ought to be conducted in the
The fifth research question reveals three challenges of using future to fill the gaps in our knowledge.
chatbots in language education, including the technological limita- Second, no studies in the current review were undertaken longi-
tions of chatbots, novelty effects and cognitive load during stu- tudinally. The intervention times in the studies examined lasted from
dents' learning process. Studies reported technological limitations several minutes to one semester, which could have led to a novelty
such as narrow database support and unnatural robotic voices. effect. To evaluate the long-term effects of chatbots on students' lan-
With regard to tackling the current technological challenges, guage learning, studies lasting two or more semesters should be con-
teachers can take a leadership role in determining how chatbots sidered to see if students' interest or motivation of interacting with
can be best used to help achieve learning outcomes. Teachers can chatbots will change overtime. Researchers are encouraged to mea-
determine how best to use chatbots in their current state of tech- sure both students' behavioural and cognitive outcomes. Other vari-
nological development, thereby mitigating their limitations. For ables, such as students' technology literacy and learning adaptability,
example, Fryer and Carpenter (2006) argued that using chatbots is the design of chatbots' interface, and different target languages can
more appropriate for advanced language learners than beginners be evaluated in future empirical research.
because inaccurate word input cannot be analysed by the system Third, the majority of previous studies on chatbot-supported lan-
and may give students disappointing responses. In contrast, De guage learning conducted measurements using self-reported question-
Gasperis and Florio (2012) successfully used a restricted chatbot to naires, which insufficiently presented the potential effects of using
correct learners' spelling errors, demonstrating that it is possible to chatbots on students' language achievement. As Fryer et al. (2020)
transform a technical limitation into a benefit. Similarly, teachers suggested, observed variables such as students' achievement and class-
can exploit the restrictedness of chatbot conversations to check room observations should be included in future research to validate the
beginners' factual knowledge, such as with vocabulary memoriza- integration of educational chatbots in language learning. Researchers
tion. Chatbots with narrow functionalities or a limited database are advised to include objective measurements in future studies.
may be more acceptable to students for learning factual knowledge Fourth, the majority of previous experimental research on chatbot-
than for conceptual knowledge (Huang et al., 2019). As for supported language learning focused on the different effects between
advanced language learners, teachers can set rules of communica- chatbots and human partners. Few research evaluated the use of
tion with chatbots to help students understand the chatbots' capa- chatbots compared with other equivalent tools (e.g., students in control
bilities and limitations. For example, if help is needed during a group searching information online; Goda et al., 2014). Additionally, given
conversation, students can be asked to type a particular word or that current use of chatbots has been tied closely with online learn-
symbol (e.g., “helpline”) to activate the helpline function without ing, computer-assisted learning, and mobile learning, these learning
interrupting the ongoing interaction. For the future implementa- conditions also include the presences of other techniques (e.g., 3D
tion of chatbots in language learning, the current stage of techno- learning platforms; Wang et al., 2017). Therefore, it is vital to con-
logical development should be considered by educators. sider whether any increase in students' performance and engage-
To mitigate the novelty effect, delivering a workshop prior to ment is aroused by the chatbots only or by the combinations of
the first lesson can prepare students by giving them prior experience tools. Researchers are encouraged to follow up along these lines to
on chatbot-integrated learning. As suggested by Fryer et al. (2020), provide educators with empirical-based evidence in making appro-
students' cognitive processing (i.e., how students process new infor- priate use of chatbots in the future.
mation in an environment) can be enhanced by using Mayer's (2017) Finally, few studies reported on teachers' perceptions surround-
principles of multimedia use for learning. For instance, students' lan- ing the use of chatbots in language teaching activities. Teachers may
guage learning can be more efficient if the texts in chatbots are be sidelined as chatbot designers due to the extra effort required to
presented in the form of a conversation and humanlike gestures are create specific chatbots for target learners (Nghi et al., 2019). It is
employed in the agent. Additionally, the integration of quick buttons challenging to satisfy all students' learning expectations with just
can make chatbots easier to use and allow students to choose learn- one type of chatbot because different students may want to talk
ing resources by a click, which helps enhance the interactivity about different topics (Fryer & Carpenter, 2006). Future studies
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUANG ET AL. 255
could explore teachers' perceptions of the chatbots implementation possible solutions to address them. (d) Finally, this review proposes
in language learning. several directions for future research that can advance our under-
standing of chatbot use in language learning.
Fryer, L., & Carpenter, R. (2006). Bots as language learning tools. Language behavioral, emotional, and motivational outcomes. Review of Educa-
Learning & Technology, 10(3), 8–14. tional Research, 86(3), 643–680.
Fryer, L. K., Ainley, M., Thompson, A., Gibson, A., & Sherlock, Z. (2017). Lee, Y.-C., Yamashita, N., Huang, Y., & Fu, W. (2020). "I hear you, I feel
Stimulating and sustaining interest in a language course: An experi- you": Encouraging deep self-disclosure through a chatbot. In Proceed-
mental comparison of Chatbot and Human task partners. Computers in ings of the 2020 CHI Conference on Human Factors in Computing Sys-
Human Behavior, 75, 461–468. tems (pp. 1–12). ACM.
Fryer, L. K., Coniam, D., Carpenter, R., & L apuşneanu, D. (2020). Bots for Lefebvre, C., Glanville, J., Briscoe, S., Littlewood, A., Marshall, C.,
language learning now: Current and future directions. Language Learn- Metzendorf, M.-I., Noel-Storr, A., Rader, T., Shokraneh, F.,
ing & Technology, 24(2), 8–22. Retrieved from http://hdl.handle.net/ Thomas, J., & Wieland, L. S. (2021). Chapter 4: Searching for and
10125/44719 selecting studies. In H. JPT, J. Thomas, J. Chandler, M. Cumpston, T. Li,
Fryer, L. K., Nakao, K., & Thompson, A. (2019). Chatbot learning partners: M. J. Page, & V. A. Welch (Eds.), Cochrane Handbook for Systematic
Connecting learning experiences, interest and competence. Computers Reviews of Interventions (Version 6.2 ). Cochrane www.training.
in Human Behavior, 93, 279–289. cochrane.org/handbook
Gallacher, A., Thompson, A., & Howarth, M. (2018). “My robot is an Lin, M. P.-C., & Chang, D. (2020). Enhancing post-secondary writers' writ-
idiot!”–Students' perceptions of AI in the L2 classroom. In P. Taalas, J. ing skills with a chatbot. Journal of Educational Technology & Society,
Jalkanen, L. Bradley, & S. Thouësny (Eds.), Future-Proof CALL: Language 23(1), 78–92.
Learning as Exploration and Encounters: Short Papers from EUROCALL Lu, C. H., Chiou, G. F., Day, M. Y., Ong, C. S., & Hsu, W. L. (2006). Using
(pp. 70–76). Research-publishing.net. instant messaging to provide an intelligent learning environment. In
Garrison, D. R. (2011). E-learning in the 21st Century: A Framework for Proceedings of the International Conference on Intelligent Tutoring Sys-
Research and Practice. Taylor & Francis. tems (pp. 575–583). Springer.
Garrison, D. R., Anderson, T., & Archer, W. (1999). Critical inquiry in a Mayer, R. E. (2017). Using multimedia for e-learning. Journal of Computer
text-based environment: Computer conferencing in higher education. Assisted Learning, 33(5), 403–423.
The Internet and Higher Education, 2(2–3), 87–105. Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & Group, P. (2009).
Garrison, D. R., & Cleveland-Innes, M. (2005). Facilitating cognitive pres- Preferred reporting items for systematic reviews and meta-analyses:
ence in online learning: Interaction is not enough. The American Journal The PRISMA statement. PLoS Medicine, 6(7), e1000097.
of Distance Education, 19(3), 133–148. Moore, M. G. (1989). Three types of interaction. American Journal of
Gibson, J. J. (1977). The theory of affordances. In R. Shaw & J. Bransford Distance Education, 3(2), 1–7.
(Eds.), Perceiving, Acting, and Knowing: Toward an Ecological Psychology Moore, M. G. (1993). Theory of transactional distance. In D. Keegan (Ed.),
(pp. 67–82). Erlbaum. Theoretical Principles of Distance Education. Routledge.
Goda, Y., Yamada, M., Matsukawa, H., Hata, K., & Yasunami, S. (2014). Ruan, S., Willis, A., Xu, Q., Davis, G. M., Jiang, L., Brunskill, E., &
Conversation with a chatbot before an online EFL group discussion Landay, J. A. (2019). Bookbuddy: Turning digital materials into
and the effects on critical thinking. The Journal of Information and Sys- interactive foreign language lessons through a voice chatbot. In
tems in Education, 13(1), 1–7. Proceedings of the Sixth (2019) ACM Conference on Learning Scale
Grudin, J., & Jacques, R. (2019). Chatbots, humbots, and the quest for arti- (pp. 1–4). ACM.
ficial general intelligence. In Proceedings of the 2019 CHI Conference on Schmulian, A., & Coetzee, S. A. (2019). The development of messenger
Human Factors in Computing Systems (pp. 1–11). ACM. bots for teaching and learning and accounting students' experience of
Haristiani, N. (2019). Artificial intelligence (AI) chatbot as language learning the use thereof. British Journal of Educational Technology, 50(5), 2751–
medium: An inquiry. Journal of Physics: Conference Series, 1387(1), 2020. 2777.
Hsu, L. (2020). To CALL or not to CALL: Empirical evidence from neurosci- Smutny, P., & Schreiberova, P. (2020). Chatbots for learning: A review of
ence. Computer Assisted Language Learning, 1–24. educational chatbots for the Facebook messenger. Computers & Educa-
Jia, J., Chen, Y., Ding, Z., & Ruan, M. (2012). Effects of a vocabulary acqui- tion, 151, 862.
sition and assessment system on students' performance in a blended Sulosaari, V., Suhonen, R., & Leino-Kilpi, H. (2011). An integrative review
learning class for English subject. Computers & Education, 58(1), 63–76. of the literature on registered nurses' medication competence. Journal
Jia, J., & Ruan, M. (2008). Use chatbot CSIEC to facilitate the individual of Clinical Nursing, 20(3–4), 464–478.
learning in English instruction: A case study. In Proceedings of the Inter- Sundar, S. S. (2012). Social psychology of interactivity in human-website
national Conference on Intelligent Tutoring Systems (pp. 706–708). interaction. In Oxford Handbook of Internet Psychology. Oxford Univer-
Springer. sity Press.
Kim, N.-Y. (2016). Effects of voice chat on EFL learners' speaking ability Sweller, J. (1988). Cognitive load during problem solving: Effects on learn-
according to proficiency levels. Multimedia-Assisted Language Learning, ing. Cognitive Science, 12(2), 257–285.
19(4), 63–88. Tegos, S., Demetriadis, S., & Karakostas, A. (2015). Promoting academically
Kim, N.-Y. (2018a). Chatbots and Korean EFL students' English vocabulary productive talk with conversational agent interventions in collabora-
learning. Journal of Digital Convergence, 16(2), 1–7. tive learning settings. Computers & Education, 87, 309–325.
Kim, N.-Y. (2018b). A study on chatbots for developing Korean college stu- Tegos, S., Demetriadis, S., & Tsiatsos, T. (2014). A configurable conversa-
dents' English listening and reading skills. Journal of Digital Conver- tional agent to trigger students' productive dialogue: A pilot study in
gence, 16(8), 19–26. the CALL domain. International Journal of Artificial Intelligence in Educa-
Kim, N.-Y., Cha, Y., & Kim, H.-S. (2019). Future English learning: Chatbots tion, 24(1), 62–91.
and artificial intelligence. Multimedia-Assisted Language Learning, 22(3), Tu, C. H. (2000). On-line learning migration: From social learning theory to
32–53. social presence theory in a CMC environment. Journal of Network and
Kirschner, P., Strijbos, J.-W., Kreijns, K., & Beers, P. J. (2004). Designing Computer Applications, 23, 27–37. https://doi.org/10.1006/jnca.1999.
electronic collaborative learning environments. Educational Technology 0099
Research and Development, 52(3), 47–66. Wang, Y. F., Petrina, S., & Feng, F. (2017). VILLAGE—Virtual immersive lan-
Korpershoek, H., Harms, T., de Boer, H., van Kuijk, M., & Doolaard, S. guage learning and gaming environment: Immersion and presence.
(2016). A meta-analysis of the effects of classroom management strat- British Journal of Educational Technology, 48(2), 431–450. https://doi.
egies and classroom management programs on students' academic, org/10.1111/bjet.12388
13652729, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12610 by INASP KENYA - Tangaza College, Wiley Online Library on [21/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUANG ET AL. 257
Weizenbaum, J. (1966). ELIZA—A computer program for the study of natu- Yin, Q., & Satar, M. (2020). English as a foreign language learner interac-
ral language communication between man and machine. Communica- tions with chatbots: Negotiation for meaning. International Online Jour-
tions of the ACM, 9(1), 36–45. nal of Education and Teaching, 7(2), 390–410.
Winkler, R., & Soellner, M. (2018). Unleashing the potential of chatbots in
education: A state-of-the-art analysis. In Academy of Management
Annual Meeting Proceedings. Academy of Management.
Xu, Y., Wang, D., Collins, P., Lee, H., & Warschauer, M. (2021). Same benefits, dif-
ferent communication patterns: Comparing children's reading with a conver- How to cite this article: Huang, W., Hew, K. F., & Fryer, L. K.
sational agent vs. a human partner. Computers & Education, 161, 4059.
(2022). Chatbots for language learning—Are they really useful?
Xu, Y., & Warschauer, M. (2020). Exploring young children's engagement
in joint reading with a conversational agent. In Proceedings of the Inter- A systematic review of chatbot-supported language learning.
action Design and Children Conference (pp. 216–228). ACM. Journal of Computer Assisted Learning, 38(1), 237–257. https://
Yang, H.-C., & Zapata-Rivera, D. (2010). Interlanguage pragmatics with a doi.org/10.1111/jcal.12610
pedagogical agent: The request game. Computer Assisted Language
Learning, 23(5), 395–412.