0% found this document useful (0 votes)
142 views43 pages

Comparing L2 Learners' Writing Against Parallel Machine-Translated Texts

Uploaded by

Vương Tuấn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views43 pages

Comparing L2 Learners' Writing Against Parallel Machine-Translated Texts

Uploaded by

Vương Tuấn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Journal Pre-proof

Comparing L2 Learners' Writing Against Parallel Machine-Translated Texts: Raters'


Assessment, Linguistic Complexity and Errors

Yuah V. Chon, Dongkwang Shin, Go Eun Kim

PII: S0346-251X(20)30768-5
DOI: https://doi.org/10.1016/j.system.2020.102408
Reference: SYS 102408

To appear in: System

Received Date: 7 February 2020


Revised Date: 9 September 2020
Accepted Date: 2 November 2020

Please cite this article as: Chon ,, Y.V., Shin, D., Kim ,, G.E., Comparing L2 Learners' Writing Against
Parallel Machine-Translated Texts: Raters' Assessment, Linguistic Complexity and Errors, System,
https://doi.org/10.1016/j.system.2020.102408.

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.

© 2020 Published by Elsevier Ltd.


Author Statement

Yuah V. Chon: Investigation; Formal analysis; Software; Writing- Original draft


preparation; Writing- Reviewing and Editing, Visualization

Dongkwang Shin: Conceptualization, Methodology; Investigation; Formal analysis,


Software; Visualization; Writing- Original draft preparation

Go Eun Kim: Formal analysis; Visualization

of
ro
-p
re
lP
na
ur
Jo
Comparing L2 Learners' Writing Against Parallel Machine-Translated
Texts: Raters' Assessment, Linguistic Complexity and Errors

(Previously "The Use of Machine Translation for L2 Writing:

A Nuisance or a Solution?")

First author:

of
Name: Yuah V. Chon

ro
Department or School: Department of English Education

University: Hanyang University -p


re
Address: 222 Wangshimli-Ro, Seongdong-Gu
lP

City and postcode: Seoul 04763

Country: South Korea


na

Email: [email protected]
ur
Jo

Corresponding author:

Name: Dongkwang Shin

Department or School: Department of English Education

University: Gwangju National University of Education

Address: 55 Pilmundae-Ro, Buk-Gu

City and postcode: Gwangju 61204

Country: South Korea

Email address: [email protected]


Co-Author:

Name: Go Eun Kim

Department or School: Graduate School of Education

University: Hanyang University

of
Address: 222 Wangshimli-Ro, Seongdong-Gu

ro
City and postcode: Seoul 04763

Country: South Korea -p


re
Email address: [email protected]
lP
na
ur
Jo
Comparing L2 Learners’ Writing Against Parallel Machine-Translated Texts:
Raters’ Assessment, Linguistic Complexity, and Errors

Abstract

Recent developments in machine translation, such as in Google Translate, may help second

language (L2) writers produce texts in the target language according to their intended

meaning. The aim of the present study was to examine the role of machine translation (MT)

f
oo
in L2 writing. For this purpose, 66 Korean English as a foreign language (EFL) university

learners produced compositions in which writing tasks were counterbalanced in three writing

pr
modes (i.e., Direct Writing, Self-Translated Writing, and Machine-Translated Writing). The
-
re
learners’ writing products were first graded by independent markers and later submitted for
lP

computerized text analyses using BNC-COCA 25000, Coh-Metrix, and SynLex to assess

linguistic complexity. The texts were also analyzed for types of errors. The results indicate
na

that MT narrowed the difference of writing ability between the skilled and less skilled
ur

learners, facilitated learner use of lower frequency words, and produced syntactically more
Jo

complex sentences. Error analysis showed a reduction in the quantity of grammatical errors

when MT was used to aid L2 writing. However, MT-translated compositions contained more

mistranslations and a greater number of poor word choices. The results offer pedagogical

implications for using MT for L2 writing.

Keywords: Machine translation, Google Translate, second language writing, lexical

complexity, syntactic complexity, Coh-Metrix, SynLex, error analysis

1. Introduction

1
In academic contexts where second language (L2) writing is involved, machine translation

(MT) through PCs, mobile phones, and the web is a widely used source of reference during

writing. Despite its widespread use in the era of artificial intelligence (AI), MT is usually not

recommended by teachers and researchers as the primary source of reference to help learners

solve the language problems they face, since the ability of MT to accurately translate a

writer’s intended message is still under question (Groves & Mundt, 2015). One of the reasons

f
is that MT has not been sufficiently developed yet to accurately convey the writers’ intended

oo
meaning. Indeed, using MT itself can be a potential source of error if learners fail to notice

pr
MT-produced errors and self-correct them. However, recent advancements in MT technology
-
re
(Le & Schuster, 2016) with the introduction of “Google’s neural machine translation”
lP

(GNMT) have greatly improved the quality of MT (Jia, Carl, & Wang, 2019). In November

2016, Google switched from a phrase-based approach to neural machine translation (NMT),
na

an approach that uses AI to learn from millions of examples. While phrase-based machine
ur

translation (PBMT) breaks an input sentence into words and phrases to be translated largely
Jo

independently, NMT considers the entire input sentence as the unit for translation. This

development has greatly improved the quality of Google Translate. Despite this improvement,

GNMT can still cause significant errors that a human translator would never make, for

instance, translating sentences in isolation rather than considering the context of a paragraph

(Le & Schuster, 2016).

To investigate the impact of MT on L2 writing, we compared learners’ use of MT to their

usual practice of writing, such as when learners write directly in L2 (i.e., direct writing), or

write in first language (L1) and translate to L2 (i.e., self-translated writing). To understand

the characteristics of MT-produced writing compared to the other two modes of L2 writing,
2
independent raters assessed the writing products. Computational analyses were subsequently

conducted to examine the linguistic characteristics of the texts. To assess the accuracy of

learners’ writing products, a manual error analysis was used.

1.1 Background

Early MT studies were conducted primarily within the context of translation studies and often

f
viewed MT as a source of errors (Belam, 2003; Kliffer, 2005; Niño, 2008), specifically

oo
lexico-grammatical errors that must be corrected through post-editing. However, with the

pr
widespread use of MT for writing in a foreign or second language, studies have been
-
re
conducted recently to highlight its significance in L2 writing.
lP

Some researchers asked participants to identify errors and rectify them in MT products
na

(Groves & Mundt, 2015; Kol, Schcolnik, & Spector-Cohen, 2018; Tsai, 2019), while others
ur

analyzed error types that appeared due to MT (Lee, 2019; Niño, 2008; Wallwork, 2016).
Jo

Groves and Mundt (2015) asked students to submit an essay in their L1, which was

subsequently translated into English through a web-based translation engine. The text was

full of errors, but the rendition had reached a level of accuracy close to the minimum needed

for university admission. Wallwork (2016) analyzed the types of mistakes that Google

Translate makes, translating from Italian to English, and found that they occurred while

arranging word order, not placing the plural -s on acronyms, making uncountable nouns

countable, and misusing of tenses. Kol et al. (2018) included an awareness task to assess

students’ awareness of Google Translate mistakes and a correction task to evaluate their

ability to rectify the identified mistakes. The awareness and correction tasks showed that

intermediate students identified 54% of the mistakes, while advanced students identified 73%
3
and corrected 87% of the identified mistakes. Tsai (2019) asked English as a foreign language

(EFL) learners to write first in Chinese, later draft the corresponding text in English, translate

the Chinese into English using Google Translate, and finally compare their self-written

English texts with their machine-translated English texts. When both English drafts were

analyzed, the machine-translated English texts had more words, fewer mistakes in spelling

and grammar, and fewer errors per word. Lee (2019) studied the impact of using MT on EFL

students’ writing by asking them to translate Korean into English without the help of MT and

f
later correct their English writing using MT. The students’ writing outcomes revealed that

oo
using MT was more beneficial for lower level learners who made fewer lexico-grammatical

errors and produced improved revisions. - pr


re
lP

Products of MT writing were examined for their quality by having them graded by

independent markers (Stapleton & Kin, 2019; Van Rensburg, Snyman, & Lotz, 2012). Van et
na

al. (2012) had writing products assessed by raters to compare the quality of translation
ur

created by 1) Google Translate, 2) a translation student, and 3) a professional translator.


Jo

There was a significant difference in quality between the translation products of Google

Translate (M = 33.8), the translation student (M = 72.2), and the professional translator (M =

96.6). The researchers claimed that the MT products could be made useful, but post-editing

would be required to make them intelligible and meet the functional requirements of the text.

Stapleton and Kin (2019) asked L2 primary students to write in English for the first task. In

another task, students wrote in their native Chinese to the same prompt, and subsequently

translated the text into English using MT. When teachers were asked to grade grammar,

vocabulary, and comprehensibility of two sets of parallel essays, teachers considered MT

writing to be significantly better than non-MT writing in grammar. Although the difference in

vocabulary was not significant between MT and non-MT writing, MT writing appeared
4
equally comprehensible to teachers.

The increased use of MT for writing has led teachers/researchers (Correa, 2014; Ducar &

Schocket, 2018; Mundt & Groves, 2016; Stapleton & Kin, 2019) and learners (Lee, 2019;

Tsai, 2019) to reevaluate this recent reference tool. Correa (2014) proposed that MT should

be utilized to discourage/minimize academic dishonesty and raise metalinguistic awareness

by urging learners to view writing as a process and not just an end product. Ducar and

f
Schocket (2018) argued that MT is not a type of technology that instructors can prevent

oo
learners from using, and that this technology should be considered as a 21st-century skill to

pr
help learners understand the positive progress toward greater proficiency and ethical use of
-
re
technologies. Mundt and Groves (2016) claimed that the advancement of web-based MT in
lP

providing grammatically accurate translations may be misinterpreted as a remedy for the lack

of writers’ language proficiency. Lee (2019) also found through learner interviews and
na

reflection papers that using MT for revisions positively affected learners’ use of writing
ur

strategies while helping them to think of writing as a process.


Jo

Considered together, despite the flaws and complications noted by learners and researchers

regarding the use of MT, this technology is a subject of interest to both the average person

who seeks the translation of a given text and L2 writers who write primarily for academic

purposes (Groves & Mundt, 2015; Lee, 2019; Tsai, 2019). Moreover, previous studies (Lee,

2019; Stapleton & Kin, 2019; Tsai, 2019) on MT-translated texts can be criticized for their

study designs that required students to respond to the same writing topic twice. That is, when

the student-translated L2 texts were produced first before the same L1 text was submitted for

MT, the texts rendered by MT could have been subjected to a practice effect. Wallwork (2016)

also suggested that with the evolution of MT, teaching future students how to spot mistakes
5
and revise already written texts could be as important as teaching them how to generate a text

from scratch. In contrast, the idea itself of allowing L2 learners to use MT in classrooms may

appall in-service L2 teachers (Stapleton & Kin, 2019), but this is inevitably the direction to

which we appear to be heading. Rather than condemning this practice, the present research

investigates the potential benefits of teaching the use of MT for language learning. That is,

the practice may not yet be widespread, but the trend appears to be on the rise.

f
1.2 Research Questions

oo
pr
Considering the recent improvements in MT and the lack of studies on the pedagogical
-
re
implications of these latest improvements, the time seems propitious to explore the quality
lP

and nature of MT and consider its implications for language learning. Therefore, an empirical

study was designed to discover whether MT had reached a level of quality under which
na

students could use their L1 (in this case Korean) to write a passage and subsequently use MT
ur

to translate it into English. The MT texts were rated for the learners’ level of writing ability
Jo

and analyzed on measures of lexical diversity, syntactic complexity, and error type. The

following research questions guided our study:

1. Do raters differ in their assessment of EFL learners’ compositions on three writing

tasks: direct writing (DW), self-translated writing (TW), and machine-translated

writing (MW)?

2. Does writing mode (DW, TW, MW) have an effect on the linguistic complexity

(lexical diversity, syntactic complexity) of L2 texts?

3. What types of lexical and grammatical errors can be identified in the three modes of

writing?
6
2. Method

2.1 Participants

Seventy L2 speakers of English participated in this study, but four participants were excluded

(two essays were not written as directed, a writing sample could not be read due to

mishandling of the electronic word document, and one participant was a native speaker of

f
English). All the participants were native speakers of Korean in South Korea (hereafter,

oo
Korea or Korean). The participants, who were majoring in English Language Teaching in the

pr
College of Education, were recruited from universities of two cities, Gwangju and Seoul.
-
re
Both groups of learners were participating in classes where the focus of instruction was in
lP

practicing L2 writing skills. The students ranged from sophomores to juniors between 20 and

28 years of age. Their level of proficiency was high-intermediate (TOEFL IBT: 80 to 100)
na

according to international standards. Within the local context, most of the students had scored
ur

the highest stanine level of “one” (4% of the examinee population) on the nine-point standard
Jo

scale that is used as a measure for the Korean university entrance exam (Korean College

Scholastic Ability Test). Any possible difference in their writing proficiency was noted by a

diagnostic writing test (detailed later).

All learners had been learning EFL since grade three. However, most of them lacked

experience in productive skills (i.e., speaking and writing) since their English learning had

been centered on instruction for reading and grammar. This is due to the socio-educational

milieu where teaching primarily focuses on the grammar-translation method and standardized

tests. Nonetheless, the learners’ knowledge of English was expected to be sufficient for them

7
to recognize major errors arising either from their own writing or from MT and to revise them

adequately.

2.2 Instruments

2.3 Writing Tasks

f
Participants wrote short English compositions of around 300 words on observing anecdotal

oo
pictures. Three different picture description tasks (Park, Min & Kim, 2014) were provided in

pr
varying sequences (see Appendix A for the writing prompt). For direct writing, students
-
re
composed in English. For the self-translated writing task, the students first wrote in Korean
lP

and subsequently translated the text into English (L2). For the machine-translated writing

task, students wrote in Korean (L1), submitted their L1 text to Google Translate, and revised
na

the machine-generated translation. To offset the possibility of an order effect (practice effect),
ur

the researchers counterbalanced the three writing topics of the picture description tasks with
Jo

the three writing modes, as illustrated in Table 1.

A number of considerations were taken into account for the design of the writing tasks. Three

prompts were necessary since using the same writing topic in the second or third writing

condition may have led learners to write better as the participants knew what to do. On the

other hand, their performance might have deteriorated in the second or third condition due to

tiredness (i.e., fatigue effect). The three different picture description tasks were designed to

prompt the learners to connect to the situations “to young people’s life as much as possible to

aim for acceptable levels of ecological validity” (Schoonen, van Gelderen, Stoel, Hulstijn, &

de Glopper, 2011, p. 42). Nevertheless, it can be claimed that the learners’ familiarity with
8
these assigned topics were different, which could have affected their communication and

expression in using their L1 and English.

To offset the possible influence of different writing topics and analyze writing mode

difference, a repeated measures design was used, an experimental design where the same

participants take part in each condition of the independent variable (i.e., writing mode) and

where the ordering of tasks is different for the participants (Bulté & Housen, 2012; Mackey

f
& Gass, 2015; Polio & Friedman, 2017; Van Weijen, Van den Bergh, Rijlaarsdam, & Sanders,

oo
2009). That is, to analyze writing mode difference, the researchers counterbalanced the order

pr
of the conditions for the participants by alternating the order in which participants attended
-
re
writing topics and writing mode. In the end, after all the writers participated in the three
lP

writing topics A, B, and C, the sum of the writing products by writing mode allowed the

researchers to cancel out the effect of writing topic (and the effects incurred by different
na

contextual familiarity, L1 knowledge, and English) and compare the products only for writing
ur

mode difference.
Jo

2.4 Diagnostic Essay Prompt

To understand how the writing products of the three writing modes would be graded by

independent raters, the learners’ writing proficiency was examined for its effect on the three

modes of writing. To assess learners’ writing proficiency, an expository essay was used for

the diagnostic purpose after verifying the appropriateness of using a genre different from

those assigned to the students in the experimental tasks (i.e., narrative). A Pearson correlation

analysis between the scores of diagnostic writing and DW was found to be significant and

positive (r = .336, p < .01), which supported the legitimacy to use the students’ score of the
9
diagnostic essay to divide the learners into different writing ability groups. Jeong’s (2017)

study, with Korean EFL university learners, also validated the use of expository essays for

assessing L2 learners’ writing proficiency. While investigating how narrative and expository

genres affect students’ writing performance, Jeong (2017) found no genre effect on individual

student essay scores. This finding aligns with previous studies (Hoetker & Brossell, 1989;

O’Loughlin & Wigglesworth, 2007) that found no score differences due to genre. As such,

the use of expository genre was considered valid for assessing learners’ writing proficiency.

f
An expository essay would also be more sensitive to explore the learners’ writing ability

oo
since it demands that writers think at higher levels of cognitive ability, evaluate evidence,

pr
expound on an idea, and set forth an argument. All the participants were asked to write a
-
re
diagnostic essay in the first week of the semester. See Appendix B for the writing prompt.
lP

2.5 Data Collection


na
ur

Before the writing tasks were conducted, a training session was held. The learners were
Jo

introduced to MT, to its advantages and limitations, and to the types of errors that they can

expect from this technology. As part of the training, learners also received hands-on practice

by machine translating an anecdote to note the features of MT.

The three writing tasks were administered during the regular spring semester in 2019 with the

learners of intact classes in Gwangju and Seoul. Directions were provided to the participants

to include paragraph structure in their writing. Learners were permitted to use only the word

processor for planning, outlining, and drafting. For DW and TW, the learners were not

allowed to use dictionaries. For TW and MW, learners were asked to save their drafts written

in Korean (L1) and to later submit them with their final L2 essays. For MW, the learners were
10
also asked to copy the translated output from the MT and revise it in a separate paragraph to

show how the revisions had been made.

The learners wrote on three separate days on each prompt in a timed 50-minute session

according to the respective writing prompts provided to them. In one sitting for TW, the

learners wrote first in L1 (Korean) (20 minutes) and later translated the essay into English (30

minutes). For MT, the learners wrote first in L1 (20 minutes), translated with the machine

f
translator, and revised the L2 essay (30 minutes). In the DW mode, the learners wrote on the

oo
topic exclusively in English (for 30 minutes) with the announced time for revision (20

pr
minutes). The rationale for providing revision time in the DW mode was to offset the
-
re
advantage that may have accrued in the TW and MT modes. While using the word processor,
lP

the learners’ keyboarding skill was not expected to be a factor since they were accustomed to

using computers and word processors for their academic tasks.


na
ur

2.6 Data Analysis


Jo

2.6.1 Assessment of Writing Products

An area of interest of the study was to examine how raters assessed the learners’ writing

products for the three modes of writing. The two raters selected for this study were native

speakers of English experienced (more than 8 years) in rating a large pool of essays written

by learners of English at one of the universities where the study was conducted. For assessing

writing, our choice of an analytic scale was Jacob’s et al.’s ESL Composition Profile (1981),

which rates five aspects of writing: content, organization, vocabulary, language use, and

11
mechanics. The raters underwent a day of training to ensure that they reached an agreement

on the descriptors of the analytic scale.

The raters assessed the learners’ diagnostic essays and those produced by DW, TW, and MW,

resulting in 264 essays (4 x 66 learners) for each rater to mark. Raters graded the essays

independently while not being aware of the three different modes of writing. The essays were

provided scores on a scale of 100, and the scores were calculated for inter-rater reliability. For

f
essays with a difference of more than 10 points, the leading researcher of the study advised

oo
the raters to reconsider their scores. The raters were asked to negotiate until an agreement

pr
could be reached. In the end, the inter-rater reliability for the essays with the Pearson
-
re
correlation was significant (p < .01) and reliable: Diagnostic Essay (.908), DW (.870), TW
lP

(.849), and MW (.823).


na

2.6.2 Writing Ability Groups


ur
Jo

The scores from the diagnostic essays were analyzed for different ability groups to examine

the effect of writing mode. By using visual binning available in SPSS (Statistical Package for

Social Sciences) and utilizing the mean of the diagnostic test score (M = 73.3), we divided

the participants into two ability groups. The learners were divided into those “Below the

mean” (n = 29) and “Above the mean” (n = 37), for which a significant difference was found

between the two groups (p < .001) (see Table 3). The groups were respectively labeled the

“Skilled” and the “Less Skilled” learners.

2.6.3 Analysis of Written Texts

12
The written texts were analyzed for lexical diversity and syntactic complexity. The

complexity indices were obtained using computer-based text analysis tools. Prior to

submitting the texts for machine coding, they were corrected for misspellings and

punctuation errors to ensure that the software functioned as intended.

Jarvis (2013) suggested that lexical diversity can be captured in terms of volume (i.e., text

f
oo
length), rarity (i.e., frequency of words in the language), and variability (i.e., type-token ratio

pr
corrected for text length). To measure volume, text length and sentence length were
-
calculated by Coh-Metrix 3.0 (Graesser, McNamara, & Kulikowich, 2011). Rarity measures
re
were calculated for the proportion of K3 and K4 words (i.e., third and fourth 1,000 English
lP

word families of English) using BNC-COCA 25,000 RANGE


na

(https://www.wgtn.ac.nz/lals/about/staff/paul-nation). As another measure of rarity, the

percentage of complex words was also calculated with the Readability Test Tool (WebFX,
ur

2012). According to Broda, Niton, Gruszczynski, and Ogrodniczuk (2014), a complex word
Jo

is that with more than three syllables and a component of the readability measure for the

Gunning FOG index. Variability was assessed by measure of textual lexical diversity (MTLD;

McCarthy & Jarvis, 2010) via Coh-Metrix 3.0. MTLD was utilized, rather than type-token

ratio (TTR), as TTR usually gets affected by the length of the text sample. However, MTLD

can overcome the potential confounding of text length by using sampling and estimation

methods (Jarvis, 2013).

Syntactic complexity was assessed in terms of four indices, drawing on the work of Bulté and

Housen (2012) and Norris and Ortega (2009): overall complexity, subordination complexity,

13
phrasal complexity, and syntactic sophistication. Following previous task complexity studies

in L2 writing, the t-unit, defined as ‘‘one main clause plus whatever subordinate clauses

happen to be attached to or embedded within it” (Hunt, 1970, p. 4) was adopted as the

principal unit of analysis. Overall complexity was expressed as the ratio of words to t-units.

Subordination complexity was operationalized as the proportion of clauses in relation to t-

units. The two indices were calculated by utilizing the text analysis software SynLex (Lu,

2010). To measure phrasal complexity, the mean number of modifiers per noun phrase was

f
calculated by Coh-Metrix 3.0. The level of syntactic sophistication was also assessed using

oo
Coh-Metrix 3.0. This measure estimates the extent to which syntactic structures are consistent

pr
in a text. That is, a lower syntactic structure similarity index indicates a larger selection of
-
re
structures.
lP
na

2.6.4 Error Analysis of Written Texts


ur

In the context of the current study, errors were defined as “lexical, morphological, or
Jo

syntactic constructions that clearly deviate from the rules of standard written English—

deviations about which most literate and proficient users of the language would agree”

(Bitchener & Ferris, 2012, p. 146). The syntactic and lexical elements of L2 writing that

impede meaning from being expressed accurately were examined. Ferris’ (2011) analytical

framework of error taxonomy (p. 102) was referenced for coding errors. Ferris’ taxonomy

includes errors of 1) word choice, 2) verb tense, 3) verb form, 4) word form, 5) subject-verb

agreement, 6) articles, 7) noun ending, 8) pronouns, 9) run-on, 10) fragment, 11) punctuation,

12) spelling, 13) sentence structure, 14) informal, and 15) idiom. After coding errors in the

students’ compositions, we were able to derive a framework of error taxonomy, which

14
consisted of 1) word choice, 2) verb tense, 3) verb form, 4) subject-verb agreement, 5)

articles, 6) noun ending, 7) pronoun, 8) sentence structure, 9) mechanics, 10) mistranslations,

and 11) prepositions. On learning that both run-on and fragment are related to structuring

sentences, the two types were together identified as a “sentence structure” error. A category

for “mechanics” was added to the taxonomy to include punctuation and spelling errors.

Owing to cross-linguistic influences, categories for “mistranslation” and “prepositions” were

added because they were relatively frequent in the compositions.

f
oo
To identify the types of errors that occurred, two raters were chosen. A Korean bilingual

pr
speaker of English (not directly involved in our study) who had 10 years of experience in
-
re
evaluating L2 writing was additionally recruited to corroborate the appropriateness of the
lP

error taxonomy. The other was the author of this study. To mark the errors, the two raters

independently went through a reiterative process of reading the texts. The inter-rater
na

reliability calculated by percent agreement between the raters was initially 78.9%. The raters
ur

discussed to reach to an agreement. Table 2 shows the complete version of the error
Jo

taxonomy and examples.

2.6.5 Statistical Analysis

The learners’ writing scores, indices for linguistic complexity of L2 texts, and frequency for

the different types of errors were calculated to derive descriptive statistics (mean and

frequency). In order to compare the skilled and less skilled learners’ writing scores, an

independent-samples t-test was employed. To calculate for writing mode effect (DW, TW,

MW), a repeated measures one-way ANOVA was conducted on the learners’ writing ability

15
scores, indices of linguistic complexity, and frequency of errors. The alpha level was set

at .05 for all tests.

3. Results

3.1 Relative effects of writing mode on the skilled and less skilled learners’ writing ability

The repeated measures one-way ANOVA indicated that the learners’ writing ability scores on

f
oo
the three writing modes were not significantly different, F(2, 130) = .677, p = .510. Further

- pr
analysis within each proficiency group also indicated that there was no effect of writing mode

on learners’ writing ability, both for the skilled, F(2, 56) = 1.088, p = .344, and less skilled
re
learners, F(2, 72) = 2.658, p = .077. However, a significant difference was found between the
lP

two groups for DW (p < .05) (Table 3). In contrast, a non-significant difference was found
na

between the skilled and less skilled learners for both TW (p = .202) and MW (p = .858).
ur

3.2 Three Writing Modes and Linguistic Complexity


Jo

Table 4 provides the descriptive statistics and results of repeated measures one-way ANOVA

for lexical diversity and syntactic complexity across DW, TW, and MW. The analysis for

lexical diversity indicated that MW produced the longest “text length” (M = 194.35) and

“sentence length” (M = 13.87) (p < .001) of the three modes of writing. Although the learners

were asked to write around 300 words for each writing task, they wrote less than 200 words

on average in all three modes (DW: min = 77 words, max = 359 words; TW: min = 86, max =

286; MW: min = 83, max = 381). The researchers surmised that this was due to the

limitations of time and the learners’ lack of L2 writing proficiency in writing a draft and

16
conducting post-editing within the set time.

While the most frequent 3,000- to 4,000-word families serve as the “minimum” requirement

for beginner-to-intermediate L2 learners (Saito, 2018), MW produced a larger percentage of

words at the 3,000- and 4,000-word family level in comparison to DW (p < .01). The support

of MT also facilitated the participants to produce the greatest number of “complex words”

(words of more than three syllables) (p < .001). The results indicate that MT had helped the

f
learners to produce longer words at higher lexical sophistication levels, which is expected to

oo
require cognitive processing at higher levels (Schütze, 2017). However, the MTLD indicates

pr
that it was TW that had allowed the learners to retrieve a larger range of lexical items (p
-
re
< .01). That is, the translation process appears to have directed the learners to reflect on the
lP

different lexical choices available to them to most effectively express their intended message

in L2.
na
ur

Syntactic complexity, which can be considered an integral part of L2 learners’ overall


Jo

development in the target language (Foster & Skehan, 1996; Lu, 2010; Ortega, 2003),

indicated that MW was more effective than DW for Words/t-unit (overall complexity),

Modifiers per Noun Phrase (NP) (phrasal complexity), and Structural Similarity (sentence

syntax similarity). The index for the overall complexity indicated that MW (p < .01) had

facilitated learners to produce varied and sophisticated grammatical structures, in comparison

to those produced by DW. Similarly, the index for Modifiers per NP also suggested that the

learners were able to produce denser noun phrases with many modifiers. That is, the noun

phrases were found to be more complex in the MW mode. The lower Structure Similarity

index that appeared for both TW and MW (0.13) reflected that the learners had gone beyond

using sentences with similar structures to express similar thoughts and feelings. That is, both
17
TW and MW helped learners to use sentence structures that they did not use in DW (p <.001).

3.3 Three Writing Modes and Types of Errors

While a total of 2,400 errors occurred for all texts regardless of writing mode, the most

frequent errors occurred for articles (n = 460, 19.17%), word choice (n = 435, 18.12%),

mechanics (n = 323, 13.46%), prepositions (n = 303, 12.63%), sentence structure (N = 242,

10.08%), and verb tense (n = 217, 9.04%), accounting for 82.5% of the errors. Other error

f
oo
types occurred for verb form (n = 120, 5.0%), word form (n = 87, 3.63%), pronouns (n = 74,

pr
3.08%), noun endings (n = 67, 2.79%), mistranslations (n = 40, 1.67%), and subject-verb

agreements (n = 32, 1.33%). Since the texts differed in length in spite of all the learners being
-
re
provided with the same amount of time for the three writing tasks, the number of errors was
lP

calculated for the error rate (no. of errors/ no. of words per text). Repeated measures one-way
na

ANOVA indicated a significant difference in error rates between the writing modes (DW: M

= .062, SD = .040; TW: M = .088, SD = .052; MW: M = .065, SD = .044), F (1.699, 110.451)
ur

= 10.047, p < .001. Post-hoc comparisons showed significantly higher rates of errors in TW
Jo

than in DW (p < .001) and MW (p < .001). There was no difference in error rates between

TW and MW. For a detailed interpretation, an analysis by error type is added.

When statistical tests were performed, the error rates differed between the writing modes for

articles, mistranslations, prepositions, sentence structures, and word choice as illustrated in

Table 5. MW was effective in reducing article and preposition errors. In contrast, both forms

of translated writing (TW and MW) yielded a greater number of mistranslation errors in

comparison to DW (p < .05). Word choice errors were also the least frequent in DW but

increased with MW.

18
4. Discussion

The writing products as a whole revealed that writing mode did not influence the learners’

ability to write in L2. However, an analysis by different writing ability groups revealed that

TW and MW had assisted the less skilled learners to narrow their difference in writing ability

from the skilled learners. In comparison, DW did not offer the less skilled learners an

advantage. This difference is not surprising considering that the less skilled learners relied on

f
oo
their existing linguistic knowledge to attend to the given writing task. In TW, the translation

- pr
process appeared to have offered the less skilled learners an opportunity to refine their

expressions for stating their intended message. Similarly, access to MT may have helped the
re
less skilled learners to communicate at a level of writing proficiency that was not
lP

significantly different from that of the skilled learners. This indicated that MT was a source of
na

reference that functioned as an aid to lexico-grammatical problems that the less skilled

learners faced during the writing process (Kol et al., 2018; Lee, 2019; Tsai, 2019). Most
ur

strikingly, the writing scores obtained from MW were not significantly different from those
Jo

of DW. While there has often been skepticism expressed toward the quality of MT for writing,

the results suggest that Google Translate may have reached a stage that will allow users to

produce texts that would be as good as those written by DW.

When the writing products were submitted for computerized analyses, MW showed higher

quality of lexis and syntax. MW produced lexical measures that were enhanced in quality

compared to the other modes of writing in terms of text length, sentence length, and quantity

of low frequency, and complex words (Kol et al., 2018). Furthermore, MT helped learners to

produce a larger variety of syntactic structures in comparison to DW. The index for structural

19
similarity in particular indicated that L1 texts submitted to MT had helped learners retrieve a

larger range of sentence structures, which were important to avoid monotony and provide

appropriate emphasis within discourse.

Considered together, in line with how learners have been found to depend on their L1 while

learning an L2 (Cook, 2010), the results support the position that learners who lack the ability

to write in L2 can be assisted to improve the quality of their writing by utilizing L1 as a

f
mediator to produce L2 (Laufer & Girsai, 2008; Van Weijen et al., 2009; Wang & Wen, 2002;

oo
Woodall, 2002). In a similar vein, the greater range of L1 lexico-grammatical items that the

pr
learners wanted to produce could be more effectively retrieved through the use of MT.
-
re
lP

MT is expected to offer several advantages to L2 writers. First, MT can benefit learners in

using words, phrases, and sentence structures that they might not be able to retrieve on their
na

own. Apparently, when learners wrote with MT to express their intentions, they were
ur

encouraged to select words that they already knew in L1 with the freedom of being able to
Jo

construct complex sentence structures, which as a whole allowed them to provide a more

detailed elaboration of their intended message initially encapsulated in L1 (Lee, 2019).

Second, MT can function similar to teachers’ corrections or peer feedback. Knowing that peer

feedback is often unsatisfactory both for teachers and learners (Hyland & Hyland, 2019;

Paulus, 1999; Rollinson, 2005), MT can provide immediate feedback on writers’ drafts and

alleviate complications associated with loss of writing, self-efficacy, or heightening levels of

writing anxiety. Third, the process of using MT also fostered language learning in general for

improving the learners’ metalinguistic knowledge of L2, which is positively correlated with

L2 proficiency (Roehr-Brackin, 2018). MT can help learners become more aware of patterns,

correlations between form and meaning, lexical choices, and collocational patterns. It can
20
also help learners to understand that there is more than one way to create meaning (Vold,

2018). Similarly, MT may have alerted the students to errors by suggesting alternatives,

which, as a result, may contribute to opportunities for L2 acquisition (Lee, 2019; Tsai, 2019).

Awareness can further be raised on how to write in L2 when the learners use MT to notice

mismatches between their L1 and L2 (Qi & Lapkin, 2001; Swain & Lapkin, 1995).

However, the error analysis in this study did not straightforwardly reflect the benefits of MT.

f
DW evidenced the lowest error rates for mistranslations, sentence structures, and word

oo
choice, and a probable explanation for this is that when the learners had to write directly in

pr
L2 without any type of human or machine-related assistance, they actively employed
-
re
communication strategies (Dörnyei & Scott, 1997; Fæerch & Kasper, 1983) to achieve their
lP

communicative goal or to avoid making mistakes. Access to MT was beneficial for reducing

errors related to articles and prepositions, but not for mistranslations and poor word choices.
na

The L2 learners’ improvement in the use of articles and prepositions is a noteworthy


ur

interlanguage development since articles are one of the most common types of errors that
Jo

Korean learners make, apparently owing to the fact that construct does not exist in Korean

(Park & Song, 2008). In the present study, learners often omitted articles with generic nouns

and were confused between definite (“the”) and indefinite (“a” and “an”) articles. Preposition

errors were also common, such as for words, to, of, in, on, at, and between.

In comparison, mistranslations and poor word choices that remained in the products of MW

may have occurred when the learners were unable to correct the mistranslations in the MT

renditions. This could be attributed to two possible causes: 1) Writers will take more risks

and experiment more with their native language before translating. 2) EFL writers tend to

write in structures and/or styles that do not suit the characteristics of the English language.
21
That is, when the writers were translating texts that were written in L1, which consisted of

more difficult vocabulary and sentence structures than what the learners could translate, the

L2 texts could have resulted in comparatively higher error rates. However, the high number

of errors does not necessarily imply a fall in the quality of writing on this occasion. In

addition, L2 writers do make types of errors similar to those that appear in MT writings

(Stapleton & Kin, 2019), often being related to mistranslations due to cross-linguistic

influences (Kobayashi & Rinnert, 1992; Laufer & Girsai, 2008) and word choices (author;

f
Ferris & Roberts, 2001).

oo
pr
Considered together, the significantly higher error rates found for word choice and
-
re
mistranslation in MW do not necessarily indicate low quality writing. There was evidence in
lP

the texts that the learners embraced a rich vocabulary rather than conveniently referring to

their “lexical teddy bears” (Hasselgren, 1994). In fact, upon qualitative observation of L1
na

texts, it was found that L1 texts written for MW were marked by a more diverse set of
ur

expressions. The participants used figurative speech including onomatopoeia, repetition of


Jo

synonyms, proverbs, and four-Chinese-character idioms to enrich their writing with colorful

expressions. Some samples showed that L2 writers read MT texts carefully and tried to revise

the text to improve both its accuracy and rhetorical effect. For instance, a writer using MW

wrote a four-Chinese-character idiom that means to look forward to something (학수고대 =

鶴首苦待). MT translated the expression as “I have been waiting and waiting,” reiterating the

word “wait” to intensify the message but failing to obtain the exact equivalent. The student

successfully revised the expression: “I have been eagerly waiting for this South Korea vs.

Greece soccer game.” This case suggests that upon seeing errors or imprecise translations

made by MT, learners actively sought the best option in their mental schema. Similarly,
22
another learner corrected MT’s mistranslation of “한숨도 못 잤지만” (meaning “I could not

sleep at all’) to “I could not even sigh” (resulting from the polysemy of “한숨” which could

mean a breath, a pause, or a sigh), to “I could not sleep a wink,” showing the depth of the

writers’ vocabulary. Another frequent error was concerning the Korean word “당황하다,”

which was translated almost always into “embarrassed” regardless of the context. This

example shows that one-to-one translation may not always be attainable especially for

f
oo
languages that are largely different, such as for Korean and English. Similarly, the Korean

pr
verb “기대하다” was also mistranslated into the verb “expect.” Actual examples from writers’
-
re
compositions are provided in Table 6.
lP

Pedagogical implications can be proposed as follows. First, EFL teachers should guide
na

learners on correcting word choice and mistranslation errors that MT is prone to make. For
ur

instance, participants struggled to convey the message that “the TV showed a black screen.”
Jo

In Google Translate the Korean expression “화면이 나오지 않았다” yielded the awkward

expression: “the TV/screen did not come out.” These findings specify that writing instruction

needs to focus on providing adequate mini-lessons on grammar and word choice with English

corpora based on examples from MT texts. Teachers should conduct strategy-based instruction

to train learners to notice the errors or inappropriate uses of the language in the MT texts. On

noticing the problems, the learners should be able to devise appropriate strategies to resolve

these problems. If problems persist, the instructor can provide further (written) selective

feedback.

23
Second, before learners submit L1 texts to MT, they should be instructed to pay greater

attention to reformulating and revising their L1 texts. Although Google Translate based on the

implementation of the new AI algorithm has improved its ability to avoid literal translations,

words that have more than one meaning (polysemous words) and lower-frequency idioms still

cause problems (Correa, 2014; Ducar & Schocket, 2018). Consciousness-raising activities

should be conducted with learners to make them aware of this shortcoming rather than to

condemn the deficiency of Google Translate. Google Translate can offer multiple options for

f
polysemous words of single word items; however, this function may not be readily available

oo
while submitting L1 text in the form of a sentence, proverb, or idiom.

- pr
re
Third, MT should be utilized to provide a scaffold for L2 writing, through which the learners
lP

can use their working memory to refine their text for accurately expressing themselves. MT

will free the learners from having to pay attention to grammar rules or literal translations so
na

that they can be more involved in improving the content or rhetorical features of the text (Lee,
ur

2019; Tsai, 2019).


Jo

4. Conclusion, Limitations, and Recommendations for Further Research

The skill that is necessary for the 21st century is “technology literacy,” which requires learners

to understand the machines that make information accessible (Jewitt, 2012). Crossley (2018)

suggested that there is likely to be “technological disruption” where potential language learners

will rely on MT applications to communicate in a target language. Similarly, current L2 writing

may require learners to extend their abilities to search and analyze information as they write

online. In addition to dictionaries and concordancers, MT can be considered a reference tool for

L2 writing. Although this tool cannot yet be a substitute for the role of a fluent human translator,
24
web-based MT, most noticeably Google Translate, can be expected to be the referencing

technology of the next generation (Kirchhoff, Turner, Axelrod, & Saavedra, 2011; Van

Rensburg et al., 2012). The benefits of MT are potentially multifaceted: MT can function

similar to a dictionary for productive purposes (e.g., to check on the meaning of a partially

known L2 word, or to retrieve an unknown L2 word for a L1 lexical item already known),

and/or for finding usage information regarding lexico-grammatical patterns, and also to

confirm the intended meaning for an L2 text written by the learner.

f
oo
With developing interest in the use of MT for L2 writing and to learn more about the

pr
characteristics of this reference tool, qualitative and quantitative studies should be further
-
re
conducted to compare the renditions of MT with the L2 writers’ post-edited version of the MT
lP

text; such studies will be able to address the limitation of the present study. The students in the

present study had the opportunity to improve their text by using the MT rather than using the
na

MT texts as a finished product. In other words, the MT texts that the learners produced were
ur

not its pure products. Moreover, in exploring the relationship between L2 learners’ writing
Jo

proficiency and quality of their MT writing outcomes, matching of genres between the

diagnostic and study tasks may provide a more theoretically valid explanation for the effects of

writing proficiency. Further studies within the framework of process-oriented studies can be

conducted with think-aloud methodology or stimulated recall to learn more about the MT-

assisted writing process. At a larger scale, Crossley (2018) predicted that the rise of MT would

lead to greater scientific understanding of language learning and language acquisition, which

will help us to analyze how languages are learned, coded, stored, and processed.

Funding

This work was supported by the National Research Foundation of Korea Grant, which is
25
funded by the Korean Government (NRF-2017S1A5A2A01024598).

References

Author XX.
Belam, J. (2003, September). Buying up to falling down”: A deductive approach to teaching
post-editing. Paper presented at the MT Summit IX: Workshop on teaching translation
technologies and tools, New Orleans, USA.
Bitchener, J., & Ferris, D. R. (2012). Written corrective feedback in second language

f
oo
acquisition and writing. New York: Routledge.
Broda, B., Niton, B., Gruszczynski, W., & Ogrodniczuk, M. (2014). Measuring Readability of

pr
Polish Texts: Baseline Experiments. LREC, 24, 573-580.
Bulté, B., & Housen, A. (2012). Defining and operationalising L2 complexity. In a. Housen, F.
-
re
Kuiken & I. Vedder (Eds.), Dimensions of L2 performance and proficiency:
Complexity, accuracy and fluency in SLA (pp. 23–46). Amsterdam/Philadelphia: John
lP

Benjamins.
Correa, M. (2014). Leaving the “peer” out of peer-editing: Online translators as a pedagogical
na

tool in the Spanish as a second language classroom. Latin American Journal of


ur

Content & Language Integrated Learning, 7(1), 1–20.


Cook, G. (2010). Translation in language teaching: An argument for reassessment. Oxford
Jo

University Press.
Crossley, S. A. (2018). Technological disruption in foreign language teaching: The rise of
simultaneous machine translation. Language Teaching, 51(4), 541–552.
Dörnyei, Z., & Scott, M. L. (1997). Communication strategies in a second language:
Definitions and taxonomies. Language Learning, 47(1), 173–210.
Ducar, C., & Schocket, D. H. (2018). Machine translation and the L2 classroom: Pedagogical
solutions for making peace with Google translate. Foreign Language Annals, 51(4),
779–795.
Færch, C., & Kasper, G. (1983). On identifying communication strategies in interlanguage
production. In C. Færch & G. Kasper (Eds.), Strategies in interlanguage

communication (pp. 210–238). London: Longman Pub Group.

Ferris, D. (2011). Treatment of error in second language student writing. Michigan:


26
University of Michigan Press.
Ferris, D., & Roberts, B. (2001). Error feedback in L2 writing classes: How explicit does it
need to be?. Journal of Second Language Writing, 10(3), 161–184.
Foster, P., & Skehan, P. (1996). The influence of planning and task type on second language
performance. Studies in Second Language Acquisition, 18(3), 299–323.
Graesser, A. C., McNamara, D. S., & Kulikowich, J. M. (2011). Coh-Metrix: Providing
multilevel analyses of text characteristics. Educational Researcher, 40(5), 223–234.
Groves, M., & Mundt, K. (2015). Friend or foe? Google Translate in language for academic
purposes. English for Specific Purposes, 37, 112–121.

f
Hasselgren, A. (1994). Lexical teddy bears and advanced learners: A study into the ways

oo
Norwegian students cope with English vocabulary. International Journal of Applied
Linguistics, 4(2), 237–258.
- pr
Hoetker, J., & Brossell, G. (1989). The effects of systematic variations in essay topics on the
writing performance of college freshmen. College Composition and Communication,
re
40(4), 414–421.
lP

Hunt, K. W. (1970). Syntactic maturity in school children and adults. Chicago: University of
Chicago Press.
na

Hyland, K., & Hyland, F. (Eds.). (2019). Feedback in second language writing: Contexts and
issues. New York: Cambridge University Press.
ur

Jacobs, H. L., Fay Hartflel, V., Hughey, J. B., & Wormuth, D. R. (1981). Testing ESL
Jo

composition: A practical approach. Rowley, MA: Newbury House Publishers.


Jarvis, S. (2013). Defining and measuring lexical diversity. In S. Jarvis & M. Daller (Eds.),
Vocabulary knowledge: Human ratings and automated measures (pp. 13–45).
Amsterdam: John Benjamins.
Jeong, H. (2017). Narrative and expository genre effects on students, raters, and performance
criteria. Assessing Writing, 31, 113–125.
Jewitt, C. (2012). Technology, literacy, learning: A multimodal approach. New York:
Routledge.
Jia, Y., Carl, M., & Wang, X. (2019). How does the post-editing of neural machine translation
compare with from-scratch translation? A product and process study. The Journal of
Specialised Translation, 31, 61–86.
Kirchhoff, K., Turner, A. M., Axelrod, A., & Saavedra, F. (2011). Application of statistical
machine translation to public health information: A feasibility study. Journal of the
27
American Medical Informatics Association, 18(4), 473–478.
Kliffer, M. (2005). An experiment in MT post-editing by a class of intermediate/advanced
French majors. Proceedings of EAMT 10th Annual Conference (pp. 160–165).
Budapest, Hungary.
Kobayashi, H., & Rinnert, C. (1992). Effects of first language on second language writing:
Translation versus direct composition. Language Learning, 42(2), 183–209.
Kol, S., Schcolnik, M., & Spector-Cohen, E. (2018). Google Translate in academic writing
courses? The EuroCALL Review, 26(2), 50–57.
Laufer, B., & Girsai, N. (2008). Form-focused instruction in second language vocabulary

f
learning: A case for contrastive analysis and translation. Applied Linguistics, 29(4),

oo
694–716.
Le, Q. V., & Schuster, M. (2016). A neural network for machine translation, at production
scale. Retrieved January 16,
- pr
2020, from the World
https://research.googleblog.com/2016/09/a-neural-network-for-machine.html
Wide Web:
re
Lee, S. M. (2019). The impact of using machine translation on EFL students’ writing.
lP

Computer Assisted Language Learning, 1–19.


Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing.
na

International Journal of Corpus Linguistics, 15, 474–496.


Mackey, A., & Gass, S. M. (2015). Second language research: Methodology and design. New
ur

York: Routledge.
Jo

McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of
sophisticated approaches to lexical diversity assessment. Behavior Research Methods,
42(2), 381–392.
Mundt, K., & Groves, M. (2016). A double-edged sword: the merits and the policy
implications of Google Translate in higher education. European Journal of Higher
Education, 6(4), 387–401.
Niño, A. (2008). Evaluating the use of machine translation post-editing in the foreign
language class. Computer Assisted Language Learning, 21(1), 29–49.
Norris, J. M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in
instructed SLA: The case of complexity. Applied Linguistics, 30(4), 555–578.
Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A
research synthesis of college‐level L2 writing. Applied Linguistics, 24(4), 492–518.
O’Loughlin, K., & Wigglesworth, G. (2007). Investigating task design in academic writing
28
prompts. In L. Taylor, & P. Falvey (Eds.), IELTS collected papers. Research in
speaking and writing performance (pp. 379–421). Cambridge: Cambridge University
Press.
Park, Y., Min, H., & Kim, J. (2014). Development of writing assessment software for high
school classroom in Korea (Report No. RRE 2014-11). Seoul: Korea Institute for
Curriculum and Evaluation.
Park, T. S., & Song, M. J. (2008). The primary causes of article errors made by Korean
advanced learners of English. English Teaching, 63(3), 71–90.
Paulus, T. M. (1999). The effect of peer and teacher feedback on student writing. Journal of

f
Second Language Writing, 8(3), 265–289.

oo
Polio, C., & Friedman, D. A. (2017). Understanding, evaluating, and conducting second
language writing research. London: Routledge.
- pr
Qi, D. S., & Lapkin, S. (2001). Exploring the role of noticing in a three-stage second
language writing task. Journal of Second Language Writing, 10(4), 277–303.
re
Roehr-Brackin, K. (2018). Metalinguistic awareness and second language acquisition. New
lP

York: Routledge.
Rollinson, P. (2005). Using peer feedback in the ESL writing class. ELT Journal, 59(1), 23–
na

30.
Saito, K. (2018). Advanced second language segmental and suprasegmental acquisition. In P.
ur

A. Malovrh & A. G. Benati (Eds.), The handbook of advanced proficiency in second


Jo

language acquisition (pp. 282–303). New Jersey: Wiley Blackwell.


Schoonen, R., van Gelderen, A., Stoel, R., Hulstijn, J., & de Glopper, K. (2011). Modeling
the development of L1 and EFL writing proficiency of secondary school students.
Language Learning, 61, 31–79.
Schütze, U. (2017). Language learning and the brain: Lexical processing in second language
acquisition. Cambridge: Cambridge University Press.
Stapleton, P., & Kin, B. L. K. (2019). Assessing the accuracy and teachers’ impressions of
Google Translate: A study of primary L2 writers in Hong Kong. English for Specific
Purposes, 56, 18–34.
Swain, M., & Lapkin, S. (1995). Problems in output and the cognitive processes they
generate: A step towards second language learning. Applied Linguistics, 16(3), 371–
391.
Tsai, S. C. (2019). Using Google Translate in EFL drafts: A preliminary investigation.
29
Computer Assisted Language Learning, 32(5-6), 510–526.
Wallwork, A. (2016). Using Google Translate and analysing student-and GT-generated
Mistakes. In English for Academic Research: A Guide for Teachers (pp. 55–68).
Cham: Springer.
Wang, W., & Wen, Q. (2002). L1 use in the L2 composing process: An exploratory study of
16 Chinese EFL writers. Journal of Second Language Writing, 11(3), 225–246.
Woodall, B. R. (2002). Language-switching: Using the first language while writing in a
second language. Journal of Second Language Writing, 11, 7–28.
Van Rensburg, A., Snyman, C., & Lotz, S. (2012). Applying Google Translate in a higher

f
education environment: Translation products assessed. Southern African Linguistics

oo
and Applied Language Studies, 30(4), 511–524.
Van Weijen, D., Van den Bergh, H., Rijlaarsdam, G., & Sanders, T. (2009). L1 use during L2

writing, 18(4), 235–250.


- pr
writing: An empirical study of a complex phenomenon. Journal of Second language
re
Vold, E. T. (2018). Using machine translated texts to generate L3 learners’ metalinguistic talk.
lP

In Å. Haukås, C. Bjørke & M. Dypedahl (Eds.), Metacognition in language learning


and teaching (pp. 67–97). New York: Routledge.
na
ur
Jo

30
Appendices

A. Writing Prompts for the Three Modes of Writing

 Writing prompt with Machine Translation

The following occurs in the sequence of Figures (1), (2), and (3). Describe the situations
respectively for Figures (1) and (2), and then write about what may happen for Figure (3).
Write your draft in L1 first, and then use a machine translator (e.g., Google Translate)
to write the draft in English, and then make revisions. Write about 300 words.
1 2 3

- pr
oo
f
?
re
lP
na

 Writing prompt for Direct Writing


ur

The following occurs in the sequence of Figures (1), (2), and (3). Describe the situations
respectively for Figures (1) and (2) as shown, and write about what may happen for Figure
Jo

(3). Write your draft in English and continue to revise. Write about 300 words.
1 2 3

?
31
 Writing prompt for Translated Writing

The following Figures (1), (2), and (3) occur in order. Describe the situations respectively for
Figures (1) and (2) as shown, and write about what may happen for Figure (3). Write your
draft in Korean first, and then try to translate what you have written into English. Write
in about 300 words. Moreover, make sure your Korean draft is saved together with your L2
writing.
1 2 3

?
f
oo
- pr
re
lP
na

B. Writing Prompt for Diagnostic Essay


ur

Please state your position as to whether it is appropriate to install closed-circuit television


Jo

(CCTV) cameras in schools. After considering the advantages and disadvantages of having

them in schools, discuss three reasons for your position. Make sure you write an introduction

and a conclusion for your essay. Write at least 400 words.

32
List of Tables

Table 1. Writing Topics and Writing Modes


Table 2. Classification of Writing Errors with Examples
Table 3. Three Writing Modes and Writing Abilities
Table 4. Descriptive Statistics and Inferential Statistics for Linguistic Complexity
Table 5. Descriptive Statistics and Inferential Statistics for Error Rates
Table 6. Errors Resulting from Mistranslations

of
ro
-p
re
lP
na
ur
Jo
Table 1
Writing Topics and Writing Modes
N Writing Task 1 Writing Task 2 Writing Task 3
Machine-translated
Direct Writing, Translated Writing,
Group 1 20 Writing,
Topic A Topic B
Topic C
Machine-translated
Direct Writing, Translated Writing,
Group 2 24 Writing,
Topic C Topic A
Topic B
Machine-translated
Translated Writing, Direct Writing,
Group 3 22 Writing,
Topic C Topic B
Topic A

of
Note: Topic A = Amusement Park, Topic B = Subway, Topic C = FIFA World Cup

ro
-p
re
lP
na
ur
Jo
Table 2
Classification of Writing Errors with Examples
Error Type (Code) Description
1. Article (Art) Cases where an article is missing
e.g., As the woman receives call from her friend. (MW #29)
2. Mechanics (MC) Spelling, punctuation, and capitalization errors
e.g., What is wrong with tv? (MW #65)
3. Mistranslation (MT) Includes incorrect translations, words, or expressions that are
incomprehensible
e.g., On 8:00pm I turned on tv, however, the screen of the television did not
come out. (MW #7)
[As in when the learner wanted to indicate that the TV screen showed
nothing.]
4. Noun Endings (N) Refers to cases where plural nouns are in incorrect forms or nouns that are
incorrectly written in the plural form
e.g., After grabbing my hair, I call my boyfriends. (TW #9)

of
5. Preposition (Prep) Cases where prepositions are misused or omitted
e.g., There were three girls who felt intimate each other. (MW #46)

ro
6. Pronoun (Pro) Involves the incorrect use of personal pronouns or relative pronouns
e.g., Me and my friends looked forward to go there. (DW #2)
7. Sentence Structure (SS) Includes cases of i) run-on sentences (including comma splice) and
-p
fragments, ii) verb phrase errors, and iii) word order or phrase order errors.
e.g., I’ll buy pizza, chicken and beer you have to turn on the TV. (TW #38)
re
8. Subject-Verb Agreement Cases where the predicate does not agree with the number of the subject
(SV) e.g., I knew the fact that there are South Korea and Greece soccer game at
the eight o'clock. (GT #23)
lP

9. Verb Form (VF) Cases of wrong voice (active/passive) or confusion between infinitives and
gerunds
e.g., I excited about the game which begins in 20:00 today. (GT #12)
na

10. Verb Tense (VT) Cases of wrong verb tense (e.g., not using perfect tense when necessary) or
inconsistency in verb tense use
e.g., I feel very bad. I struggled with my friends, and I immediately booked
ur

another guesthouse. (GT #34)


11. Word Choice (WC) Includes lexical errors for i) incorrect word choice or ii) unclear messages
or awkward expressions
Jo

e.g., Some people ran up for looking for the driver’s life. (TW #34)
12. Word Form (WF) Instances of i) incorrect part of speech or ii) ill-formed word
e.g., It was a very tired day. (DW #54)
Note: DW = Direct Writing, TW = Translated Writing, MW = Machine-translated Writing
Table 3
Three Writing Modes and Writing Abilities
M SD T df Sig.
Two Prof Group
Less Skilled Learners 65.34 13.13
Diagnostic Writing -6.194 64 .000**
Skilled Learners 79.54 4.21
Less Skilled Learners 77.74 4.77
Direct Writing -2.291 64 .025*
Skilled Learners 80.80 5.84
Less Skilled Learners 77.46 7.36
Translated Writing -1.290 64 .202
Skilled Learners 79.51 5.57

Machine-translated Less Skilled Learners 79.14 5.88


.180 64 .858
Writing Skilled Learners 78.88 5.58

of
Note: **p < .001, *p < .05

ro
-p
re
lP
na
ur
Jo
Table 4
Descriptive Statistics and Inferential Statistics for Linguistic Complexity
Machine-
Translated
Direct Writing translated
Writing F Post-hoc
writing
M SD M SD M SD
Lexical Measures
Text length 1<2**
167.95 51.28 181.23 46.35 194.35 54.76 11.405*** 1<3***
(No. of words) 2<3*
Sentence 1<2**
length 11.47 2.99 12.56 3.32 13.87 3.34 18.717*** 1<3***
(No. of words) 2<3**
1=2
3,000 Word 1<3*
0.84 0.97 0.91 0.93 1.21 1.01 3.402*
Families 2=3
1<2**

of
4,000 Word 1<3**
0.54 0.68 1.19 1.28 1.04 0.98 6.736**
Families 2=3

ro
Complex 1=2
11.18 4.85 12.03 6.23 15.26 6.96 9.647*** 1<3***
words 2<3**

MTLD 56.26 16.27 62.43


-p
17.34 59.70 17.16 3.276*
1<2**
1=3*
2=3
re
Syntactic Measures
1<2**
lP

Words/t-unit 11.10 2.62 12.02 2.82 12.43 2.83 7.210** 1<3**


2=3
1<2*
Clause/t-unit 1.46 0.23 1.55 0.30 1.51 0.24 3.701* 1=3
na

2=3
Modifiers per 1=2
0.65 0.16 0.69 0.13 0.79 0.14 15.900*** 1<3***
Noun Phrase 2<3***
ur

Structural 1>2***
0.15 0.05 0.13 0.03 0.13 0.04 11.426*** 1>3***
similarity 2=3
Jo

Note: ***p < .001, **p < .01, *p < .05; 1=Direct Writing, 2=Translated Writing, 3=Machine-translated
Writing; MTLD = measure of textual lexical diversity
Table 5
Descriptive Statistics and Inferential Statistics for Error Rates
Machine-
Translated
Direct Writing translated
Writing F Post-hoc
writing
M SD M SD M SD
1<2*
Articles 1.21 1.27 1.93 1.57 0.99 1.04 9.542*** 1=3
2>3***
Mechanics 1.02 1.44 1.18 1.22 0.76 0.98 3.041 N/A
1<2*
Mistranslations 0.02 0.11 0.16 0.37 0.17 0.44 4.808* 1<3*
2=3
Noun Endings 0.13 0.35 0.23 0.44 0.24 0.42 1.991 N/A
1=2

of
Prepositions 0.99 1.24 1.14 0.97 0.62 0.67 6.440** 1=3
2>3***

ro
Pronouns 0.18 0.35 0.25 0.56 0.22 0.42 .414 N/A

Sentence
Structures
0.51 0.66 1.00
-p
1.01 0.67 0.98 7.733**
1<2**
1=3
2=3
re
Subject-verb
0.05 0.18 0.09 0.25 0.16 0.47 2.174 N/A
Agreements
lP

1=2
Word choice 1.00 0.89 1.24 1.22 1.57 1.28 4.523* 1<3*
2=3
na

Word Form 0.18 0.36 0.36 0.59 0.23 0.40 2.656 N/A
Verb Form 0.32 0.42 0.48 0.77 0.29 0.43 2.351 N/A
ur

Verb Tense 0.59 0.87 0.74 1.25 0.54 0.78 .835 N/A
Note: p < .05*, p < .01**, p < .001***; 1=Direct Writing, 2=Translated Writing, 3=Machine-translated Writing
Jo
Table 6
Mistranslation errors
Intended message / L1 expression Mistranslation/Word choice errors
The TV showed a black screen. Nothing • “tv did not come out yet”
appeared on the TV screen. • “Screen did not come out correctly.”
‘화면이 나오지 않았다’ • “television did not float any video screen”
• “But TV has become the off”
• “But it came out only black channel”
Felt bewildered/ felt frustrated/ did not know • “… there was no picture on the screen. I was
what to do very embarrassed.”
‘당황하였다’ • “TV screen has become invisible, so I was
embarrassed.”
• “Then the TV did not work at all.
Embarrassment and irritation rises at the same
time.”

of
• “The boy gets embarrassed because he cannot
watch the game.”

ro
One anticipated/ looked forward to an event / • “[The World Cup match] is really expected.”
an anticipated event • “I expected much, so I was very angry about
‘기대하던’ ‘기대했다’ -p
this situation.”
• “It was the amusement park that we expected
for a few days.”
re
lP
na
ur
Jo

You might also like