Smart 2014
Smart 2014
http://journals.cambridge.org/REC
Jonathan Smart
Abstract
This study examines the role of guided induction as an instructional approach in paper-based data-
driven learning (DDL) in the context of an ESL grammar course during an intensive English program at
an American public university. Specifically, it examines whether corpus-informed grammar instruction
is more effective through inductive, data-driven learning or through traditional deductive instruc-
tion. In the study, 49 participants completed two weeks of ESL grammar instruction on the passive
voice in English. The learners participated in one of three instructional treatments: a data-driven
learning treatment, a deductive instructional treatment using corpus-informed teaching materials,
and a deductive instructional treatment using traditional (i.e., non-corpus-informed) materials.
Results from pre-test, post-test, and delayed post-test indicated that the DDL group significantly
improved their grammar ability with the passive voice, while the other two treatment groups did not
show significant gains. The findings from this study suggest that in this learning context there are
measurable benefits to teaching ESL grammar inductively using paper-based DDL.
1 Introduction
Language corpora and corpus-informed teaching materials have been used in language
teaching over several decades, and the predominant model of teaching with corpora in
as been data-driven learning or DDL, originally developed and promoted by Johns and
colleagues (Johns, 1991; Johns & King, 1991). DDL allows learners to inductively discover
language structures and patterns through interacting with concordancing software or
with concordance-based instructional materials. Although researchers and proponents of
corpus-informed instruction have been cautiously optimistic about the potential of DDL in
language teaching since Johns’ groundbreaking work, direct uses of corpora in language
teaching have not been widely adopted (Conrad, 2005; Flowerdew, 2012; Römer, 2006).
The goal of this paper is to examine the role that guided induction plays in DDL-based
ESL grammar instruction through a classroom-based study comparing DDL with more
traditional approaches to grammar instruction.
Classroom-based research on learning outcomes and learner attitudes towards DDL has
been carried out in several areas of language learning. Much of this research has been in the
field of vocabulary learning (cf. Boulton, 2010a, 2012 for comprehensive lists of empirical
The role of guided induction in paper-based data-driven learning 185
DDL research). Researchers have also seen the potential for corpora to contribute to
grammar instruction, as corpus-based research in grammar has contributed greatly to
descriptive linguistics (Conrad 2000, 2005). There have been several classroom-based
studies of teaching grammar with DDL, including work by Belz and Vyatkina (2005a,
2005b, 2008), Boulton (2008, 2009a, 2009b, 2010b), Chambers and O’Sullivan (2004),
Conroy (2010), Estling Vannestål and Lindquist (2007), Gaskell and Cobb (2004), Hadley
(2002), O’Sullivan and Chambers (2006), Pérez-Paredes, Sánchez-Tornel and Alcaraz
Calero (2012), Pérez-Paredes, Sánchez-Tornel, Alcaraz Calero and Aguado Jiménez
(2011), Tian (2005a, 2005b), Whistle (1999), Yoon (2008) and Yoon and Hirvela (2004).
Many of these studies have measured learners’ attitudes and experiences with DDL and
grammar instruction, and some have also measured learning outcomes and found benefits
to using DDL in grammar instruction. Though there is evidence to support such uses of
DDL, it is not entirely clear whether the benefits are a result of the inductive approach to
language instruction, the use of corpus-informed materials and tools, or a combination
thereof. In short, there is still a need for further research to understand how the inductive
approach to DDL contributes to language learning (Flowerdew, 2009).
Despite the research supporting DDL, a recent trend over the last decade in developing
corpus-informed grammar textbooks, has not been the incorporation of an inductive
approach to grammar learning, but rather the presentation of information from corpora and
corpus-based research about grammar in an explicit, rule-centered framework which runs
counter to the inductive principles inherent in DDL (cf. Conrad & Biber, 2009; McCarthy,
McCarten & Sandiford, 2005; Reppen, Bunting, Diniz, Blass, Iannuzzi & Savage, 2012).
While there are examples of corpus-driven grammar books that are not purely rule-based
(e.g., Thornbury, 2004), they remain rare. The apparent contrast between DDL research on
grammar instruction and corpus-informed grammar textbooks could, perhaps, be dismissed
as an effect of the fundamental difference between a textbook and an in-class approach to
instruction based around learner-centered, computer-based activities.
However, over the history of DDL, there has been a range of approaches to learning from
corpora as outlined in Mukherjee (2006: 12), varying in the degree of learner autonomy and
the degree to which computers and concordancing software have been used directly by
learners. Since Johns’ early work in DDL, a popular instructional strategy has been to have
learners interact directly with corpora using concordancers to conduct their own searches of
language features and patterns. However, as an example of rethinking traditional computer-
based DDL, Boulton (2010b) demonstrates that DDL can still be effective with prepared,
preselected inductive activities based on corpora, and that this type of less autonomous,
paper-based DDL may actually be more effective with certain groups of learners.
In fact, interacting directly with corpora is not a necessary part of DDL, nor is it always
desirable. Hands-on DDL faces several hurdles, including helping learners understand what
a corpus is and why it would be a useful tool in language learning, familiarizing learners
with a concordancer (often using a software interface not in their L1), and helping learners
decide what language features can be used with the software. A teacher must carefully
consider these potential challenges when considering implementing DDL in a grammar
classroom. Boulton (2010b) suggests that when working with lower-level learners
186 J. Smart
especially, using computers for DDL is possibly an unnecessary complication when activ-
ities with pre-prepared DDL materials can accomplish many of the same goals. There are
other situations, additionally, where using prepared DDL materials instead of having lear-
ners interact directly with corpora may be more beneficial. For learners who are unfamiliar
with using software in language learning and for those who are unaccustomed to inductive
or learner-centered activities, the challenge of directly interacting with corpora may have the
unintended result of inhibiting learning instead of being a benefit.
So DDL, in its use over the last few decades, is not necessarily characterized by direct
interaction with language corpora or by total learner autonomy. Rather, I propose that two
particular characteristics define DDL:
1) real language data are used as sources of language learning materials or reference
resources;
2) learning activities are student-centered and focus on language discovery.
DDL, under this definition, is not based on having learners work semi-autonomously with
concordancing software, but rather relies on carefully designed and scaffolded activities to
help learners discover language structures through working with real language samples,
whether on computer or not. In this sense, explicit presentations of rules in corpus-informed
grammar textbooks and discovery learning through DDL are not solely distinguished from
one another by mode of delivery, but also in their understanding of how grammar is learned.
Only the first characteristic listed above, using language from authentic texts as a tool for
learning, has been a consistent link between DDL research and the recent emergence
of corpus-informed textbooks. In fact, the benefits of using corpus-based language
samples and language descriptions have been the foundation of both DDL research and the
development of corpus-informed grammar textbooks.
In both hands-on and paper-based DDL the pedagogical approach focuses on induction as a
learning strategy (Flowerdew, 2009; Mukherjee, 2006). The assumption is that there is an
inherent connection between inductive language discovery and corpora as learning tools, and
this assumption is not shared by the aforementioned corpus-informed grammar textbooks like
the Touchstone series (McCarthy et al., 2005), Real grammar (Conrad & Biber, 2009), or
Grammar and beyond (Reppen et al., 2012). Rather, these texts present information about
language deductively with explicit information about grammar rules and patterns. The exis-
tence of textbooks in parallel with DDL-based approaches to grammar instruction outlines an
important question, in the current research on learning with corpora, of whether instruction
benefits from an inductive or a deductive approach. When the current research on DDL has
shown positive learning outcomes, are these gains attributable to the use of corpus-informed
teaching materials and tools, or to the inductive approach of DDL, or both?
At the less autonomous end of Mukherjee’s (2006) cline of DDL, learning through
preselected concordance materials does not necessarily entail a purely inductive approach.
Rather, the language discovery activities are guided and facilitated by an instructor
(Boulton, 2010b; Flowerdew, 2009; Stevens, 1991). Research on the pedagogical approach
of guided induction has sought to examine best practices of teacher-facilitated discovery
learning in second language teaching. Sinclair (2003), in particular, models how facilitated
The role of guided induction in paper-based data-driven learning 187
3. Intervention: optional step to provide learners with hints or clearer guides for
induction.
4. Induction: making one’s own rule for a particular feature.
Consequently, corpus-based activities in this guided inductive approach would begin
with learners looking at pre-selected examples of language data. They are then guided by the
teacher to make observations, typically through group- or pair-based problem-solving
activities, to identify patterns or trends about the language data. At this point, the teacher can
if necessary intervene with careful hints or suggestions for the learners to help them
accomplish the problem-solving activities. Finally, the learners are guided to complete
subsequent activities using the patterns of the grammar feature under review.
2 The study
Following Boulton’s (2010b) model of teaching using corpora without computers, this
study compared the learning outcomes from three groups of learners each participating
in a different instructional treatment. The first group (data-driven learning, henceforth
DDL) completed a guided inductive treatment using the aforementioned practices in
The role of guided induction in paper-based data-driven learning 189
Female 4 2 5 12
Male 12 13 13 37
L1 Arabic 11 13 13 37
L1 Chinese 5 2 5 12
Total 16 15 18 49
Test Scores
M SD
paper-based DDL. The second group (i.e., deductive, corpus-informed instruction, hence-
forth DCI) completed a deductive, rule-based instructional treatment using teaching
materials derived from language corpora. Finally, to provide a baseline for comparison, a
third instructional group received traditional grammar instruction (TGI) using conventional
teaching materials. The participants in each group first completed a pre-test on the gram-
matical structure of the passive voice, then over the course of two weeks received four hours
of instruction and practice on this. They completed an immediate post-test following
instruction, and a delayed post-test two weeks later.
2.1 Participants
Volunteer participants were recruited from three advanced grammar classes in an intensive
English program at a regional public American university. The learners were L2 learners of
English who had completed, on average, 5.4 years of English study, and had a mean age
of 22. In addition to the grammar class, the learners also took courses in listening and
speaking, composition, and reading and vocabulary. Of the 49 participants, 38 were male
and 11 were female. All participants were from one of two L1 language backgrounds:
Chinese was the first language for 12 participants and Arabic for the remaining 37. Neither
sex nor the L1 were controlled for, as pre-existing classes were used for the study. However,
the three groups (DDL, DCI, and TGI) had similar distributions of Chinese and Arabic
speakers and of males and females (Table 1).
Prior to the instructional treatment, all participants completed a general test of grammar
knowledge based on the structure section of the paper-based TOEFL (i.e., section 2)
(Table 2). A one-way ANOVA of the participants’ mean scores indicated no significant
differences between the three treatment groups, F (2, 43) = .988, p = .38.
190 J. Smart
The researcher was also the classroom instructor for all three groups. This decision was
made in order to maintain as consistent instruction as possible across the three groups and to
control for variation in teaching style and related effects. In order to reduce the possibility of
influencing the outcome of the instructional treatments (i.e., via a Hawthorne effect), the
teacher kept a reflective journal of his own behavior in each of the classes, both prior to and
during the instructional treatments, as a way to monitor interactions with the treatment
groups. Each of the three classes met twice weekly for one-hour sessions, and the study was
conducted over a three-week period, with four class periods (i.e., four hours) dedicated to
instruction on the active and passive voice.
Each grammar class received instruction on the passive voice using a different instructional
approach. For the two corpus-informed treatment groups (DDL and DCI), teaching
materials were developed using available corpus-based descriptions of the passive voice
(Biber, Johansson, Leech, Conrad & Finegan, 1999; Conrad & Biber, 2009); an example of
a data-driven activity is provided in Appendix A. Grammar teaching materials for the two
approaches to corpus-informed instruction were developed using an operational framework
for describing the passive voice based on Celce-Murcia and Larsen-Freeman’s (1999; also
Larsen-Freeman 2001) three-dimensional model of Form – Meaning – Use. Information on
use included simple register differences in use (between speech and academic writing) as
well as differences in use between ‘short’ and ‘long’ passives.
Authentic language samples were also used in developing materials, the samples being
taken from academic prose and news articles from two publicly available online corpora: the
TIME Magazine corpus (Davies 2007) and the Corpus of Contemporary American English
(Davies 2008). Whenever possible, the language samples were used for materials for both
corpus groups.
In the DDL group, the instructional approach was based on the principles of guided
induction and followed the ‘four I’s outlined in Flowerdew (2009): illustration, interaction,
intervention, and induction. Specifically, the participants received preselected language
samples (as printed concordance lines) that were designed to illustrate specific differences in
form, meaning, and use of the passive voice. The participants worked in small groups and in
jigsaw activities to identify and explain what they found in the examples. Following the
discovery activities, they shared their findings with the class in a discussion activity, revised
their findings according to feedback from classmates, and then completed additional pro-
ductive writing activities where they applied their rules.
The DCI group used corpus-informed materials in a deductive instructional approach. The
instruction followed the widely used instructional approach of Presentation – Practice –
Production (PPP). This group first received corpus-informed grammar rules about the passive
voice, including information about form, meaning, and use. They then practiced different
forms of the passive voice in different contexts (in exercises designed using sentences taken
from the corpora), and then produced the target structures in the same short writing activities
as the DDL group.
As a point of comparison, the TGI group received grammar instruction following the same
deductive instructional approach as the DCI group (i.e., PPP), but used traditional grammar
teaching materials instead of corpus-informed descriptions of rules and activities. This group
The role of guided induction in paper-based data-driven learning 191
first received instruction on the form, meaning, and use of the passive voice based on widely
used grammar textbooks, and then completed practice activities from the textbooks. Following
the practice, they also produced the passive voice in short writing tasks.
2.3 Instruments
Prior to the instructional period, the participants in all three groups were given a short pre-test of
their grammar ability related to the active and passive voice in English. Immediately following
the two weeks of instruction, the participants completed another test with the same test tasks.
Then, two weeks following the instructional period, participants completed a delayed post-test,
again with the same test tasks. Given the short duration of each class period (60 minutes), the
tests were designed to be short demonstrations of the participants’ overall grammar knowledge
in relation to the active and passive voice in English through different tasks.
The pre-test, post-test, and delayed post-test all consisted of three tasks (see sample test in
Appendix B) that were designed to measure ability related to form, meaning, and use
(Table 3). The first task was to determine if sentences contained errors in the use of either
the active or passive voice, and if so, to correct the error. This task was designed to measure
the participants’ knowledge of the form and meaning of the active and passive voice. The
second task was to rewrite active sentences into the passive voice or vice versa; and in
the third task, participants had to decide which type of sentence would be more appropriate
based on the register of use (i.e., speech or academic writing).
The goal of the register awareness task was to measure participants’ immediate ability to
distinguish which use of a particular voice was appropriate in one of the two major registers
of speech and academic writing (Biber, 1988). Understanding how registers differ based on
language structure is a valuable area of language ability for learners (Aguado-Jiménez,
Pérez-Paredes & Sánchez, 2012), and while simplistic, the task was designed to provide
some feedback on how the instructional treatments may have affected participants’
awareness of register differences.
2.4 Analysis
The participants’ performance on the three tests were scored and mean scores were calcu-
lated across the groups (Table 4). As the data met the assumptions of normality for
192 J. Smart
df SS MS F p
parametric analysis, ANOVA tests were conducted to determine whether there was
improvement during the study among the three groups. Initially, a one-way ANOVA was
conducted on the participants’ pre-test scores to determine if there were differences between
groups prior to the instructional treatment. Subsequently, Repeated Measures ANOVAs
were conducted for each treatment group to measure their performance on the three
assessment tasks over time. Additionally, scores on individual test tasks were analyzed to
compare performance across the three treatment groups. However, as these individual task
scores were based on low point values and did not individually meet the assumptions
of normality or sphericity, non-parametric Kruskal Wallis H tests were used in this stage of
the analysis.
3 Findings
The one-way ANOVA on pre-test scores failed to indicate significant differences between
the three treatment groups prior to instruction, F (2, 43) = 2.993, p > .05 (Table 5). This
suggests that the participant groups were at the same ability at the beginning of the
instructional period.
The RMANOVA analyses revealed that the overall tests of passive voice ability can
distinguish between the three groups. The DDL group showed statistically significant
increases in their mean scores from the pre-test to the post-tests while the other two groups
did not.
TGI group: mean scores from this group increased somewhat from the pre-test to the
post-test, but decreased in the delayed post-test (delayed post-test scores were very close to
pre-test scores). The RMANOVA of overall test scores for the TGI group indicated that
there was a significant main effect for test, F (2, 24) = 7.06, p < .05 (Table 6). However,
pairwise comparisons indicated that the increase in mean scores from pre-test to post-test
was not significant (p = .02) when using a Bonferroni adjustment of α1 = .05/3 = .017 to
The role of guided induction in paper-based data-driven learning 193
df SS MS F p η2
df SS MS F p η2
df SS MS F p η2
calculate significance. Rather, the comparisons revealed that the significant main effect was
due to a significant decrease in mean scores from the post-test to the delayed post-test
(p = .001). These results indicate that the gains from pre-test to post-test were not significant
while the decrease following the instructional period (i.e., from post-test to delayed post-
test) was, in fact, significant. Consequently, there is no evidence of measurable learning
from the TGI instructional treatment.
DCI group: the one-way RMANOVA failed to show a significant main effect for test,
F (2, 18) = .594, p > .05 (Table 7). The changes in mean scores between pre-test (2.60),
post-test (3.30), and delayed post-test (2.80) were not significant for this treatment group,
suggesting that while the mean scores did increase from the pre-test to the post-tests, the
gains were not statistically significant.
DDL group: a one-way RMANOVA of overall test scores indicated that there was a
significant main effect for test for this group, F (2, 20) = 13.207, p < .05 (Table 8). In the
post hoc analysis, pairwise comparisons indicated that the increase in mean scores from
pre-test to post-test was significant (p = .000), as was the increase from pre-test to delayed
post-test mean scores (p = .009) based on the parameters of the aforementioned adjusted
alpha. The decrease in mean scores from post-test (5.09) to delayed post-test (4.82) was not
statistically significant (p = .43). Based on these findings, it is reasonable to conclude that
the DDL instructional treatment led to a significant increase in grammar performance for
this group and that these gains were maintained over time, indicative of an instructional
effect from the treatment.
194 J. Smart
Following the analysis of the overall tests of the active/passive voice, analyses of the
three test tasks were conducted individually. The first task, error correction, measured
participants’ receptive and productive grammar ability associated with the form of the
active/passive voice construction. Scores on the pre-test, post-test, and delayed post-test
tasks are reported in Table 9. The Kruskal-Wallis analysis failed to show any differences
between the three treatment groups on the pre-test performance for this task (H (2) = .412,
p = .814). On the post-test, the scores did differentiate significantly between the three
groups (H (2) = 11.083, p = .004), with a mean rank of 13.15 for TGI, 25.05 for DDL, and
14.85 for DCI. Analysis of the delayed post-test also indicated a significant difference
between the treatment groups (H (2) = 7.863, p = .02), with a mean rank of 12.77 for TGI,
23.45 for DDL, and 17.10 for DCI. The DDL group outperformed the other two groups on
both the post-test and the delayed post-test for this task.
On the second task, rewriting sentences in the active/passive voice, the analysis showed
significant difference between the three treatment groups on the pre-test (H (2) = 11.897,
p = .003), with a mean rank of 15.00 for TGI, 22.73 for DDL, and 15.00 for DCI. While the
previously reported analysis of overall scores for the active/passive voice did not show a
significant pre-test difference between the three treatment groups, this analysis suggests that
the participant groups did not have the same ability level related to this particular task at the
beginning of the instructional treatments. Specifically, the mean scores for the TGI and DCI
groups were lower than the DDL group on the pre-test (Table 10). The TGI and DCI
participants both had a mean score of 0.00 on this task, with the DDL participants scoring
only slightly higher (M = .45). Analysis of the post-test scores for this task also indicated a
significant difference between treatment groups (H (2) = 6.354, p = .042), with a mean rank
of 18.04 for TGI, 22.05 for DDL, and 11.80 for DCI. Analysis of the delayed post-test
The role of guided induction in paper-based data-driven learning 195
scores showed an even greater (and statistically) significant difference between groups
(H (2) = 12.543, p = .002), with a mean rank of 14.27 for TGI, 25.41 for DDL, and
13.00 for DCI. The differences between the DDL group and the other two groups that were
present at the pre-test were continued through the post-tests, with DDL scoring higher on all
three tasks.
For the third, register awareness task, the mean results of group scores by test are reported
in Table 11. Analysis of these scores indicated that there was no significant difference
between treatment groups on the pre-test (H (2) = .949, p = .622), on the post-test
(H (2) = .180, p = .949), or on the delayed post-test (H (2) = 3.427, p = .180). Although this
task was designed to distinguish between the corpus-derived instructional treatments and
the traditional instruction as a measure of language performance directly related to they type
of information corpora can provide, it did not provide any meaningful discrimination
between groups.
4 Discussion
The results of the analysis present a complex picture of how these three instructional
treatments differ in terms of learning outcomes. From the analysis of the overall test scores,
DDL led to clear gains in learning from pre-test to post-test that were maintained into the
delayed post-test. The other two groups did not show the same improvement. However, the
more detailed task-specific analysis revealed that within the overall test scores, one task in
particular, the error correction task, was a better indicator of group differences than the other
two tasks, when considered individually.
Given that the two corpus-informed instructional approaches included information about
register differences in the use of passive voice, the fact that the register awareness task did
not indicate any differences between the three groups was somewhat surprising. However,
as the register awareness information was limited to broad differences between speech
and academic writing, this may be a limitation of the practical scope of the instructional
materials as much as of the testing instrument.
On the other hand, the differences that emerged between the three groups on the error
correction task may relate directly to the inductive nature of the DDL instructional
approach. As learners in this treatment group completed detailed, learner-centered analyses
of sentences with passive voice constructions and were responsible for identifying and
explaining this information, this may have played a role in their knowledge of the accurate
form and meaning of sentences that use the passive voice. As the DDL group improved on
this task from pre-test to post-test, and the DCI group did not, it seems reasonable to
196 J. Smart
conclude that the improvements were in part related to the combination of the inductive
instructional approach along with the use of real language from corpora and not solely the latter.
Based on the teacher’s own reflective note taking during the instructional treatments, the
three groups received the same amount of time on tasks and interacting with language
samples. However, an important difference that emerged during the treatment was that
the participants in the DDL group, due to the nature of the learning tasks, engaged with the
language learning activities and the sample language in ways that the other learners did not.
They discussed the examples with their classmates in an effort to discover patterns in
the language and to solve problems; whereas the students in the other treatment groups
primarily referred to the rules they had available to complete learning tasks. The DDL
learners’ engagement with the material and interest in what many perceived as a novel
approach to grammar instruction may have led to more learning during the course of this
brief instructional intervention.
Finally, this study has provided some evidence that in comparison with traditional
grammar instruction (e.g., PPP) and more conventional instructional materials, inductive
paper-based DDL can be a valuable resource for language teachers. The practical design
of the approach to DDL in this study (i.e., developing materials using online corpora,
consulting pre-existing corpus-based research, and not requiring the use of computers in the
classroom) is an instructional approach that should be accessible in a broad range of
language learning contexts. This approach is certainly not novel in the field of DDL
(cf. Boulton, 2010b), but this small-scale study of teaching one grammar feature using DDL
may provide further support for educators interested in the possible benefits of using this
approach in language classrooms.
While these results provide some limited evidence that this combination of inductive
instruction with corpus-informed teaching materials can lead to greater outcomes, caution
must be used in generalizing the findings. The study focused primarily on rather concrete
‘rules’ for the grammatical differences between the active and passive voice, taking a
narrow view of what a grammar structure is and also, due to the constraints of using real
grammar classrooms with immediate curricular goals, taking only limited advantage of the
possibilities of using corpus-informed teaching materials for providing more robust insight
into register and use differences between the two voices in English. Further study of this
type of instructional intervention may be better addressed by integrating these instructional
treatments into a teaching approach that incorporates communicative language learning as
the basis for learning about register and uses of the target language feature. Additionally, the
passive voice, as a grammatical construction, may be simpler to learn using DDL than other,
more complex grammar structures (in terms of form, meaning, and use).
5 Conclusion
This study used a guided inductive approach to paper-based data-driven learning that fol-
lowed the four-stage instructional sequence proposed by Flowerdew (2009). Historically,
DDL studies have varied both in instruction methods and in technologies used, limiting both
generalizability and a cohesive framework for applying DDL, but the results of this study
The role of guided induction in paper-based data-driven learning 197
suggest that using a well-structured, guided inductive approach for DDL (especially
in paper-based contexts) can lead to positive learning outcomes, and that there is more to
DDL as an instructional approach than simply using corpora as a teaching tool. Future
corpus-informed approaches to grammar instruction that incorporate this or a similar DDL
framework and consider a range of grammatical features may provide a more robust
understanding of how inductive, corpus-informed approaches to grammar instruction may
be beneficial to language learners.
References
Aguado-Jiménez, A., Pérez-Paredes, P. and Sánchez, P. (2012) Exploring the use of multidimensional
analysis of learner language to promote register awareness. System, 40(1): 90–103.
Belz, J. and Vyatkina, N. (2005a) Learner corpus analysis and the development of L2 pragmatic
competence in networked intercultural language study: The case of German modal particles.
Canadian Modern Language Review, 62(1): 17–48.
Belz, J. and Vyatkina, N. (2005b) Computer-mediated learner corpus research and the data-driven
teaching of L2 pragmatic competence: The case of German modal particles. CALPER Working
Papers 4.
Belz, J. and Vyatkina, N. (2008) The pedagogical mediation of a developmental learner corpus for
classroom-based language instruction. Language Learning & Technology, 12(3): 33–52.
Biber, D. (1988) Variation across speech and writing. Cambridge: Cambridge University Press.
Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E. (1999) Longman grammar of spoken
and written English. London: Longman.
Boulton, A. (2008) Looking for empirical evidence of data-driven learning at lower levels. In:
Lewandowska-Tomaszczyk, B. (ed.), Corpus linguistics, computer tools, and applications: State of
the art. Frankfurt: Peter Lang, 581–598.
Boulton, A. (2009a) Testing the limits of data-driven learning: Language proficiency and training.
ReCALL, 21(1): 37–51.
Boulton, A. (2009b) Corpora for all? Learning styles and data-driven learning. In: Mahlberg, M.,
González-Díaz, V. and Smith C. (eds.), Proceedings of the 5th Corpus Linguistics Conference.
http://ucrel.lancs.ac.uk/publications/cl2009/
Boulton, A. (2010a) Learning outcomes from corpus consultation. In: Moreno Jaén, M., Serrano
Valverde, F. and Calzada Perez, M. (eds.), Exploring new paths in language pedagogy: Lexis and
corpus-based language teaching. London: Equinox, 129–144.
Boulton, A. (2010b) Data-driven learning: Taking the computer out of the equation. Language
Learning, 60(3): 534–572.
Boulton, A. (2012) Empirical research in data-driven learning: A summary. http://bit.ly/BoultonATILF
Carter, R. and McCarthy (1995) Grammar and spoken language. Applied Linguistics, 16(2):
141–158.
Celce-Murcia, M. and Larsen-Freeman, D. (1999) The grammar book: An ESL/EFL teacher’s course
(2nd ed.). Boston: Thomson/Heinle.
Chambers, A. and O’Sullivan, Í (2004) Corpus consultation and advanced learners’ writing skills
in French. ReCALL, 16(1): 158–172.
Chang, C. and Kuo, C. (2011) A corpus-based approach to online materials development for writing
research articles. English for Specific Purposes, 30(3): 222–234.
Chang, P. (2012) Using a stance corpus to learn about effective authorial stance-taking: A textlinguistic
approach. ReCALL, 24(2): 209–236.
Chujo, K., Anthony, L. and Oghigian, K. (2009) DDL for the EFL classroom: Effective uses
of a Japanese-English parallel corpus and the development of a learner-friendly, online parallel
198 J. Smart
concordance. In: Mahlberg, M., González-Díaz, V. and Smith, C. (eds.), Proceedings of the
5th Corpus Linguistics Conference. Liverpool: University of Liverpool. http://ucrel.lancs.ac.uk/
publications/cl2009/
Chujo, K. and Oghigian, K. (2012) DDL for EFL beginners: A report on student gains and views on
paper-based concordancing and the role of L1. In: Thomas, J. and Boulton, A. (eds.) Input, process,
and product: Developments in teaching and language corpora. Brno: Masaryk University Press,
170–183.
Conrad, S. (2000) Will corpus linguistics revolutionize grammar teaching in 21st century? TESOL
Quarterly, 34(3): 548–560.
Conrad, S. (2005) Corpus linguistics and L2 teaching. In: Hinkel, E. (ed.), Handbook of research in
second language teaching and learning. Mahwah, NJ: Lawrence Erlbaum, 393–409.
Conrad, S. and Biber, D. (2009) Real grammar: A corpus-based approach to instruction. New York:
Pearson Longman.
Conroy, M. (2010) Internet tools for language learning: University students taking control of their
writing. Australasian Journal of Educational Technology, 26(6): 861–882.
Davies, M. (2007) TIME Magazine Corpus: 100 million words, 1920s–2000s. http://corpus.byu.edu/
time/
Davies, M. (2008) The Corpus of Contemporary American English: 450 million words, 1990-present.
http://corpus.byu.edu/coca/
Ellis, R. (2006) Current issues in the teaching of grammar: An SLA perspective. TESOL Quarterly,
40(1): 83–107.
Estling Vannestål, M. and Lindquist, H. (2007) Learning English grammar with a corpus: Experi-
menting with concordancing in a university grammar course. ReCALL, 19(3): 329–350.
Flowerdew, L. (2009) Applying corpus linguistics to pedagogy: A critical evaluation. International
Journal of Corpus Linguistics, 14(3): 393–417.
Flowerdew, L. (2012) Corpora and language education. New York: Palgrave Macmillan.
Gaskell, D. and Cobb, T. (2004) Can learners use concordance feedback for writing errors? System,
32(3): 301–319.
Hadi, Z. and Alibakhshi, G. (2012) On the effectiveness of Corpus Analysis Tool in the use of correct
preposition in Persian into English translation. The Iranian EFL Journal, 8(5): 284–294.
Hadley, G. (2002) An introduction to data-driven learning. RELC Journal, 33(2): 99–124.
Haight, C., Herron, C. and Cole, S. (2007) Approaches on the learning of grammar in the elementary
foreign language college classroom. Foreign Language Annals, 40(2): 288–310.
Hammerly, H. (1975) The deduction/induction controversy. The Modern Language Journal, 59(1):
15–18.
Hanafiyeh, M. and Keshi, A. K. (2013) Corpus-based instruction and thesaurus-based teaching on
Iranian EFL learners’ grammatical knowledge. Journal of Basic and Applied Scientific Research,
3(2): 167–179.
Herron, C. and Tomasello, M. (1992) Acquiring grammatical structures by guided induction. The
French Review, 65(5): 708–718.
Johns, T. (1991) From printout to handout: grammar and vocabulary teaching in the context of
data-driven learning. In: Johns, T. and King, P. (eds), Classroom concordancing. English Language
Research Journal, 4: 27-45.
Johns, T. and King, P. (eds.) 1991 Classroom concordancing. English Language Research Journal, 4.
Kirschner, P., Sweller, J. and Clark, R. (2006) Why minimal guidance during instruction does not
work: An analysis of the failure of constructivist, discovery, problem-based, experiential, and
inquiry-based teaching. Educational Psychologist, 41(2): 75–86.
Larsen-Freeman, D. (2001) Teaching Grammar. In: Celce-Murcia, M. (ed.), Teaching English as a
second or foreign language (3rd ed.). Boston: Thomson/Heinle, 251–266.
The role of guided induction in paper-based data-driven learning 199
McCarthy, M., McCarten, J. and Sandiford, H. (2005) Touchstone. Cambridge: Cambridge University
Press.
Mukherjee, J. (2006) Corpus linguistics and language pedagogy: The state of the art and beyond.
In: Braun, S., Kohn, K. and Mukherjee, J. (eds.), Corpus technology and language pedagogy:
New resources, new tools, new methods. Frankfurt: Peter Lang, 5–24.
Norris, J. M. and Ortega, L. (2000) Effectiveness of L2 instruction: A research synthesis and quanti-
tative meta-analysis. Language Learning, 50(3): 417–528.
O’Sullivan, Í. and Chambers, A. (2006) Learners’ writing skills in French: Corpus consultation and
learner evaluation. Journal of Second Language Writing, 15(1): 49–68.
Pérez-Paredes, P., Sánchez-Tornel, M. and Alcaraz Calero, J. M. (2012) Learners’ search patterns during
corpus-based focus-on-form activities. International Journal of Corpus Linguistics, 17(4): 483–516.
Pérez-Paredes, P., Sánchez-Tornel, M., Alcaraz Calero, J. and Aguado Jiménez, P. (2011) Tracking
learners’ actual uses of corpora: guided vs non-guided corpus consultation. Computer Assisted
Language Learning, 24(3): 233–253.
Reppen, R., Bunting, J., Diniz, L., Blass, L., Iannuzzi, S. and Savage, A. (2012) Grammar and
beyond, (Vols. 1–4), Cambridge: Cambridge University Press.
Römer, U. (2006) Pedagogical applications of corpora: Some reflections on the current scope and a
wish list for future developments. Zeitschrift für Anglistic und Amerikanistic, 54(2): 121–134.
Schmied, J. (2006) Corpus linguistics and grammar learning: Tutor versus learner perspectives.
In: Braun, S., Kohn, K. and Mukherjee, J. (eds.), Corpus technology and language pedagogy: New
resources, new tools, new methods. Frankfurt: Peter Lang, 87–106.
Shaffer, C. (1989) A comparison of inductive and deductive approaches to teaching foreign languages.
The Modern Language Journal, 73(4): 395–403.
Sinclair, J. (2003) Reading concordances. London: Longman.
Sripicharn, P. (2003) Evaluating classroom concordancing: The use of concordance-based materials
by a group of Thai students. Thammasat Review, 8(1): 203–236.
Stevens, V. (1991) Classroom concordancing: Vocabulary materials derived from relevant,
authentic text. English for Specific Purposes, 10(1): 35–46.
Thornbury, S. (2004) Natural grammar: The keywords of English and how they work. Oxford: Oxford
University Press.
Tian, S. (2005a) Data-driven learning. Do learning tasks and proficiency make a difference? Pro-
ceedings of the 9th conference of the Pan-Pacific Association of Applied Linguistics. Tokyo:
Waseda University Media Mix Corp, 360–371.
Tian, S. (2005b) The impact of learning tasks and learner proficiency on the effectiveness of data-
driven learning. Journal of Pan-Pacific Association of Applied Linguistics, 9(2): 263–275.
Whistle, J. (1999) Concordancing with students using an ‘off-the-web’ corpus. ReCALL, 11(2): 74–80.
Yoon, H. (2008) More than a linguistic reference: The influence of corpus technology on L2 academic
writing. Language Learning & Technology, 12(2): 31–48.
Yoon, H. and Hirvela, A. (2004) ESL students attitudes toward corpus use in L2. Journal of Second
Language Writing, 13(4): 257–283.
Appendix A
Data-driven Activity for Passive Voice
Read each of the following sentences. Underline the use of the verb “measured” in each sentence.
Then, with a partner, circle the subject of each sentence and answer the questions below.
1. Richardson and his colleagues measured the expansion rate of the plume.
2. The same researchers measured the immune response of human subjects to soybeans
using a skin-prick test-an evaluation used often by allergy doctors.
200 J. Smart
3. During a 1958 flood, for example, sediment levels in the river were measured at
35 pounds per cubic foot, and an observer described its surface as " wrinkled. "
4. The height values were measured as recommended by the CDC
5. Kashlinsky's team measured cluster motions relative to the cosmic microwave
background.
6. Fifty years ago franchise fees and costs were measured in the hundreds of thousands
of dollars.
What subjects did you circle for sentences 3, 4, and 6?
Look back carefully at the verbs in each sentence. The verb measured appears differently in
sentences 3, 4, and 6. How so?
When the subject of the sentence is doing the measuring which verb is used?
When the subject isn’t doing the measuring, which verb do we use?
Appendix B
Post-test Active/Passive Voice
Part 1. Read the following sentences. If a sentence is grammatically correct, write “correct”
in the space provided. If the sentence is not correct, revise the sentence in the space
provided. The first two are done for you.
00. In Los Angeles it often seems as though screenplays are being writing by everyone who
can put a noun and a verb together.
In Los Angeles it often seems as though screenplays are being written by everyone who
can put a noun and a verb together.
2. Last week 3.3 million Americans learned within the past year that their names had been
used to open fraudulent bank or credit-card accounts.
___________________________________________________________________
___________________________________________________________________
3. Nerve gas was using on enemy troops attempting a counterattack on the U.S. forces.
___________________________________________________________________
___________________________________________________________________
The role of guided induction in paper-based data-driven learning 201
Part 2. The following sentences are all correct. However, they need to be rewritten
to make them more appropriate for speech/writing. Rewrite the passive voice verbs to active
voice verbs and the active verbs to passive. The first one has been done for you.
Part 3. Determine which version of the sentence sounds better in speech or writing. Circle
your answer.