Exploring AI-Generated English Relative Clauses in Comparison To Human Production
Exploring AI-Generated English Relative Clauses in Comparison To Human Production
Exploring AI-Generated English Relative Clauses in Comparison To Human Production
Abstract
Human behavioral studies have consistently indicated a
preference for subject-extracted relative clauses (SRCs) over
object-extracted relative clauses (ORCs) in sentence production
and comprehension. Some studies have further shown that this
preference can be influenced by the semantic properties of head
nouns, particularly animacy. In this study, we use AI language
models, specifically GPT-2 and ChatGPT 3.5, to simulate human
sentence generation. Our primary goal is to evaluate the extent
to which these language models replicate human behaviors in
sentence production and identify any divergences. We tasked the
models with completing sentence fragments structured as ‘the,’
followed by a head noun and 'that’ (The reporter that …). We
varied the semantic property of head nouns such that they are all
animate (the secretary that … ) in Study 1 and are either animate
or inanimate (the musician/book that … ) in Study 2. Our findings
reveal that in Study 1, both GPT models exhibited a robust SRC
bias, replicating human-like behavior in relative clause production.
However, in Study 2, we observed divergent behavior between the
models when head nouns were inanimate, while consistency was
maintained when head nouns were animate. Specifically, ChatGTP
3.5 generated more ORCs than SRCs in the presence of inanimate
head nouns. These results, particularly those from ChatGPT 3.5,
Received 06 November 2023; Revised 14 December 2023; Accepted 15 December 2023
* Correspondence: [email protected] (Eunkyung Yi) & [email protected] (Sanghoun Song)
Journal of Cognitive Science 24(4): 465-496 December 2023
©2023 Institute for Cognitive Science, Seoul National University
466 Hongoak Yun, Eunkyung Yi and Sanghoun Song
1. Introduction
Grave, Linzen, & Baroni, 2018; Linzen, Dupoux, & Goldberg, 2016),
grammaticality judgment with garden path constructions (Frank & Hoeks,
2019; Futrell & Levy, 2019; van Schijndel & Linzen, 2018), negative
polarity items licensing (Futrell, Wilcox, Morita. & Levy, 2018; Shin, Yi,
& Song, 2023), a filler-gap grammaticality judgment in center embedding
and syntactic islands involving long-distance dependencies (Wilcox, Levy,
Morita, & Futrell, 2018; 2019), and discourse expectations (Yi, Cho, &
Song, 2022). Overall, pre-ChatGPT neural language models have revealed
considerable syntactic sensitivity to grammaticality (Warstadt & Bowman,
2020), but these language models have struggled with making commonsense
inferences and role-based event prediction (Ettinger, 2020). Since the
launch of ChatGPT, language models have seen dramatic improvements
in semantic, pragmatic, and syntactic knowledge. However, few studies
have attempted to simulate the cognitive aspects that underlie how humans
use their linguistic knowledge during the process of comprehension and
production (Cai, Haslett, Duan, Wang, & Pickering, 2023). One potential
approach to address this gap is by simulating psycholinguistic findings.
The processing of relative clauses is a topic of significant psycholinguistic
interest. When English-speaking humans read sentences like (1a-b), they
typically read clauses like (1a), where the subject is extracted to the head
position of the relative clause (i.e., SRC), faster or with less difficulty
than clauses like (1b), where the object is extracted (i.e., ORC). The
phenomenon, known as SRC advantages in comprehension, is a well-
established psycholinguistic observation widely replicated across different
languages. In addition to comprehension studies, corpus studies have
consistently found that subject-extracted relatives are produced more
frequently than their corresponding object-extracted counterparts in large-
scale corpora (Levy, 2008; Reali & Christiansen, 2007; Roland, Dick,
Elman, 2007). Computational linguistic theories have proposed probabilistic
models, such as the surprisal model, which mathematically demonstrate
468 Hongoak Yun, Eunkyung Yi and Sanghoun Song
2. Previous studies
British
British Wall Street
National
National Brown Switchboard Journal
Corpus
Corpus Treebank 2
Spoken
Subject
14,182 9,851 15,024 9,548 18,229
relative
Object
2,943 3,863 1,976 5,616 1,802
relative
Object
relative 5,455 14,423 4,746 5,314 3,385
(reduced)
Exploring AI-Generated English Relative Clauses in Comparison to Human Production 473
nouns were animate but decreased to 31% when they were inanimate. The
effect of semantic sensitivity to the SRC-ORC frequency distribution,
based on corpus counts, has also been observed in the laboratory-based
studies of human production. Gennari and MacDonald (2008) manipulated
the animacy of head nouns, using nouns that were animate as in (3a) and
inanimate as in (3b). In a completion task, English native speakers were
asked to continue sentence fragments like (3a-b). The results revealed that
when prompted with animate head nouns, there was a strong bias for SRCs,
with an 85% preference for SRCs over 15% for ORCs. Conversely, when
prompted with inanimate head nouns, the strong bias for SRCs weakened,
resulting in a 65% preference for SRCs versus 35% for ORCs.
3. Study 1
3.1 Method
GPT-2
GPT-2, featuring 1.5 billion parameters, underwent pretraining on
an extensive dataset comprising web pages, books, and various written
materials. During pretraining, the model mastered the language by
predicting the next word in a given context sequence (Shrivastava, Pupale,
& Singh, 2021). Following this phase, GPT-2 underwent fine-tuning for
a wide range of downstream tasks, such as text classification, sentiment
analysis, and question-answering (Schneider, de Souza, Gumiel, Moro, &
Paraiso, 2021).
ChatGPT 3.5
ChatGPT, powered by GPT 3.5, employs a stack of 13 Transformer
blocks, each featuring 12 attention heads and 768 hidden units. This model
was pretrained on a vast corpus of text data, including books, articles,
and websites, through a language modeling task (Abdullah, Madain, &
Jararweh, 2022). This pretraining enables ChatGPT to learn connections
between words and phrases in natural language, resulting in coherent
responses during conversations. Notably, compared to GPT-2, ChatGPT
demonstrates significantly improved generation capabilities. The texts it
generates are contextually accurate, grammatically precise, and logically
coherent. Many have attested to the fluency of ChatGPT, finding it valuable
across diverse applications like content writing, summarization, machine
translation, and text rewriting (Sallam, 2023).
Exploring AI-Generated English Relative Clauses in Comparison to Human Production 477
3.1.2 Materials
A total of 76 stimuli were sampled from Roland et al. (2012), Gordon
et al. (2004), and Reali and Christiansen (2007), all of which are
psycholinguistic studies that documented faster reading times for subject
relatives compared to object relatives. To generate relative clauses by neural
language models (i.e., GPT-2 and ChatGPT 3.5), we prepared incomplete
sentence prompts, as indicated in Example (4), which have been extracted
from the 76 original stimuli. All of the head NPs used in the present study
were animate. Appendix A displays the stimuli that we used.
3.2.3 Procedure
The incomplete sentence fragments were entered into neural language
models, GPT-2 and ChatGPT 3. First, we used the large version of GPT-2
and asked the model to complete the incomplete sentence fragments across
six different temperature settings: 0.1, 0.3, 0.5, 0.7, 0.9 and 1.0. At each
temperature level, we asked the model to generate ten sentence samples
for each stimulus. In total, we ended up with 4,500 sentence samples
from six different temperature levels. Second, we used the free version of
ChatGPT-3.5, accessible via the OpenAI website (https://chat.openai.com/).
To obtain sentence samples, we formulated our request within the prompt
box, as follows: “Would you please generate 10 English sentences starting
with The secretary that … ?”. To mitigate potential implicit priming effect
arising from previously produced samples that the model might have in
its working memory space during generation (Cai et al., 2023), we had the
ChaptGPT-3.5 model generate ten sentence samples for each stimulus and
repeated the process 15 times in distinct trials. This resulted in a total of
11,250 sentence samples.
478 Hongoak Yun, Eunkyung Yi and Sanghoun Song
1
To compute the relative frequency of ORCs, we replaced SRC occurrences with
ORC occurrences.
Exploring AI-Generated English Relative Clauses in Comparison to Human Production 479
3.3 Results
Figure 1a-d illustrate the means and standard errors of the relative
frequencies of both subject-extracted relative clauses and object-extracted
relative clauses generated by both GPT-2 and ChatGPT 3.5 using the stimuli
taken from the three previous studies. A simple visual examination clearly
reveals
stimuli taken that subject-extracted
from the three previous studies.relative clauses
A simple visual greatly
examination outnumber
clearly object-
reveals that subject-
extracted relative clauses greatly outnumber object-extracted relative clauses across all studies, regardless
extracted relative clauses across all studies, regardless of the GPT versions.
of the GPT versions.
Figure 1. Frequency
Figure 1. Frequency
distributions distributions
of RC types generated
of byRC bothtypes
GPT-2 and ChatGPT 3.5
generated byusing
bothheadGPT-
nouns extracted from the three previous studies. The extent of SRC biases for each study is depicted in
2 Error
(d). andbars
ChatGPT 3.5Confident
represent 95% using Intervals.
head nouns extracted from the three previous
studies. The extent of SRC biases for each study is depicted in (d). Error
Table 2 presents the results of paired t-tests, confirming that the neural language models generated
bars represent
subject-extracted 95%
relative Confident
clauses Intervals.
significantly more frequently than object-extracted relative clauses. The
significant differences reported in Table 2 indicated that neural large language models succeed in simulating
human producers, similar to Roland et al.’s (2007) corpus counts. As shown in Figure 1d, the SRC bias was
consistently present in all studies, irrespective of the GPT versions. These results supported our hypothesis,
Table 2 presents the results of paired t-tests, confirming that the neural
raised in Research question 1, suggesting that AI-generated corpus approximates human-generated corpora.
languageourmodels
Additionally, generated
results indicated that the subject-extracted
SRC bias was consistentlyrelative clauses
much stronger significantly
when using ChatGPT
3.5 models compared to GPT-2 models.
more frequently than object-extracted relative clauses. The significant
Table 2. Paired t-tests results comparing relative clause frequencies across studies
Roland et al. ORCs .31 (.24) t(23) = -3.63, .10 (.21) t(23) = -9.54,
(2012) SRCs .66 (.25) p = .001 .90 (.21) p = .000
480 Hongoak Yun, Eunkyung Yi and Sanghoun Song
4. Study 2
GPT models.
4.1 Method
4.1.2 Materials
A total of 56 stimuli were taken from Gennari and MacDonald (2008)
who have conducted a sentence completion task to English native speakers.
Among these stimuli, 28 stimuli had animate head nouns, as in (6a),
whereas the remaining 28 stimuli had inanimate head nouns, as in (6b).
Recall that Gennari and MacDonald (2008) have observed the strong bias
for subject relatives when head nouns were animate (i.e., 85% vs. 15%) but
the strong bias got weak when head NPs were inanimate (i.e., 65% vs. 35%).
As we did in Study 1, we prepared incomplete sentence prompts as in (6a-
b), by using an NP plus that, for neural large language models to generate
relative clauses. The stimuli used in this study are presented in Appendix A.
4.2 Results
Figure 2a-b illustrate the means and standard errors of the relative
482 Hongoak Yun, Eunkyung Yi and Sanghoun Song
Figure
Figure 2.2.Frequency
Frequency distributions
distributions ofclauses
of the types of relative the for
types
animateof
headrelative
NPs (a) and clauses
inanimate for
head NPs (b), generated by both GPT 2 and ChatGPT 3.5 based on the stimuli extracted from Gennari and
animate
MacDonald head
(2008).NPs (a)represent
Error bars and inanimate head NPs (b), generated by both
95% Confident Intervals.
GPT 2 2.and
Figure ChatGPT
Frequency 3.5ofbased
distributions the typeson the stimuli
of relative clauses forextracted from
animate head NPs Gennari
(a) and inanimateand
head NPs (b), generated by both GPT 2 and ChatGPT 3.5 based on the stimuli extracted from Gennari and
MacDonald (2008). Error bars represent 95%
MacDonald (2008). Error bars represent 95% Confident Intervals.
Confident Intervals.
Figure 3. The extent of SRC biases when head NPs are animate (a) and inanimate (b). Error bars
represent 95% Confident Intervals.
Recall that a higher SRC bias indicates a stronger preference for SRCs over ORCs. Figure 3 illustrates
Figure 3. The
the extent extentthe
to which ofSRC
SRCbiasbiases when headbyNPs are animate (a) and inanimate (b). Error bars
Figure
represent
3.95%
that the SRC
The extent
Confident
bias dropped
ofis SRC
Intervals.
modulated
biasesthe animacy
whenof headhead nouns
NPs and the GPT
are version.
animate
below zero only for the ChatGPT 3.5 when inanimate head nouns were used.
It shows
(a) and
inanimate (b).revealed
Statistical tests Errorno bars represent
significant differences 95%
in SRC Confident
bias between theIntervals.
GPT versions when head nouns
Recall
were that a(t(27)
animate higher SRCp =bias
= .32, indicates
.754) but therea were
stronger preference
statistical for SRCs
significances over
when ORCs.
head nounsFigure 3 illustrates
were inanimate
the extent
(t(26) = to which
5.01, p =the SRCThese
.000). bias isresults
modulated by the
support ouranimacy of head
hypothesis, nouns andintheResearch
as presented GPT version. It shows
question 2,
that the SRCthat
suggesting biasAI-generated
dropped below zero only
corpora for the ChatGPT
approximate 3.5 whencorpora
human-generated inanimatein head
termsnouns were used.
of semantics.
Statistical tests revealed no significant differences in SRC bias between the GPT versions
However, the successful simulations were observed only with ChatGPT 3.5. Additionally, our results show when head nouns
were
that animate (t(27)toward
the bias shift = .32, object
p = .754) but there
relatives whenwere statistical
inanimate significances
head nouns werewhen headmuch
used was nounsmore
weredramatic
inanimate
(t(26) = 5.01,3.5
by ChatGPT = .000).compared
p models These results
to humansupport our hypothesis, as presented in Research question 2,
speakers.
suggesting that AI-generated corpora approximate human-generated corpora in terms of semantics.
However, the successful
Table 3. Results simulations
of paired were RC
t-tests between observed
relativeonly with ChatGPT
frequencies 3.5.NP
across the Additionally,
types our results show
Exploring AI-Generated English Relative Clauses in Comparison to Human Production 483
Recall that a higher SRC bias indicates a stronger preference for SRCs
over ORCs. Figure 3 illustrates the extent to which the SRC bias is
modulated by the animacy of head nouns and the GPT version. It shows
that the SRC bias dropped below zero only for the ChatGPT 3.5 when
inanimate head nouns were used. Statistical tests revealed no significant
differences in SRC bias between the GPT versions when head nouns were
animate (t(27) = .32, p = .754) but there were statistical significances when
head nouns were inanimate (t(26) = 5.01, p = .000). These results support
our hypothesis, as presented in Research question 2, suggesting that
AI-generated corpora approximate human-generated corpora in terms of
semantics. However, the successful simulations were observed only with
ChatGPT 3.5. Additionally, our results show that the bias shift toward object
relatives when inanimate head nouns were used was much more dramatic
by ChatGPT 3.5 models compared to human speakers.
Inanimate ORCs .21 (.21) t(27) = -7.15, .56 (.35) t(26) = .88,
head NPs SRCs .78 (.22) p = .000 .44 (.35) p = .386†
Note.† Degree of freedom (df) for this comparison is one less than the others. For this mean
comparison tests, we ended up with 26 pairs of comparisons because ChatGPT 3.5 denied to process
unethical content for one stimulus (The grenade that~).
484 Hongoak Yun, Eunkyung Yi and Sanghoun Song
5. General discussion
generated more frequently than object relatives. However, when head nouns
were inanimate, ChatGPT 3.5 generated numerically more object relatives
than subject relatives, while GPT-2 did not exhibit this shift in preference.
These results suggest that when provided with sentence fragments
beginning with ‘the + inanimate noun + that,’ ChatGPT’ computations
estimate that nouns, rather than verbs, are more probable for the following
position (i.e., the first word within relative clauses). Conversely, in cases
where the sentence fragments began with ‘the + animate noun + that,’ both
language models showed a preference for verbs over nouns in the next
position. Notably, these ChatGPT outputs closely replicate the frequency
distributions of relative clauses observed in human-generated corpora.
We have speculated several accounts for the observed bias observed in
the language models. As discussed previously, the models may depend on
exemplar-based learning, where specific trials, such as The book that ~, are
associated with a higher likelihood of continuing with nouns, whiles others,
like The lady that ~, favor verbs. It is also possible that ChatGPT 3.5 has
develop abstract semantic knowledge (e.g., [+/- animate]) during pretraining
and applies this knowledge during sentence production. For example,
inanimate NPs like the book or the wine can be associated with a patient
role occurring at an object position (Kako, 2006), leading to a preference
for object-extracted relatives. On the contrary, animate NPs like the lady
may be linked an agent role occurring at a subject position (Kako, 2006),
resulting in a preference for subject-extracted relatives. Importantly, the
performance of ChatGPT in Study 2 cannot be accounted for only by the
syntactic approach.
As a final note, close examination of ChatGPT’s outputs revealed that
not all inanimate head nouns generated a bias for object relatives. For
example, inanimate head nouns like the accident that or the incident that
continued more frequently with subjective relatives than object relatives.
The presence of the [-animate] feature alone may not be sufficient to
Exploring AI-Generated English Relative Clauses in Comparison to Human Production 487
Our present study has a couple of limitations.2 First, given that ChatGPT
2
We appreciate the anonymous reviewers’ comments for this matter.
488 Hongoak Yun, Eunkyung Yi and Sanghoun Song
6. Conclusion
Funding This research was supported by the 2023 scientific promotion program funded
by Jeju National University.
Declarations
Ethics Approval This study is exempt from ethics approval as it did not involve human
participants in obtaining results.
Conflict of Interest The authors declare that they have no competing interests.
490 Hongoak Yun, Eunkyung Yi and Sanghoun Song
References
Ettinger, A., Elgohary, A., & Resnik, P. (2016). Probing for semantic evidence of
composition by means of simple classification tasks. Proceedings of the 1st
Workshop on Evaluating Vector-space Representations for NLP, 134–139.
Filippova, K., Alfonseca, E., Colmenares, C. A., Kaiser, L., & Vinyals, O. (2015).
Sentence compression by deletion with LSTMs. Proceedings of the 2015
Conference on Empirical Methods in Natural Language Processing, 360–
368.
Ford, M. (1983). A method for obtaining measures of local parsing complexity
throughout sentences. Journal of Verbal Learning and Verbal Behavior, 22,
203–218.
Fox, B. A., & Thompson, S. A. (1990). A Discourse Explanation of the Grammar
of Relative Clauses in English Conversation. Language, 66, 297-316.
Frank, S. L., & Hoeks, J. (2019). The Interaction Between Structure and Meaning
in Sentence Comprehension: Recurrent Neural Networks and Reading
Times. Proceedings of the 2019 Cognitive Science Society, 337-343.
Futrell, R., Wilcox, E., Morita, T., & Levy, R. (2018). RNNs as psycholinguistic
subjects: Syntactic state and grammatical dependency. arXiv:1809.01329
Futrell, R., & Levy, R. (2019). Do RNNs learn human-like abstract word
order preferences? Proceedings of the 2019 Society for Computation in
Linguistics (SCiL), 50–59.
Gennari, S. P., & MacDonald, M. C. (2008). Semantic indeterminacy in object
relative clauses. Journal of Memory and Language, 58 (2), 161-187.
Gibson, E., Bergen, L., & Piantadosi, S. T. (2013). Rational integration of noisy
evidence and prior semantic expectations in sentence interpretation.
Proceedings of the National Academy of Sciences, 110(20), 8051-8056.
Gordon, P., Hendrick, R., & Johnson, M. (2001). Memory interference during
language processing. Journal of Experimental Psychology: Learning,
Memory, and Cognition, 27(6), 1411–1423.
Grodner, D., & Gibson, E. (2005). Consequences of the Serial Nature of
Linguistic Input for Sentenial Complexity. Cognitive Science, 29 (2), 261-
290.
Gulordava, K., Bojanowski, P., Grave, E., Linzen, T., & Baroni, M. (2018).
Colorless green recurrent networks dream hierarchically. Proceedings of
the 2018 conference of the north American chapter of the Association for
Computational Linguistics: Human Language Technologies, 1, 1195–1205.
492 Hongoak Yun, Eunkyung Yi and Sanghoun Song
Mikolov, T., Karafiát, M., Burget, L, Černocký, J. H., & Khudanpur, S. (2010).
Recurrent neural network based language model. Interspeech, 1045-1048.
Mitchell, D.C., Cuetos, C., & Corley, M. M. B. (1995). Exposure-based models
of human parsing: Evidence for the use of coarse-grained (nonlexical)
statistical records. Journal of Psycholinguist Research, 24, 469–488.
OpenAI. (2023). ChatGPT (Mar 14 version) [Large language model]. https://chat.
openai.com/chat
Pickering, M. J., & Branigan, H. P. (1998). The representation of verbs: Evidence
from syntactic priming in language production. Journal of Memory and
Language, 39(4), 633–651.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019).
Language models are unsupervised multitask learners. OpenAI tech report.
Reali, F., & Christiansen, M. H. (2007). Processing of relative clauses is made
easier by frequency of occurrence. Journal of Memory and Language, 57
(1), 1-23.
Roland, D., Mauner, G., & Hirose, Y. (2021). The processing of pronominal
relative clauses: Evidence from eye movements. Journal of Memory and
Language, 119, 104244,.
Roland, D., Dick, F., & Elman, J. L. (2007). Frequency of basic English
grammatical structures: A corpus analysis. Journal of Memory and Language,
57(3), 348-379.
Roland, D., Mauner, G., O’Meara, C., & Yun, H. (2012). Discourse expectations
and relative clause processing. Journal of Memory and Language, 66 (3),
479-508.
Rush, A., M., Chopra, S., & Weston, J. (2015). A Neural Attention Model for
Abstractive Sentence Summarization. Proceedings of the 2015 Conference
on Empirical Methods in Natural Language Processing, 379–389.
Sallam, M. (2023). ChatGPT utility in health care education, research, and
practice: systematic review on the promising perspectives and valid
concerns. Healthcare, 11 (6), 887, 1-20.
Schneider, E.T.R, de Souza, J. V. A., Gumiel, Y. B., Moro, C., & Paraiso, E.
C. (2021). A GPT-2 language model for biomedical texts in Portuguese,
Proceedings of the 2021 IEEE 34th International Symposium on Computer-
Based Medical Systems (CBMS), 474–479.
494 Hongoak Yun, Eunkyung Yi and Sanghoun Song
Schwenk, J., Harmel, N., Brechet, A., Zolles, G., Berkefeld, H., Müller, C. S.,
Bildl, W., Baehrens, D., Hüber, B., Kulik, A., Klöcker, N., Schulte, U., &
Fakler, B. (2012). High-resolution proteomics unravel architecture and
molecular diversity of native AMPA receptor complexes. Neuron, 74(4),
621–633.
Shin, U., Yi, E., & Song, S. (2023). Investigating a neural language model’s
replicability of psycholinguistic experiments: A case study of NPI licensing.
Frontiers in Psychology, 14, 937656.
Shrivastava, A., Pupale, R., & Singh, P. (2021). Enhancing aggression detection
using GPT-2vbased data balancing technique, Proceedings of the 2021 5th
International Conference on Intelligent Computing and Control Systems
(ICICCS), 1345–1350.
Traxler, M. J., Morris, R., K., & Seely, R. E. (2002). Processing subject and object
relative clauses: Evidence from eye movements. Journal of Memory and
Language, 47(1), 69–90.
Thorp, H. H. (2023). ChatGPT is fun, but not an author. Science, 379 (6630), 313-
313.
Van Schijndel, M., & Linzen, T. (2018). Modeling garden path effects without
explicit hierarchical syntax. Proceedings of the 40th Annual Conference of
the Cognitive Science Society, 2600–2605.
Wang, F. Y., Miao, Q., Li, X., Wang, X., & Lin, Y. (2023). What does chatGPT
say: the DAO from algorithmic intelligence to linguistic intelligence, IEEE/
CAA Journal of Automatica Sinica, 10 (3), 575–579.
Warstadt, A., & Bowman, S. R. (2020). Can neural networks acquire a structural
bias from raw linguistic data? arXiv:2007.06761.
Wilcox, E., Levy, R., Morita, T., & Futrell, R. (2018). What do RNN language
models learn about filler–gap dependencies? Proceedings of the 2018
EMNLP workshop BlackboxNLP: Analyzing and interpreting neural
networks for NLP, 211–221.
Wu, C., Yin, S., Qi, W., Wang, X., Tang, Z, & Duan, N. (2023). Visual
ChatGPT: Talking, Drawing and Editing with Visual Foundation Models.
arXiv:2303.04671.
Yi, E., Cho, H., & Song, S. (2022). An experimental investigation of discourse
expectations in neural language models. Korean Journal of English
Language and Linguistics, 22, 1101-1115.
Exploring AI-Generated English Relative Clauses in Comparison to Human Production 495
Yun, H., & Yi, E. (2019). The role of frequency in the processing of giving and
receiving events. Language Research, 55(2), 253-279.
496 Hongoak Yun, Eunkyung Yi and Sanghoun Song