Cerebral Lateralisation of First and Second Langua

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/353525545

Cerebral lateralisation of first and second languages in bilinguals assessed using


functional transcranial Doppler ultrasound

Article  in  Wellcome Open Research · July 2021


DOI: 10.12688/wellcomeopenres.9869.2

CITATIONS READS

0 50

9 authors, including:

Eva Gutiérrez-Sigut M. Macsweeney


University of Essex University College London
41 PUBLICATIONS   547 CITATIONS    77 PUBLICATIONS   3,320 CITATIONS   

SEE PROFILE SEE PROFILE

Zoe Woodhead Heather Payne


University of Oxford University College London
69 PUBLICATIONS   1,011 CITATIONS    12 PUBLICATIONS   144 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

ireadmore View project

Understanding the mechanisms of multimodal communication in deaf children with cochlear implants View project

All content following this page was uploaded by Eva Gutiérrez-Sigut on 15 November 2021.

The user has requested enhancement of the downloaded file.


Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

RESEARCH ARTICLE

   Cerebral lateralisation of first and second languages in


bilinguals assessed using functional transcranial Doppler
ultrasound [version 2; peer review: 1 approved, 1 approved
with reservations, 1 not approved]
Dorothy V. M. Bishop 1, Clara R. Grabitz1, Sophie C. Harte 1,2, Kate E. Watkins1,
Miho Sasaki2,3, Eva Gutierrez-Sigut 2,4, Mairéad MacSweeney2,5,
Zoe V. J. Woodhead 1, Heather Payne 2,5
1Department of Experimental Psychology, University of Oxford, Oxford, UK
2Deafness, Cognition, Language Research Centre, UCL, London, UK
3Faculty of Business and Commerce, Keio University, Tokyo, Japan
4Department of Psychology, University of Essex, Colchester, UK
5Institute of Cognitive Neuroscience, UCL, London, UK

v2 First published: 15 Nov 2016, 1:15 Open Peer Review


https://doi.org/10.12688/wellcomeopenres.9869.1
Latest published: 28 Jul 2021, 1:15
https://doi.org/10.12688/wellcomeopenres.9869.2 Reviewer Status

Invited Reviewers
Abstract
Background: Lateralised language processing is a well-established 1 2 3
finding in monolinguals. In bilinguals, studies using fMRI have
typically found substantial regional overlap between the two version 2
languages, though results may be influenced by factors such as (update)
proficiency, age of acquisition and exposure to the second language. 28 Jul 2021
Few studies have focused specifically on individual differences in brain
lateralisation, and those that have suggested reduced lateralisation version 1
may characterise representation of the second language (L2) in some 15 Nov 2016 report report report
bilingual individuals.
Methods: In Study 1, we used functional transcranial Doppler
1. Marc Brysbaert, Ghent University, Ghent,
sonography (FTCD) to measure cerebral lateralisation in both
languages in high proficiency bilinguals who varied in age of Belgium
acquisition (AoA) of L2. They had German (N = 14) or French (N = 10) as
2. David W. Green, University College London,
their first language (L1) and English as their second language. FTCD
was used to measure task-dependent blood flow velocity changes in London, UK
the left and right middle cerebral arteries during phonological word Tom Hope, University College London,
generation cued by single letters. Language history measures and
London, UK
handedness were assessed through self-report. Study 2 followed a
similar format with 25 Japanese (L1) /English (L2) bilinguals, with
3. Andreas Jansen , University of Marburg,
proficiency in their second language ranging from basic to advanced,
using phonological and semantic word generation tasks with overt Marburg, Germany
speech production.
Results: In Study 1, participants were significantly left lateralised for

 
Page 1 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

both L1 and L2, with a high correlation (r = .70) in the size of laterality
indices for L1 and L2. In Study 2, again there was good agreement Verena Schuster , University of Marburg,
between LIs for the two languages (r = .77 for both word generation Marburg, Germany
tasks). There was no evidence in either study of an effect of age of
acquisition, though the sample sizes were too small to detect any but Any reports and responses or comments on the
large effects.  article can be found at the end of the article.
Conclusion: In proficient bilinguals, there is strong concordance for
cerebral lateralisation of first and second language as assessed by a
verbal fluency task.

Keywords
Laterality, Bilingualism, FTCD

Corresponding author: Dorothy V. M. Bishop ([email protected])


Author roles: Bishop DVM: Conceptualization, Formal Analysis, Methodology, Supervision, Writing – Original Draft Preparation, Writing
– Review & Editing; Grabitz CR: Conceptualization, Formal Analysis, Investigation, Writing – Original Draft Preparation; Harte SC: Formal
Analysis, Writing – Original Draft Preparation, Writing – Review & Editing; Watkins KE: Conceptualization, Methodology, Supervision,
Writing – Original Draft Preparation, Writing – Review & Editing; Sasaki M: Conceptualization, Data Curation, Writing – Original Draft
Preparation, Writing – Review & Editing; Gutierrez-Sigut E: Conceptualization, Formal Analysis, Writing – Original Draft Preparation,
Writing – Review & Editing; MacSweeney M: Conceptualization, Writing – Original Draft Preparation, Writing – Review & Editing;
Woodhead ZVJ: Formal Analysis, Writing – Original Draft Preparation, Writing – Review & Editing; Payne H: Conceptualization, Data
Curation, Formal Analysis, Writing – Original Draft Preparation, Writing – Review & Editing
Competing interests: No competing interests were disclosed.
Grant information: Study 1 was supported by a Wellcome Trust programme grant [082498] and European Research Council Advanced
Grant [694189]. Study 2 was supported by funding for the ESRC Deafness Cognition and Language Research Centre (DCAL) [grant
number RES-620-28-0002] and MEXT Grants-in-Aid for Scientific Research [grant number JP24720187] for MS. MM was supported by a
Wellcome Trust Senior Research Fellowship [grant number WT100229MA].
Copyright: © 2021 Bishop DVM et al. This is an open access article distributed under the terms of the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
How to cite this article: Bishop DVM, Grabitz CR, Harte SC et al. Cerebral lateralisation of first and second languages in bilinguals
assessed using functional transcranial Doppler ultrasound [version 2; peer review: 1 approved, 1 approved with reservations, 1
not approved] Wellcome Open Research 2021, 1:15 https://doi.org/10.12688/wellcomeopenres.9869.2
First published: 15 Nov 2016, 1:15 https://doi.org/10.12688/wellcomeopenres.9869.1

 
Page 2 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

low proficiency groups, though results have not always been


 REVISED
        Amendments from Version 1 consistent across studies, and the impact of these individual
In this revised paper, the original experiment forms Study 1, and
differences appears to be task dependent (Kim et al., 1997;
new data from Japanese-English bilinguals, again using functional Klein, 2003; Klein et al., 1995; Wartenburger et al., 2003). More
transcranial Doppler Ultrasound, for Study 2. The second study generally, studies on this topic tend to have relatively small
was conducted by colleagues (now co-authors) from University sample sizes and hence low power to detect any but large effects.
College London and University of Essex. Clara Grabitz, who did
study 1 as a research project, has given permission for a change
of corresponding author to Dorothy Bishop. A range of methods has been used to assess anatomical and
functional differences between cerebral hemispheres, depend-
Addition of study 2 allows us to confirm that there is high ing on experimental aims as well as task constraints. Here our
similarity between laterality indices for the two languages in
focus is on functional lateralisation, and the possibility that in
proficient bilinguals across two different groups of bilinguals,
and in one of these groups, across two word generation tasks. bilingualism there may be a differential contribution from the
We make the case that the similarity is meaningful and not just right hemisphere for the two languages. This was suggested
a consequence of low power, as indicated by Bayes Factors by a meta-analysis of behavioural studies by Hull & Vaid (2007),
for the comparison of means. Furthermore, although FTCD incorporating studies using dichotic listening, visual prefer-
is not suitable for studying localisation of language within a
hemisphere, we present evidence that it gives a reliable and
ence, and dual task methods; surprisingly, they found that profi-
sensitive measure of extent of lateralisation, that allows us to cient bilinguals who learned L2 in infancy had more bilateral
go beyond simple classification of language laterality as left or language representation of L2 than those who acquired L2
right, via a direct comparison of blood flow in the middle cerebral after 5 years of age. Few fMRI studies have focussed on lan-
arteries.
guage lateralisation in bilinguals. An fMRI study of 16 bilingual
We analysed data from both studies using our most recent people with epilepsy found excellent agreement between
analysis scripts; this ensures reproducibility as well as laterality indices for L1 and L2 on verb production tasks
consistency with our other recent studies, and allows for identical (Centeno et al., 2014). In contrast, Dehaene et al., 1997,
analytic steps to be used for Studies 1 and 2. This change in
found, consistent with other studies, that when listening
approach (basing the measure of laterality index on mean
difference over an interval rather than a region around a peak, to L1, there was consistency between participants in the
and using prespecified criteria for removing outliers with noisy locus of activation in the left hemisphere, but when listening
data) has little impact on the results, other than to improve to L2, there was substantial variability from person to per-
reliability of one subset of data in Study 1. son, not just within a hemisphere, but also in terms of which
Any further responses from the reviewers can be found at hemisphere was most activated. A recent study of basic and
the end of the article advanced L2 learners by Gurunandan et al. (2020) reported
that, whereas language production tended to be left-lateralised
in both languages, in receptive tasks, the two languages tended
to lateralise to opposite hemispheres, with this effect increasing
with language proficiency. For language production, the size of
Introduction the laterality index showed only weak agreement between L1
The two cerebral hemispheres of the brain are neither struc- and L2, regardless of proficiency. Taking these findings on
turally nor functionally identical. Hemispheric specialisation language laterality together, we predict that on production
reflects a variety of factors influencing the brain, including tasks, we should find equivalent lateralisation for L1 and L2 in
genetics, development, experience and pathology. Language moderate-to-high proficiency bilinguals. Although there is
ability is particularly striking in this regard, since, at least in suggestive evidence that laterality indices might show some
monolinguals, it is predominantly left lateralised in most dissociation between L1 and L2 in bilinguals, this tends to be
people (Knecht et al., 1998a). The representation of language in seen on receptive tasks, and it is hard to know if such dissocia-
the bilingual brain has been a topic of controversy. On the one tions are reliable, as test-retest reliability of the laterality index
hand, differential recovery patterns for individual languages is usually unknown.
in stroke patients point towards separate neural representa-
tions (Paradis, 2004), yet on the other hand, neuroimaging of Here we report two studies using functional transcranial
healthy individuals has mostly reported the involvement of Doppler ultrasonography (FTCD) to test the hypothesis that cer-
overlapping cortical areas in the left hemisphere for first (L1) and ebral lateralisation is equivalent for first and second languages
second (L2) languages (Abutalebi et al., 2005; Perani & Abutalebi, in proficient bilinguals. This method uses ultrasound to meas-
2005; Sulpizio et al., 2020). ure cerebral blood flow velocity (CBFV) in the left and right
hemispheres. The change in CBFV reflects the task depend-
The picture is complicated by the complex nature of bilingual- ent contribution of each hemisphere due to neurometabolic
ism, with individuals varying in age of acquisition (AoA), coupling, i.e. brain areas showing task-dependent neuronal fir-
proficiency, exposure to the different languages, and number of ing need to replenish metabolic resources, requiring increased
languages spoken. A recent review of brain structure and con- blood flow (Aaslid et al., 1982; Deppe et al., 2004). In order
nectivity concluded that brain organisation was influenced by to assess language lateralisation, CBFV is measured in the
duration and extent of language use, and their combined effects middle cerebral artery (MCA), which supplies extensive regions
(DeLuca et al., 2019). In functional imaging, differential acti- of the cortex, including frontal, temporal and parietal areas,
vation for L2 vs L1 has been reported for late acquisition or (van der Zwan et al., 1993). These cortical regions in the left

Page 3 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

hemisphere contain areas that are necessary for language either German-English (N = 14) or French-English (N = 10)
processing and production, including classical Broca’s and bilinguals, with a self-reported high level of proficiency in
Wernicke’s areas in the inferior frontal and superior temporal English. All had normal or corrected to normal vision. Individu-
lobes, respectively. FTCD is a reliable and valid measure of lan- als with a diagnosis of any speech, language or learning impair-
guage lateralisation, (Bishop et al., 2009; Groen et al., 2012; ment, affected by a neurological disorder or taking medication
Illingworth & Bishop, 2009; Stroobant et al., 2011), giving good affecting brain function e.g. antidepressants, were not included
correlations with the gold standard intracarotid amobarbital test in the study.
and functional MRI (fMRI) (Deppe et al., 2004; Knake et al.,
2003; Knecht et al., 1998a; Knecht et al., 1998b; Rihs et al., A total of 40 individuals were assessed for viability as study
1999; Somers et al., 2011). Importantly, FTCD had moderate- participants. In total, 14 participants were excluded for a range
to-good within-session (split half) and test-retest reliability of reasons, including no suitable Doppler signal, due to the
(Woodhead et al., 2020). We can therefore distinguish between inability to find a suitable temporal window in the skull, or failure
true dissociations between LIs on different tasks and lack of to stabilize the Doppler signal for the required amount of time
agreement attributable to poor reliability of measurement. (11 participants), or low quality data (3 participants). Data was
analysed from 26 participants. During the analysis, 2 further
FTCD lacks within-hemisphere spatial resolution, so is not participants were dropped because of an insufficient number
suitable for identifying topographic differences in language of useable trials. All further analyses are based on the final
representation within one hemisphere. However, it provides a sample of 24 participants (18 female; mean age = 23.04 years,
measure of changes in blood flow velocity in the middle sd = 3.64 years).
cerebral artery, which can give a direct index of the relative
contribution of the two hemispheres, without any need to spec- Ethics statement. The study was approved by the University
ify thresholds or regions of interest. Advantages of FTCD of Oxford Central Research Ethics Committee (CUREC),
are that it is inexpensive, non-invasive, comfortable, easily approval number, MS-IDREC-C1-2015-126). All participants
applicable, mobile, and child-friendly and it has excellent reso- provided written informed consent.
lution in the time domain (Bishop et al., 2010; Knecht et al.,
1998b). FTCD has been used to study cerebral lateralisation Apparatus. A commercially available transcranial Doppler
in monolinguals, but it has not, to our knowledge, been used ultrasonography device (DWL, Multidop T2; manufacturer,
to compare lateralisation of two languages in bilingual partici- DWL Elektronische Systeme, Singen, Germany) was used for
pants, defined here as people who use more than one language continuous measurements of the changes in cerebral blood flow
on a regular basis (Grosjean, 1989). velocity (CBFV) through the left and right MCA. The MCA
was insonated at ~5 cm (40–60 mm). Activity in frontal and
Study 1: Highly proficient French-English or
medial cortical areas, supplied by the anterior cerebral artery,
German-English bilinguals
and inferior temporal cortex, supplied by the posterior cer-
In Study 1, we used the cued word generation task, which is a
ebral artery, do not contribute to the measurements made in the
well validated and commonly used productive language task
MCA. Two 2-MHz transducer probes, which are relatively
(Knecht et al., 1998a; Knecht et al., 1998b), to test whether
insensitive to participant motion, were mounted on a screw-top
language lateralisation is equivalent for first and second lan-
headset and positioned bilaterally over the temporal skull window
guages in bilinguals. A secondary aim was to consider whether
(Deppe et al., 2004).
there is any impact of AoA. Participants were highly profi-
cient bilinguals, all with English as a second language, who
were working or studying in Oxford, UK at an advanced level. Handedness. Handedness was not a selection criterion, and
We predicted that the extent of left lateralisation of bilin- was assessed via the Edinburgh Handedness Inventory (EHI;
gual speakers would relate to their AoA of L2. On the basis of Oldfield, 1971). The inventory consists of 10 items assessing
Hull & Vaid’s (2007) behavioural meta-analysis we might dominance of a person’s right or left hand in everyday activi-
expect to see weaker lateralisation for L2 in bilinguals with ties. Each item is scored on a 5 step scale (“always left”, “usu-
an early AoA. On the other hand, the convergence hypothesis ally left”, “both equally”, “usually right”, “always right”). A
(Green, 2003) predicts that as proficiency increases, the neural person can score between -100 and +100 for each item and
substrate of L1 and L2 become more similar. Green’s hypoth- an overall score is calculated by averaging across all items
esis did not focus on lateralisation, but it might nevertheless be (“always left” -100; “usually left” -50; “both equally” 0).
taken to suggest the opposite pattern to that predicted by Hull
and Vaid, i.e., greater similarity in the neural basis of L1 and Language history. The Language Experience and Proficiency
L2 in those with the longest experience of L2, i.e. those with Questionnaire (LEAP-Q; Marian et al., 2007) was used to
early AoA. assess language history for all participants. The LEAP-Q is a
self-assessment questionnaire consisting of nine general ques-
Methods tions and seven additional questions per language that explore
Participants. Participants were recruited through the Oxford acquisition history, context of acquisition, current language
University German Society and Oxford University French use, and language preference and proficiency ratings across
Society, as well as through posters in the Experimental Psychol- language domains (speaking, understanding and reading) as
ogy building. Participants were aged over 18 years and were well as accent ratings. An overall self-reported proficiency

Page 4 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

rating was calculated by taking the mean ratings for proficiency the three letters with the lowest first letter word frequency:
in speaking, reading and understanding English. Q, X and Y in English; Q, X and Z in German; and W, X and
Y in French. Written task instructions for the German and
The main variable of interest from LEAP was age of acquisi- French word generation tasks were translated into German and
tion of L2 (AoA), i.e. answer to the question ‘age when you French by the experimenter (CG).
began acquiring the language’; we subdivided into early AoA
(before 6 years of age) and late AoA subgroups, to test the pre- Each trial started with an auditory tone and the written instruc-
diction from Hull & Vaid (2007) that language is more bilat- tion “Clear Mind” (5 s), followed by the letter cue to which
erally represented when L2 is learned in early childhood. the participant silently generated words (15 s), and then overt
To characterise the sample, we also report the numbers of lan- word generation (5 s) (Figure 1). To restore baseline activity,
guages spoken; age of achieving fluency in English; self-reported participants were instructed to relax (25 s) at the end of each
strength of foreign accent when speaking English (on a scale trial. Event markers were sent to the Multi-Dop system when
from 0 [none] to 10 [pervasive]); and mean self-reported the letter cue appeared, denoting trial onset for subsequent
proficiency in English. analysis of the Doppler signal.

Word generation task. Tasks were programmed using Presenta- Data pre-analysis and calculation of asymmetry indices. The
tion® software (version 17.2; www.neurobs.com). All instruc- cerebral blood flow velocity data were analysed using custom
tions were presented centrally in white Arial font on a black scripts in R Studio (R Core Team, 2020), which are available in
background. Each participant was tested in English (L2) and the Underlying data (Bishop et al., 2021a). The data preprocess-
their native language (L1; French or German) in a single session ing followed conventional methods (Deppe et al., 2004), and
using two tasks, each consisting of 23 trials. included the following steps:
• Downsampling from 100 Hz to 25 Hz.
The order of the two languages was counterbalanced across
• Epoching from -11 s to 30 s relative to the onset of
participants and the entire testing session lasted between
the ‘Clear Mind’ cue.
75 and 90 minutes. The experimenter spoke English at all times.
So that they were focussed on their native language, participants • Manual exclusion of trials with obvious spiking or
were asked to describe the Cookie Theft picture of the Boston dropout artefacts.
Diagnostic Aphasia Examination in their native language
• Automated detection of data points with signal inten-
prior to being tested in that language (Goodglass & Kaplan,
sity beyond 0.0001-0.999 quantiles. If a trial con-
1983).
tained one of these extreme data points, it was replaced
by the mean for that epoch; if it contained more than
The cued word generation paradigms were based on Knecht
one, the trial was excluded from further analysis
and colleagues’ 1998 paradigm (Knecht et al., 1998b). For
each trial, the participant is shown a letter and is asked to • Normalisation of signal intensity by dividing CBFV
silently generate words starting with that letter. Each task com- values by the mean for all included trials and
prised 23 trials and lasted for around 20 minutes. We excluded multiplying by 100.

Figure 1. A schematic diagram of the word generation task. Period of interest (POI) is marked in grey from 8 to 20 s, and the event
marker is displayed in red.

Page 5 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

• Heart cycle integration by averaging the signal Statistical analysis. All analyses were conducted using the R
intensity from peak to peak of the heartbeat. Programming Language (R Core Team, 2020). We first checked
for a leftward bias in the overall laterality index, using a
• Baseline correction by subtracting the mean CBFV
one-group t-test, and also categorised each participant as left-
across the baseline period (-10s to 0s relative to the
biased, right-biased or bilateral. The bilateral group were
‘Clear Mind’ cue) from all values in the trial.
those whose confidence interval around the LI included zero.
• Automated detection and rejection of trials containing Split half reliability of the LI was estimated using LIs com-
normalized values below 60 or 140. puted from odd or even trials only. Spearman correlations were
computed between LIs for L1 and L2.
Participants with fewer than 15 usable trials for either lan-
guage were excluded from all further analyses. For each par- To test our main hypothesis, the association between strength
ticipant that was included in the analysis, a grand mean was of lateralization (LI values) for L1 and L2 was first visu-
calculated over all of their included trials. A laterality index (LI) alized using a scatterplot, with the strength of association
was calculated by taking the mean of the difference between computed as Spearman’s correlation coefficient. Following
left and right CBFVs (L-R) within a period of interest (POI) Woodhead et al. (2020), we adopted an approach based on
that started 8 s after the ‘Clear Mind’ cue (i.e. 3 s after the word Bland & Altman (1986) to determine whether the individual
generation task had begun) and ended at 20 s (i.e. when the LIs for L1 and L2 were equivalent. This involves specifying
covert generation task ended). The start time of the POI was boundaries for the expected distribution of difference scores,
chosen to allow time for the blood flow to respond to the which should contain 95% of bivariate points, if the two val-
task; and the end time was chosen to prevent capturing the ues are equivalent. The expected range can be computed from
response to the overt speech generation phase. knowledge of the task reliability. We adopted the range
specified by Woodhead et al. (2020); they computed differ-
This method of calculating LI using the mean L-R difference ence scores by LIs for odd vs even trials, and set boundaries
across the whole of the POI (the ‘mean’ method) deviates from corresponding to expected mean of zero +/-1.96 standard
the conventional method that we had used in the first version deviations. If the two measures are equivalent, 95% of dif-
of this paper (https://doi.org/10.12688/wellcomeopenres.9869.1). ference scores, the repeatability coefficient, between LIs for
The original ‘peak’ method, popularised by Deppe et al. L1 and L2 should fall in this range (from -2.5 to 2.5).
(1997) takes the mean of a narrow time window around the
peak difference within the POI. This method forces the LI to be For our second hypothesis, that laterality for L2 would be asso-
either left or right - even if the waveform is close to zero with ciated with AoA, we used a t-test to compare laterality for L2
no clear lateralised peak, the highest absolute value in the between those with early vs late AoA. A two-tailed test was
POI will be treated as a peak. This creates a bimodal distribu- used because the literature does not give clear predictions about
tion of LIs. We have compared the ‘peak’ method with our direction of effect.
‘mean’ method, and shown that, while they give high agree-
ment, the mean method is at least as reliable and gives normally In addition, we report the correlation between LI values and
distributed LI values, albeit with lower values, due to aver- strength of handedness (EHI quotient), and the impact of testing
aging over the whole POI (Woodhead et al., 2020). We have order (L1 then L2, or L2 then L1).
therefore moved to using the mean method in our current
research. Nonetheless, peak LI values were computed in case
they are required for comparison with other studies, and are Results
available on the online data repository: https://osf.io/4pm76/. Handedness. Summary statistics for the EHI handedness meas-
ure can be seen in Table 1. Of 24 participants included in the
In a final step, to bring our methods in alignment with data analysis, 23 had EHI values above 0, indicating right
Woodhead et al. (2019), we identified and excluded data- handedness. The remaining participant had an EHI of -20, indi-
sets with unusually high trial-by-trial variability using the cating weak left handedness. Correlations between LI from
Hoaglin & Iglewicz (1987) outlier detection method. For this FTCD and handedness scores on the EHI, were not statisti-
analysis, LI was calculated for each trial, rather than just for cally distinguishable from zero for either L1 (r = -0.145) or L2
the grand average. The standard error of these single-trial (r = 0.137).
LI values was then calculated. Outliers were defined as data-
sets where the standard error was above an upper threshold, Language history. Summary statistics for the language his-
calculated as: tory questionnaire can be seen in Table 1. Self-reported profi-
ciency in speaking, reading and understanding English were all
Upper threshold = Q3 + 2.2 * (Q3 – Q1) generally high (all around 9/10), with a minimum for any indi-
vidual rating of 6/10. Age of acquisition, defined as age when
where Q1 is the first quantile of the standard errors among first started acquiring the language, was more variable, ranging
all participants, and Q3 is the third quartile. Participants from 0 to 15 years. Binary categorisation of AoA, using
who had standard error above the upper threshold for either Hull & Vaid’s (2006) criteria gave 7 cases of early AoA
L1 or L2 were excluded from all further analyses. (below 6 years of age), and 17 cases of late AoA.

Page 6 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

FTCD data quality and reliability. As mentioned in the Normality of the LI values was assessed using Shapiro-Wilk
Methods, two participants were excluded from the analysis tests. Distributions of LIs were unimodal for both L1 and
because of insufficient number of usable trials. For the remain- L2. Data for L1 did not significantly deviate from normality
ing 24 participants, 5.98% of trials were excluded for L1, and (W = 0.88, p = 0.009), whereas data for L2 were significantly
6.34% for L2. non-normal (W = 0.96, p = 0.514), showing a rightward skew.

Table 1. Demographics for the Study 1 Split-half reliability was assessed by correlating the LI values
participants, N=24 (18 female). from odd and even trials. The Spearman’s correlation for the
L1 data was 0.58, and for the L2 data it was 0.7, indicating
Characteristic Mean (sd) medium to good within-session reliability.

Age, years 23.04 (3.64)


Normalized blood flow velocities for the left and right middle
EHI/100 73.67 (26.74) cerebral arteries are presented for each task in Figure 2.
Languages spoken 3.71 (0.95)
Table 2 shows summary statistics for the LI values for L1
Age of English acquisition, years 7.54 (4.41) and L2. The Bayes factor was computed to check the equiva-
Age of English fluency, years 12 (6.83) lence of the mean LI for the two languages using the R package
‘BayesFactor’ with default settings (Morey & Rouder, 2018),
English accent/10 2.58 (2.41) and gave a value of 0.234, which may be interpreted as moder-
English overall rating/10 9.1 (1) ate evidence for the null hypothesis (Lee & Wagenmakers, 2014).
The percentage of participants in each group categorised as
English speaking rating/10 8.92 (1.1) left lateralised, bilateral or right lateralised is also shown. The
English listening rating/10 9.12 (1.08) majority of participants were left lateralised, with only around
10% showing bilateral activation. No participants showed
English reading rating/10 9.25 (0.94) right lateralisation for either L1 or L2. T-tests showed that

Figure 2. Left and right hemisphere activation is displayed as a function of epoch time in seconds for the word generation
task for L1 (French or German) and L2 (English) in Study 1. Dotted lines indicate the start and end of the baseline period (from -10 to
0 seconds) and the period of interest (from 8 to 20 seconds). L1, first language; L2, second language.

Table 2. Summary statistics for Study 1 laterality indices (N = 24).

Language Mean trials mean LI se LI % left % bilateral % right

L1 21.62 2.72 0.36 92 8 0

L2 21.54 2.82 0.28 88 12 0

Page 7 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

there were no significant effects of testing order on LI values, Proficiency was generally high in this sample, so it was not
either for L1 (p = 0.113) or L2 (p = 0.712). possible to assess the impact of variation in proficiency on
lateralisation. The sample was small, and so lacking in power
As can be seen in the scatterplot in Figure 3, laterality indi- to detect small effects, but there was no indication of sup-
ces for L1 and L2 were similar, with Spearman’s R = 0.703. port for the hypothesis that AoA affected absolute levels of lan-
Furthermore, the points cluster around the continuous grey guage lateralisation or was related to a difference in lateralisation
line, which shows the point of equivalence between L1 and between the two languages.
L2, and all but one point falls within the Bland-Altman bounds
(dotted grey lines), as would be expected if L1 and L2 were Study 2: Japanese-English bilinguals with
equivalent. moderate-high proficiency
In Study 1 we found no difference in laterality patterns for L1
Effect of age of acquisition. One can see by inspection of and L2 between French-English and German-English bilinguals,
Figure 3 that there is no evidence of a trend for lower LI for but it is possible that differences might be more apparent with
L2 in those with early AoA, and a t-test of differences in L2 languages that are more different from one another, in gram-
LI for those with early and late AoA revealed no differences: matical structure, lexical items and/or phonology. These fac-
t = 0.84, p = 0.419. For a more quantitative assessment of tors have been shown to influence the ease with which a second
association, we computed Spearman’s correlations between language is learned, and might plausibly affect the extent to
the LI values for L2 (English) and the age of acquisition of which language representations are shared or distinct (Schepens
English. This was not statistically different from zero (r = 0, et al., 2016). Study 2 provided the opportunity to assess this
p = 0.99). idea in a sample of adults whose native language was Japanese,
with English as the L2.
Discussion
Nearly all participants showed significant left lateralised blood- Study 2 was run independently of Study 1, at a different institu-
flow for both L1 and L2 during the word generation task. tion by different experimenters, to address similar questions
Only 5 participants were classified as bilateral for one to Study 1, but with Japanese-English bilinguals. We report
language, and for 3 of these it was L1 that was bilateral. Further- the two studies together here as they make it possible to test
more, laterality indices for L1 and L2 were highly related and generalisability of the Study 1 findings in a different language,
similar in magnitude, indicating good reliability of the measure. and with some methodological modifications. In addition,

Figure 3. Scatterplot showing individual mean LIs in L1 and L2, with horizontal and vertical error bars denoting standard
errors. The continuous grey line corresponds to the point of equality of the two measures, and the dotted lines show the limits where
difference between LIs is +/- 2.5.

Page 8 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

Study 2 included bilinguals with a wider range of proficiency Word generation tasks. Stimuli were presented using Cogent
than Study 1, making it possible to consider the effect of this toolbox (http://www.vislab.ucl.ac.uk/cogent) for MATLAB
variable on lateralisation. (Mathworks Inc., Sherborn, MA). Triggers time locked to the
onset of the stimulus were sent from the presentation PC to
An additional aim of Study 2 was to test whether a lan- the Doppler Box set-up.
guage that uses both logographic and syllabic orthographic
systems would show a more pronounced difference between The task was based on Gutierrez-Sigut et al. (2015), and involved
phonological and semantic processing in the strength of later- phonological and semantic word generation tasks in English
alisation (cf, Gutierrez-Sigut et al., 2015). Japanese Kana carry and Japanese, with order counterbalanced across participants.
phonological information, but Kanji are more strongly linked Task instructions were delivered to correspond to the tested
to semantic information. We expected that phonological language. Unlike in Study 1, there was no silent interval for
fluency would stimulate typically left-lateralised pre-motor covert word generation: participants spoke the words aloud
articulatory planning processes more strongly than semantic as they thought of them. Gutierrez-Sigut et al. had previously
fluency, and therefore be more strongly left-lateralised. shown that LIs were similar regardless of whether overt or
covert responses were given, and they noted a benefit of overt
Methods production was that the experimenter could record the par-
Participants. We recruited participants through the UCL psy- ticipants’ responses as they occurred. For each trial, partici-
chology participant pool, research posters around the University, pants saw “Clear Mind” presented on the screen for 3 seconds.
and through email communication to contacts within Japanese The cue stimulus was then presented, and participants had 17
communities in London. We initially recruited 32 adult native seconds to overtly generate as many words as possible. Par-
speakers of Japanese, who reported using English on a daily basis. ticipants were then instructed to relax for 16 seconds to restore
None of the participants had a history of reading or language baseline activity. Each trial lasted a total of 36 seconds.
difficulties. All had normal or corrected to normal vision.
Stimuli
Seven participants were excluded from the study. This was due Phonological word generation - Japanese and English. In
to inability to find a suitable temporal window (6 participants), Japanese, participants were presented with a cue in Hiragana,
or an insufficient number of usable trials after preprocessing one of the Japanese phonological scripts. Following the Japanese
(1 participant). All analyses are based on the final sam- mora frequency analysis conducted by Dan et al. (2013) based
ple of 25 participants] (19 female, mean age = 29.32 years, on the familiarity ratings in Amano & Kondo (1999), 10
sd = 6.73 years). of the 12 most frequent moras that are positioned at the begin-
ning of words were selected (あ/a/, い/i/, お/o/, か/ka/, き/ki/,
Ethics statement. Ethical approval for the study was granted by こ/ko/, さ/sa/, し/shi/, た/ta/, ふ/hu/). The two moras omitted
the UCL Research Ethics Committee (ID:3612/001). Partici- were は (/ha/) and じ (/ji/). は was omitted because it would be
pants gave written informed consent and were aware they could pronounced /wa/ when it was the subject-marker and じ was
withdraw at any time. omitted because it was the voiced sound of し (/shi/) that was
included in the stimuli. Participants had to produce as many
Language history and ability. Age of acquisition of English words as possible that began with the specified Kana. Each
and number of years of using English were evaluated via Kana was presented twice, and the 20 trials were presented
self-report. As with Study 1, a binary age of acquisition (AoA) in a pseudo-randomised order to ensure all 10 cues had been
variable was created by subdividing participants into early presented once before a cue was repeated.
(below 6 years) and late (6 years or over) subgroups.
In the English phonological word generation task, participants
English language ability was measured using the Quick were presented with 10 alphabetic letters (A, B, C, F, H, M, O,
Placement Test (University of Cambridge Local Examinations S, T, W) and asked to produce as many words as possible that
Syndicate, 2001), which assesses English reading, vocabulary, began with the specified letter. Trials were presented in the
and grammar. The test is scored out of 60. Those who scored same manner as the Japanese task.
under 40 were classed as having basic level proficiency (N = 4);
between 40 and 48 were classed as having intermediate Semantic word generation - Japanese and English. Ten
level proficiency (N = 3); and above 48 were classed as hav- Japanese words representing semantic categories were pre-
ing advanced level proficiency (N = 17). The test data was not sented in the standard written form, i.e. the mixture of Kanji and
available for one participant. Kana: 家畜 farm animals, 動物園の動物 zoo animals, 野菜
vegetables, 果物 fruits, 飲み物 drinks, 色 colours, スポーツ sports,
FTCD apparatus. Blood flow velocity through the left and ペット pets, 道具 tools, and 乗り物 transport. The same
right MCAs was examined using a DopplerBox ultrasonog- semantic categories were presented in English. Participants had
raphy device and DiaMon headset (manufactured by DWL to report as many words that matched these categories as possi-
Elektronische Systeme, Singen, Germany). Two 2-MHz trans- ble. Each category was repeated twice in the semantic fluency
ducer monitoring probes were mounted on the headset and blocks. Categories were presented in a pseudo randomised
placed at each temporal skull window. order.

Page 9 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

FTCD analysis. The same FTCD analysis method was used consistency with Study 1. For phonological word generation, the
as in Study 1, except that the epoch lengths were changed to split-half correlation was 0.6 for L1 and 0.83 for L2. For
match timings for Study 2. The POI started at 6 s after the onset semantic word generation, the correlation was 0.61 for L1 and
of the ‘Clear Mind’ stimulus (i.e., 3 s after the word generation 0.69 for L2. This indicated moderate to good reliability for
task had begun) and ended at 20 s (i.e., at the end of the word all tasks.
generation task).
LI values. Normalized blood flow velocities for the left and
Results right middle cerebral arteries are presented for each language
Language history and task performance. Summary statis- and task in Figure 4. Table 4 shows summary statistics for L1
tics of language history can be seen in Table 3. Age of English and L2 in both phonological and semantic word generation
acquisition ranged from 0 to 13 years. In contrast to Study 1, tasks. Bayes factors were computed to check the equivalence of
where there was little variation in proficiency: Study 2 included the mean LI for the two tasks in the two languages using the R
4 cases with basic proficiency, 3 cases with intermediate pro- package ‘BayesFactor’ with default settings (Morey & Rouder,
ficiency, and 17 cases with advanced proficiency, according 2018). This gave a value of 0.211 for the Phonological task,
to the Quick Placement Test. The usage of English was assessed which may be interpreted as moderate evidence for the null
using the question “how much English and Japanese (and other hypothesis, and a value of 0.368 for the Semantic task, which
languages if you have) do you use in a typical week?” and corresponds to anecdoal evidence for the null hypothesis (Lee
the percentages of use of English out of 100% are shown in & Wagenmakers, 2014).
Table 3. The participants tended to use English more than
Japanese. Laterality indices for L1 and L2 were strongly correlated in both
the phonological task (Spearman’s R = 0.769) and the seman-
The mean number of words produced per trial in the phono- tic task (Spearman’s R = 0.775), closely replicating the results
logical conditions was 5.84 (SD = 1.34) for Japanese and 5.98 of Study 1. This is shown in the scatterplots in Figures 5a and 5b.
(SD = 1.32) for English. The mean number of words pro-
duced per trial in the semantic condition was 7.61 (SD = 1.24) Effects of age of acquisition and proficiency. Points in
for Japanese and 6.95 (SD = 1.29) for English. There was no Figure 5 are coded to show age of acquisition. We explored
significant difference between the mean number of words pro- whether age of acquisition for English was related to strength of
duced per trial for L1 and L2 in the phonological condition laterality in L2. There was no significant correlation between
(t (48) = -0.36, p = 0.719) or the semantic condition (t (47.9) = 1.84, AoA and LI for the phonological task (r = -0.12, p = 0.583;
p = 0.071). Figure 5A) or for the semantic task (r = -0.02, p = 0.907;
Figure 5B).
FTCD data quality and reliability. Normality of LI values was
assessed using Shapiro-Wilk tests. For the phonological tasks, Data on the Quick Placement Test, the measure of proficiency
data was normally distributed for L1 (W = 0.96, p = 0.507) and in L2, were available for 24 participants. These were not cor-
L2 (W = 0.95, p = 0.278). Data was also normally distrib- related with the LI for either the phonological task: r = 0.11,
uted for the semantic tasks for L1 (W = 0.97, p = 0.63) and L2 p = 0.614, or the semantic task: r = 0.02, p = 0.932.
(W = 0.98, p = 0.949).
Discussion
Split-half reliability was assessed by correlating the LI values Study 2 found that most participants were left-lateralised
from odd and even trials, using Spearman’s correlations for for language on both tasks in both languages and there was
close correspondence between the LIs for L1 and L2.
Furthermore, the pattern of results was very similar for the pho-
Table 3. Demographics for the Study 2 nological and semantic fluency tasks. For this sample we had
participants, N=25 (19 female). direct measures of proficiency, but again we found no relation-
ship between lateralisation and either age of acquisition or
Characteristic Mean (sd) proficiency.
Age, years 29.32 (6.73)
General discussion
Age of English acquisition, years 10 (4.21) The results of Studies 1 and 2 show strong similarity despite
Time using English, years 11.08 (6.43) the differing format of the tasks (covert and overt), native lan-
guages (French/ German and Japanese), and English profi-
English overall score/60 47.38 (7.1) ciencies (mostly highly proficient but varying between basic
English speaking/100 65.65 (20.96) and advanced proficiency).

English listening/100 69.78 (16.48) The correlations between the LIs for L1 and L2 were uniformly
English reading/100 65.43 (16.51) high (ranging from .70 to .78) with 79% of participants left
lateralised for L1 and 76% of participants left lateralised in
English writing/100 69.78 (18.8) L2. The data reported here add to a growing pool of results

Page 10 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

Figure 4. Left and right hemisphere activation is displayed as a function of epoch time in seconds for word generation for
L1 (Japanese) and L2 (English) in Study 2. Plot 4a shows the phonological word generation task, and 4b shows the semantic word
generation task. Dotted lines indicate the start and end of the baseline period (from -10 to 0 seconds) and the period of interest (from 6 to
20 seconds). L1, first language; L2, second language.

Table 4. Summary statistics for Study 2 laterality indices.

Task Language Mean trials mean LI se LI % left % bilateral % right

Phonological L1 19.12 1.96 0.28 72 28 0

Phonological L2 19.52 2.01 0.38 76 20 4

Semantic L1 19.00 1.53 0.28 72 28 0

Semantic L2 19.12 1.65 0.28 64 32 4

supporting the idea that laterality of expressive language It is worth highlighting that our studies only used expressive
processing is the same for L1 and L2 in proficient bilinguals. language tasks, which typically produce strong lateralisation.

Page 11 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

Figure 5. Scatterplot showing individual mean LIs in L1 and L2 for (a) Phonological and (b) Semantic Word Generation, with horizontal and
vertical error bars denoting standard errors. The continuous grey line corresponds to the point of equality of the two measures, and the
dotted lines show the limits where difference between LIs is +/- 2.5.

Where discrepancies in laterality have previously been reported, previously reported dissociations between laterality for L1 and
this has been for receptive language tasks - both in behavioural L2 could simply reflect low reliability of the chosen measure.
contexts (dichotic listening), and in neuroimaging (compre- We believe our results are not an artefact of bimodality in the
hension or lexical decision tasks) (Gurunandan et al., 2020; distributions; few cases had atypical lateralisation, and we used
Hull & Vaid, 2007; Wartenburger et al., 2003). It is possible that nonparametric correlations to guard against undue influence
the processes that drive this effect seen in the literature are not on correlations by outliers.
recruited during expressive language production. There would
be considerable interest in studying laterality of perception and Age of acquisition has been proposed as a key factor in deter-
comprehension of spoken language using FTCD, for which mining divergence of lateralisation patterns. For example Hull &
we have developed some paradigms that have good reliability Vaid (2007) found that bilinguals who were exposed to a second
(Woodhead et al., 2020). language before the age of 5 years had more bilateral repre-
sentation than those who acquired a second language later. In
Split-half reliabilities for all tasks were also uniform and high our studies, age of acquisition (defined as age at first acquir-
(ranging from .58-.83) for both languages. This suggests that ing L2) ranged from 0–15 in Study 1 and 0–13 in Study 2. We

Page 12 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

found no difference in lateralisation strength for L1 and L2 that are involved for processing first and second languages. To
in those who acquired English early compared to those who uncover the specific networks involved in processing L1 and
acquired English later in either study for both phonological L2, we would need techniques that provide finer-grained infor-
and semantic word generation tasks. This suggests that when a mation about within-hemisphere localisation, microcircuitry,
second language is proficiently acquired, lateralisation patterns and connectivity (Abutalebi & Green, 2007).
of expressive language remain stable, regardless of age at which
acquisition began. Conclusions
In two studies, we showed that proficient bilinguals have
Our research can also add to the literature regarding later- comparable levels of lateralisation for L1 and L2 when lateral-
alisation and proficiency. Study 2 included participants with ity is measured using FTCD during modified versions of the
proficiency levels varying from basic to advanced as measured by well-validated word generation tasks. Our results indicate that
the standardised Quick Placement Test. Gurunandan et al. (2020) degree of language laterality is reasonably stable in individuals,
reported that increasing proficiency of L2 accompanied more rather than simply reflecting error of measurement.
divergent lateralization patterns between L1 and L2. This result
was not replicated in our study, with participants of all profi-
Laterality and language are multidimensional constructs, and
ciencies showing similar LI strengths across all tasks. As we
in future work FTCD could be used to test bilingual lateral-
found no indication that degree of language laterality is of func-
ity with different tasks and larger, more heterogeneous samples,
tional significance this opens up the possibility that variations
differing on what DeLuca et al. (2019) referred to as “the
in strength of LI, as measured by FTCD, may reflect anatomi-
spectrum of experiences”. As an inexpensive, non-invasive, com-
cal differences. Individual variation in anatomy of the cerebral
fortable, easily applicable, mobile, and child-friendly method,
blood vessels has been documented (Payne, 2017), but has
with a high temporal resolution, FTCD can complement fMRI,
not, to our knowledge, been related to measures of lateralised
allowing us to test large samples and track changes through-
blood flow.
out development, with repeated administration and with
different tasks.
Limitations
Sample population: Our samples were relatively small, with
Data availability
relatively few individuals with early age of acquisition or low
Underlying data
proficiency. Given the dearth of data on cerebral lateralisation
Open Science Framework: Bilingual FTCD, https://doi.
in bilinguals, we feel that nevertheless, the data are worth report-
org/10.17605/OSF.IO/VD9DT (Bishop et al., 2021a).
ing so they can contribute to future meta-analyses. To that
end we have made the data openly available in a repository.
This project includes the original raw and processed data for
study 1. Please see the Data Dictionary for a description of the
Language assessment: In study 1, we used a self-report ques-
files.
tionnaire to describe our sample and assess language history
and proficiency, but behavioural measurements of proficiency
may have revealed a wider response range for correlational Open Science Framework: Lateralisation in bilinguals, https://doi.
analysis. Although Marian and colleagues established high org/10.17605/OSF.IO/MDCZ5 (Bishop et al., 2021b).
reliability and validity for the self-report questionnaire used
here, and validated it against behavioural measures, their ques- This project includes the raw data for study 2 and processed
tionnaire was devised to describe a population rather than data for studies 1 and 2, with custom scripts used for analysis.
provide an analysis measure of individual differences (Marian
et al., 2007). Data are available under the terms of the Creative Commons
Zero “No rights reserved” data waiver (CC0 1.0 Public domain
For study 2, we had a direct measure of language proficiency, dedication).
but we did not find any coherent associations between level
of proficiency and lateralisation.
Author contributions
Method: While test-retest reliability of FTCD measurements is Conception and design (CRG, KEW, DVMB, HP, MM, EG, MS);
high and the time-locked correlation analysis of CBFV is robust data collection (CRG, HP, MS); data processing and statistical
and non-invasive, the main limitation of the method is that analyses (CRG, DVMB, SCH, ZVJW, HP, MS); interpretation
findings can only be interpreted on a hemispheric level, and do of data (CRG, KEW, DVMB, HP); drafting and revising the
not give information about brain regions within a hemisphere manuscript (CRG, KEW, DVMB, SCH, ZVJW, HP, EG, MS).

Page 13 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

References

Aaslid R, Markwalder TM, Nornes H: Noninvasive transcranial Doppler 436–464.


ultrasound recording of flow velocity in basal cerebral arteries. J Neurosurg. PubMed Abstract | Publisher Full Text
1982; 57(6): 769–774. Hull R, Vaid J: Bilingual language lateralization: a meta-analytic tale of two
PubMed Abstract | Publisher Full Text hemispheres. Neuropsychologia. 2007; 45(9): 1987–2008.
Abutalebi J, Cappa SF, Perani D: What can functional neuroimaging tell us PubMed Abstract | Publisher Full Text
about the bilingual brain? In JF Kroll & AMB de Groot (Eds.), Handbook of Illingworth S, Bishop DVM: Atypical cerebral lateralisation in adults with
bilingualism: Psycholinguistic approaches. New York: Oxford University Press, compensated developmental dyslexia demonstrated using functional
2005; 497–515. transcranial doppler ultrasound. Brain Lang. 2009; 111(1): 61–65.
Reference Source PubMed Abstract | Publisher Full Text | Free Full Text
Abutalebi J, Green D: Bilingual language production: The neurocognition of Kim KH, Relkin NR, Lee KM, et al.: Distinct cortical areas associated with
language representation and control. J Neurolinguistics. 2007; 20(3): 242–275. native and second languages. Nature. 1997; 388(6638): 171–174.
Publisher Full Text PubMed Abstract | Publisher Full Text
Amano S, Kondo T: NTT database series, Nihongo-no Goitokusei: Lexical Klein D: A positron emission tomography study of presurgical language
properties of Japanese. Tokyo, Japan: Sanseido-shoten (in Japanese). 1999; 1. mapping in a bilingual patient with a left posterior temporal cavernous
Bishop DVM, Badcock NA, Holt G: Assessment of cerebral lateralization in angioma. J Neurolinguist. 2003; 16(4–5): 417–427.
children using functional transcranial Doppler ultrasound (FTCD). J Vis Exp. Publisher Full Text
2010; (43): 2161. Klein D, Milner B, Zatorre RJ, et al.: The neural substrates underlying word
PubMed Abstract | Publisher Full Text | Free Full Text generation: A bilingual functional-imaging study. Proc Natl Acad Sci U S A.
Bishop DVM, Watt H, Papadatou-Pastou M: An efficient and reliable method 1995; 92(7): 2899–2903.
for measuring cerebral lateralization during speech with functional PubMed Abstract | Publisher Full Text | Free Full Text
transcranial doppler ultrasound. Neuropsychologia. 2009; 47(2): 587–590. Knake S, Haag A, Hamer HM, et al.: Language lateralization in patients with
PubMed Abstract | Publisher Full Text | Free Full Text temporal lobe epilepsy: A comparison of functional transcranial Doppler
Bishop DVM, Grabitz CR, Watkins KE: Bilingual FTCD. 2021a. sonography and the Wada test. Neuroimage. 2003; 19(3): 1228–1232.
http://www.doi.org/10.17605/OSF.IO/VD9DT PubMed Abstract | Publisher Full Text
Bishop DVM, Grabitz CR, Harte SC, et al.: Lateralisation in bilinguals. 2021. Knecht S, Deppe M, Ebner A, et al.: Noninvasive determination of language
http://www.doi.org/10.17605/OSF.IO/MDCZ5 lateralization by functional transcranial Doppler sonography: A
Bland JM, Altman DG: Statistical methods for assessing agreement between comparison with the Wada test. Stroke. 1998a; 29(1): 82–86.
two methods of clinical measurement. Lancet. 1986; 1(8476): 307–310. PubMed Abstract | Publisher Full Text
PubMed Abstract Knecht S, Deppe M, Ringelstein EB, et al.: Reproducibility of functional
Centeno M, Koepp MJ, Vollmar C, et al.: Language dominance assessment in transcranial Doppler sonography in determining hemispheric language
a bilingual population: Validity of fMRI in the second language. Epilepsia. lateralization. Stroke. 1998b; 29(6): 1155–1159.
2014; 55(10): 1504–1511. PubMed Abstract | Publisher Full Text
PubMed Abstract | Publisher Full Text Lee MD, Wagenmakers EJ: Bayesian Cognitive Modeling: A practical course.
Dan H, Dan I, Sano T, et al.: Language-specific cortical activation patterns Cambridge University Press. 2014.
for verbal fluency tasks in Japanese as assessed by multichannel Reference Source
functional near-infrared spectroscopy. Brain Lang. 2013; 126(2): 208–216. Marian V, Blumenfeld HK, Kaushanskaya M: The language experience
PubMed Abstract | Publisher Full Text and proficiency questionnaire (LEAP-Q): Assessing language profiles in
Dehaene S, Dupoux E, Mehler J, et al.: Anatomical variability in the cortical bilinguals and multilinguals. J Speech Lang Hear Res. 2007; 50(4): 940–967.
representation of first and second language. Neuroreport. 1997; 8(17): PubMed Abstract | Publisher Full Text
3809–3815. Morey RD, Rouder JM: BayesFactor: Computation of Bayes Factors for
PubMed Abstract | Publisher Full Text Common Designs. R package version 0.9.12-4.2. 2018.
DeLuca V, Rothman J, Bialystok E, et al.: Redefining bilingualism as a Reference Source
spectrum of experiences that differentially affects brain structure and Oldfield RC: The assessment and analysis of handedness: The Edinburgh
function. Proc Natl Acad Sci U S A. 2019; 116(15): 7565–7574. inventory. Neuropsychologia. 1971; 9(1): 97–113.
PubMed Abstract | Publisher Full Text | Free Full Text PubMed Abstract | Publisher Full Text
Deppe M, Knecht S, Henningsen H, et al.: AVERAGE: a Windows program for Paradis M: A neurolinguistic theory of bilingualism. John Benjamins
automated analysis of event related cerebral blood flow. J Neurosci Methods. Publishing, 2004; 18.
1997; 75(2): 147–154. Publisher Full Text
PubMed Abstract | Publisher Full Text Payne S: Cerebral Blood Flow and Metabolism: A Quantitative Approach.
Deppe M, Ringelstein EB, Knecht S: The investigation of functional brain World Scientific. 2017.
lateralization by transcranial Doppler sonography. Neuroimage. 2004; 21(3): Publisher Full Text
1124–1146. Perani D, Abutalebi J: The neural basis of first and second language
PubMed Abstract | Publisher Full Text processing. Curr Opin Neurobiol. 2005; 15(2): 202–206.
Goodglass H, Kaplan E: Boston naming test. Lea & Febiger, Philadelphia, 1983. PubMed Abstract | Publisher Full Text
Green DW: Neural basis of lexicon and grammar in L2 acquisition: The R Core Team: R: A language and environment for statistical computing. R
convergence hypothesis. In Van Hout R, Hulk A, Kuiken F, & Towell R, (Eds). Foundation for Statistical Computing, Vienna, Austria. 2020.
The lexicon-syntax interface in second language acquisition. John Benjamins, 2003; Reference Source
197–218. Rihs F, Sturzenegger M, Gutbrod K, et al.: Determination of language
Reference Source dominance: Wada test confirms functional transcranial Doppler
Groen MA, Whitehouse AJO, Badcock NA, et al.: Does cerebral lateralization sonography. Neurology. 1999; 52(8): 1591–1596.
develop? A study using functional transcranial Doppler ultrasound PubMed Abstract | Publisher Full Text
assessing lateralization for language production and visuospatial memory. Schepens JJ, van der Slik F, van Hout R: L1 and L2 distance effects in learning
Brain Behav. 2012; 2(3): 256–269. L3 Dutch. Lang Learn. 2016; 66(1): 224–256.
PubMed Abstract | Publisher Full Text | Free Full Text Publisher Full Text
Grosjean F: Neurolinguists, beware! The bilingual is not two monolinguals Somers M, Neggers SF, Diederen KM, et al.: The measurement of language
in one person. Brain Lang. 1989; 36(1): 3–15. lateralization with functional transcranial Doppler and functional MRI: a
PubMed Abstract | Publisher Full Text critical evaluation. Front Hum Neurosci. 2011; 5: 31.
Gurunandan K, Arnaez-Telleria J, Carreiras M, et al.: Converging evidence for PubMed Abstract | Publisher Full Text | Free Full Text
differential specialization and plasticity of language systems. J Neurosci. Stroobant N, Van Boxstael J, Vingerhoets G: Language lateralization in
2020; 40(50): 9715–9724. children: A functional transcranial Doppler reliability study.
PubMed Abstract | Publisher Full Text | Free Full Text J Neurolinguistics. 2011; 24(1): 14–24.
Gutierrez-Sigut E, Payne H, MacSweeney M: Investigating language Publisher Full Text
lateralization during phonological and semantic fluency tasks using Sulpizio S, Del Maschio N, Fedeli D, et al.: Bilingual language processing: A
functional transcranial Doppler sonography. Laterality. 2015; 20(1): 49–68. meta-analysis of functional neuroimaging studies. Neurosci Biobehav Rev.
PubMed Abstract | Publisher Full Text | Free Full Text 2020; 108: 834–853.
Hoaglin DC, Iglewicz B: Fine-tuning some resistant rules for outlier labeling. PubMed Abstract | Publisher Full Text
J Am Stat Assoc. 1987; 82(400): 1147–1149. University of Cambridge Local Examinations Syndicate: Quick placement test.
Publisher Full Text Oxford: Oxford University Press. 2001.
Hull R, Vaid J: Laterality and language experience. Laterality. 2006; 11(5): Reference Source

Page 14 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

van der Zwan A, Hillen B, Tulleken CA, et al.: A quantitative investigation of language lateralization using functional transcranial Doppler sonography
the variability of the major cerebral arterial territories. Stroke. 1993; 24(12): in adults. R Soc Open Sci. 2019; 6(3): 181801.
1951–1959. PubMed Abstract | Publisher Full Text | Free Full Text
PubMed Abstract | Publisher Full Text
Woodhead ZVJ, Rutherford HA, Bishop DVM: Measurement of language
Wartenburger I, Heekeren HR, Abutalebi J, et al.: Early setting of grammatical laterality using functional transcranial Doppler ultrasound: A comparison
processing in the bilingual brain. Neuron. 2003; 37(1): 159–170. of different tasks [version 3; peer review: 3 approved, 1 approved with
PubMed Abstract | Publisher Full Text reservations]. Wellcome Open Research. 2020; 3: 104.
Woodhead ZVJ, Bradshaw AR, Wilson AC, et al.: Testing the unitary theory of PubMed Abstract | Publisher Full Text | Free Full Text

Page 15 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

Open Peer Review


Current Peer Review Status:

Version 1

Reviewer Report 20 March 2017

https://doi.org/10.21956/wellcomeopenres.10640.r20237

© 2017 Jansen A et al. This is an open access peer review report distributed under the terms of the Creative
Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium,
provided the original work is properly cited.

Andreas Jansen
Laboratory for Multimodal Neuroimaging (LMN), Department of Psychiatry, University of Marburg,
Marburg, Germany
Verena Schuster
Department of Psychiatry, University of Marburg, Marburg, Germany

Although functional asymmetries between the hemispheres have been known since the mid-19th
century, we still lack a thorough understanding of the underlying mechanisms. In particular, we
do not have precise models that reveal which factors drive hemispheric specialization, how
lateralization processes of different cognitive functions interact with each other, and how the
brain integrates processes that are lateralized to opposite hemispheres. In the present study,
Grabitz and colleagues aimed to investigate whether the hemispheric lateralization of first and
second languages is different. Hemispheric dominance was assessed by functional transcranial
Doppler sonography (fTCD) in 26 high proficiency bilinguals with either German or French as their
first language (L1) and English as their second language (L2). fTCD was used to assess task-
dependent blood flow velocity changes in the left and right middle cerebral arteries during a cued
word generation task. The authors report that the majority of participants (22/26) were
significantly left lateralized for both L1 and L2. They found no significant difference between the
lateralization of L1 and L2, as assessed by a lateralization index (LI). They conclude that in highly
proficient bilinguals, there is strong concordance for cerebral lateralization of first and second
languages. Although the study was competently performed, there are some concerns about the
conceptual planning of the study and the application of fTCD.

Conceptual foundations of the study: There are many aspects of functional neuroanatomy that
might differ between L1 and L2, for instance the recruitment of brain regions, the strength of
brain activity in specific regions or the connectivity between language regions. Hemispheric
lateralization is only one aspect. It might have been useful to explain why the authors assessed in
particular hemispheric dominance, it might have been useful to state why they anticipated that
language lateralization is stronger for a bilingual person’s first language than for the second
language, and it might be have been useful to explain what they authors would have had
concluded when they had found significant differences between the lateralization of L1 and L2 –

 
Page 16 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

expect that there are significant differences. To interpret non-significant differences, as in the
present study, it is also necessary to explicitly state how strong the LI would be expected to differ
between L1 and L2. What is a minimal difference that would have been considered as relevant? It
is also not clear whether the authors intended to assess differences between L1 and L2 on a group
level or in individual subjects? What would be the putative role of interindividual differences? In
summary, in its present version of the manuscript a theoretical concept is completely missing.
Without this concept, it is not possible to properly interpret the findings. The manuscript gives the
impression that the authors were just looking for differences in a rather exploratory way.

Application of fTCD: Before performing a study, it might be a useful exercise to ask whether the
imaging technique used is a suitable tool to answer the question asked. I have serious doubts that
fTCD can be applied for that purpose. The authors expect to find differences between the
lateralization of L1 and L2. It is important to know whether the technique is sufficiently sensitive to
find differences, if they exist. As mentioned before, the authors do not explicitly state what
differences they expect. In our opinion, it is rather unlikely that the hemispheric dominance (left,
right, bilateral) of L1 and L2 will be different. If a subject is for instance left dominant for L1, we do
not expect that she will be right dominant for L2. The expected differences will most likely be on a
smaller scale. A subject that is left dominant for L1 might be a bit less left dominant for L2. Is fTCD
able to find these differences? Unfortunately, there are no methodological studies that assessed
how sensitive fTCD is to find potentially small differences in the degree of lateralization. We
certainly agree that fTCD is a useful tool to determine hemispheric dominance (that is, left- or
right-hemispheric lateralization). It is, however, unknown if the technique can be used to also
assess small differences in the degree of hemispheric lateralization. Large methodological studies
in this regard, in particular from independent groups (i.e., not from the developers of AVERAGE),
are missing. One might also ask why the developers report correlations between fTCD and other
techniques (such as fMRI or the Wada test) as high as r~0.9 (and even much higher), when it is not
possible to reproduce these findings even with the same modality. Furthermore, fTCD assesses
blood flow velocity changes in the vascular territory of the left and right middle cerebral artery.
This territory however shows a high interindividual variability. While one might argue that main
network nodes of the language system, such as “Broca’s area”, lie within this territory in all
subjects, other regions that are also active during the task might be included in the calculation of
the LI in some subjects, but not in others. What are the consequences when one compares a LI
between subjects? In summary, it is unclear whether fTCD is sensitive enough to measure small
differences between the lateralization of L1 and L2.

To conclude, the study deals with an interesting topic and is competently performed. However, the
theoretical foundation should be described in more detail, the expected difference between the LI
of L1 and L2 should be reported, and it should be made clear that fTCD is able to measure the
expected differences at all.

Competing Interests: No competing interests were disclosed.

We confirm that we have read this submission and believe that we have an appropriate level
of expertise to state that we do not consider it to be of an acceptable scientific standard, for
reasons outlined above.

Author Response 24 Jun 2021

 
Page 17 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

Dorothy Bishop, University of Oxford, Oxford, UK

‘explain why the authors assessed in particular hemispheric dominance’ – many reasons for
differences in L1 and L2: recruitment of brain regions, strength of brain activity, connectivity
between language regions

We now make it clear that we recognise that there are potentially many ways in which
language processing may differ for the two languages in bilinguals, but we do not think that
invalidates a decision to look specifically at brain lateralisation, which has previously been
discussed as potentially differing between languages.

State why it was anticipated that language lateralization is stronger for a bilingual persons L1
than for L2..   Explain what the authors would have concluded when they had found significant
differences between the lateralisation of L1 and L2. Explicitly state how strong the LI would
expected to differ between L1 and L2.

We now go into more detail regarding predictions from prior literature. The prediction of
discrepant laterality between languages was not strong: In the literature, there are reports
of both the same strength of lateralisation for L1 and L2 and also reduced lateralisation for
L2.  A finding of significant difference in lateralisation between L1 and L2 would have lent
further support to one side of this debate. 

What is a minimal difference that would have been considered as relevant? There are issues with
interpreting non-significant findings.

As well as reporting Bayes Factors for mean comparisons, we have now conducted further
analysis using the Bland-Altman method, which is specifically designed to address this issue.

It is also not clear whether the authors intended to assess differences between L1 and L2 on a
group level or in individual subjects? What would be the putative role of interindividual
differences?

This is a within-subjects study, with each person tested in both their languages, so the
differences are evaluated in individual subjects. The correlations that are reported depend
on there being individual differences in the extent of lateralisation. The result, therefore,
hinges on interindividual differences.

Important to know if technique is sensitive to find differences if they exist – are there
methodological studies to assess the sensitivity of fTCD to small differences in lateralisation?  ' I
have serious doubts that fTCD can be applied for that purpose. The authors expect to find
differences between the lateralization of L1 and L2. It is important to know whether the technique
is sufficiently sensitive to find differences, if they exist.'

Since this study was conducted, we have reported a study of test-retest reliability of
laterality indices assessed using fTCD, which we now cite.  They are high enough to give
confidence that the degree, as well as direction of laterality measured this way, is
reasonably stable. See Woodhead, Z. V. J., Bradshaw, A. R., Wilson, A. C., Thompson, P. A., &

 
Page 18 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

Bishop, D. V. M. (2019). Testing the unitary theory of language lateralization using functional
transcranial Doppler sonography in adults. Royal Society Open Science, 6(3), 181801.
https://doi.org/10.1098/rsos.181801.
The reviewers clearly have a very negative impression of fTCD as a measure of laterality, but
it's unclear what they prefer. The Wada technique is a blunt instrument that is useful in
clinical contexts for making a basic distinction between left, right and bilateral, but it is
neither feasible nor useful for measuring degrees of lateralisation. With fMRI one can
quantify the LI, but the results will depend on the statistical approach (e.g. height or extent
of statistic, %signal change etc), ROI studied and on thresholding. The kinds of individual
difference in vasculature that the reviewers mentioned may well affect the observed LI - we
now make that point in the Discussion. However, this will be as true for measures from fMRI
as for fTCD, and in addition, with fMRI, the issue is complicated by the possibility of
individual differences in localisation of language regions. 
So, while we accept that fTCD is not perfect, neither are other methods, and part of our goal
in ongoing research is to use them as complementary methods. Indeed we regard it as a
worthwhile endeavour in future to consider how far the LI in fTCD relates to anatomical
variation. But we don't see any of these as reasons to dispense with the results we have
obtained, which we regard as part of a complex pattern of evidence on these issues.

Why did the developers report correlations between fTCD and other techniques (such as fMRI or
Wada test) as high as r~0.9, when it is not possible to reproduce these findings even with the
same modality?

We cannot say why the Münster group who developed fTCD reported these correlations. 
Our work is independent of theirs and we have not used the Average software for some
years, though the processing steps we adopt are largely the same. The correlations they
originally reported were based on small sample sizes and would have large confidence
intervals around them. In addition, language laterality, as conventionally measured,  is
usually not normally distributed and should be evaluated with a nonparametric correlation
coefficient. We hope to obtain data on larger samples in future that will provide more solid
evidence on the relationship between lateralisation as assessed by fTCD and fMRI.

Competing Interests: No competing interests were disclosed.

Reviewer Report 05 December 2016

https://doi.org/10.21956/wellcomeopenres.10640.r18023

© 2016 Green D et al. This is an open access peer review report distributed under the terms of the Creative
Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium,
provided the original work is properly cited.

David W. Green
Experimental Psychology, Faculty of Brain Sciences, University College London, London, UK

 
Page 19 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

Tom Hope
Wellcome Trust Centre for Neuroimaging Institute of Neurology, University College London,
London, UK

This is a succinctly written paper reporting the novel results from a non-invasive technique
(functional transcranial doppler ultrasound) that examines changes in blood flow velocities in the
left and right middle arteries in response to a cued word production task in a person’s native
language (L1, either French or German) and in their second language (L2, English). The
participants were young proficient bilingual speakers immersed in an English context. The aim
was to examine the degree of lateralisation in response to this task in L1 and in L2.  The data are
appropriately analysed with suitable correction for the number of comparisons made where
required.
 
Rationale
It is important to deploy non-invasive methods that can be used to assess brain response for a
particular tasks in children and in adults. The specific question addressed concerns the extent to
which L1 and L2 reveal a comparable pattern of asymmetry as revealed by the measure of blood
flow velocity.

It is worth noting that both hemisphere play a role in speech processing in monolingual speakers.
Functional imaging data are consistent with the idea that regional activation during speech
production is bilateral for motor, premotor, subcortical, and superior temporal regions whereas
middle frontal activation is predominantly left lateralised (Price, 2010). As the authors correctly
note, neuroimaging data strongly implicate common regions in the processing of L1 and L2.
Indeed from a neurocomputational point of view, there is no reason to envisage that the
processing of a second language would recruit radically distinct regions (Green, 2003). Instead,
different languages may recruit different microcircuits within common regions (e.g., Paradis,
2004).  We should then expect differences attributable to the distinct phonological and syntactic
properties of words in different languages and commonalities in terms of their reference to
common entities. Consistent with this possibility, Correia et al. (2014), using multi-voxel pattern
analysis, reported discriminating neural response in multiple temporal, parietal and frontal
cortical regions to individual spoken animal nouns (horse/duck) in English and Dutch combined
with an invariant response pattern to the translation equivalents (paard/eend) indicative of access
to common semantic/conceptual knowledge in regions such as the anterior temporal pole. In
modelling recovery post-stroke, we found that models implicating the same brain regions were
equally predictive for both monolingual and bilingual speakers displaying parallel recovery
patterns (Hope et al., 2015). Evidence for selective recovery post-stroke does not contradict this
position, but rather points to a difficulty in control (Green, 2008). Detailed determination of this
possibility in the context of speech production awaits future research. However, the Wada test
(using injection of intracarotid amobarbital), referred to by the authors as the gold standard in
determining lateralization, strongly implicates left hemisphere representation for both languages
of a bilingual speaker (e.g., Rapport, Tan & Whitaker, 1983).  A non-invasive method as reported
here provides a useful adjunct despite its noted limitations in terms of identifying the microcircuits
involved.
 
Participant information
Self-reported proficiency does generally correlate reasonably well with more objective measures
as the authors note. Nevertheless, it is usually desirable to report such objective measures. For

 
Page 20 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

instance, for vocabulary measures the various tests under the rubric of LexTale offers a good
source (Brysbaert, 2013; Lemhöfer & Broersma, 2012). There are also Quick Placement tests to
assess syntactic knowledge.

Procedure
Given the experimenter spoke English all the time how was the transition to the word generation
task managed, in particular the switch from describing a picture in L1 to the naming task?

Data analysis
The word generation task involved an interval for the silent generation of words in response to a
cued letter (15 secs) followed by a 5 second recall interval. Although this interval is short and so
constrains information on relative difficulty, it is of interest to know the mean scores and their
variance. If there is variance, does such variance have detectable effects on the signal?

Estimates of reliability
The authors nicely use odd-even trials to estimate signal reliability for the asymmetry index. This
estimate proved significant for the production task in the native language (L1) but not for the
second language, English (L2). If there is no asymmetry difference then shouldn’t there be a
significant correlation when alternate trials are taken from different language runs?

Competing Interests: No competing interests were disclosed.

We confirm that we have read this submission and believe that we have an appropriate level
of expertise to confirm that it is of an acceptable scientific standard.

Author Response 24 Jun 2021


Dorothy Bishop, University of Oxford, Oxford, UK

Comments under Rationale


We thank the reviewers for providing an overview of the literature on microcircuitry, which
we now mention in the Discussion

Usually desirable to report objective measures of language proficiency


Please see response to reviewer 1. We did not have such measures for Study 1, but we do
for study 2.

Given the experimenter spoke English all the time, how was the transition to the word generation
task managed, in particular the switch from describing a picture in L1 to the naming task?’
Alas, this was not noted at the time of data collection for Study 1 and we do not have a
record of how this was handled, though we do state that the examiner used English
throughout. For Study 2, the experimenter switched instruction languages according to the
tested languages.

Data analysis. The word generation task involved an interval for the silent generation of words
in response to a cued letter (15 secs) followed by a 5 second recall interval. Although this interval
is short and so constrains information on relative difficulty, it is of interest to know the mean

 
Page 21 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

scores and their variance. If there is variance, does such variance have detectable effects on the
signal? 
We were not sure we had interpreted this correctly; in Study 1 words were generated
covertly, therefore we did not have a record of responses. However, in previous studies with
monolinguals, we have specifically considered whether varying task difficulty affects
laterality. Where difficulty is varied by constraining the task (requiring words starting with 2
specific letters rather than one), this reduced performance but did not affect the LI.
(Badcock, N. A., Nye, A., & Bishop, D. V. M. (2012). Using functional transcranial Doppler
ultrasonography to assess language lateralisation: Influence of task and difficulty level.
Laterality, 17(6), 694–710. https://doi.org/10.1080/1357650X.2011.615128). In Study 2
subjects generated words overtly and we report data on number of words produced.  There
was no relationship between number of words generated and LI. 

Estimates of reliability The authors nicely use odd-even trials to estimate signal reliability for
the asymmetry index. This estimate proved significant for the production task in the native
language (L1) but not for the second language, English (L2). If there is no asymmetry difference
then shouldn’t there be a significant correlation when alternate trials are taken from different
language runs? 
We were also puzzled by the differing estimates of split half reliability - as it turns out when
we reanalysed the data for this version, using our current analysis scripts, the estimate of
split half reliability was more similar for the two languages: for L2, the original analysis gave
r = .28. With our new method, one participant met criteria as an outlier and was excluded,
and we also used Spearman rather than Pearson correlation, and based the LI on the mean
rather than peak of the difference waveform; this gives r = .60. Please note: the analytic
decisions leading to these changes were made a priori: we used the scripts and outlier
exclusion criteria that we documented in Woodhead et al (2019), and list here how each
modification of the method affected the correlation:
- Discarding one participant with noisy data (participant 14), R = 0.44
- Using Spearman’s correlations instead of Pearson’s, R=0.49
- Using mean LI method instead of peak, R=0.60
We feel this provides further justification for basing analyses on mean rather than peak
values: the latter can be more noisy, especially if the data do not show a single pronounced
peak.

Competing Interests: No competing interests were disclosed.

Reviewer Report 05 December 2016

https://doi.org/10.21956/wellcomeopenres.10640.r17649

© 2016 Brysbaert M. This is an open access peer review report distributed under the terms of the Creative
Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium,
provided the original work is properly cited.

Marc Brysbaert

 
Page 22 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

Department of Experimental psychology , Ghent University, Ghent, Belgium

It pains me to have to write this review. The research done is good and reliable (and hence the
Wellcome Trust may decide to publish it), but the question addressed is futile and the methods
used far from optimal. Therefore, I fear that if this article is indexed to PubMed Central, it will not
do the authors much good.

For a start, the authors had anticipated that language lateralization might be stronger for a
bilingual's first language than for the second language. In 1992, Paradis already called this the
Loch Ness Monster of research on laterality and bilingualism. There is no sound evidence
whatsoever that L2 processing would be less lateralized than L1 processing (as the authors indeed
found). There is even very good evidence that as L2 proficiency increases, it increasingly uses the
very same brain areas as L1 processing. Only at low levels of L2 proficiency can one sometimes
see extra right and left hemisphere activity, arguably because the participants are using all types
of strategies, including non-language ones.

Second, the authors are using the crudest neuroscientific technique available, fTCD. As they say, it
is cheap, it can be applied easily (but leads to a considerable loss of participants), but it is also very
crude, as it only compares to blood flow to the left vs. the right hemisphere. In the present study,
the reliability is good (except for L2 processing), but even so it remains a technique that only can
tell you something about more left than right processing, nothing more. So, in the end the
authors are investigating a strawman hypothesis with an unformative technique.

Third, there are power issues. A lot of subject-related variables are tested on a group of 26
participants. Luckily the authors did not find anything significant, because any significance they
would have found, would have been very likely due to a statistical fluke, which cannot be
replicated (see papers by Gelman). 

Fourth, individual differences are thought to be of interest. Still, they are studied with subjective
scales. Why not measure the proficiency with a vocabulary test (e.g., LexTALE, Shipley)? Why use
Likert scales? In several studies (involving the French and Spanish Lextale tests), I've reported that
although there is a good correlation between subjective estimates on the basis of a Likert scale
and the Lextale scores, for individual participants there can be a big difference, because
participants use different comparison groups (e.g., L2 learners compare their performance to
other L2 learners, not to L1 speakers). If the authors want to keep on using subjective measures,
they may want to try descriptions as the levels defined by the European framework.

As said, if the goal of the Wellcome Open Research initiative is to make all reliable empirical data
available, I am not against publication. However, for the above reasons I do not think this
publication will do the authors (nor the journal) much good. The only bit of information I found of
value was Figure 4. Even then it would be good to see this supported by fMRI validation. I know
the Knecht group did so, but still I'd like to see it done in other groups as well. I'm very curious, for
instance, to what extent the bilateral patterns are valid. We rarely see them in fMRI research. 

Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of

 
Page 23 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

expertise to confirm that it is of an acceptable scientific standard, however I have


significant reservations, as outlined above.

Author Response 24 Jun 2021


Dorothy Bishop, University of Oxford, Oxford, UK

Brysbaert, while unenthusiastic about our original paper, accepted that "if the goal of the
Wellcome Open Research initiative is to make all reliable empirical data available, I am not
against publication." Nevertheless, he queried whether the study was worth doing, given
that prior work with fMRI had not found discordant laterality for two languages in
bilinguals. In addition, it was felt that reliance on self-reported proficiency was non-optimal,
and that there were concerns about concluding a lack of difference between languages
when sample size, and hence statistical power, was low. He also expressed misgivings about
the method we had used, functional transcranial Doppler ultrasound.

1. Prediction of different laterality in L1 and L2 is futile. We already know that is not the case. 

If we understood this point correctly, the reviewer is arguing that the fact that laterality is
the same in L1 and L2 for proficient bilinguals is so well established that it is pointless to
provide a further demonstration of the point. We disagree. The literature has not always
been consistent and most studies are small, so an accurate picture may only become clear
when there is sufficient information for a meta-analysis. Our aim was to use fTCD to
contribute to this literature.  We accept that the reviewer has a very low opinion of fTCD, but
we do not think this is justified, and indeed would argue that the strong correlations
between L1 and L2 laterality indices obtained with this method provide some evidence that
individual variation in degree of lateralisation are meaningful. 

2. fTCD is not sensitive enough to detect relevant hemispheric differences

We now provide more arguments in support of fTCD. We note also reviewer 2's comment: '
A non-invasive method as reported here provides a useful adjunct despite its noted
limitations in terms of identifying the microcircuits involved.'  Brysbaert states that fTCD
only tells you about left- vs right hemisphere blood flow. That's exactly what we are
interested in, so this criticism does not seem valid. We should stress we are not making
massive claims for fTCD - it clearly has its limitations -  but dismissing a study on laterality
just because it uses this method seems premature.  It is one tool in the range of possible
methods: we need to do more work with all of them (behavioural, fTCD, fMRI) to study how
they relate to one another and how reliable and sensitive they are, in order to make
progress in laterality research.  This is exactly what we are doing in our current research
programme. 

3. ‘there are power issues. A lot of subject-related variables are tested on a group of 26
participants.’

We agree that there are power issues. Brysbaert claims the result is unlikely to replicate. We
have now included Study 2 - this confirms that the result does replicate, and generalises to
another language and task (semantic fluency). 

 
Page 24 of 25
Wellcome Open Research 2021, 1:15 Last updated: 28 JUL 2021

In terms of the subsequent exploratory analysis of correlations with other measures, as


noted by reviewer 2: ' The data are appropriately analysed with suitable correction for the
number of comparisons made where required.' (our emphasis). However, we agree the
sample is too small of sensible exploratory analyses, and have now modified our focus to
the Age of Acquisition effect, which is a matter of some debate in the literature.  Other
variables are reported for completeness, but we agree that it is not sensible to report all
correlations in the absence of a priori predictions.

4. ‘individual differences are thought to be of interest. Still, they are studied with subjective scales.
Why not measure the proficiency with a vocabulary test (e.g., LexTALE, Shipley)? Why use Likert
scales? In several studies (involving the French and Spanish Lextale tests), I've reported that
although there is a good correlation between subjective estimates on the basis of a Likert scale
and the Lextale scores, for individual participants there can be a big difference, because
participants use different comparison groups (e.g., L2 learners compare their performance to
other L2 learners, not to L1 speakers). If the authors want to keep on using subjective measures,
they may want to try descriptions as the levels defined by the European framework’

We agree. This point was also made by Reviewers 2, though they also pointed out that self-
reported proficiency correlates reasonably well with more objective measures. A strength of
study 2 is that it used more objective measures of language proficiency.  We thank the
reviewer for the excellent suggestions for measures to be used in future work.

Competing Interests: No competing interests were disclosed.

 
Page 25 of 25

View publication stats

You might also like