0% found this document useful (0 votes)
6 views9 pages

Training Spatial Cognition Enhances Mathematical

A study involving 17,648 children aged 6-8 years found that spatial cognitive training significantly enhances mathematical learning, particularly through training in visuospatial working memory and reasoning. The research indicates that the type of cognitive training impacts mathematical outcomes, with visuospatial working memory training being the most effective. This large-scale study supports the idea that improving spatial abilities can lead to better academic performance in mathematics.

Uploaded by

Durgesh Pandey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views9 pages

Training Spatial Cognition Enhances Mathematical

A study involving 17,648 children aged 6-8 years found that spatial cognitive training significantly enhances mathematical learning, particularly through training in visuospatial working memory and reasoning. The research indicates that the type of cognitive training impacts mathematical outcomes, with visuospatial working memory training being the most effective. This large-scale study supports the idea that improving spatial abilities can lead to better academic performance in mathematics.

Uploaded by

Durgesh Pandey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Articles

https://doi.org/10.1038/s41562-021-01118-4

Training spatial cognition enhances mathematical


learning in a randomized study of 17,000 children
Nicholas Judd and Torkel Klingberg ✉

Spatial and mathematical abilities are strongly associated. Here, we analysed data from 17,648 children, aged 6–8 years, who
performed 7 weeks of mathematical training together with randomly assigned spatial cognitive training with tasks demanding
more spatial manipulation (mental rotation or tangram), maintenance of spatial information (a visuospatial working memory
task) or spatial, non-verbal reasoning. We found that the type of cognitive training children performed had a significant impact
on mathematical learning, with training of visuospatial working memory and reasoning being the most effective. This large,
community-based study shows that spatial cognitive training can result in transfer to academic abilities, and that reasoning
ability and maintenance of spatial information is relevant for mathematics learning in young children.

S
patial ability is closely associated with performance in sci- two questions that are of both theoretical and practical relevance:
ence, technology, engineering and mathematics1. For example, (1) does spatial cognitive training impact mathematical learning,
the ability to mentally rotate a figure in the mind correlates and if so is it more effective to train mental rotation or VSWM; and
with current performance in mathematics and predicts later learn- (2) are there inter-individual characteristics that predict the optimal
ing in children1–3. Similar associations exist for other spatial tasks, type of cognitive training to enhance mathematics?
including visuospatial working memory (VSWM)4,5 and spatial, Although our main hypothesis relates to the contrast between
non-verbal reasoning (NVR)6. mental rotation and VSWM, we also included training of spatial
Consequently, it has been suggested that improving spatial abili- NVR. NVR tasks rely heavily on one’s ability to find visuospatial
ties might be a way to enhance mathematical learning7–9, with some patterns and determine how these patterns interrelate spatially27.
teacher organizations going as far as to place equal emphasis on They were added based on the high correlation between NVR and
spatial training and numerosity10. However, the mechanisms of mathematical ability6 and some evidence that such training can
spatial training are still unclear, and there is a lack of large, random- enhance problem-solving ability28–30.
ized studies assessing its effect on mathematical performance9. We conducted our study via modifications on a freely available
Spatial tasks involve at least three aspects: creating an internal app (Vektor), using number line tasks to train mathematics. The
spatial representation; maintaining it; and manipulating it11. A men- app was used as an extra, voluntary activity in school, organized by
tal rotation task involves representation and manipulation, but min- teachers. When using the app, children spent half of their time with
imal maintenance. In contrast, a VSWM task puts a higher demand number line tasks, as the efficacy of this training was established for
on maintenance, but less on manipulation. this app31 as well as for similar training using the number line32,33.
This distinction is at the heart of two hypotheses on why train- The remaining time was allotted, by randomization, to the training
ing spatial abilities could transfer to improvements in mathemat- of rotation tasks (two-dimensional (2D) mental rotation and tan-
ics. The first hypothesis is that manipulation is the critical aspect, gram), VSWM or NVR (Fig. 1 and Supplementary Fig. 1). In the
and that rotation training would therefore be more effective than first, fifth and seventh week, children performed self-administered
training on VSWM2,8. In the second hypothesis, maintenance of tests of mathematics (addition, subtraction and number com-
spatial information that is critical, as well as training on VSWM, parison). For feasibility reasons, mathematical tests had to be
should therefore be superior11,12. It is also possible, of course, that self-administered, but this enabled us to perform a large, compre-
both hypotheses are wrong and spatial training does not transfer to hensive (12–20 h of training) and ecologically valid training study.
mathematics at all.
Previous literature gives mixed support for both hypotheses Results
of transfer. Several studies have shown a positive impact of rota- Over the course of 7 weeks, children completed cognitive training
tion training on mathematics13–15 while others found no effect16–18. for either 20 or 33 min d−1. Our final sample consisted of 17,648
Similarly, training of VSWM improved mathematical outcomes in children between the ages of 6 and 8 years, who on average com-
children in some studies19–21, although there have also been negative pleted 5,077 trials (s.d. = 1,710).
findings22,23. Given these mixed results, some suggest that cognitive To measure the baseline performance within each training
training, in general, does not transfer at all to academic abilities24. domain, we designed the app to give every child an identical first
This broad lack of consensus in the cognitive training lit- week. Task performance for all tasks was moderately correlated
erature could be due to inadequate statistical power25 along with between tasks, as well as with the mathematical transfer tests, con-
meta-analyses combining different training methods, populations sistent with the well-documented association between spatial cogni-
and outcome measures26. Perhaps most importantly, there exists a tion and mathematics (Fig. 2b).
lack of large, randomized studies. Following this first week, individuals were randomly split into
Here, we report data from over 17,000 children who engaged in one of five training plans. Each plan involved the same amount of
various forms of mathematical and cognitive training. We asked mathematical training (~50% of the training time), yet differed in the

Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden. ✉e-mail: [email protected]

Nature Human Behaviour | www.nature.com/nathumbehav


Articles Nature Human Behaviour

a b 100

Percentage of time on task


75
Training
task

Rotation tasks
Tangram
50 2D rotation
VSWM 2D mental rotation
NVR
VSWM
Number line
25

0
Default NVR Rotation Rotation Mixed
NVR Tangram heavy
Training plan

Fig. 1 | Overview of the four training tasks and the percentages of time allocated to each in the five training plans. a, Images of the four training tasks:
VSWM (a visuospatial working memory task in which participants had to remember a sequence of dots on a 4 × 4 grid and, after a delay, respond to them
in order); NVR (a spatial task in which participants had to find spatial patterns among figures); 2D mental rotation (a task in which children had to choose
the shape that, when rotated, would fit into an empty silhouette); and the tangram task (a second rotation task in which children had to manipulate and
rotate several pieces to fit into an empty silhouette). b, Percentages of time allocated to each cognitive training task in each of the five training plans. Both
2D mental rotation (that is, 2D rotation) and the tangram task are classified as rotation tasks. Images adapted from Cognition Matters.

a b Pearson
NVR correlation 1.00
0.4
First week
2D rotation 1.00 0.43

0
25

50

75

00
Fifth week

0.

0.

0.

1.
0.3
Density

Seventh week
VSWM 1.00 0.51 0.41
0.2

0.1 Number line 1.00 0.63 0.55 0.63

Baseline
0 1.00 0.58 0.53 0.40 0.25
mathematics
–2 0 2 4
e

VR
m lin

in

tio
W
N tics

rl

N
he se

ta
VS

Factor score (s.d.)


be

ro
at Ba
a
um

2D
m

c
VSWM Number line

8
100
6

50
4

2 0
0 10 20 30 0 10 20 30
Mean level

2D rotation NVR
50
7.5 40

5.0 30
20
2.5
10
0 0
0 10 20 30 0 10 20 30
Training day

Fig. 2 | Predicted factor scores by test week, correlations with baseline mathematics and training curves of the density of children for each difficulty
level and day of training. a, Density plot (n = 17,648) of predicted factor scores from three tests of mathematics by test week (Supplementary Fig. 3
shows individual tests). b, Correlations (n = 16,484) of the mean correct difficulty level for the training tasks in the first week with baseline mathematics
(a participant-specific intercept of the mathematics factor). All correlations were significant (P < 0.001). c, Training curves (maximum n = 17,647) showing
the color-coded density of children for each difficulty level and day of training. The light blue diamonds denote the mean correct level per day of each of
the corresponding tasks.

Nature Human Behaviour | www.nature.com/nathumbehav


Nature Human Behaviour Articles
proportions of cognitive training tasks (Fig. 1b and Supplementary 0.05
P < 0.001
Table 1). All training tasks were adaptive, so children progressed to P = 0.02
gradually more difficult levels (Fig. 2c and Supplementary Fig. 2). P = 0.55 P = 0.06
There was a significant improvement, as judged by an increase
0.04
in the average difficulty level, in all training tasks between the
first and seventh week. The improvement (in s.d.) was 0.57 for

Improvement per minute


VSWM (t(33,752) = 127.20; P < 0.001), 0.73 for the number line
0.03
task (t(34,428) = 196.56; P < 0.001), 0.82 for 2D mental rotation
(t(24,388) = 226.80; P < 0.001), 0.80 for NVR (t(18,533) = 185.39;
P < 0.001) and 0.53 for the tangram task (t(28,923) = 105.60;
0.02
P < 0.001).
We combined the performance on the three transfer tests using a
confirmatory factor analysis (CFA), which fit the data well (compar-
0.01
ative fit index (CFI) > 0.95; root mean square error of approximation
(RMSEA) < 0.08; Supplementary Figs. 3 and 4 and Supplementary
Table 3). The average improvement from the first to the fifth week
0
was 0.45 s.d., and from the first to the seventh week it was 0.56 s.d. Tangram 2D rotation VSWM NVR
(Fig. 2a and Supplementary Table 2), showing clear improvement
but also suggesting a diminished return for the last 2 weeks of train- Fig. 3 | Effects of the training tasks on mathematical improvement. Impact
ing compared with the previous weeks. of the different training tasks on inter-individual mathematical improvement
To estimate the validity of these online tests, we compared the (that is, random slopes), correcting for participants’ baseline mathematics
performance of 106 children (aged 6–8 years) who took both online (that is, random intercepts). These coefficients were predicted from a
and experimenter-administered tests of addition and subtraction mixed-effects model (see Methods for further details; n = 17,648). The error
and the Wechsler Intelligence Scale for Children Fifth Edition bars indicate 95% confidence intervals. Full model results are provided in
(WISC-IV) verbal arithmetic test, which were also combined using the main text.
factor analysis. We found a high correlation (r = 0.70; 95% confi-
dence interval (CI) = (0.58 to 0.78); P < 0.001) between factors from
online and experimenter-led tests, supporting the use of online tests showed a positive effect (b = 0.023; 95% CI = (0.003 to 0.042);
as outcome measures (Supplementary Fig. 5). P = 0.024) in reference to only VSWM training. In contrast, children
We proceeded to use the CFA scores in a linear, mixed-effects with rotation training improved less (b = −0.023; 95% CI = (−0.033
model with a random intercept and slope for each child to −0.013); P < 0.001), again compared with those with VSWM
(Supplementary Tables 4a–c). The intercept and slope were mod- training instead. This shows that removing VSWM training to
erately correlated (r = 0.28; change in Akaike information criterion incorporate rotation had a negative outcome on children’s math-
(∆AIC) = 678; X2(1) = 680.6; P < 0.001), showing that children with ematics abilities, while NVR had a positive effect.
higher starting scores improved more from training (that is, there To evaluate whether the two different rotation tasks (2D rota-
was a Matthew effect). tion and the tangram task) had differential effects, we performed
To evaluate the differences between the spatial cognitive training a third analysis in which we evaluated the impact of each of the
tasks, we used three complementary analyses: (1) an omnibus test different cognitive tasks separately on mathematical outcomes.
of the effect of training plans (Fig. 1b), coded categorically; (2) a Participant-specific slopes of mathematical improvement were
model accounting for the proportions of rotation training (includ- extracted. We then calculated the amount of time, in minutes, spent
ing both mental rotation and the tangram task) and NVR within on different training tasks. These values were entered into a linear
those plans, with the default plan (with ~50% VSWM) as a refer- model, with the slope as the outcome and covarying for partici-
ence (Supplementary Table 1); and (3) modelling of the size of the pants’ baseline mathematical abilities (that is, participant-specific
effect per minute spent with each of the cognitive tasks on the slope intercepts).
of mathematical improvement. All three analyses thus addressed The amounts of time spent training with NVR, VSWM, 2D
the same question (that is, whether the type of cognitive training mental rotation or the tangram task all positively predicted math-
impacts mathematical learning), but analysed cognitive training in ematical improvement. This test was in line with the previous two
terms of plans, categories or single tasks. analyses, as the amount of NVR training (b = 0.030; 95% CI = (0.020
In the first analysis, a model with training plans interacting to 0.039); P < 0.001) showed the largest effect, followed by VSWM
with test week predicted mathematical improvement significantly (b = 0.021; 95% CI = (0.017 to 0.025); P < 0.001) and then 2D men-
better than a model without (∆AIC = 24; X2(4) = 32.24; P < 0.001) tal rotation (b = 0.012; 95% CI = (0.004 to 0.020); P = 0.005) and the
(Supplementary Table 4a,b and Supplementary Fig. 6). The larg- tangram task (b = 0.008; 95% CI = (0.001 to 0.016); P = 0.040). The
est improvement was seen in the NVR group (0.602 units) and difference between 2D mental rotation and the tangram task was
the smallest improvement was seen in the rotation-heavy group not significant (Wald test; F(1) = 0.34; P = 0.56), yet both differed
(0.540 units); that is, there was a 11.5% larger effect of training, or compared with VSWM (Wald test for rotation: F(1) = 5.32; P = 0.02;
slightly more than the effect of increasing the total training time Wald test for the tangram task: F(1) = 11.53; P < 0.001) and NVR
(including the number line task) from 20–33 min d−1. (Wald test for rotation: F(1) = 10.01; P = 0.002; Wald test for the tan-
In the second analysis, to evaluate the effect of differing amounts gram task: F(1) = 13.02; P < 0.001).
of cognitive training, we coded for time spent on rotation (combin- All three analyses were thus consistent in finding a significant
ing the time for 2D mental rotation and tangram training) or NVR impact of the type of cognitive training performed, with signifi-
(Supplementary Table 4a,c). In this model, VSWM was coded as the cantly larger effects of VSWM and NVR compared with rotation
intercept; therefore, the effects of NVR and rotation are in refer- training for the mathematics transfer tests (Fig. 3).
ence to active VSWM training. Adding terms for NVR and rota- Our second question was to assess whether there were
tion interacting with test week improved the model fit (∆AIC = 24; inter-individual characteristics at baseline that predicted the opti-
X2(2) = 28.23; P < 0.001), showing that mathematical improvement mum type of cognitive training to enhance mathematics. For exam-
depended on the amount of NVR and rotation training. NVR training ple, our previous analysis showed that rotation training is less effective

Nature Human Behaviour | www.nature.com/nathumbehav


Articles Nature Human Behaviour

a
0.8

Predicted mathematical
improvement (s.d.)
0.6 Minutes of
VSWM training
4.3
0.4 7.0
9.7
11.5
0.2 16.0

0
–1.5 0 1.5
Baseline performance (s.d.)

b
0.8
Predicted mathematical
improvement (s.d.)

0.6 Minutes of
NVR training
0
0.4 2.0
2.7
3.3
0.2 4.5

0
–1.5 0 1.5
Baseline performance (s.d.)

Fig. 4 | Predicted performance following differing amounts of VSWM and NVR training for different levels of baseline cognitive performance.
a,b, Predicted mean improvement in mathematical performance following VSWM (a) and NVR training (b), as a function of baseline performance
and time spent training (n = 16,484). Baseline performance was measured from the first week of training, before randomization. The improvement in
mathematics was estimated from the slopes of improvement over the three tests, extracted from the mixed-effects model and corrected for baseline
mathematical performance. The error bars indicate 95% confidence intervals. Full model results are provided in the main text.

on average, but it might be that rotation training is more effective for Specifically, we found that training on VSWM was more effec-
some children. Children were characterized based on cognitive and tive than both types of rotation training (2D mental rotation and
mathematical performance in the first week (Fig. 2b). Both accuracy the tangram task). This suggests that, when it comes to transfer to
and average difficulty level were included, resulting in eight predictor mathematics, the crucial aspect of spatial training is maintaining a
variables from the cognitive tasks (Supplementary Table 5). spatial representation, rather than manipulating it. This is in line
First, we evaluated predictors of mathematical improvement with suggestions that a bottleneck for spatial cognition is the ability
independent of the training plans. As shown by earlier analyses, to maintain the spatial representation11,12 and that individuals with
baseline performance on the mathematics tests (that is, intercept) problems relating to mental rotation lose the image they attempt to
had a positive impact on improvement (that is, the Matthew effect). keep in mind.
However, after correcting for baseline mathematics, children with a A priori, we expected rotation training to have a positive impact
lower than average level in the number line task improved more in on mathematics learning. However, rotation turned out to have
mathematics over the course of training (b = −0.12; 95% CI = (−0.15 the smallest effect. Being the worst-performing condition, rotation
to −0.10); P after FDR (pFDR) < 0.001). Children with a higher training can be considered an active and very strict control group
level of NVR (b = 0.03; 95% CI = (0.01 to 0.05); pFDR < 0.001) for the other conditions. Our analysis showed that, compared with
and higher accuracy on VSWM (b = 0.05; 95% CI = (0.03 to 0.06); rotation training, both VSWM and NVR training had a significant
pFDR < 0.001) also improved more. and meaningful impact on mathematics learning. In this large, ran-
Second, we tested whether differing amounts of cognitive train- domized study, we show that cognitive training results in transfer to
ing (VSWM, rotation or NVR) interacted with any of these base- academic abilities, adding evidence in favour of the view that cogni-
line characteristics to predict mathematical improvement (Fig. 4). tive abilities are malleable34–36. We also show the relevance of such
This analysis showed that children with initially higher VSWM training for academic performance in an ecologically valid setting.
accuracy (b = 0.006; 95% CI = (0.002 to 9.011); pFDR < 0.001) or Our finding of transfer to mathematical learning is consistent
lower NVR performance (b = −0.007; 95% CI = (−0.012 to −0.002); with some previous studies19–21 but is at odds with others22,23. The
pFDR < 0.001) benefitted more from VSWM training. Moreover, largest negative study was performed by Roberts and colleagues22.
children with lower initial NVR levels benefitted more from NVR One reason for the discrepancies could be that the current study
training (b = −0.014; 95% CI = (−0.024 to −0.004); pFDR < 0.001). included a general population while the study by Roberts et al.
There were no significant interactions between any of the baseline included only children with low working memory capacity
performance measures and the amount of time spent on rotation (the lowest 15th percentile). As shown by the interaction analysis
training. (Fig. 4a), and consistent with previous findings, the impact of
VSWM training depends heavily on baseline performance31. Such
Discussion dependence on baseline characteristics needs to be taken into
Here, we report strong evidence that spatial cognitive training account when interpreting the existent literature as well as planning
impacts mathematical learning in children. Taking the type of cog- future studies.
nitive activity into account resulted in a model that was around An interesting finding was the positive effect of NVR training.
20 times more likely to predict mathematical improvement than a Although such training is much less researched, our finding is con-
model that did not account for it (∆AIC = 24). sistent with previous research showing NVR training to improve

Nature Human Behaviour | www.nature.com/nathumbehav


Nature Human Behaviour Articles
cognitive abilities and academic performance28–30. Our NVR task Data were collected remotely over five semesters (August 2017 to December 2019).
was similar to the sequential order task from the Leiter test bat- Each semester was coded as a cohort to be used as a covariate. In August 2018,
we added two training plans (rotation heavy and rotation/NVR) and removed the
tery37. It would be of interest in future research to focus on specific NVR plan (Supplementary Table 1). The sample size corresponded to the number
mechanisms of transfer, to study the relationship between NVR of children who met these criteria in the date range specified. This resulted in a
tasks and mathematics2,38 and to evaluate training on different types total sample size of 17,648 children. Sex was not recorded, and the children (and
of fluid-reasoning tasks. teacher) were blind to group assignment.
Regarding inter-individual characteristics, children with higher
Training tasks and plans. For the first 5 d, every child completed an identical
initial VSWM abilities, but lower NVR performance, benefited proportion of training tasks: the working memory grid, number line task, NVR
more from VSWM training. The benefit of 16 versus 4.3 min of and 2D mental rotation (see Supplementary Fig. 1 for task examples). Each task
WM training differed threefold between the high and low working entailed a certain proportion of guided trials to help teach the children how the
memory groups (a predicted gain of 0.31 versus 0.08 s.d.; Fig. 4a). task functioned. The working memory grid is a type of VSWM training in which
the participant is presented with a sequence of dots in different locations on a grid
In contrast, there was a negative interaction between baseline NVR and needs to accurately reproduce the sequence by touching the screen34. The app
performance and the amount of NVR training (Fig. 4b). However, automatically adjusts the difficulty by increasing the span of items. Number line
it is important to note that the range of NVR training time was con- training starts simply by instructing the child to use their index fingers to drag
strained to between 0 and 4.5 min. Whether the same pattern would the number line to the correct position corresponding to an Arabic numeral. The
be observed if the NVR training time were expanded to 16 min difficulty is initially moderated by removing spatial cues (for example, ticks on
the number line). It then progresses to incorporate mathematical problems (for
remains to be seen. Critically, we did not find any inter-individual example, addition or subtraction with gradually larger numbers), then introduces
characteristics showing children to benefit from rotation training. negative numbers, decimals and fractions. Ten pals (another number line task)
These findings show how baseline characteristics can predispose consisted of bars on the right side of the screen that were partially filled number
children to benefit from specific types of training. lines. The participant was then instructed to pull the correct bar (that is, the one
Here, we only randomized a minor part of training activities, as with the right number of units to fit) from multiple options on the right-hand
side43. The difficultly was moderated by bar length, the replacement of bars with
70% of the time was identical for all children. Yet, slight changes Arabic numerals and increasing the sum from 5 to 10 and then 15. 2D mental
in the cognitive content 30% of the time resulted in a 11.5% differ- rotation consisted of mental rotation tasks in which the difficulty was increased
ence in mathematical learning, with some tasks being two to three by increasing the angle of the required rotation and the complexity of the objects
times more effective than others (Fig. 3). Because this study did not being rotated, similar to other rotation training in the literature44. The NVR task
include a passive control group, we cannot estimate the absolute consisted of sequential ordering tasks in which participants viewed a sequence of
tiles with spatial patterns and had to choose the correct image to fill the blank in
effect size, only the differences between conditions. In evaluating the sequence28. The difficulty was increased by introducing additional stimulus
these, it is important to consider that most educational interven- dimensions (colours, shapes and numbers of dots) on which the stimuli should be
tions have small effect sizes: the mean of 141 interventions was compared.
recently estimated to be 0.06 (refs. 39,40). However, small effects are After the first week, children were randomly divided into five training plans.
important if they have a repeated impact, such as improved learning The default training plan included only VSWM and mathematical training. This
plan was identical to the one reported by Nemmi et al.31 showing the highest
processes over time41,42. mean gains. Due to ethical considerations, we did not want to deviate too far from
Another limitation of this study is that we could not evaluate this previous design. The experimental training plans substituted VSWM for the
how effective cognitive training is relative to mathematical training domain of interest (see Supplementary Table 1 for proportions).
alone because all children received the same proportion of math- The rate of improvement of the trained rotation tasks was gradual throughout
the training (Supplementary Fig. 2). Unpublished data were available from
ematical training. It is likely that, for any given test (for example,
28 children training on the same rotation tasks on an equivalent plan (50%
addition or geometry), training on that particular skill is the most mathematics; 50% rotation) for 5 weeks. Children were evaluated with the picture
time-effective way to improve test results. However, this study offers rotation test before and after intervention45. Compared with a control group
a proof of principle that spatial cognitive training transfers to aca- (n = 48), who only received regular education and repeated tests with equal
demic abilities. Given the wide range of areas associated with spa- spacing in time, the improvement on this task was d = 1.6 s.d. This suggests that
our rotation training is comparable to the improvements in other training studies
tial cognition (including not only other fields of mathematics, but of spatial rotation training44. In the same sample of children (n = 76), we found a
also science, technology and engineering) it is possible that training Cohen’s d improvement of 0.72 on experimenter-administered tests of addition and
transfers to multiple areas, which should be included in any calcu- subtraction and the WISC-IV verbal arithmetic test, which were combined using
lation, by teachers and policymakers, of how time-efficient spatial factor analysis (see Methods and Supplementary Fig. 5) for the children training in
training is relative to training for a particular test. mathematics and rotation compared with the control group.

Testing tasks. Testing was self-administered through the app, and appeared
Methods on specific days. The first testing session was delivered on the third and fourth
Implementation and inclusion criteria. This research project was approved by the days of training, with the second on the 25th and 26th days of training and the
Swedish Ethical Review Authority under application number 2016/136-31/1. As final session on the 35th and 36th days of training. Addition and subtraction
specified in our ethics application and in accordance with Swedish law (2003:460), tests appeared on the first day per testing session, while a number comparison
informed consent was not sought from children or their guardians, as no task appeared on the following day. For the addition and subtraction tasks,
personally identifiable information was collected or stored and the study involved children responded with on-screen buttons (0–9). The test finished if a trial was
no risk of harm to the participating children, all of whom stood to benefit from not completed within 60 s or if three consecutive errors were made. The sum of
the training. Data were collected in collaboration with the non-profit foundation correct responses was used as the outcome measure for both tests. The number
Cognition Matters (https://cognitionmatters.org/), to implement our training comparison task presented two single-digit Arabic numerals on either side of the
plans using the freely available app Vektor. Vektor is an adaptive cognitive training screen and the children had to respond by tapping on the larger number. The
app primarily aimed at improving school performance in mathematics. Previous outcome measure was the mean response time of correct trials.
research has shown the efficacy of Vektor to improve mathematics in children31. As expected in this age range, the subtraction and addition tests were not
There was no active recruitment or advertising for this study. Educators signed normally distributed, due to a large number of zero responses in the first test
up their classes entirely of their own volition, including agreeing on data storage, (Supplementary Fig. 3a). However, the third test (the mean reaction time of correct
and children’s data were automatically anonymized. Children and educators number comparison trials) was normally distributed. Around 2% of the number
could withdraw from the study at any point. There was no compensation for comparison test data were missing and therefore imputed. This represented under
participation. Educators chose the amount of training per day (either 20 or 33 min) 1% of total test data.
and the app automatically logged out after the prescribed amount to time.
Children were only included if they completed 36 d of training (corresponding Validity of online mathematical testing. To test the validity of the online tasks,
to the mathematical tests in week 7) and if they were between the ages of 6 and we used data from two schools gathered over the course of a year. The first study
8 years (see Supplementary Fig. 7 for attrition). The app automatically randomized consisted of 46 preschool-age children, while the second consisted of 60 children
each account to one of the training plans, ensuring that schools with differing (36 of whom were in first grade). Ethical approval was granted by a regional ethics
demographics would have equal percentages of students in each training plan. committee. A trained experimenter individually administered an addition and

Nature Human Behaviour | www.nature.com/nathumbehav


Articles Nature Human Behaviour
subtraction test and a WISC verbal arithmetic test. The online tests were identical participants trained for 20 or 33 min. These were entered into a linear model
to those used in this study. The time between experimenter-administered and predicting children’s random slopes (standardized with mean = 0 and s.d. = 1) while
online tests was a maximum of 4 weeks. Both school and grade were residualized correcting for their baseline mathematics (random intercepts).
from the mathematical tests. We then fit exploratory factor analyses individually
for experimenter-administered and online tests, using the fa function from the R Predicting improvement based on performance in the first week of training.
package psych46. Since all of the participants completed identical cognitive training within the
Following the extraction of factor scores, we used a Pearson correlation to first week, this allowed us to extract training indices from four tasks; VSWM, the
determine the consistency between mathematics tests administered in person and number line task, rotation and NVR. For performance measures, we used the mean
those given online. It should be noted that these factors will not represent exactly level of correct trials and the accuracy (that is, the percentage of correct trials).
the same underlying concept (that is, lacking configural invariance), as the number This resulted in a total of eight cognitive training predictors that were
comparison was substituted by the WISC verbal arithmetic test. entered in a linear model predicting children’s improvement. For our measure of
inter-individual improvement, we extracted children’s coefficients for test week
CFA. A CFA was used in which the three self-administered tests were combined (that is, their random slopes) from a mixed-effects model with fixed effects for
into a latent mathematical factor for each time point. This was done to increase the test week, age bracket, cohort and training time. We controlled for participants’
power by removing measurement error. Correlated error variance was modelled baseline mathematics (that is, their random intercepts), to see what predicts
between times for each task. For model estimation, we used full-information mathematical improvement independent of baseline abilities.
maximum-likelihood estimation and a robust maximum-likelihood estimator Lastly, we carried out three analyses, in which we examined the interactions
with a Yuan–Bentler-scaled test statistic from the R package lavaan47,48. Missing between the amount of different training (that is, rotation, NVR and VSWM)
follow-up behavioural data were imputed under the assumption that they were and baseline training indices to predict improvement. Similar to an earlier
missing at random49. We assessed model fits using the CFI (fit > 0.95) and RMSEA analysis, we coded the amounts of training, taking into consideration whether
(fit < 0.08)50. children completed 20 or 33 min d−1. This analysis was based on the rationale that
The model was then tested for longitudinal measurement invariance. This is a participants struggling in one cognitive domain might benefit from extra training
necessary step to ensure the latent factors represent the same concept throughout in this domain, which could then transfer to mathematical improvement. We used
time. This procedure is a series of increasingly strict tests: (1) the loadings are FDR correction to control our type 1 error rate in all prediction-based analyses.
constrained to be equal through time (weak invariance); (2) the intercepts are
constrained to be equal between the time points (strong invariance); and (3) the Reporting Summary. Further information on research design is available in the
error variances of each of the tests are also constrained to be equal through time Nature Research Reporting Summary linked to this article.
(strict invariance; Supplementary Fig. 4). The end result was four models: the
baseline model (that is, configural invariance) in which the same tests were used Data availability
through time, as well as weak, strong and strict invariant models. A series of model The data to replicate the main analysis (that is, mixed-effects model) are available
comparisons, in ascending order of strictness, were run to determine longitudinal at https://github.com/njudd/spatialcognition. Data for the baseline characteristics
measurement invariance. Regrettably, there is no consensus in the literature about and graphs in this study are available upon request from the corresponding author.
which fit indices to use or the cut-off values; therefore, we decided to include a
variety of fit indices in the preprint manuscript51 (Supplementary Table 3). Due to
our large sample size, we decided to test invariance via a change of ≥−0.010 for the Code availability
CFI, supplemented by a change of ≥0.015 for the RMSEA52. The code to replicate the main analysis (that is, mixed-effects model) is available
at https://github.com/njudd/spatialcognition. Code for the baseline characteristics
Linear mixed-effects modelling. Factor scores were estimated from a longitudinal, and graphs in this study is available upon request from the corresponding author.
strict-measurement invariant CFA to be used as the dependent variable in
mixed-effects models using maximum-likelihood estimation with the R package Received: 8 June 2020; Accepted: 16 April 2021;
lme4 (ref. 53). This modelling strategy was chosen over a latent growth curve Published: xx xx xxxx
model, mainly due to the numerous categorical covariates along with the ability
to easily compare coefficients. Covariates were added step-wise using AIC model
comparison. An AIC difference of 8 was considered meaningful for a model fit. References
The emmeans R package was used to estimate marginal means and test contrasts54. 1. Wai, J., Lubinski, D. & Benbow, C. P. Spatial ability for STEM domains:
We predominately relied on model comparison; however, wherever P values are aligning over 50 years of cumulative psychological knowledge solidifies its
reported, tests were two tailed with an alpha level of 0.05. importance. J. Educ. Psychol. 101, 817–835 (2009).
All models, except the intercept-only model, were fit with a correlated random 2. Hawes, Z. & Ansari, D. What explains the relationship between spatial and
intercept and a slope per participant per test week. Test week was coded using the mathematical skills? A review of evidence from brain and behavior. Psychon.
number of weeks between tests (that is, t1 = 0, t2 = 4 and t3 = 6). Models were built Bull. Rev. 27, 465–482 (2020).
hierarchically, by adding fixed effects for test week, age bracket, cohort and training 3. Mix, K. S. et al. Separate but correlated: the latent structure of space and
time. To test the effect of training time. we added an interaction of training time mathematics across development. J. Exp. Psychol. Gen. 145, 1206–1227
with test week. (2016).
4. Peng, P., Namkung, J., Barnes, M. & Sun, C. A meta-analysis of mathematics
Effect of different cognitive training. We took three complementary approaches and working memory: moderating effects of working memory domain, type
to determine the effects of different types of training. The first two built on earlier of mathematics skill, and sample characteristics. J. Educ. Psychol. 108,
models, while for the third we took a different strategy by extracting random 455–473 (2016).
slopes and intercepts. For the first approach, we added training plans (categorically 5. Gathercole, S. E. & Brown, L. Working memory assessments at school entry
dummy coded), building on the aforementioned model (that is, with an interaction as longitudinal predictors of National Curriculum attainment levels. Educ.
of training time with test week). A model with training plans interacting with Child Psychol. 20, 109–122 (2003).
test week was compared against one with only the fixed effects, and a more 6. Geary, D. C. Cognitive predictors of achievement growth in mathematics: a
parsimonious model without training plans was included. 5-year longitudinal study. Dev. Psychol. 47, 1539–1552 (2011).
In the second approach, we added terms for the amount of rotation (as a 7. Dillon, M. R., Kannan, H., Dean, J. T., Spelke, E. S. & Duflo, E. Cognitive
combination of 2D mental rotation and the tangram task) and the amount of NVR science in the field: a preschool intervention durably enhances intuitive but
training. The amount of rotation or NVR training did not take into consideration not formal mathematics. Science 357, 47–55 (2017).
whether participants had trained for 20 or 33 min, as this variance was already 8. Newcombe, N. Harnessing Spatial Thinking to Support STEM Learning
accounted for in a simpler model (via the interaction of test week with training Working paper 161 (OECD iLibrary, 2017); https://www.oecd-ilibrary.org/
time). A model with rotation and NVR terms both interacting with test week was education/harnessing-spatial-thinking-to-support-stem-learning_7d5dcae6-en
compared against one with only fixed effects, and again a more parsimonious 9. Stieff, M. & Uttal, D. How much can spatial training improve STEM
model. We refer the reader to AIC differences in model comparisons to draw achievement? Educ. Psychol. Rev. 27, 607–615 (2015).
conclusions; yet, when we report P values from mixed-effects models, these were 10. Paying Attention to Spatial Reasoning, K-12: Support Document for Paying
derived using the Satterthwaite degrees-of-freedom method. Attention to Mathematics Education (Ontario Ministry of Education, 2014);
The third method was used for two reasons: (1) to estimate the amount of http://www.edu.gov.on.ca/eng/literacynumeracy/lnspayingattention.pdf
effect from each spatial task; and (2) to check that treating 2D mental rotation 11. Lohman, D. F. in Advances in the Psychology of Human Intelligence Vol. 4
and the tangram task as similar tasks holds. To accomplish this goal, we extracted (ed. Sternberg, R. J.) 181–248 (Lawrence Erlbaum Associates, 1988).
random intercepts from participants and random slopes of test week from a model 12. Carpenter, P. A. & Just, M. A. in Advances in the Psychology of Human
with fixed effects for test week, age bracket, cohort and test time. We then coded Intelligence Vol. 3 (ed. Stenberg, R. J.) 221–252 (Erlbaum, 1986).
four tasks (VSWM, NVR, 2D mental rotation and the tangram task) based on 13. Cheng, Y.-L. & Mix, K. S. Spatial training improves children’s mathematics
the daily minutes of training in each of them, taking into consideration whether ability. J. Cogn. Dev. 15, 2–11 (2014).

Nature Human Behaviour | www.nature.com/nathumbehav


Nature Human Behaviour Articles
14. Hawes, Z., Moss, J., Caswell, B., Naqvi, S. & MacKinnon, S. Enhancing 39. Lortie-Forgues, H. & Inglis, M. Rigorous large-scale educational RCTs are
children’s spatial and numerical skills through a dynamic spatial approach to often uninformative: should we be concerned? Educ. Res. 48, 158–166 (2019).
early geometry instruction: effects of a 32-week intervention. Cogn. Instr. 35, 40. Bloom, H. S., Hill, C. J., Black, A. R. & Lipsey, M. W. Performance
236–264 (2017). trajectories and performance gaps as achievement effect-size benchmarks for
15. Lowrie, T., Logan, T. & Hegarty, M. The influence of spatial visualization educational interventions. J. Res. Educ. Eff. 1, 289–328 (2008).
training on students’ spatial reasoning and mathematics performance. J. Cogn. 41. Abelson, R. P. A variance explanation paradox: when a little is a lot. Psychol.
Dev. 20, 729–751 (2019). Bull. 97, 129–133 (1985).
16. Hawes, Z., Moss, J., Caswell, B. & Poliszczuk, D. Effects of mental rotation 42. Funder, D. C. & Ozer, D. J. Evaluating effect size in psychological
training on children’s spatial and mathematics performance: a randomized research: sense and nonsense. Adv. Methods Pract. Psychol. Sci. 2,
controlled study. Trends Neurosci. Educ. 4, 60–68 (2015). 156–168 (2019).
17. Cornu, V., Schiltz, C., Pazouki, T. & Martin, R. Training early visuo-spatial 43. Butterworth, B. & Kovas, Y. Understanding neurocognitive developmental
abilities: a controlled classroom-based intervention study. Appl. Dev. Sci. 23, disorders can improve education for all. Science 340, 300–305 (2013).
1–21 (2017). 44. Uttal, D. H. et al. The malleability of spatial skills: a meta-analysis of training
18. Rodán, A., Gimeno, P., Elosúa, M. R., Montoro, P. R. & Contreras, M. J. Boys studies. Psychol. Bull. 139, 352–402 (2013).
and girls gain in spatial, but not in mathematical ability after mental rotation 45. Neuburger, S., Jansen, P., Heil, M. & Quaiser-Pohl, C. Gender differences in
training in primary education. Learn. Individ. Differ. 70, 1–11 (2019). pre-adolescents’ mental-rotation performance: do they depend on grade and
19. Wright, H. et al. Improving Working Memory (Education Endowment stimulus type? Pers. Individ. Differ. 50, 1238–1242 (2011).
Foundation, 2019); h­tt­ps­:/­/w­es­tm­in­st­er­re­se­ar­ch­.w­es­tm­in­st­er­.a­c.­uk­/d­ow­nl­oa­d/­ 46. Revelle, W. psych: procedures for psychological, psychometric, and
1d­07359f1fda308387fc3679b914dac0eb89947b618f7f68f999a6f87793bf personality research. https://www.scholars.northwestern.edu/en/publications/
ba/1174522/Working%20Memory.pdf psych-procedures-for-personality-and-psychological-research (Northwestern
20. Berger, E. M., Fehr, E., Hermes, H., Schunk, D. & Winkel, K. The Impact of University, 2019).
Working Memory Training on Children’s Cognitive and Noncognitive Skills 47. R Core Development Team. R: a language and environment for statistical
Discussion Paper No. 09/2020 (NHH Department of Economics, 2020); computing. (R Foundation for Statistical Computing, 2014).
https://doi.org/10.2139/ssrn.3622985 48. Rosseel, Y. Lavaan: an R package for structural equation modeling and more.
21. Bergman-Nutley, S. & Klingberg, T. Effect of working memory training on Version 0.5-12 (BETA). J. Stat. Softw. 48, 1–36 (2012).
working memory, arithmetic and following instructions. Psychol. Res. 78, 49. Enders, C. K. & Bandalos, D. L. The relative performance of full information
869–877 (2014). maximum likelihood estimation for missing data in structural equation
22. Roberts, G. et al. Academic outcomes 2 years after working memory training models. Struct. Equ. Model. 8, 430–457 (2001).
for children with low working memory: a randomized clinical trial. 50. Hu, L. & Bentler, P. M. Cutoff criteria for fit indexes in covariance structure
JAMA Pediatr. 170, e154568 (2016). analysis: conventional criteria versus new alternatives. Struct. Equ. Model. 6,
23. Schwaighofer, M., Fischer, F. & Bühner, M. Does working memory training 1–55 (1999).
transfer? A meta-analysis including training conditions as moderators. 51. Putnick, D. L. & Bornstein, M. H. Measurement invariance conventions and
Educ. Psychol. 50, 138–166 (2015). reporting: the state of the art and future directions for psychological research.
24. Simons, D. J. et al. Do “brain-training” programs work? Psychol. Sci. Public Dev. Rev. 41, 71–90 (2016).
Interest 17, 103–186 (2016). 52. Chen, F. F. Sensitivity of goodness of fit indexes to lack of measurement
25. Francis, G. Too good to be true: publication bias in two prominent studies invariance. Struct. Equ. Model. 14, 464–504 (2007).
from experimental psychology. Psychon. Bull. Rev. 19, 151–156 (2012). 53. Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects
26. Green, C. S. et al. Improving methodological standards in behavioral models using lme4. J. Stat. Softw. 67, 1–48 (2015).
interventions for cognitive enhancement. J. Cogn. Enhanc. 3, 2–29 (2019). 54. Lenth, R. emmeans: estimated marginal means, aka least-squares means.
27. Mackintosh, N. & Mackintosh, N. J. IQ and Human Intelligence (Oxford Univ. https://cran.r-project.org/web/packages/emmeans/index.html (Univ. Iowa,
Press, 2011). 2019).
28. Bergman Nutley, S. et al. Gains in fluid intelligence after training non-verbal
reasoning in 4-year-old children: a controlled, randomized study. Dev. Sci. 14, Acknowledgements
591–601 (2011). We acknowledge R. Almeida, D. Sjölander, J. Beckeman, B. Sauce and D. Zhang
29. Klauer, K. J. & Phye, G. D. Inductive reasoning: a training approach. for extensive help with various aspects of the study. This work was supported by
Rev. Educ. Res. 78, 85–123 (2008). contributions from M. Westman and S. Westman, along with funding from The Swedish
30. Mackey, A. P., Hill, S. S., Stone, S. I. & Bunge, S. A. Differential effects of Medical Research Foundation. The funders had no role in study design, data collection
reasoning and speed training in children. Dev. Sci. 14, 582–590 (2011). and analysis, decision to publish or preparation of the manuscript.
31. Nemmi, F. et al. Behavior and neuroimaging at baseline predict individual
response to combined mathematical and working memory training in
children. Dev. Cogn. Neurosci. 20, 43–51 (2016). Author contributions
32. Fischer, U., Moeller, K., Bientzle, M., Cress, U. & Nuerk, H.-C. Sensori-motor N.J. and T.K. contributed equally in all aspects of the study.
spatial training of number magnitude representation. Psychon. Bull. Rev. 18,
177–183 (2011).
33. Outhwaite, L. A., Faulder, M., Gulliford, A. & Pitchford, N. J. Raising early Competing interests
achievement in math with interactive apps: a randomized control trial. T.K. holds an unpaid position as Chief Scientific Officer for the non-profit organization
J. Educ. Psychol. 111, 284–298 (2019). Cognition Matters. N.J. declares no competing interests.
34. Klingberg, T. et al. Computerized training of working memory in children
with ADHD—a randomized, controlled trial. J. Am. Acad. Child Adolesc. Additional information
Psychiatry 44, 177–186 (2005). Supplementary information The online version contains supplementary material
35. Jaeggi, S. M., Buschkuehl, M., Jonides, J. & Perrig, W. J. Improving fluid available at https://doi.org/10.1038/s41562-021-01118-4.
intelligence with training on working memory. Proc. Natl Acad. Sci. USA 105,
6829–6833 (2008). Correspondence and requests for materials should be addressed to T.K.
36. Schmiedek, F., Lövdén, M. & Lindenberger, U. Hundred days of cognitive Peer review information Nature Human Behaviour thanks Kelly S. Mix and the other,
training enhance broad cognitive abilities in adulthood: findings from the anonymous, reviewer(s) for their contribution to the peer review of this work.
COGITO study. Front. Aging Neurosci. 2, 27 (2010). Reprints and permissions information is available at www.nature.com/reprints.
37. Roid, G. H. & Miller, L. J. Leiter International Performance Scale—Revised:
Examiner’s Manual (Stoelting, 1997). Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
38. Mix, K. S. Why are spatial skill and mathematics related? Child Dev. Perspect. published maps and institutional affiliations.
13, 121–126 (2019). © The Author(s), under exclusive licence to Springer Nature Limited 2021

Nature Human Behaviour | www.nature.com/nathumbehav


nature research | reporting summary
Corresponding author(s): Torkel Klingberg
Last updated by author(s): Apr 14, 2021

Reporting Summary
Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency
in reporting. For further information on Nature Research policies, see our Editorial Policies and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.
n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement
A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly
The statistical test(s) used AND whether they are one- or two-sided
Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of all covariates tested


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons
A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient)
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals)

For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted
Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes
Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated
Our web collection on statistics for biologists contains articles on many of the points above.

Software and code


Policy information about availability of computer code
Data collection The data was collected remotely with the app Vektor (https://cognitionmatters.org/).

Data analysis We only used open source software for the analysis. R version 3.6.0 (2019-04-26) was used and the relevant packages are mentioned in the
methods.
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and
reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable:
- Accession codes, unique identifiers, or web links for publicly available datasets
- A list of figures that have associated raw data
April 2020

- A description of any restrictions on data availability

Data and code for the main analysis (i.e., mixed-effect model) will be made available upon publication. Any other data will be made available upon request.

1
nature research | reporting summary
Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.
Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Behavioural & social sciences study design


All studies must disclose on these points even when the disclosure is negative.
Study description This is a randomized control trial the data are quantitative.

Research sample The sample is 17,648 children between the ages of 6-8 that completed mathematical training for at least 36 days, this age range was
chosen as it is the target demographic of the training. We included all children from Aug 2017 – Jan 2020.

Sampling strategy No sample size was predetermined. As this is the first study of its kind comparing different active types of training (WM vs Rotation vs
NVR) it is quite difficult to accurately estimate expected effect sizes. A lot of previous literature suffers from a variety of issues (which
we discuss in the manuscript) making it even harder to calculate an expected effect size of each. Lastly, we also only modified a small
proportion of the training (unrelated to mathematics) and therefore assumed we would need quite a large sample.

Data collection All data was remotely gathered and anonymized making the experimenter naive to the condition until analysis.

Timing We included all children from Aug 2017 – Jan 2020.

Data exclusions 184 subjects were excluded as they partook in an experimental training task making them "unique"; a further single subject was
excluded for missing cohort information. These exclusion were decided prior to analysis.

Non-participation We accessed the database and only included subjects with 36 days therefore we have not calculated subjects that only preferred a
few days of training.

Randomization An algorithm randomly allocated children into training programs, this was done on the level of the child and not classroom.

Reporting for specific materials, systems and methods


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material,
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

Materials & experimental systems Methods


n/a Involved in the study n/a Involved in the study
Antibodies ChIP-seq
Eukaryotic cell lines Flow cytometry
Palaeontology and archaeology MRI-based neuroimaging
Animals and other organisms
Human research participants
Clinical data
Dual use research of concern

Human research participants


Policy information about studies involving human research participants
Population characteristics See above
April 2020

Recruitment There was no advertisement or active recruitment. Teachers independently decided to implement the training program
themselves, there could be classroom level self selection bias yet this should not impact the results as the different training
plans were randomized within classroom.

Ethics oversight Swedish ethical review authority (Stockholm region)

Note that full information on the approval of the study protocol must also be provided in the manuscript.

You might also like