0% found this document useful (0 votes)

10 views10 pages

Decentralized Brain Age Estimation Using MRI Data

Uploaded by

Rao Bade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views10 pages

Decentralized Brain Age Estimation Using MRI Data

Uploaded by

Rao Bade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Neuroinformatics (2022) 20:981–990

https://doi.org/10.1007/s12021-022-09570-x

ORIGINAL ARTICLE

Decentralized Brain Age Estimation Using MRI Data

Sunitha Basodi1 · Rajikha Raja1,2 · Bhaskar Ray1,3 · Harshvardhan Gazula4 · Anand D. Sarwate5 · Sergey Plis1 ·
Jingyu Liu1,3 · Eric Verner1 · Vince D. Calhoun1,3,6

Accepted: 27 January 2022 / Published online: 5 April 2022

Abstract
Recent studies have demonstrated that neuroimaging data can be used to estimate biological brain age, as it captures infor-
mation about the neuroanatomical and functional changes the brain undergoes during development and the aging process.
However, researchers often have limited access to neuroimaging data because of its challenging and expensive acquisition
process, thereby limiting the effectiveness of the predictive model. Decentralized models provide a way to build more accu-
rate and generalizable prediction models, bypassing the traditional data-sharing methodology. In this work, we propose a
decentralized method for biological brain age estimation using support vector regression models and evaluate it on three
different feature sets, including both volumetric and voxelwise structural MRI data as well as resting functional MRI data.
The results demonstrate that our decentralized brain age regression models can achieve similar performance compared to
the models trained with all the data in one location.

Keywords Brain age · Decentralized · Federated · COINSTAC

Introduction diseases (Cole et al., 2019; Franke & Gaser, 2019;

Elliott et al., 2019) and was also found to be wider in
Brain age estimation (BAE), from magnetic resonance patients with dementia and autism (Reeve et al., 2014).
imaging (MRI) data has become widely popular in recent Several studies have used traditional machine learning
years. Brain age here refers to biological brain age and (Cole et al., 2018; Liem et al., 2017) and deep learn-
we use this terminology throughout the manuscript. ing approaches (Jónsson et al., 2019) to develop promis-
Defined as the difference between chronological age ing prediction models on large datasets that have either
and biological brain age, the brain age gap can be help- been collected at one location or sourced from different
ful in early identification of various neurodegenerative locations. Although having a large training dataset can
help in achieving robust models, there are limitations
on the number/size of the data that can be gathered for
* Sunitha Basodi such large-scale analyses. Examples of such limitations
[email protected] include the cost-prohibitiveness of MRI data collection,
1
Tri‑institutional Center for Translational Research varying institutional data-sharing policies, constrained
in Neuroimaging and Data Science, Georgia State data-usage agreements, data-privacy concerns, and many
University, Georgia Institute of Technology, Emory more (White et al., 2020).
University, Atlanta, GA, USA One efficient approach to bypass these limitations is to
2
Department of Radiology, University of Arkansas use decentralized algorithms, which do not require pooling
for Medical Sciences, Little Rock, AR, USA the data in one location. Decentralized learning algorithms
3
Department of Computer Science, Georgia State University, have become popular in recent years due to the demand for
Atlanta, GA, USA collaborative studies (White et al., 2020; Plis et al., 2016).
4
Princeton Neuroscience Institute, Princeton, NJ, USA Such approaches are particularly important when there is a
5
Department of Electrical and Computer Engineering, Rutgers need to perform a large-N analysis involving heterogene-
University, Piscataway, NJ, USA ous datasets without worrying about data transmission or
6
Department of Psychology, Georgia State University, Atlanta, violating privacy.
GA, USA

13
Vol.:(0123456789)
982 Neuroinformatics (2022) 20:981–990

In this work, we have developed a decentralized brain age models. In a region-based approach, an atlas is used to sum-
estimation algorithm using support vector regression (SVR) marize the data for estimating prediction models and brain
and performed detailed experiments on three different fea- age. Surface-based methods generally compute a triangu-
ture sets (types of datasets) using varying sampling methods lated mesh using WM/GM boundaries or GM/CSF bounda-
and demonstrate the robustness of our approach. We imple- ries and then extract surface-based features for estimation.
ment our approach within the Collaborative Informatics and In this work, we also introduce a component-based approach
Neuroimaging Suite Toolkit for Anonymous Computation for functional MRI (fMRI) data that uses the timecourses
(COINSTAC) (Plis et al., 2016) framework (more on this from independent components or rather their cross-corre-
later). Results show that the proposed decentralized method lations, called functional network connectivity (FNC), as
using SVR models attain performance on par with a central- features (Jafri et al., 2008). The different features extracted
ized approach. from each of these approaches can be used with either tra-
The remainder of this paper is organized as follows. The ditional machine learning or deep learning models. Deep
background on biological brain age, current estimation learning models have shown promising results in BAE. In
approaches, decentralized learning, and COINSTAC are Cole et al. (2017), BAE was computed using both raw MR
presented in “Background”. The decentralized BAE method images and preprocessed MR images, and both showed con-
is presented in “Methods”, followed by a discussion of the sistent results. The authors of Niu et al. (2020) employ sin-
results in “Results” and conclusion in “Conclusion”. gle and multimodal brain imaging data, including structural
MRI (sMRI), diffusion tensor imaging (DTI), and resting
state fMRI (rs-fMRI), and evaluate performance on many
Background models. Recently, graph neural networks have also been used
for BAE (Stankevičiūtė et al., 2020) using data from the UK
Brain Age Biobank (Sudlow et al., 2015). There is also a recent study
which describes the effect of feature selection for brain age
The biological brain age (of an individual) is the observable estimation models (Ray et al., 2021). However, there is no
age of the brain in contrast to the chronological (actual) age. clear understanding about which models perform the best;
As brain age cannot be directly measured, models to esti- though it is clear that having large amounts of training data
mate brain age are typically trained with the chronological usually helps in achieving robust models.
age of cognitively normal subjects. Unless stated otherwise,
the term ‘brain age’ corresponds to ‘biological brain age’ Decentralized Learning
throughout this manuscript. The difference between the esti-
mated brain age and chronological age is considered to be Decentralized learning has recently gained attention as it
the brain age gap. Studies have shown that patients suffer- allows privacy-preserving, large-scale, machine learning
ing from neurodegenerative diseases (such as Alzheimer’s, analysis (Sarwate et al., 2014). Also referred to as feder-
Huntington’s, Parkinson’s, and Multiple Sclerosis) have ated learning in literature, decentralized machine learning
an increased brain age gap or estimated brain age (Reeve is focused on training models on data that are not shared but
et al., 2014). In contrast, there are studies that show activities that researchers want to be analyzed together. In this method,
such as meditation (Luders et al., 2016) and physical exer- all the participating sites start with the same model, i.e., a
cise decrease brain age (Steffener et al., 2016). These and local model is trained on its own data. In each iteration, gra-
other studies (Cole et al., 2017; Yang et al., 2019; Franke & dients from local models are sent back to the main site for
Gaser, 2019) suggest BAE may serve as an important bio- aggregation, after which updated gradients are returned to
marker in understanding the aging process of the brain and local sites to update their models. This is repeated over many
its relationship to brain health and disorder. iterations until the model achieves its stopping criteria, such
as the desired performance. There are several challenges in
Current Brain Age Estimation Approaches designing decentralized algorithms, including heterogeneity
across different sites, transmission, maintaining synchroni-
Estimating biological brain age on the basis of MRI is zation during training iterations, and preserving data pri-
now a widely used approach. There are broadly three dif- vacy. Heterogeneity arises from the different devices (hard-
ferent approaches that have been previously used, namely, ware and software) involved during the training process, and
voxel-based, region-based, and surface-based approaches one cannot overlook this issue, as it can effect the overall
for structural (MRISajedi & Pardakhti, 2019). In a voxel- performance of a decentralized model. Though there is no
based approach, MR images are first segmented into gray- raw data exchange, decentralized learning involves transmis-
matter (GM), white matter (WM), and cerebrospinal fluid sion of model gradients/parameters that utilize significant
(CSF), and then this voxel-level information is used to fit the bandwidth, especially when training deep learning models.

13
Neuroinformatics (2022) 20:981–990 983

The main site also needs to ensure that the aggregation is Such a decentralized training approach can be used to train
performed on the correct iteration data from all the sites any estimation model; however, the type of information
when there is repeated communication between the local and transferred between local and main sites highly depends
main site. There are also studies (Bostami et al., 2021a, b) on the type of the estimation model used (as different
which explore the data heterogeneity effects at different machine learning algorithms have different parameters
sites and work on harmonizing these effects while build- and approaches to reach the optimal solution).
ing decentralized models. For a detailed summary of the Formally, let N+1 be the total participating sites, where
machine learning algorithms, challenges, and architectures site − 0 is considered as main site and the remaining sites
for decentralization, we refer the readers to Aledhari et al. are local sites. Let the matrix Xim×n ∈ R be data of m sub-
(2020); Li et al. (2020). jects with n features available at a site i ∈ {0, 1, 2, .., N},
with the matrix Yim×1 ∈ R representing their chronological
COINSTAC age. At each local site i, a regression model Mi is con-
structed, and the corresponding model parameters Pn×k i
COINSTAC (White et al., 2020; Plis et al., 2016; COIN- are sent to the main site, where k is the dimension of the
STAC, 2020) is a platform that enables decentralized anal- parameters generated by the model.
ysis of (neuroimaging) data without the need to pool the pred
data at one location. It was created to enable collaborative Yi = Mi (Xim×n , Yim×1 ) (1)
research by removing the barriers to traditional data-centric
At main site ( site − 0 ), these model parameters are
approaches. As there is no pooling of data, COINSTAC pro-
aggregated and used to transform the original data
tects the privacy of individual datasets and solves the prob-
X0m×n ∈ R . The data are transformed using the parameter
lem of researchers wanting to collaborate but being unable ⨂
matrix with the operation. In our case, it is a simple
to because of data sharing restrictions (Ming et al., 2017).
matrix multiplication due to a linear regression model,
Gazula et al. (2019) used this application to perform a
although in general it can be any matrix operation. The
decentralized regression on sMRI data preprocessed using
transformed features Xnewm×k are used for BAE.
voxel-based morphometry to analyze the structural changes 0
in the brain as linked to age, body mass index, and smok- Pn×k = Average(Pn×k
i ), i ∈ {1, 2, 3, .., N}
ing. This study also emphasized the benefits of large-scale
0

neuroimaging analysis. COINSTAC implements a wide and ⨂

growing range of decentralized neuroimaging pipelines and Xnewm×k
0
= X0m×n Pn×k
i
supports such large-scale analysis of decentralized data with
results on par with results from pooled data. pred
Y0 = M0 (Xnewm×k
0
, Y0m×1 )

Here M0 represents the aggregated model containing

Methods information from models Mi , i ∈ {1, 2, .., N}.

Though decentralized learning has been applied in some

domains, this paper, to our knowledge, presents the first
approach for decentralized brain age analysis. In this work,
we estimate biological brain age using a decentralized
approach. Let us assume there are N + 1 participating sites,
each gathering data from a different set of participants.
In centralized models, data from all the local sites is
combined at a central location, where it is used for train-
ing a model, and therefore a centralized model has better
performance than any model trained only on a subset of
that data. In decentralized studies, instead of transferring
the original data, only the information needed to train a In our initial setup, all the local sites are provided with
model is shared, which not only improves the performance the details of the estimation model and the type of informa-
of the estimated model but also keeps the data secure at the tion to be shared with the main site after training their local
local sites. Information sent from local sites is collected at models. Of all the participating sites, N sites are used as
the main site to build an aggregated model. The challenge local sites, and the remaining site is used as the main site.
is to reduce the performance gap between decentralized In this study, we use support vector regression (SVR) as the
and centralized models without sharing the original data.

13
984 Neuroinformatics (2022) 20:981–990

estimation model and apply decentralization by employing a and UKBiobank (Miller et al., 2016). The UPENN-PNC
training strategy similar to Sarwate et al. (2014); Chaudhuri data consists of T1-weighted sMRI of 1417 health partici-
et al. (2011). As a first step, all the local sites train an SVR pants with chronological ages ranging from 8 to 21 years
model locally with their data and transfer the weight vectors during acquisition. This data is jointly spatially normalized
of the locally learned models to the main site. The main site and segmented and then smoothed by a 6-mm full-width at
averages these vectors and uses the average weight vector half maximum (FWHM) Gaussian kernel. Segmentation was
to transform its data into a new feature space. This modified performed in SPM12 (Ashburner et al., 2014). The second
data is then used to train a decentralized SVR model (see dataset, UKBiobank, has fMRI images of 11,754 subjects
Fig. 1). with chronological ages in the range of 44 to 80 years. The
Metrics We use root mean square error (RMSE) and data were preprocessed using a combination of FSL (Smith
mean absolute error (MAE) as metrics to measure the per- et al., 2020) and SPM12. This included distortion correction,
formance of the fitted regression models. RMSE is a stand- rigid body motion correction, normalization to Montreal
ard measure used to analyze the performance of regression Neurological Institute (MNI) space and smoothing using a
models. It is a measure of the distance between the estimated Gaussian kernel with an FWHM of 6 mm. Next, we ran
values and actual values and is computed as follows: a fully-automated independent component analysis (ICA)
√ pipeline that can capture corresponding functional network
√
√ ns ( )2
features while retaining more single-subject variability (Du
√ Σi=1 ya − ye
RMSE =
√ et al., 2015). This pipeline has been successfully applied
ns to multiple studies to identify a wide range of connectivity
abnormalities in numerous brain diseases (Du et al., 2015).
Here, ns , ya and ye correspond to number of samples, Group ICA was first performed on two large healthy control
actual and estimated values respectively. datasets to create spatial network (component) priors follow-
We also compute MAE as it reflects the performance of ing which a spatially constrained ICA algorithm was applied
brain age gaps and is commonly used in literature. It is an to back-reconstruct spatial maps and time-courses (TCs) for
average measure of the magnitude of estimation errors with- each subject (Du et al., 2019). Figure 2 shows the age range
out considering their error directions: distribution of the subjects in these datasets. Figure 3 shows
n
s
Σi=1 |ya − ye | the three different features extracted using these datasets as
MAE = described below. These are some of the interesting features
ns generally used in literature for estimating brain age.
Estimation using FreeSurfer features We used Free-
Surfer (version v5.3) (Fischl, 2012) to extract brain struc-
tural features from UPENN-PNC data. A standard aseg.
Results stats file that has features corresponding to total Intrac-
ranial Volume (eTIV), left hemisphere (lh) and right hemi-
The primary goal of this work is to develop a decentral- sphere (rh) subcortical regions is generated. Additionally,
ized BAE model and contrast its performance against a features corresponding to cortical thicknesses and volumes
centralized model. For this purpose, we performed three of the parcellated regions in surface GM in both left and
experiments with features extracted from two different data- right hemispheres are also extracted. In total, we use 152
sets. All experiments were performed in COINSTAC (Plis features for each subject to train the regression models.
et al., 2016) with six sites (one as main and the remaining Estimation using GM features We use the Automatic
five as local nodes). Anatomical Labeling (AAL) (Rolls et al., 2020) brain atlas
Datasets There are two MRI datasets that are analyzed of 116 brain parcellations to extract the mean GM density
in this work, viz, UPENN-PNC (Satterthwaite et al., 2014) of each region of interest (ROIs)

Fig. 1 Overall flow of model

parameters from locally trained
SVR models that are sent to the
main site for aggregation. The
aggregated model parameters
are used to transform the data at
the main site into new features,
which are used for training a
new decentralized SVR model

13
Neuroinformatics (2022) 20:981–990 985

(a) UPENN-PNC (b) UKBioBank

Fig. 2 Histograms showing the chronological age distribution of healthy subjects of UPENN-PNC (left) and UKBiobank (right) datasets

Estimation using FNC features From the UKBiobank, more data than free surfer or GM models. This is interesting
we applied a fully automated spatially constrained ICA at a practical level from a biological perspective. Though
approach to fMRI data from 11,754 subjects, yielding 53 there is no theoritical analysis on SVR, federated SVMs
nonartifactual ICNs with 490 time points each (for more have been examined and found to be accurate (Sarwate
details, see Datasets in “Results”). Therefore, for each et al., 2014). We have used 152 freesurfer features and 116
subject we have 53 × 53 correlation matrices as features. GM features in separate tests, which we believe provides
Because these matrices are symmetric, we extract the non- good test datasets without introducing too much complex-
diagonal upper triangular matrix values (1378 features) for ity due to size. We also evaluate with larger, more complex
each subject to train the regression models. features and also find good performance.
In this study, analyzing such different features using SVR Sampling Methods As the data need to be distributed
models show that some features have better estimation per- across six sites, we employ three strategies to split the data
formance and have lower error levels for the same dataset or equally. We employ a 90%-10% train-test split at each site. In
across different datasets. For instance, free surfer and GM the first approach, we randomly partition the data into these
features are extracted from the same UPENN-PNC dataset sites, and within each site, the data is further randomly split
to estimate brain age. SVR models with GM features have into training and testing datasets. This approach is referred
lower RMSE and MAE scores compared to free surfer fea- to as random sampling. In the second approach, we use strat-
tures (See Table 1). Also, models with FNC features have ified sampling based on subject age to partition the data into
higher error levels compared to models with free surfer and six sites. Within each site, the training and testing datasets
GM features, though FNC models are trained with 10 times were randomly split. We call this age-stratified sampling.

(a) FreeSurfer features (b) Graymatter ROI (c) FNC Component maps

Fig. 3 Different features extracted from sMRI and fMRI datasets for BAE

13
986 Neuroinformatics (2022) 20:981–990

In the third approach, we group the subjects into different models perform similar to the centralized model and have
bins based on their age ranges, label these bins and use these much better performance than the local sites. For all the
labels to perform stratified sampling. We refer to this method feature sets, the decentralized models have slightly higher
as age-bin-stratified sampling. error values compared to centralized models during training,
All the experiments were repeated five times, referred to whereas the testing metrics are slightly improved compared
as five runs, for every sampling method (similar to 5-fold to centralized values.
cross validation) to avoid sampling bias. Each experiment Figure 4 shows the performance of the final (decentral-
run includes splitting the data across all the sites using a ized) model compared to the local models across five runs
sampling method and training models using decentralized (repetitions) for different feature sets. Figure 4a, b show the
approach. In each experiment run to maintain the consist- performance of decentralized and local models for Free-
ency across training and testing datasets, while building a Surfer and GM models for 1417 subjects, and Fig. 4c shows
centralized model, training data from all the sites is pooled the performance of the FNC dataset with 11754 subjects.
into one training dataset, and testing data from all the sites These results show that as the number of training data
is combined to create the testing dataset. increases, the decentralized model achieves better accuracy
The results show that the decentralized models have compared to any of the locally trained models.
similar performance compared to centralized models (see We statistically compare the performance of centralized
Table 1). For models trained using FreeSurfer features, and decentralized models for both measures using a Wil-
the decentralized models showed higher RMSE and MAE coxon signed-rank test (Woolson, 2007). This is a nonpara-
values on the training data across all sampling strategies, metric pairwise comparison test, with no assumptions on the
whereas the centralized models demonstrated slightly higher data distribution and a null hypothesis that the differences
values on the test data. For models trained with GM features, between two samples have a distribution centered about
we observed slightly higher error values for decentralized zero. For each metric (RMSE and MAE), we compare the
models compared to centralized models on both the training training scores of decentralized and centralized models for
and testing sets across all sampling strategies. For models different sampling methods separately using different input
trained with FNC features, the observations were similar to data features. We repeat this setup and also compare the test-
the ones found on models trained with FreeSurfer features. ing scores for these metrics. In all the cases, this test fails to
The higher MAE/RMSE values for models with FNC fea- reject the null hypothesis indicating that the decentralized
tures is likely due to the age range of UKBioBank data (40- models achieved performance similar to that of centralized
80), rather than high dimensionality. But the decentralized models. This is a very interesting observation as we tested

Table 1 Performance comparison of decentralized and centralized models for BAE

Input − data Sampling Method RMSE MAE
train test train test

FreeSurfer random Decentralized 3.673 ± 0.045 3.624 ± 0.242 3.139 ± 0.068 3.083 ± 0.221
Centralized 3.518 ± 0.025 3.845 ± 0.149 2.863 ± 0.017 3.244 ± 0.161
age stratified Decentralized 3.701 ± 0.052 3.7 ± 0.184 3.156 ± 0.051 3.176 ± 0.172
Centralized 3.501 ± 0.012 3.911 ± 0.137 2.85 ± 0.01 3.312 ± 0.159
age-bin stratified Decentralized 3.652 ± 0.033 3.646 ± 0.071 3.131 ± 0.038 3.102 ± 0.03
Centralized 3.522 ± 0.01 3.862 ± 0.15 2.863 ± 0.013 3.24 ± 0.115
GM random Decentralized 2.652 ± 0.132 2.459 ± 0.156 2.172 ± 0.134 1.983 ± 0.147
Centralized 2.138 ± 0.015 2.213 ± 0.141 1.686 ± 0.012 1.769 ± 0.109
age stratified Decentralized 2.588 ± 0.035 2.489 ± 0.141 2.063 ± 0.031 2.029 ± 0.105
Centralized 2.151 ± 0.01 2.185 ± 0.081 1.691 ± 0.009 1.761 ± 0.055
age-bin stratified Decentralized 2.625 ± 0.061 2.613 ± 0.087 2.131 ± 0.061 2.12 ± 0.056
Centralized 2.149 ± 0.004 2.2 ± 0.062 1.694 ± 0.005 1.752 ± 0.054
FNC random Decentralized 7.497 ± 0.076 7.486 ± 0.074 6.285 ± 0.078 6.284 ± 0.066
Centralized 7.21 ± 0.005 7.857 ± 0.084 5.722 ± 0.009 6.535 ± 0.06
age stratified Decentralized 7.502 ± 0.034 7.494 ± 0.073 6.298 ± 0.04 6.284 ± 0.058
Centralized 7.203 ± 0.01 7.968 ± 0.123 5.709 ± 0.018 6.623 ± 0.138
age-bin stratified Decentralized 7.511 ± 0.056 7.501 ± 0.047 6.318 ± 0.045 6.279 ± 0.038
Centralized 7.194 ± 0.005 7.997 ± 0.096 5.708 ± 0.01 6.624 ± 0.059

13
Neuroinformatics (2022) 20:981–990 987

(a) FreeSurfer (b) GM

Fig. 4 Performance of decentralized models compared to local models across five runs (repetitions) for different feature sets

decentralized models with samples of over one thousand and We repeat this process 5 times (similar to 5-fold cross vali-
over 11 thousands,and in all the cases decentralized models dation) to avoid any sampling bias. During the training
perform similar to centralized models. This indicates the phase, performance scores of all the models (Figs. 5a, c, 6a,
de-centralized models work consistently with centralized c, 7a, c) increases with the increase in the number of local
models for smaller or larger data samples. sites. As the number of sites increases, there is small number
We also test the performance of the models by varying of data samples (small sample sizes) present at each site,
number of local sites. Performance comparison of cen- which could result in this trend. However, as the number
tralized and decentralized models using freesurfer (FS), of local sites increases for decentralization, the number of
graymatter(GM) from UPENN-PNC sMRI data and FNC data samples, type (and number) of features extracted from
features of UKBiobank fMRI data with varying number of the data seem to play an important role for the convergence
local sites are shown in Figs. 5, 6 and 7 respectively. To of the decentralized models. For instance, though the free-
perform these experiments, as we vary the number of sites surfer (FS) based features and GM features are derived from
(from 2 to 10), all the data is partitioned randomly across the same dataset, UPENN-PNC with 1417 subjects, they
a given number of sites. At each site, 90% of the data is have different patterns for testing scores. GM feature based
reserved for training and 10% of the data is used for testing. models(which use 116 features for each subject) (Fig. 6b,

(a) FS MAE Training (b) FS MAE Testing (c) FS RMSE Training (d) FS RMSE Testing

Fig. 5 Performance comparison of centralized and decentralized models using 152 freesurfer(FS) features of UPENN-PNC sMRI data of 1417
subjects with varying number of local sites

13
988 Neuroinformatics (2022) 20:981–990

(a) GM MAE Training (b) GM MAE Testing (c) GM RMSE Training (d) GM RMSE Testing

Fig. 6 Performance comparison of centralized and decentralized models using 116 graymatter(GM) features of UPENN-PNC sMRI data of 1417
subjects with varying number of local sites

(a) FNC MAE Training (b) FNC MAE Testing (c) FNC RMSE Training (d) FNC RMSE Testing

Fig. 7 Performance comparison of centralized and decentralized models using 1378 FNC features of UKBiobank fMRI data of 11,754 subjects
with varying number of local sites

d) start diverging with the increase in the number of sites, would also like to extend our existing model to harmonize
where as for freesurfer feature based models(which use 152 heterogeneous data across different sites, analyze its effects
features for each subject) (Fig. 5b, d), both MAE and RMSE on the decentralized brainage estimation and also imple-
testing scores converge. Similarly, for FNC based features ment our decentralized approach using other machine learn-
(Fig. 7b, d), both MAE and RMSE testing scores of the mod- ing techniques, compare their performances with centralized
els converge with the increase of the number of sites. method as well as across different machine learning models.

Conclusion Information Sharing Statement

In this work, we employ decentralized machine learning All the scripts used to perform the experiments in this paper
SVR models for brain age estimation and compare the results are published in the github repository: https://github.com/
with a centralized SVR model. The decentralized model is trendscenter/decentralized_brainage_paper
trained by utilizing information from locally trained models
at different sites and involves no data sharing. Results from Acknowledgements We sincerely thank Debbrata Kumar Saha and
Biozid Bostami for their comments. We are also grateful to the funding
models trained on three different feature sets with three dif- institutions for their support (National Institutes of Health, National
ferent data splitting strategies show that the performance of Institute on Drug Abuse and the National Institute of Mental Health).
the decentralized model is as good as the centralized model
and is consistently better than individual sites separately. The Author Contributions All the authors helped improve the manuscript.
key benefit of decentralization is that it does not require data SB designed decentralized models for FNC features, performed the
data analysis for all the models and wrote the initial manuscript. RR
sharing and therefore encourages collaboration by allow- and HG designed the decentralized model for FreeSurfer and GM fea-
ing different research groups to readily participate in larger tures. BR designed feature extraction strategies. AS and SP designed
studies without worrying about their data-sharing policies and provided insights into the decentralized regression models and
or data transmission. Future work will include developing a helped in tuning their performances. JL provided guidance about
feature extraction and decentralized model design. EV manages the
multishot version of this pipeline coupled with differential COINSTAC project and helped write the paper. VDC supervised all
privacy strategies and also to mathematically explore the the stages of the project and also funded the project.
effects of regression errors on decentralized models. We

13
Neuroinformatics (2022) 20:981–990 989

Funding This work was funded by the National Institutes of Health during adolescence shows structural changes linked to age, body
(R01DA040487), National Institute on Drug Abuse (R01DA049238) mass index, and smoking: A coinstac analysis. bioRxiv, 846386.
and the National Institute of Mental Health (R01MH121246) Jafri, M. J., Pearlson, G. D., Stevens, M., & Calhoun, V. D. (2008).
A method for functional network connectivity among spatially
Data Availability UPENN-PNC (Satterthwaite et al., 2014) and independent resting-state components in schizophrenia. Neuro-
UKBiobank (Miller et al., 2016) are openly available MRI datasets image, 39(4), 1666–1681.
which can be accessed upon request. Please contact the authors of Sat- Jónsson, B. A., Bjornsdottir, G., Thorgeirsson, T., Ellingsen, L. M.,
terthwaite et al. (2014), Miller et al. (2016) for data. Walters, G. B., Gudbjartsson, D., et al. (2019). Brain age predic-
tion using deep learning uncovers associated sequence variants.
Nature communications, 10(1), 1–10.
Declarations Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated
learning: Challenges, methods, and future directions. IEEE Sig-
Competing Interest All the authors declare that they have no compet- nal Processing Magazine, 37(3), 50–60.
ing interests. Liem, F., Varoquaux, G., Kynast, J., Beyer, F., Masouleh, S. K.,
Huntenburg, J. M., et al. (2017). Predicting brain-age from mul-
timodal imaging data captures cognitive impairment. Neuroim-
References age, 148, 179–188.
Luders, E., Cherbuin, N., & Gaser, C. (2016). Estimating brain age
Aledhari, M., Razzak, R., Parizi, R. M., & Saeed, F. (2020). Feder- using high-resolution pattern recognition: Younger brains in
ated learning: A survey on enabling technologies, protocols, and long-term meditation practitioners. Neuroimage, 134, 508–513.
applications. IEEE Access, 8, 140699–140725. Miller, K. L., Alfaro-Almagro, F., Bangerter, N. K., Thomas, D. L.,
Ashburner, J., Barnes, G., Chen, C.-C., Daunizeau, J., Flandin, G., Friston, Yacoub, E., Xu, J., et al. (2016). Multimodal population brain
K., Kiebel, S., Kilner, J., Litvak, V., Moran, R., et al. (2014). imaging in the uk biobank prospective epidemiological study.
Spm12 manual. Wellcome Trust Centre for Neuroimaging, London, Nature neuroscience, 19(11), 1523–1536.
UK 2464. Ming, J., Verner, E., Sarwate, A., Kelly, R., Reed, C., Kahleck, T., Silva,
Bostami, B., Vergara, V., & Calhoun, V. D. (2021a). Harmonization R., Panta, S., Turner, J., Plis, S., et al. (2017). Coinstac: Decentral-
of multi-site dynamic functional connectivity network data. IEEE izing the future of brain imaging analysis. F1000Research 6.
BIBE. Niu, X., Zhang, F., Kounios, J., & Liang, H. (2020). Improved
Bostami, B., Vergara, V., Calhoun, V. D., & Hillary, F. (2021b). Network- prediction of brain age using multimodal neuroimaging data.
ing brain networks: Federated harmonization of neuroimaging data. Human brain mapping, 41(6), 1626–1643.
Complex Networks, Madrid, Spain. Plis, S. M., Sarwate, A. D., Wood, D., Dieringer, C., Landis, D.,
Chaudhuri, K., Monteleoni, C., & Sarwate, A. D. (2011). Differentially Reed, C., et al. (2016). Coinstac: a privacy enabled model and
private empirical risk minimization. Journal of Machine Learning prototype for leveraging and processing decentralized brain
Research, 12, 3. imaging data. Frontiers in neuroscience, 10, 365.
COINSTAC. http://coinstac.trendscenter.org. Ray, B., Duan, K., Chen, J., Fu, Z., Suresh, P., Johnson, S., Calhoun,
Cole, J. H., Marioni, R. E., Harris, S. E., & Deary, I. J. (2019). Brain V. D., & Liu, J. (2021). Multimodal brain age prediction with
age and other bodily ages: implications for neuropsychiatry. feature selection and comparison. EMBC.
Molecular psychiatry, 24(2), 266–281. Reeve, A., Simcox, E., & Turnbull, D. (2014). Ageing and parkin-
Cole, J. H., Poudel, R. P., Tsagkrasoulis, D., Caan, M. W., Steves, C., son’s disease: why is advancing age the biggest risk factor?
Spector, T. D., & Montana, G. (2017). Predicting brain age with Ageing research reviews, 14, 19–30.
deep learning from raw imaging data results in a reliable and Rolls, E. T., Huang, C.-C., Lin, C.-P., Feng, J., & Joliot, M. (2020).
heritable biomarker. NeuroImage, 163, 115–124. Automated anatomical labelling atlas 3. Neuroimage, 206, 116189.
Cole, J. H., Ritchie, S. J., Bastin, M. E., Hernández, M. V., Maniega, S. Sajedi, H., & Pardakhti, N. (2019). Age prediction based on brain
M., Royle, N., et al. (2018). Brain age predicts mortality. Molecu- mri image: a survey. Journal of medical systems, 43(8), 279.
lar psychiatry, 23(5), 1385–1392. Sarwate, A. D., Plis, S. M., Turner, J. A., Arbabshirani, M. R., &
Du, Y., Fu, Z., Sui, J., Gao, S., Xing, Y., Lin, D., Salman, M., Rahaman, Calhoun, V. D. (2014). Sharing privacy-sensitive access to neu-
M. A., Abrol, A., Chen, J., et al. (2019). Neuromark: a fully auto- roimaging and genetics data: a review and preliminary valida-
mated ica method to identify effective fmri markers of brain disor- tion. Frontiers in neuroinformatics, 8, 35.
ders. medRxiv, 19008631. Satterthwaite, T. D., Elliott, M. A., Ruparel, K., Loughead, J., Prabhakaran,
Du, Y., Pearlson, G. D., Liu, J., Sui, J., Yu, Q., He, H., et al. (2015). K., Calkins, M. E., et al. (2014). Neuroimaging of the philadelphia
A group ica based framework for evaluating resting fmri markers neurodevelopmental cohort. Neuroimage, 86, 544–553.
when disease categories are unclear: application to schizophrenia, Smith, S., Woolrich, M., Behrens, T., Beckmann, C., Flitney, D., Jenkinson,
bipolar, and schizoaffective disorders. Neuroimage, 122, 272–280. M., Bannister, P., Clare, S., De Luca, M., Hansen, P., et al. Fmrib
Elliott, M. L., Belsky, D. W., Knodt, A. R., Ireland, D., Melzer, T. R., software library.
Poulton, R., Ramrakha, S., Caspi, A., Moffitt, T. E., & Hariri, A. Stankevičiūtė, K., Azevedo, T., Campbell, A., Bethlehem, R. A., & Liò,
R. (2019). Brain-age in midlife is associated with accelerated bio- P. (2020). Population graph gnns for brain age prediction. bioRxiv.
logical aging and cognitive decline in a longitudinal birth cohort. Steffener, J., Habeck, C., O’Shea, D., Razlighi, Q., Bherer, L., & Stern,
Molecular psychiatry, 1–10. Y. (2016). Differences between chronological and brain age are
Fischl, B. (2012). Freesurfer. Neuroimage, 62(2), 774–781. related to education and self-reported physical activity. Neurobiol-
Franke, K., & Gaser, C. Ten. (2019). years of brainage as a neuroim- ogy of aging, 40, 138–144.
aging biomarker of brain aging: what insights have we gained? Sudlow, C., Gallacher, J., Allen, N., Beral, V., Burton, P., Danesh, J.,
Frontiers in neurology, 10, 789. et al. (2015). Uk biobank: an open access resource for identifying
Gazula, H., Holla, B., Zhang, Z., Xu, J., Verner, E., Kelly, R., Schumann, the causes of a wide range of complex diseases of middle and old
G., & Calhoun, V. D. (2019). Decentralized multi-site vbm analysis age. Plos med, 12(3), e1001779.

13
990 Neuroinformatics (2022) 20:981–990

White, T., Blok, E., & Calhoun, V. D. (2020). Data sharing and Yang, L., Cao, C., Kantor, E. D., Nguyen, L. H., Zheng, X., Park, Y.,
privacy issues in neuroimaging research: Opportunities, obsta- et al. (2019). Trends in sedentary behavior among the us popula-
cles, challenges, and monsters under the bed. Human Brain tion, 2001–2016. Jama, 321(16), 1587–1597.
Mapping.
Woolson, R. (2007). Wilcoxon signed-rank test. Wiley encyclopedia Publisher’s Note Springer Nature remains neutral with regard to
of clinical trials, 1–3. jurisdictional claims in published maps and institutional affiliations.

Brochure VD4
No ratings yet
Brochure VD4
8 pages
Brain Age Estimation With A Greedy Dual-Stream Mod
No ratings yet
Brain Age Estimation With A Greedy Dual-Stream Mod
29 pages
Android Widgets
0% (1)
Android Widgets
58 pages
LV Ruyi Final Report
No ratings yet
LV Ruyi Final Report
58 pages
Brain Network Analysis Enhanced Ebook Download
100% (15)
Brain Network Analysis Enhanced Ebook Download
15 pages
Deep Learning-Based Brain Age Prediction in Normal
No ratings yet
Deep Learning-Based Brain Age Prediction in Normal
46 pages
Human Brain Mapping - 2024 - Nguyen - Brain Structure Ages A New Biomarker For Multi Disease Classification
No ratings yet
Human Brain Mapping - 2024 - Nguyen - Brain Structure Ages A New Biomarker For Multi Disease Classification
24 pages
Predicting Brain Age Using ML Algorithms
No ratings yet
Predicting Brain Age Using ML Algorithms
9 pages
Fnagi 13 761954
No ratings yet
Fnagi 13 761954
17 pages
Brain Clocks Capture Diversity and Disparities in Aging and Dementia Across Geographically Diverse Populations
No ratings yet
Brain Clocks Capture Diversity and Disparities in Aging and Dementia Across Geographically Diverse Populations
23 pages
1 s2.0 S1053811922007522 Main
No ratings yet
1 s2.0 S1053811922007522 Main
14 pages
p365 High Line Pressure Transducer
No ratings yet
p365 High Line Pressure Transducer
7 pages
Final Presentation
No ratings yet
Final Presentation
31 pages
Second
No ratings yet
Second
18 pages
Brain-Age Prediction - A Systematic Comparison of Machine Learning Workflows
No ratings yet
Brain-Age Prediction - A Systematic Comparison of Machine Learning Workflows
17 pages
Estimating Brain Age From Structural MRI and MEG Data - Insights From Dimensionality Reduction Techniques
No ratings yet
Estimating Brain Age From Structural MRI and MEG Data - Insights From Dimensionality Reduction Techniques
13 pages
Different Algorithms (Might) Uncover Different Patterns: A Brain-Age Prediction Case Study
No ratings yet
Different Algorithms (Might) Uncover Different Patterns: A Brain-Age Prediction Case Study
14 pages
A Review of Neuroimaging-Driven Brain Age Estimation For Identification of Brain Disorders and Health Conditions
No ratings yet
A Review of Neuroimaging-Driven Brain Age Estimation For Identification of Brain Disorders and Health Conditions
15 pages
14100-Article Text-25123-1-10-20230922
No ratings yet
14100-Article Text-25123-1-10-20230922
7 pages
A Review of Deep Learning Approaches For Non-Invasive Cognitive Impairment Detection
No ratings yet
A Review of Deep Learning Approaches For Non-Invasive Cognitive Impairment Detection
24 pages
Case Study and Research Material CMU
No ratings yet
Case Study and Research Material CMU
23 pages
Modelling Tomography Using Reflexw
No ratings yet
Modelling Tomography Using Reflexw
16 pages
Article 3
No ratings yet
Article 3
20 pages
Age Related Changes in Human Brain Functional Connectivity Using Graph Theory and Machine Learning Techniques in Resting State fMRI Data
No ratings yet
Age Related Changes in Human Brain Functional Connectivity Using Graph Theory and Machine Learning Techniques in Resting State fMRI Data
18 pages
Brain Aging Patterns in A Large and Diverse Cohort of 49,482 Individuals
No ratings yet
Brain Aging Patterns in A Large and Diverse Cohort of 49,482 Individuals
22 pages
How To Manage BW Transports
No ratings yet
How To Manage BW Transports
14 pages
Brain Charts For The Human Lifespan
No ratings yet
Brain Charts For The Human Lifespan
20 pages
2023 Pappalettera Cognitive Reserve:Resilience Myth or Reality
No ratings yet
2023 Pappalettera Cognitive Reserve:Resilience Myth or Reality
20 pages
Deformation Fields - A New Source of Information To Predict Brain Age
No ratings yet
Deformation Fields - A New Source of Information To Predict Brain Age
10 pages
Smith 2019
No ratings yet
Smith 2019
12 pages
Brain Charts For Human Lifespan
No ratings yet
Brain Charts For Human Lifespan
20 pages
DL Autoencoder Singapore Survey
No ratings yet
DL Autoencoder Singapore Survey
12 pages
Learning To Synthesise The Ageing Brain Without Longitudinal Data
No ratings yet
Learning To Synthesise The Ageing Brain Without Longitudinal Data
16 pages
Yash Patel 20210701004 Intro To ML Report
No ratings yet
Yash Patel 20210701004 Intro To ML Report
16 pages
2023 - Interpreting Age Predictions From Brain Maps Via Deep Neural Activations and Tensor Decomposition - Claros-Olivares Et Al
No ratings yet
2023 - Interpreting Age Predictions From Brain Maps Via Deep Neural Activations and Tensor Decomposition - Claros-Olivares Et Al
13 pages
English - Estimate The Age of Your Brain
No ratings yet
English - Estimate The Age of Your Brain
12 pages
1 s2.0 S2213158222002406 Main
No ratings yet
1 s2.0 S2213158222002406 Main
11 pages
Brain RGIN
No ratings yet
Brain RGIN
11 pages
Lateral Earth Pressures For Seismic Design of Cantilever Retaining Walls
100% (2)
Lateral Earth Pressures For Seismic Design of Cantilever Retaining Walls
8 pages
A Deep Learning Model For Brain Segmentation Across Pediatric and Adult Populations
No ratings yet
A Deep Learning Model For Brain Segmentation Across Pediatric and Adult Populations
11 pages
Novel GNNfor Alzheimer Disease-Kesin Oku
No ratings yet
Novel GNNfor Alzheimer Disease-Kesin Oku
11 pages
A Comparison of Machine Learning Methods For Survival Analysis of High-Dimensional Clinical Data For Dementia Prediction.s41598-020-77220-W
No ratings yet
A Comparison of Machine Learning Methods For Survival Analysis of High-Dimensional Clinical Data For Dementia Prediction.s41598-020-77220-W
10 pages
SATA Drivers For XP (Solution On 0x0000007B BSOD) - HP Support Forum
No ratings yet
SATA Drivers For XP (Solution On 0x0000007B BSOD) - HP Support Forum
8 pages
Predicting Age From Brain EEG Signals-A Machine Learning Approach
No ratings yet
Predicting Age From Brain EEG Signals-A Machine Learning Approach
12 pages
A Novel Deep Neural Network-Based Emotion Analysis System For Automatic Detection of Mild Cognitive Impairment in The Elderly
No ratings yet
A Novel Deep Neural Network-Based Emotion Analysis System For Automatic Detection of Mild Cognitive Impairment in The Elderly
11 pages
Voxel-Level Importance Maps For Interpretable Brain Age Estimation
No ratings yet
Voxel-Level Importance Maps For Interpretable Brain Age Estimation
10 pages
Bi2021 Article IdentificationOfDifferentialBr
No ratings yet
Bi2021 Article IdentificationOfDifferentialBr
9 pages
Barcamp Accra: Marketing Plan
No ratings yet
Barcamp Accra: Marketing Plan
5 pages
Thought Works
No ratings yet
Thought Works
6 pages
Baldovino t3d - Lab 7 and 8
No ratings yet
Baldovino t3d - Lab 7 and 8
50 pages
Neural Networks: Hossein Shahamat, Mohammad Saniee Abadeh
No ratings yet
Neural Networks: Hossein Shahamat, Mohammad Saniee Abadeh
17 pages
NEB Letter Authorizing Seismic Tests
No ratings yet
NEB Letter Authorizing Seismic Tests
5 pages
SSD 9971
No ratings yet
SSD 9971
4 pages
EEG-Based Age and Gender Prediction Using Deep BLSTM-LSTM Network Model
No ratings yet
EEG-Based Age and Gender Prediction Using Deep BLSTM-LSTM Network Model
8 pages
Fnagi 12 00249
No ratings yet
Fnagi 12 00249
6 pages
Brainsci 13 00770 v2
No ratings yet
Brainsci 13 00770 v2
21 pages
Quick Start Guide Cisco 7911 IP Telephone: More User Manuals On
No ratings yet
Quick Start Guide Cisco 7911 IP Telephone: More User Manuals On
18 pages
Artefacts
No ratings yet
Artefacts
3 pages
CPLACID Brochure
No ratings yet
CPLACID Brochure
3 pages
BrainAge Brandon PDF
No ratings yet
BrainAge Brandon PDF
9 pages
Casing Design Example
100% (2)
Casing Design Example
42 pages
Ieee Paper Format
No ratings yet
Ieee Paper Format
4 pages
LTE Frequency Bands
No ratings yet
LTE Frequency Bands
6 pages
Segregation of Functions
No ratings yet
Segregation of Functions
2 pages
Bahagian B: 45 Markah
No ratings yet
Bahagian B: 45 Markah
6 pages
Optimized Transfer Learning Based Dementia Prediction System For Rehabilitation Therapy Planning
No ratings yet
Optimized Transfer Learning Based Dementia Prediction System For Rehabilitation Therapy Planning
13 pages
Early Detection of Alzheimers Disease Using Cognitive Features A Voting-Based Ensemble Machine Learning Approach
No ratings yet
Early Detection of Alzheimers Disease Using Cognitive Features A Voting-Based Ensemble Machine Learning Approach
10 pages
Can An Online Battery Match In-Person Cognitive Testing in Providing Information About Age-Related Cortical Morphology?
No ratings yet
Can An Online Battery Match In-Person Cognitive Testing in Providing Information About Age-Related Cortical Morphology?
11 pages
A Geometrical Approach To Enhance Security Against Cyber Attacks in Digital Substations
No ratings yet
A Geometrical Approach To Enhance Security Against Cyber Attacks in Digital Substations
15 pages
11 - Part1fninf 18 1384720
No ratings yet
11 - Part1fninf 18 1384720
2 pages
Lab+ +Enumerating+Windows+10+Using+WinPEAS
No ratings yet
Lab+ +Enumerating+Windows+10+Using+WinPEAS
8 pages
IJCAIT132 Rsingh
No ratings yet
IJCAIT132 Rsingh
11 pages
Detection and Analysis of AD Using Various Machine
No ratings yet
Detection and Analysis of AD Using Various Machine
26 pages
Application Note: Revision 01
No ratings yet
Application Note: Revision 01
34 pages
Sop Cognitive Science Iitk
0% (1)
Sop Cognitive Science Iitk
2 pages
Ce6306 Strength of Materials Ii/Iii Mechanical Engineering
No ratings yet
Ce6306 Strength of Materials Ii/Iii Mechanical Engineering
29 pages
Parkinson Disease Detection Based On Speech Using Various Machine Learning Models and Deep Learning Models
No ratings yet
Parkinson Disease Detection Based On Speech Using Various Machine Learning Models and Deep Learning Models
6 pages
MSR Template 2023 Word
No ratings yet
MSR Template 2023 Word
8 pages
Index
No ratings yet
Index
8 pages
Network Security Policy 2024
No ratings yet
Network Security Policy 2024
11 pages
Diagnosis of Alzheimer Disease Using Eeg Signals 08-07-22
No ratings yet
Diagnosis of Alzheimer Disease Using Eeg Signals 08-07-22
9 pages
Color
No ratings yet
Color
5 pages
Voice-Based Deep Learning Medical Diagnosis System For Parkinson's Disease Prediction
No ratings yet
Voice-Based Deep Learning Medical Diagnosis System For Parkinson's Disease Prediction
5 pages
Final Synopsis
No ratings yet
Final Synopsis
12 pages
Adv - Micro. Programming ATmega8 Using Arduino IDE
No ratings yet
Adv - Micro. Programming ATmega8 Using Arduino IDE
8 pages
Summary of Agile and Scrum
No ratings yet
Summary of Agile and Scrum
3 pages
Ann Rev 2017 Final
No ratings yet
Ann Rev 2017 Final
37 pages
1 - 2024 Paper
No ratings yet
1 - 2024 Paper
7 pages
Brain Charts For The Human Lifespan
No ratings yet
Brain Charts For The Human Lifespan
20 pages
Literature Survey
No ratings yet
Literature Survey
4 pages
CSC270 DB CDF V4.0
No ratings yet
CSC270 DB CDF V4.0
2 pages
LTE Overview
No ratings yet
LTE Overview
44 pages
Madhav Institute of Technology & Science, Gwalior
No ratings yet
Madhav Institute of Technology & Science, Gwalior
2 pages

Decentralized Brain Age Estimation Using MRI Data

Uploaded by

Decentralized Brain Age Estimation Using MRI Data

Uploaded by

Neuroinformatics (2022) 20:981–990

Decentralized Brain Age Estimation Using MRI Data

Accepted: 27 January 2022 / Published online: 5 April 2022

Keywords Brain age · Decentralized · Federated · COINSTAC​

Introduction diseases (Cole et al., 2019; Franke & Gaser, 2019;

neuroimaging analysis. COINSTAC implements a wide and ⨂

Here M0 represents the aggregated model containing

Though decentralized learning has been applied in some

Fig. 1 Overall flow of model

(a) UPENN-PNC (b) UKBioBank

Table 1 Performance comparison of decentralized and centralized models for BAE

(a) FreeSurfer (b) GM

Conclusion Information Sharing Statement

You might also like

Keywords Brain age · Decentralized · Federated · COINSTAC