0% found this document useful (0 votes)
239 views63 pages

Covid19 Detection Using Federated Learning

This document is a dissertation submitted by four students - Lenin Fernandes, Vaibhav Kamat, Reuben Pereira, and Srajan Shetty - for their Bachelor's degree in computer engineering from Goa University. The dissertation is titled "Covid-19 detection using Federated Learning" and was conducted under the guidance of Prof. Pratiksha Shetgaonkar. The dissertation aims to assess the effectiveness of federated learning in accurately diagnosing Covid-19 by comparing the performance of deep learning models with and without the federated learning framework on chest x-ray image datasets while keeping patient data private and secure.

Uploaded by

Vaibhav Kamat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
239 views63 pages

Covid19 Detection Using Federated Learning

This document is a dissertation submitted by four students - Lenin Fernandes, Vaibhav Kamat, Reuben Pereira, and Srajan Shetty - for their Bachelor's degree in computer engineering from Goa University. The dissertation is titled "Covid-19 detection using Federated Learning" and was conducted under the guidance of Prof. Pratiksha Shetgaonkar. The dissertation aims to assess the effectiveness of federated learning in accurately diagnosing Covid-19 by comparing the performance of deep learning models with and without the federated learning framework on chest x-ray image datasets while keeping patient data private and secure.

Uploaded by

Vaibhav Kamat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

Shivgram Education Society’s

SHREE RAYESHWAR INSTITUTE OF ENGINEERING & INFORMATION TECHNOLOGY


SHIRODA, GOA – 403 103

DEPARTMENT OF COMPUTER ENGINEERING

2022 – 2023

“Covid-19 detection using Federated Learning”


By
Mr. Lenin Fernandes
Mr. Vaibhav Kamat
Mr. Reuben Pereira
Mr. Srajan Shetty

Under the Guidance of


Prof. Pratiksha Shetgaonkar
Assistant Professor, Dept. of Computer Engineering

BACHELOR OF ENGINEERING: GOA UNIVERSITY

1
Shivgram Education Society’s
SHREE RAYESHWAR INSTITUTE OF ENGINEERING & INFORMATION TECHNOLOGY
SHIRODA, GOA – 403 103

2022 - 2023

CERTIFICATE
This is to certify that this dissertation entitled

“Covid-19 detection using Federated Learning”


Submitted in partial fulfillment of the requirements for Bachelor’s Degree in Computer
Engineering of Goa University is the bonafide work of

Mr. Lenin Fernandes


Mr. Vaibhav Kamat
Mr. Reuben Pereira
Mr. Srajan Shetty

__________________________________ ___________________________________
PROF. PRATIKSHA SHETGAONKAR PROF. PRATIKSHA SHETGAONKAR
ASSISTANT PROFESSOR HEAD OF DEPARTMENT (I/C)

________________________________
PROF. JATEEN SHET SHIRODKAR
PRINCIPAL (I/C)

2
Shivgram Education Society’s
SHREE RAYESHWAR INSTITUTE OF ENGINEERING & INFORMATION TECHNOLOGY
SHIRODA, GOA – 403 103

2022 - 2023

CERTIFICATE

The dissertation entitled,


“Covid-19 detection using Federated Learning”
Submitted by,
Mr. Lenin Fernandes
Mr. Vaibhav Kamat
Mr. Reuben Pereira
Mr. Srajan Shetty

In partial fulfillment of the requirements of the Bachelor’s Degree in Computer


Engineering of Goa University is evaluated and found satisfactory.

DATE: _____________ EXAMINER 1: _______________

PLACE: ___________ EXAMINER 2: _______________

3
ACKNOWLEDGEMENT

This dissertation would not have been possible without the guidance and the help of several
individuals who in one way or another contributed and extended their valuable assistance in the
preparation and completion of this report.
We would like to express our gratitude to Goa University and Shree Rayeshwar Institute of
Engineering and Information Technology for including this project in the course, which has provided
us with an opportunity to work on the topic of "Covid-19 diagnosis using Federated Learning".
We would like to express our gratitude to our Mentor, Prof. Mrs. Pratiksha Shetgaokar, HOD
Computer Engineering- SRIEIT, who was a continual source of inspiration. She pushed us to think
imaginatively and urged us to do this homework without hesitation. Her vast knowledge, extensive
experience, and professional competence enabled us to successfully accomplish this project. This
endeavor would not have been possible without her help and supervision.
This initiative would not have been a success without the contributions of each and every individual.
We were always there to cheer each other on, and that is what kept us together until the end.

Last but not least, we would like to express my gratitude to my family, siblings, and friends for their
invaluable assistance, and I am deeply grateful to everyone who has contributed to the successful
completion of this project.

4
ABSTRACT

The COVID-19 pandemic has had a profound impact on global health and economies, resulting in
a staggering number of reported cases surpassing 360 million in 2022 alone. However, the strain
on healthcare systems has posed challenges in delivering adequate care to patients with various
other illnesses. One potential solution lies in leveraging deep learning technology to aid
radiologists in swiftly diagnosing COVID-19 by analyzing x-ray images.

Unfortunately, the collection and sharing of patient data on centralized servers are limited by
data privacy regulations. To address this issue, we propose the implementation of federated
learning as a means of training COVID-19 data while respecting privacy concerns. Federated
learning allows collaborative model training across multiple institutions or devices without
transferring patient data to a central server.

In our research, we aim to assess the effectiveness of federated learning in accurately diagnosing
COVID-19. To do so, we plan to compare the performance of three widely recognized deep
learning models, namely VGG19, MobileNetV1, and InceptionV3. We will evaluate these models
both with and without the federated learning framework using chest x-ray image datasets.

By implementing federated learning, we can overcome the limitations imposed by data privacy
regulations. This approach enables us to conduct experiments and analyze the performance of
different models while keeping patient data localized and secure.

The findings from our research will provide valuable insights into the efficacy of federated
learning for COVID-19 diagnosis. This knowledge will contribute to the development of accurate
and efficient diagnostic tools, benefiting both radiologists and patients. Furthermore, it will pave
the way for future applications of federated learning in healthcare, allowing for collaborative
research and improved patient care while preserving data privacy.

5
TABLE OF CONTENTS

CHAPTER NO. TITLE PAGE NO.

TITLE PAGE i
CERTIFICATE ii
ACKNOWLEDGEMENT iv
ABSTRACT v
TABLE OF CONTENTS vi
LIST OF FIGURES vii
LIST OF TABLES ix
LIST OF ALGORITHMS x
ABBREVIATIONS xi

Chapter 1 INTRODUCTION …………………………….………….………….…………………….. (11)

1.1 Introduction & Background of Project…………..………..…………...(11)

1.2 Motivation for Research ……………………………………….………..….(13)

1.3 Research Questions, Problem Statement & Objectives….……..(15)

1.4 Report organization ………........................................………........….......(17)

Chapter 2 LITERATURE REVIEW ……….………….………….………….……………..……… (18)

2.1 Introduction .............................................................................…..……..... (18)

2.2 Review of Existing Literature……..……..……..……..….….…..……… (20)

2.3 Summary…………………………………………………………………....….... (27)

6
Chapter 3 RESEARCH DESIGN & METHODOLOGY ……………………….…….. (28)

3.1 Assumptions ………………….…………………………………..…………....…. (28)

3.2 Proposed Research Design/methodology ….….……………...……… (42)

Chapter 4 IMPLEMENTATION & RESULT ANALYSIS……………………………....… (48)

Chapter 5 CONCLUSION & FUTURE SCOPE………………………………………………… (60)

5.1 Conclusion………… ………………………………………………………….…. (60)

5.2 Future Scope….…………………………………………………………..…..… (61)

REFERENCES ….………………………………………………………………………………….………..……..(62)

7
LIST OF FIGURES

FIGURE NO. DESCRIPTION PAGE NO.

3.1 Architecture of VGG19 29

3.2 Architecture of InceptionV3 30

3.3 Architecture of MobilenetV1 31

3.4 Architecture of flower 35

3.5 Overview of the FL training process 42

Loss Convergence Comparison of VGG19 Model -


4.1 51
Federated Learning vs Centralized Training.

Loss Convergence Comparison of MobileNet Model -


4.2 52
Federated Learning vs Centralized Training.

Loss Convergence Comparison of InceptionV3 Model


4.3 53
- Federated Learning vs Centralized Training.

4.4 VGG19: Confusion Matrices of Client 1,2 and 3 54

4.5 InceptionV3: Confusion Matrices of Client 1,2 and 3 55

4.6 MobileNet: Confusion Matrices of Client 1,2 and 3 56

4.7 UI Screenshots 58

8
LIST OF TABLES

TABLE NO. DESCRIPTION PAGE NO.

2.1 Literature Review Table 21

3.1 Data-set Split 28

Test Results of clients training on their own datasets


4.1 48
(non-FL)

Test results of clients combining datasets to train a


4.2 48
centralized model

4.3 FL individual Client Accuracy values 49

4.4 FL individual Client Loss values 49

Comparison of Losses for VGG19 Model - Clients


4.5 Trained with Federated Learning (FL) and Individual 50
Datasets

Comparison of Losses for InceptionV3 Model -


4.6 Clients Trained with Federated Learning (FL) and 50
Individual Datasets

Comparison of Losses for MobileNet Model - Clients


4.7 Trained with Federated Learning (FL) and Individual 51
Datasets

9
LIST OF ALGORITHMS
ALGORITHM DESCRIPTION PAGE NO.
NO.
3.1 Flower Client algorithm from official 37
documentation

3.2 Flower Server algorithm from official 38


documentation

4.1 Prediction code under front-end 59

10
Chapter 1: INTRODUCTION

1.1 Introduction & Background of Project

The current COVID-19 pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS
CoV2), threatened human life, health, and productivity and has rapidly spread worldwide. The
COVID-19 virus, like other family members, is sensitive to ultraviolet rays and heat. AI and deep
learning play an essential role in COVID-19 cases identification and classification using computer-
aided applications, which achieves excellent results for identifying COVID-19 cases based on known
symptoms including fever, chills, dry cough, and a positive x-ray. AI, and deep learning models can
be used to forecast the spread of the virus based on historical data which can help control its spread.
So, there is a need to build machine learning models to identify COVID-19 infected patients or to
predict the spread of the virus in the future, but this is not easy to achieve because the patient data is
confidential, and without enough data, it is too difficult to build a robust model. A new approach is
needed that makes it easy to build a model without accessing a patient’s private data or requires
transferring a patient's raw data, and one which gives high prediction accuracy.
The concept of Federated Machine Learning was proposed by Google in 2016 as a new machine
learning paradigm. The objective of federated learning is to build a machine learning model based on
distributed datasets without sharing raw data while preserving data privacy. In Federated Machine
Learning, each client (organization, server, mobile device, and IoT device) has a dataset and the local
machine learning model. There is a centralized machine learning model (global model), which
aggregates the distributed client’s model parameters (model gradients). Each client trains the local
machine learning model locally on a dataset and shares the model parameters or weights to the global
model. The global model makes an iteration of rounds to collect the distributed client model updates
without sharing raw data.
Need of Federated Learning:
 Decentralized model removes the need to transfer all the data to one server to train the model, as
training each node occurs locally, unlike traditional machine learning which requires moving all the
data to a centralized server, to build and train the model.
 No data privacy violation as it applies methodologies including the differential privacy and the
homographic secure multiparty computation, unlike traditional machine learning.

11
 A third-party can be part of the training process as long as there is no data privacy violation and
data is secured, unlike traditional machine learning third-party could not be an option in case of
military organizations.
 Less computational power is needed as model training is performed on each client, and the
centralized model’s primary role is to collect gradient update distributed models, unlike the
traditional machine learning where one centralized server contains all the data, which requires high
computational power for model training.
 Decentralized algorithms may provide better or the same performance as centralized algorithms.

It is highly recommended to use Federated Machine Learning rather than traditional machine
learning, in such environments where data privacy is highly required. Federated Learning can be
applied in many disciplines like (Smart healthcare, sales, multi-party database, and smart retail)

12
1.2 Motivation for Research
It provides an option in the medical domain wherein the sensitive data is not shared and privacy is
maintained, keeping the training dataset on the devices, so a data pool is not required for the model.
Federated learning facilitates access to heterogeneous data even in cases where data sources can
communicate only during certain times. Models are constantly improved using client data with no
need to aggregate data for continual learning. This approach uses less complex hardware, because
federated learning models do not need one complex central server to analyze data.

Advantages of Federated Learning are:

1.2.1 Communication Efficiency

Communication-Efficiency Communication is a key bottleneck to consider when developing


methods for federated networks. This is because Federated networks potentially include a massive
number of devices (for example, millions of smartphones), and communication in the network can be
slower than local computation by many orders of magnitude. Therefore, federated learning depends
on communication-efficient methods that iteratively send small messages or model updates as part of
the distributed training process instead of sending the entire dataset over the network. There are two
main goals to further reduce communication: (1) reducing the total number of communication rounds
or (2) reducing the size of transmitted messages at each round. The following are general concepts
that aim to achieve communication-efficient distributed learning methods:
 Local updating methods,
 Model compression schemes,
 Decentralized training.
1.2.2 Systems Heterogeneity
The storage, computational, and communication capabilities of the devices that are part of a
federated network may differ significantly. Differences usually occur due to variability in hardware
(CPU, memory), network connectivity (3G, 4G, 5G, wifi), and power supply (battery level).
Additionally, only a small fraction of the devices may be active at once. Each device may be
unreliable as it is not uncommon for an edge device to drop out due to connectivity or energy
constraints. Therefore, fault tolerance is important as participating devices may drop out before
completing the given training iteration. Therefore, federated learning methods have to be developed
so that they (1) anticipate a low amount of participation, (2) tolerate heterogeneous hardware, and (3)

13
are robust to dropped devices in the network. There are some key directions to handle systems
heterogeneity:
 Asynchronous communication,
 Active device sampling,
 Fault tolerance.

1.2.3 Statistical Heterogeneity


Devices frequently generate and collect data in a non-identically distributed manner across the
network, e.g., mobile phone users have varied use of language in the context of a next-word
prediction task. Also, the number of data points across devices may vary significantly, and there may
be an underlying structure present that captures the relationship between devices and their associated
distributions. This data generation paradigm violates frequently-used independent and identically
distributed (I.I.D.) assumptions in distributed optimization, increases the likelihood of stragglers, and
may add complexity in terms of modeling, analysis, and evaluation. Challenges arise when training
federated models from data that is not identically distributed across devices, both in terms of
modeling the data and in terms of analyzing the convergence behavior of associated training
procedures.

1.2.4 Privacy Concerns


Privacy concerns often motivate the need to keep raw data on each device local in federated settings.
However, sharing other information such as model updates as part of the training process can also
potentially reveal sensitive information, either to a third party or to the central server. Recently
methods aim to enhance the privacy of federated learning using secure multiparty computation
(SMC) or differential privacy. However, those methods usually provide privacy at the cost of
reduced model performance or system efficiency. Therefore, balancing these trade-offs is a
considerable challenge in realizing private federated learning systems. Recently, multiple privacy-
preserving methods for machine learning have been researched. For example, the following three
main strategies could be used for federated settings: Differential privacy to communicate noisy data
sketches, homomorphic encryption to operate on encrypted data, and secure function evaluation or
multiparty computation.
● Differential Privacy
● Homomorphic Encryption

14
1.3 Research Questions, Problem Statement & Objectives
Research questions:
This research aims to find a solution to the following research questions:
RQ1) What are the existing solutions to deal with the problem of Covid-19, to understand the
methodologies that the researchers have used in the literature to deal with the Covid-19 problem?
RQ2) What are the problems with the existing solutions?
RQ3) How to deal with existing problems using available techniques?
RQ4) How to use federated learning to tackle the problem?
Problem statement:
Computer aided automated diagnosis of covid 19 using federated learning without sharing patients’
sensitive data.

Research Methodology:
For our research, we employed Google Scholar as our database and performed a search using the
keywords 'COVID-19' and 'federated learning' to identify relevant research papers. The search
yielded a total of 128 papers published between 2018 and 2022. From this initial set, we carefully
selected 24 papers for further review based on their relevance to our project.

Objectives:
1.To understand the existing solutions to deal with the problem of COVID-19, the methodologies
that researchers have used in the literature to deal with the COVID-19 problem.
Existing solutions to deal with COVID-19 include:
● Traditional medical diagnostic tools such as RT-PCR and rapid antigen tests.
● Telemedicine and remote monitoring tools for patients in self-isolation.
Machine learning-based tools for detecting COVID-19 from medical images such as X-rays and CT
scans.
2.Problems with existing solutions:
● Limited availability of diagnostic tests, particularly in resource-limited settings.
● Inaccurate results due to factors such as operator error and variability in test performance.
● Privacy concerns related to the collection, storage, and sharing of medical data.
3.To identify and understand the problems with the existing solutions to deal with COVID19. To
propose solutions to deal with existing problems using available techniques.
Potential solutions to address the problems with existing COVID-19 detection solutions include:

15
● Developing more accurate and reliable diagnostic tests.
● Improving the accuracy of machine learning-based tools for detecting COVID-19 from
medical images.
● Implementing secure and privacy-preserving data storage and sharing protocols.
4)To explore the use of federated learning as a potential solution to tackle the problem of COVID-19.
Federated learning is a distributed machine learning approach that allows for the training of machine
learning models on decentralized datasets, without the need for central data collection and storage.
This can help to address the privacy concerns associated with traditional machine learning-based
COVID-19 detection tools, and enable the development of accurate and effective COVID-19
detection models without compromising the privacy of individual patients.

16
1.4 Report Organization
The report is organized into five chapters. Chapter 1 discusses the crucial role of AI and deep
learning in identifying and classifying COVID-19 cases using computer-aided applications. It
introduces federated learning as a technique that allows building ML models without accessing
patients' private data or transferring raw data. The chapter highlights the necessity of federated
learning, the research's motivation, and the advantages of FL. It also lists down the research
questions and objectives. Chapter 2 provides a comprehensive review of existing literature on the
research topic, including a summary of the findings. In Chapter 3, the assumptions, hypotheses, and
proposed methodology for the study are presented. Chapter 4 contains a summary of the research
findings. The final chapter, Chapter 5, serves as the conclusion and outlines the future scope of the
research, emphasizing the significance of AI and deep learning in COVID-19 management and the
potential impact of federated learning in healthcare and medical research.

17
CHAPTER 2 : LITERATURE REVIEW

2.1 Introduction

In the wake of the COVID-19 pandemic, the healthcare industry has been actively exploring various
technological solutions to aid in the detection and diagnosis of the disease. One such approach that
has gained significant attention is the application of federated learning in the field of COVID-19
detection. Federated learning is a decentralized machine learning approach that enables multiple
institutions or devices to collaboratively train a shared model without sharing raw patient data. This
privacy-preserving technique has the potential to revolutionize the way COVID-19 is diagnosed and
screened while ensuring data privacy and security. Federated learning leverages the power of
distributed computing and advanced machine learning algorithms to train models on data stored
locally at different medical institutions or devices. The models are trained in a collaborative manner,
with each institution contributing its data while keeping it securely within their premises. This
approach addresses privacy concerns associated with sharing sensitive patient data, which is
particularly crucial in the healthcare domain. Several studies have explored the application of
federated learning specifically for COVID-19 detection and diagnosis. These studies utilize various
deep learning techniques and architectures, such as convolutional neural networks (CNNs), to
analyze medical imaging data, including chest X-rays and CT scans. By training models on
distributed datasets, these approaches enable accurate detection of COVID-19 cases while preserving
patient privacy.

One important aspect of federated learning for COVID-19 detection is the optimization of
hyperparameters and model aggregation. Genetic algorithms have been employed to optimize
learning rates, batch sizes, and other hyperparameters for individual end-device models. By
clustering clients based on their hyperparameters, the learning efficiency per training unit is
increased. The use of genetic algorithms helps in fine-tuning the hyperparameters and achieving
better model aggregation within a cluster. The performance of federated learning models for COVID-
19 detection has been evaluated using various metrics, including accuracy, area under the curve
(AUC), and receiver operating characteristic (ROC) scores. Comparisons have been made with
traditional learning methods, centralized models, and individual client models to assess the
effectiveness of federated learning. The results have shown promising improvements in accuracy and

18
performance, indicating the potential of federated learning in enhancing COVID-19 detection and
diagnosis.

Furthermore, federated learning has been extended to address other challenges in the field of
COVID-19 detection. For example, the generation of synthetic COVID-19 images using generative
adversarial networks (GANs) has been explored to enhance the detection process. Additionally,
federated learning has been combined with edge cloud computing to enable efficient and privacy-
enhanced COVID-19 detection in resource-constrained environments. In conclusion, federated
learning holds significant promise for COVID-19 detection and diagnosis. By utilizing distributed
datasets and collaborative model training, federated learning enables accurate identification of
COVID-19 cases while respecting data privacy and security. The optimization of hyperparameters,
model aggregation, and the integration of advanced techniques like GANs and edge cloud computing
further enhance the potential of federated learning in combating the COVID-19 pandemic. Continued
research and development in this field can lead to improved diagnostic accuracy, better resource
utilization, and enhanced patient privacy, ultimately contributing to more effective strategies for
managing and controlling the spread of COVID-19.

19
2.2 Review of Existing Literature

This literature review table provides a comprehensive overview of various studies and research
related to Federated Learning in the context of COVID-19 detection and medical imaging analysis.
The table highlights key performance metrics, methodologies, and focus areas of each study. Several
machine learning models, such as VGG 19, InceptionV3 and MobilenetV1 have been applied in
federated learning frameworks to achieve accurate COVID-19 detection and medical diagnosis.
Overall, the literature review demonstrates the potential and impact of federated learning in various
fields, especially in healthcare. It highlights the growing interest in collaborative approaches that
facilitate multi-institutional collaborations without sharing sensitive patient data. The studies reveal
that federated learning achieves comparable results to centralized learning methods while preserving
privacy and avoiding data sharing.

20
Publisher/Ye
Sr.no. Title Focus/Research Methodology Data Type Results/Performance Metrics
ar

Genetic Clustered Federated MDPI 2022 Genetic algorithm to optimize Genetic Symptoms: Cough, Training Loss: 0.2868
Learning for COVID-19 learning rates and batch sizes for algorithm ; Fever, Sore Throat, Training Accuracy 0.9109
Detection each of the individual end device ANN ; whale Shortness of Breath Validation Loss 0.2836
models. optimization. & Headache. Validation Accuracy 0.9125
Fed Tune, Other Features:
1 FedEx Gender, Age 60 and
above, Test indication
& Test date.
Target Feature:
Corona Result.

Federated learning for COVID-19 Elsevier A collaborative federated learning VGG16 and Chest X-ray images FL-VGG16+data aug
screening from Chest X-ray 2021 framework allowing multiple ResNet50 Accuracy 94.40%
images medical institutions screening Sensitivity 96.15%
COVID-19 from Chest X-ray Specificity 92.66%
2 images using deep
learning without sharing patient FL-ResNet50+data aug
data. Accuracy 97.0%
Sensitivity 98.11%
Specificity 95.89%

21
Application of Federated Learning NA To compare Fl model performance DenseNet121; Chest X-ray images FL
in building a robust COVID-19 2022 to conventional central model and Binary Cross- ROC-AUC score ROC-PRC score
TABLE 2.1

Chest X-ray classification Model individual client models Entropy; Adam client1 99.61 client1 99.85
optimizer(10^-3) client2 92.31 client2 91.54
client3 88.31 client3 88.57
Literature Review Table

3
Centralized
ROC-AUC score ROC-PRC score
client1 99.86 client1 97.96
client2 99.92 client2 98.94
client3 93.51 client3 92.86

4 Federated Learning for COVID- IEEE 2021 This paper proposes a new FedGAN, X-ray: DarkCOVID Proposed FedGAN scheme
19 Detection with federated learning scheme, called custom CNN dataset, ChestCOVID Precision Sensitivity F1-score
Generative Adversarial Networks FedGAN, to generate realistic dataset COVID-19 0.993 0.978 0.991
in Edge Cloud COVID-19 images for facilitating Normal 0.964 1 0.969
Computing privacy-enhanced COVID-19 Pneumonia 0.966 0.876 0.932
detection with generative
adversarial networks (GANs) in
edge cloud computing
CNN Based Transfer Learning IEEE 2021 A deep learning-based CNN, ReLu, X-Ray kaggle 5*5 kernel activation of relu
Framework For approach is used for SoftMax ephocs 10
Classification Of COVID-19 identifying COVID-19 from loss 19.64
Disease From Chest patient’s chest X-ray images. accuracy91.85
5 X-ray val_los 0.055
val_acc 97.5

Diagnosing COVID-19 SPIE Uses X-rays and CT scan CNN, Transfer COVID-19(x- Method type sensitivity Specificity Acc
Pneumonia from X-Ray and images from multiple sources Learning: ray, ct) CNN xray 100 88 94
CT Images using Deep Learning for COVID-19 detection Alexnet, github, BSTI . ct 90 100 94.1
and Transfer using deep learning and
Learning Algorithms transfer learning algorithms. Normal(x-ray, Alexnet xray 100 96 98
A CNN and modified ct) . ct 72 100 82
6 pretrained AlexNet model are Kaggle,
applied on the prepared X- Radiopedia
rays and CT scan images
dataset.

Federated learning based Covid- Wiley Online The authors developed a basic Website, COVID19ACTI Classifier Accuracy Sensitivity Specificity F-1 score
19 detection Library 2022 sequential CNN model based Flower, ON- Xception 0.9959 0.9911 1.000 0.9955

22
on deep and federated Xception, RADIOLOGY- InceptionV3 0.9917 1.0000 0.9845 0.9922
learning that focuses on user InceptionV3, CXR DenseNet121 0.9751 1.0000 0.9535 0.9762
data security while DenseNet121, ResNet50 0.9087 0.8125 0.9922 0.8934
7 simultaneously enhancing test ResNet50
accuracy. Uses Chest X-ray
images for the diagnosis.

IEEE 2022 leverage the capabilities of VGG16 and Ultrasound & an see that CFL
Collaborative Federated edge computing in medicine ResNet50 Xray performance is considerably better than the
Learning for Healthcare: Multi- by analyzing and evaluating performance of
Modal COVID-19 Diagnosis at the potential of intelligent multi-modal model trained in a conventional
the Edge processing of clinical visual federated learning environment
data
Parameters of CFL Exps
8 communication rounds : 30,50 &100
Epochs : 5 & 10
A Systematic Literature Arxiv.org 2020 1.Since FL was proposed in 2016, what AE,ANN,CNN, Through reading and analyzing articles related
Review on Federated is the overall DNN,ELM,GA to FL in
Learning research and application trends of FL. 2: N,NN,RNN,DT the five major databases, they conduct
What methods to improve the quality of ,RF,LR statistical analysis and
FL model present the results of a systematic literature
design and training? review.
3: What methods to improve the quality Problem solved:
of data privacy protection in FL? Edge Computing and Internet of Things
4: What are the incentive mechanisms • Healthcare
9 for improving • Urban Computing and Smart City Physical
the data quality of FL? Information System
5: Can FL achieve similar learning • Internet and Finance
effects to non-FL • Industrial Manufacturing
on the same dataset? What are the
conditions when FL
and non-FL results are similar?

The future of digital health : NPJ Digital This paper considers key factors Fed avg FL is a promising approach to obtain
with federated learning. medicine 2020 contributing to this issue, explores how powerful,accurate, safe, robust and unbiased
federated learning (FL) may provide a models
10 solution for the future of
digital health and highlights the
challenges and considerations that need

23
to be addressed.

Federated learning in Nature The primary focus was on the ResNeXt, Ample and diverse data are needed
medicine: facilitating publishing group performance impact of diferentially MobileNet-v2 Collaborative learning is superior
multi-institutional 2020 private training techniques, which may and ResNet18 FL performs comparably to data-sharing
collaborations without reduce the risk of training data being
11 sharing patient data reverse engineered from model
parameters. Such reverse engineering is
one of the many security and privacy
concerns that remain for FL

Federated Learning For ophthalmology To Evaluate the performance of a RDR The 153 OCTA en face images For both applications, federated learning
Microvasculature Science 2021 federated learning framework for deep classification used for microvasculature achieved similar performance as internal
Segmentation and Diabetic neural network-based retinal segmentation were acquired from models. Specifically, for microvasculature
Retinopathy Classification of microvascular segmentation and 4 OCT instruments with fields of segmentation, the federated learning model
OCT Data referable diabetic (RDR) classification view ranging from 2 × 2-mm to 6 achieved similar performance (mean DSC
using OCT and OCT × 6-mm. The 700 eyes used for across all test sets, 0.793) as models trained on
12 angiography(OCTA) RDR classification consisted of a fully centralized dataset (mean DSC, 0.807).
OCTA en face images and For RDR classification, federated learning
structural OCT projections achieved a mean AUROC of 0.954 and 0.960;
acquired from 2 commercial OCT the internal models attained a mean AUROC of
systems. 0.956 and 0.973. Similar results are reflected in
the other calculated evaluation metrics.
COVID-19 Screening on Chest na 2020 develop a new deep anomaly CNN(Trained on image net) Xrays , CT scans model, which
13 X-ray Images Using Deep detection model for fast, reliable combines the classification and anomaly detection
Learning based Anomaly screening tasks,
Detection outperforms each single task learning model.
Federated learning of predictive NLM aim at solving a binary supervised sparse Support Vector Machine demographic data such cPDS converges faster than centralized methods at the
models from federated 2020 classification problem to predict (sSVM) classifier, Primal Dual as age, gender, and cost of some communication between agents. It also
Electronic Health Records hospitalizations for cardiac events Splitting (cPDS) algorithm . race, physical converges faster and with less communication
using a distributed algorithm 2) to characteristics such as overhead compared to an alternative distributed
develop a general decentralized weight, height, Body algorithm.
optimization framework enabling Mass Index (BMI),
14 multiple data holders to collaborate medical history
and converge to a common captured by diagnoses
predictive model, without explicitly
exchanging raw data.

Federated learning based NA 2022 The proposed model helps users ResNet50,DenseNet121,nception chest Xrays, CT scans model can successfully classify X-ray images into
16 Covid-19 detection detect COVID-19 in a few seconds V3,Xception covid-19 positive and negative
by uploading a single chest X-ray
image.
COVID-19 detection using PLOS 1 In this paper, we studied the Algorithm 1. The Federated chest x-ray The proposed federated learning model gives better
federated machine learning 2021 efficacy of federated learning Learning Model radiography prediction accuracy than traditional deep learning
versus traditional learning by Input: COVID-19 Dataset as CSV mode.2)The proposed federated learning model gives
developing two machine learning file a lower loss than traditional machine learning

24
models (a federated learning model Output: Model Prediction model.3)The proposed federated learning model takes
17 and a traditional machine learning Accuracy and loss 2)Algorithm 2. a higher training time than traditional machine
model)using Keras and TensorFlow The Traditional Machine learning model.
federated Learning Model
Input: COVID-19 Dataset as CSV
file
Output: Model Prediction
Accuracy and loss
Experiments of Federated na 2020 k, we conducted four individual 3 individual client models, COVID-19 CXR ResNet18 has the fastest convergence speed,
Learning for COVID-19 Chest experiments conventional method using all 3 images and the highest accuracy rate (96.15%, 91.26%) on
X-ray Images to present the performances in clients, FL method using the 3 both the
federated learning of four clients; These models are then training set and the testing set. The ResNeXt
different networks for COVID-19 compared. convergence
CXR images: CovidNet[3], rate is closely followed, but the accuracy rate is not as
18 ResNeXt[4], MobileNet-v2[5], and good as the second-ranked COVID-Net. Although
ResNet18[6]. Further, we MobileNet-v2
analyzed the results and proposed has a loss value similar to COVID-Net, the accuracy
possible future improvements to rate on
inspire more research in federated the testing set is not satisfactory.
learning for
COVID-19.
Federated Learning for google 2019 We train a recurrent neural network FederatedAveraging algorithm TEXT from The performance of each model is
Mobile Keyboard language model using a distributed, on- users evaluated using the recall
Prediction device learning framework called metric, defined as the ratio of the
federated learning for the purpose of number of correct predictions to the
next-word prediction in total number of tokens.
a virtual keyboard for smartphones.
19

the future of digital Springer Nature how federated learning (FL) may provide This is evaluated on curated data sets, CT scan, MRI, FL has an impact on nearly all
health with federated 2020 a solution for the future of this introduced biases where BraTS stakeholders and the entire treatment
learning digital health and highlights the demographics (e.g., gender, age) or cycle, ranging from improved
challenges and considerations that need technical imbalances (e.g., acquisition medical image analysis providing
to be addressed. protocol, equipment manufacturer) skew clinicians with better diagnostic
20 predictions and adversely affect the tools, over true precision medicine
accuracy for certain groups or sites. by helping to find similar patients, to
collaborative and accelerated drug
discovery decreasing cost and time-
to-market for pharma companies.

Collaborative federated IEEE 2022 Leverage the capabilities of edge The emerging concept of clustered X-ray and improvements

25
learning for healthcare: computing in medicine by evaluating the federated learning (CFL) for an Ultrasound of 16% and 11% in overall F1-
Multi-Modal COVID-19 potential of intelligent processing of automatic COVID-19 diagnosis was used Scores have been achieved over the
21 Diagnosis at the Edge clinical data at the edge and the performance of the proposed trained model trained (using multi-
framework under different experimental modal
setups on two benchmark datasets was COVID-19 data) in the CFL setup on
evaluated. X-ray and Ultrasound datasets

Federated Learning of JMIIR Using federated learning, a machine Patient data were collected from the CT scan The Mount Sinai
electronic health records Publications learning technique that avoids locally electronic health records of 5 hospitals Queens hospital was the only
to improve mortality of 2021 aggregating raw clinical data across within the Mount Sinai Health System. hospital where the LASSOfederated
patients with covid-19 multiple institutions, to predict mortality Logistic regression with L1 model performed worse than the
in hospitalized patients with COVID-19 regularization/least absolute shrinkage LASSOlocal model, with a
within 7 days. and selection operator (LASSO) and difference of 0.012 in AUROC
multilayer perceptron (MLP) models values(23%). the Mount Sinai West
were trained by using local data at each hospital, the LASSOlocal
site. Developed a pooled model with model severely underperformed
22
combined data from all 5 sites, and a compared to the LASSOfederated
federated model that only shared model, with an AUROC difference
parameters with a central aggregator. of 0.319. The Mount Sinai
West hospital had the lowest sample
size (n=485) and the lowest
COVID-19 mortality prevalence
(5.6%) compared to all
hospitals.
Dynamic fusion IEEE 2021 A novel dynamic fusion- An architecture is designed for CT scan and X- dynamic fusion-based federated learning (DF_FL)
based federated based federated learning dynamic fusion-based federated ray achieves higher accuracy than the default
learning for approach for medical learning systems to analyze setting of federated learning (D_FL) in 14 groups
covid-19 diagnostic image analysis medical diagnostic images, which of experiments.
detection to detect COVID-19 dynamically decides the DF_FL have lower accuracy compared to D_FL
23 infections. participating clients according to for 0.57%, 1.331%, 0.951%, and 1.141% resp
their local model performance and D_FL for 0.57%, 1.331%, 0.951%, and 1.141%
schedule the model fusion based on respectively
participating client’s training time.

Human Activity IEEE 2018 Compare HAR (Human DNN and a records of >In this experiments on synthetic and real world
Recognition Activity Recognition) softmax regression model human datasets, federated learning achieves accuracy of
using Federated classifier trained using activities: up to 89% compared to up to 93% in centralized
Learning centralized learning with biking, sitting, learning at the price of higher communication cost
24 HAR classifier trained standing, for complex models with a high number of
using federated learning. walking, stair up parameters, such as DNNs.
and stair down >Best test accuracy for the DNN was 93% and
and null. 83% for the softmax regression.
>The best performing DNN model consist of 2
hidden layers and 100 neurons in each layer, with
a total of 24006 parameters.
Federated elsevier 2018 solving a binary A federated learning model was Heart Disease The cPDS converge faster than centralized
learning of supervised classification developed that was able to predict dataset methods at the cost of some communication
future hospitalizations for patients

26
predictive problem to predict between agents. It also converges faster and with
models from hospitalizations for with heart-related diseases .The less communication overhead compared to an
federated cardiac events using a proposed decentralized framework, alternative distributed algorithm. In both cases, it
Electronic distributed algorithm the cluster Primal Dual Splitting achieves similar prediction accuracy measured by
25 Health Records (cPDS) algorithm could solve the the Area Under the Receiver Operating
sparse Support Vector Machine Characteristic Curve (AUC) of the classifier.
(sSVM) problem, which yields
classifiers using relatively few
features and facilitates the
interpretability of the classification
decisions.

Federated Springer Nature to predict the future In this data from 20 institutes CT scan and X- EXAM achieved an average area under the curve
learning for 2021 oxygen requirements of across the globe were used to train ray (AUC) >0.92 for predicting outcomes at 24 and 72
predicting symptomatic patients a FL model called EXAM hrs, and it provided 16% improvement in average
clinical with Covid-19 using (electronic medical record (EMR) AUC measured across all participating sites and an
outcomes in input of vital signs, chest X-ray AI model). The average increase in generalizability of 38% when
patients with laboratory data and chest EXAM model is based on the CDS compared with models trained at a single site data.
26 COVID-19 X-rays. (clinical decision support) model. For prediction of mechanical ventilation treatment
In total, 20 features (19 from the or death at 24 h at the largest independent test site,
EMR and one CXR) were used as EXAM achieved a sensitivity of 0.950 and
input to the model. specificity of 0.882.
2.3 Summary

The concept of federated learning is a new and popular research topic and is being widely explored
in healthcare. Numerous reports have demonstrated proof of concept with respect to federated
learning applied to real-world medical imaging. With federated learning, members of the
participating community can obtain performance that is akin to training with a large dataset.
Therefore, federated learning offers a way to bypass the problem of not having sufficient labeled data
to deploy top-level machine-learning solutions, through the combined effort of various medical
institutions

To conclude, it is extremely difficult for individual sites that have small labeled datasets to build
their own AI models for patient diagnosis. With federated learning, this barrier is eliminated.
Through collaboration, multiple institutions can pool their data to train a global model that offers
greater accuracy over a larger spectrum of patients. In this collaborative effort, there is no direct data
sharing, as federated learning prioritizes the privacy of patient data. Federated learning offers easy
scalability, flexible training scheduling, and large training datasets through multi-site collaborations,
all essential conditions to the successful deployment of an AI solution.

27
Chapter 3: RESEARCH DESIGN & METHODOLOGY

3.1 Assumptions
3.1.1 Dataset

The proposed research aims to develop a federated learning (FL) model for COVID-19 detection using chest
X-ray images. The data used for each of the clients in this project will be non-Independent Identically
Distributed (non-IID) and will be collected from different sources such as the Eurorad database from Kaggle
Repository, and the IEEE data repository. The data will contain images belonging to two classes i.e COVID-
19 present and absent, with variations in the number of images per class. The data will be divided into
training, validation, and test datasets for each of the clients. In this study, three different clients' datasets were
used to train and evaluate the model.

Client 1 utilized the "ieee8023" GitHub repository for obtaining COVID-19 positive chest X-ray images, and
the "paultimothymooney/chest-xray-pneumonia" Kaggle dataset for normal chest X-ray images. In total, the
client used 284 images, with an equal distribution of positive and normal cases. Client 2 contributed the
"COVID19ACTION-RADIOLOGY-CXR" dataset, which comprised 536 COVID-19 positive chest X-ray
images and 668 normal chest X-ray images, resulting in 1204 images. Lastly, Client 3's dataset, known as the
"(SIRM) COVID-19 DATABASE," included a larger number of samples. It contained 3516 COVID-19
positive chest X-ray images and 10,000 normal chest X-ray images, summing up to 13,516 images of which
1000 images from each class were used. In summary, the combined dataset used in this study consisted of
3,488 images, with varying distributions of normal and COVID-19 cases across the three clients' datasets.

Table 3.1

Data-set split

Client Source Covid Normal

1 ieee8023(github), 142 142


paultimothymooney/chest-xray-
pneumonia(kaggle)

2 COVID19ACTION-RADIOLOGY- 536 668


CXR.

3 (SIRM) COVID-19 DATABASE 1000 1000

28
3.1.2 Models
1. VGG19:
 Details: VGG19 is a deep convolutional neural network (CNN) architecture. It consists of 19 layers,
including convolutional layers with small 3x3 filters, max-pooling layers, and fully connected layers. The
network is known for its simplicity and uniformity in architecture, with the same filter size and stride used
throughout the network.
 Uniqueness: VGG19 is renowned for its depth, which allows it to learn complex features and patterns
from images. It has a larger number of layers compared to earlier versions like VGG16, which enhances its
representational capacity and enables better feature extraction.
 Improvements: The main improvement of VGG19 over earlier versions is its increased depth, which
enables better feature learning. However, this increased depth also leads to a higher number of parameters and
computational complexity, making it more resource-intensive.

Fig 3.1 Architecture of VGG19

2. InceptionV3:
 Details: InceptionV3 is a deep CNN architecture developed by Google. It is known for its use of
"Inception modules" that employ multiple filter sizes (1x1, 3x3, 5x5) in parallel to capture different levels of
information. The network also incorporates factorized convolution and dimensionality reduction techniques to
reduce computational complexity.

29
 Uniqueness: InceptionV3's uniqueness lies in its efficient use of computational resources through the
Inception modules. The parallel filters capture both local and global information, allowing the network to
extract diverse and multi-scale features from images effectively.
 Improvements: InceptionV3 improved upon earlier versions like InceptionV1 by introducing factorized
convolutions, which reduces computational complexity while maintaining accuracy. This version also includes
additional regularization techniques like batch normalization and label smoothing, further enhancing
performance.

Fig 3.2 Architecture of InceptionV3

3. MobileNetV1:
 Details: MobileNetV1 is a lightweight CNN architecture designed for efficient inference on mobile and
embedded devices. It utilizes depth-wise separable convolutions, where each convolutional filter operates on
individual input channels separately, reducing computational requirements.
 Uniqueness: MobileNetV1's uniqueness lies in its emphasis on efficiency without significant loss in
accuracy. By employing depth-wise separable convolutions, it significantly reduces the number of parameters
and computations, making it ideal for resource-constrained devices.
 Improvements: Subsequent versions of MobileNet, such as MobileNetV2 and MobileNetV3, have
introduced improvements over MobileNetV1. MobileNetV2 introduced linear bottlenecks, inverted residuals,
and shortcut connections to enhance performance. MobileNetV3 further optimized the architecture with

30
dynamic width control and improved activation functions, achieving higher accuracy with reduced
computational cost.

Fig 3.3 Architecture of MobilenetV1

These models each have their strengths and uniqueness, allowing researchers and practitioners to choose the
most suitable option based on their specific requirements and constraints.

31
3.1.3 Federated learning averaging strategies
Federated Learning (FL) averaging strategies are algorithms used to aggregate model updates from multiple
participants in a federated learning setting. Here are the details of four popular FL averaging strategies:

1. FedAvg (Federated Averaging)


FedAvg follows a simple averaging approach. It starts by initializing a global model on a central server. Then,
in each round of federated learning, a subset of participants (e.g., devices or institutions) is selected. These
participants train the global model using their local data and compute model updates. The central server
collects these updates, aggregates them by averaging, and sends the updated global model back to the
participants.
Formula:
 Client update: `w_i(t+1) = ClientUpdate(w_i(t), D_i)`
 Aggregation: `w(t+1) = (1/N) * ∑(w_i(t+1))`
Algorithm
 Server:
Initialize the global model parameters, w(0).
 For each round t:
Randomly select a subset of participants, P_t.
Participants update their local model parameters:
Client update: w_i(t+1) = ClientUpdate(w(t), D_i)
Server aggregates the model updates:
Aggregation: w(t+1) = (1/|P_t|) * ∑(w_i(t+1)) where |P_t| represents the number of participants in round t.

2. FedProx (Federated Proximal):


FedProx extends FedAvg by introducing a proximal term to address non-i.i.d. (non-independent and
identically distributed) data and encourage local models to stay close to the global model. It includes a
regularization term that penalizes large differences between local and global models.
Formula:
 Client update: `w_i(t+1) = ClientUpdate(w_i(t), D_i) + μ * (w_i(t) - w(t))`
 Aggregation: `w(t+1) = (1/N) * ∑(w_i(t+1))`
Algorithm
 Server:
Initialize the global model parameters, w(0).
 For each round t:

32
Randomly select a subset of participants, P_t.
Participants update their local model parameters:
Client update: w_i(t+1) = ClientUpdate(w(t), D_i) + μ * (w(t) - w_i(t))
Server aggregates the model updates:
Aggregation: w(t+1) = (1/|P_t|) * ∑(w_i(t+1)) where |P_t| represents the number of participants in
round t.

3. q-FedAvg (Quantization-Based Federated Averaging):


q-FedAvg addresses the communication bottleneck in FL by quantizing model updates before transmission. It
reduces the amount of data sent from participants to the central server by compressing the updates. The
quantization process involves reducing the precision of the model parameters.
Formula:
 Client update and aggregation are similar to FedAvg but with quantization of model updates.
Algorithm
 Server:
Initialize the global model parameters, w(0).
 For each round t:
Randomly select a subset of participants, P_t.
Participants update their local model parameters:
Client update with quantization: w_i(t+1) = Quantize(ClientUpdate(w(t), D_i))
Server aggregates the model updates:
Aggregation: w(t+1) = (1/|P_t|) * ∑(w_i(t+1)) where |P_t| represents the number of participants in
round t.

4. per-FedAvg (Personalized Federated Averaging):


per-FedAvg aims to personalize the global model for each participant by adapting the learning rate based on
the importance of local data. Participants with more representative data for a particular task are assigned
higher learning rates, enabling them to have a larger impact on the global model.
Formula:
 Client update: `w_i(t+1) = ClientUpdate(w_i(t), D_i)` with a personalized learning rate based on data
importance.
 Aggregation: `w(t+1) = (1/N) * ∑(w_i(t+1))`
Algorithm

33
 Server:
Initialize the global model parameters, w(0).
 For each round t:
Randomly select a subset of participants, P_t.
Participants update their local model parameters:
Client update with personalized learning rate: w_i(t+1) = ClientUpdate(w(t), D_i, η_i)
Server aggregates the model updates:
Aggregation: w(t+1) = (1/|P_t|) * ∑(w_i(t+1)) where |P_t| represents the number of participants in
round t.

These algorithms provide different approaches to aggregating model updates in federated learning, taking into
account various considerations such as non-i.i.d. data, communication efficiency, and personalization. The
formulas provided describe the key steps in each algorithm's update and aggregation processes.

34
3.1.4 Flower
Flower is an open-source Python library designed to simplify the development of federated learning systems.
It provides a framework for implementing federated learning algorithms and enables collaboration among
distributed participants without the need to transfer sensitive data to a central server. Here's an overview of
Flower and its benefits:

Fig 3.4 Architecture of flower

1. What is Flower?
 Flower stands for "Federated Learning over Encrypted Data with Differential Privacy." It is a library
that facilitates the implementation of federated learning algorithms and workflows.
 Flower provides a unified interface and abstraction layer, making it easier to develop federated
learning systems regardless of the underlying deep learning framework (e.g., TensorFlow, PyTorch).

2. Why Flower?
 Federated learning presents unique challenges compared to traditional centralized machine learning
approaches, mainly due to privacy and data distribution concerns.

35
 Flower addresses these challenges by providing a decentralized framework for collaborative model
training, where participants keep their data local and only share model updates.
 Flower enables privacy-preserving federated learning, allowing participants to train models on their
local data while keeping sensitive information secure.

3. Benefits of using Flower:


 Simplified development: Flower offers an intuitive and unified interface, making it easier for developers
to implement federated learning systems and algorithms.
 Cross-framework compatibility: Flower is compatible with popular deep learning frameworks like
TensorFlow and PyTorch, allowing developers to leverage existing models and tools.
 Privacy preservation: With Flower, participants keep their data on local devices, addressing privacy
concerns associated with transferring sensitive data to a central server.
 Collaborative learning: Flower enables collaborative model training by aggregating model updates from
multiple participants, allowing for the collective intelligence of a distributed network.
 Flexible and customizable: Flower provides flexibility to customize the federated learning workflow,
allowing developers to tailor the process to specific requirements and constraints.

By using Flower, researchers and developers can overcome the challenges associated with federated learning,
including privacy concerns and distributed collaboration, while benefiting from a simplified development
experience and cross-framework compatibility.

RPC and gRPC


 Remote Procedure Call (RPC) is a protocol that one program can use to request a service from a program
located in another computer on a network without having to understand the network's details.
 gRPC is a modern open-source high performance Remote Procedure Call (RPC) framework that can run
in any environment.
 Flower uses gRPC framework to allow servers to call and execute the various steps of the training
process.

36
3.1.5 Flower client algorithm

The flower client is a python object that extends the NumpyClient class object has 3 procedures defined inside
it:

 get_parameters
Used to download the current state of the model from the server
 fit
Used to take the downloaded weight that we got and train the model on top of it. It uses the private data and
trains the model that we have on the device currently and sends it back to the server once the training is done.
 evaluate
Is called by the server when we need to test the state of the model, how well it is acting, it calls this method
and it gets the model tested based on the user’s private data

import flwr as fl
import tensorflow as tf

# Load model and data (MobileNetV2, CIFAR-10)


model = tf.keras.applications.MobileNetV2((32, 32, 3), classes=10,
weights=None)
model.compile("adam", "sparse_categorical_crossentropy",
metrics=["accuracy"])
(x_train, y_train), (x_test, y_test) =
tf.keras.datasets.cifar10.load_data()

# Define Flower client


class CifarClient(fl.client.NumPyClient):
def get_parameters(self, config):
return model.get_weights()

def fit(self, parameters, config):


model.set_weights(parameters)
model.fit(x_train, y_train, epochs=1, batch_size=32)
return model.get_weights(), len(x_train), {}

def evaluate(self, parameters, config):


model.set_weights(parameters)
loss, accuracy = model.evaluate(x_test, y_test)
return loss, len(x_test), {"accuracy": accuracy}

# Start Flower client


fl.client.start_numpy_client(server_address="127.0.0.1:8080",
client=CifarClient())

Algorithm 3.1
Flower Client algorithm from official documentation

37
3.1.6 Flower Server algorithm

 Contains the averaging algorithm and overall training strategy.


 Clients connect to the server over a secure gRPC connection and declare that they are ready to participate
in the training process.
 Once the required number of clients (minimum of 2) join the process, the server calls procedures over
gRPC.

import flwr as fl

# Start Flower server


fl.server.start_server(
server_address="0.0.0.0:8080",
config=fl.server.ServerConfig(num_rounds=3),
)

Algorithm 3.2
Flower Server algorithm from official documentation

3.1.7 Non iid data


Non-i.i.d. (non-independent and identically distributed) data refers to datasets in which the data samples are
not uniformly distributed or independently and identically generated across different participants or clients in
a federated learning setting. In non-i.i.d. scenarios, the data distribution and characteristics can vary
significantly across participants, leading to unique challenges in federated learning.
1. Heterogeneous Data Distribution:
 In non-i.i.d. settings, participants may have different data distributions, meaning their data samples come
from distinct populations or have varying statistical properties.
 Heterogeneous data distribution can arise due to factors like location, demographics, or individual
preferences, making it challenging to create a global model that performs well across all participants.

2. Imbalanced Data:
 Non-i.i.d. data often exhibits class imbalance, where certain classes or categories have significantly more
or fewer instances than others.
 Imbalanced data can affect the model's learning process, as the model may become biased towards the
majority class and struggle to learn from the minority class.

3. Concept Drift:

38
 Non-i.i.d. data can exhibit concept drift, which refers to the change in underlying data distribution over
time.
 Concept drift can occur due to various reasons, such as evolving user preferences, changes in the
environment, or shifts in data collection methods.
 Concept drift makes it challenging to maintain a consistent model performance over time, as the model
needs to adapt to changing data distributions.

4. Transfer Learning and Domain Adaptation:


 Non-i.i.d. data requires techniques like transfer learning and domain adaptation to handle the differences
in data distribution across participants.
 Transfer learning allows models to leverage knowledge learned from one domain (source domain) to
improve performance on a different but related domain (target domain). Instead of starting the training process
from scratch, the model can initialize with parameters pre-trained on a large, labeled dataset from the source
domain. This pre-training provides the model with general knowledge about the task, such as feature
extraction, that can be useful across domains. The model is then fine-tuned using a smaller labeled dataset
from the target domain, adapting its parameters to the specific characteristics of the target domain.
 Domain adaptation techniques aim to align or adapt the model to account for the differences in data
distributions between participants.

5. Personalization:
 Non-i.i.d. data provides an opportunity for personalized federated learning, where models can be
customized to suit individual participants' specific data distributions and requirements.
 Personalization allows the model to capture participant-specific characteristics and improve performance
for individual users.

Strategies to Address Non-I.I.D. Data:


 Client Selection: Careful client selection can be employed to ensure a representative subset of clients in
each round. This helps mitigate the impact of non-i.i.d. data by considering a diverse range of data sources.
 Local Training: Allowing clients to perform multiple local training iterations or epochs can help mitigate
non-i.i.d. effects by allowing clients to learn from their data more effectively.
 Personalized Federated Learning: Personalized federated learning algorithms assign different learning
rates or weights to different clients based on the relevance or importance of their data, which can help address
non-i.i.d. effects.

39
 Model Aggregation Techniques: Employing sophisticated aggregation techniques, such as weighted
averaging or weighted majority voting, can take into account the quality and representativeness of clients'
data.

Research and Solutions for non-i.i.d data:


 Various research works and techniques have been proposed to tackle non-i.i.d. data in federated learning,
such as federated transfer learning, data augmentation, meta-learning, and federated meta-learning.
 These techniques aim to adapt the federated learning process to handle non-i.i.d. data by incorporating
methods from transfer learning, model adaptation, and leveraging additional meta-information.
3.1.8 Data preprocessing methods
1. Resizing:
 Resizing involves adjusting the size of images or data samples to a consistent shape or resolution.
 It is commonly used in computer vision tasks where images may have varying dimensions. Resizing
ensures uniformity, allowing models to process images efficiently.
 Techniques like bi-linear or nearest-neighbor interpolation can be used to resize images while preserving
their aspect ratio.

2. Normalization:
 Normalization is used to scale data features to a standard range, typically between 0 and 1 or -1 and 1.
 It helps in preventing certain features from dominating the learning process and improves convergence
during model training.
 Common normalization techniques include min-max scaling and z-score normalization (standardization).

3. One-Hot Encoding:
 One-hot encoding is used to represent categorical variables as binary vectors.
 It converts categorical data into a binary format that machine learning models can process effectively.
 Categories are binary-encoded, with 1 representing presence and 0 indicating absence.

4. Feature Scaling:
 Feature scaling is the process of standardizing the range of features in the datasets.
 It ensures that features with different scales and units are on a comparable level, preventing bias towards
features with larger values.
 Mean normalization or feature-wise scaling can be applied to achieve feature scaling.

40
5. Feature Extraction/Selection:
 In some cases, it may be necessary to extract relevant features or select a subset of features to improve
model performance and reduce complexity.
 Techniques like Principal Component Analysis (PCA), feature importance ranking, or domain knowledge
can be used for feature extraction or selection.

41
3.2 Proposed Research Design/methodology

Figure 3.5: Overview of the FL training process

In this study, we propose to investigate a federated learning framework utilizing a client-server architecture, as
depicted in Figure 1. Our objective is to implement the Flower federated learning framework and utilize the
FedProx algorithm for aggregation for the classification of X-ray images into COVID-19 infected cases and
non-COVID-19 cases. In this configuration, a centralized parameter server maintains a global model, which is
shared with the clients, and coordinates their model updates. By leveraging the Flower federated learning
framework, we aim to facilitate efficient communication and collaboration between the clients while ensuring
the privacy and security of their individual datasets. In contrast to the FedAvg algorithm, we will employ the
FedProx algorithm as our optimization method. The FedProx algorithm incorporates a proximal term that
encourages the models to stay close to each client's initial model, thereby improving convergence and
robustness in federated learning scenarios. In this study, we will utilize pre-trained models such as VGG19,
MobileNetV1, and InceptionV3 for feature extraction and classification of X-ray images to detect COVID-19
disease. These models have demonstrated excellent performance in various computer vision tasks and are
widely used in the research community. The detailed architecture and specifications of the VGG19,
MobileNet, and InceptionV3 models will be described in Section 3.1 of our study.

42
The learning phase of this CNN model consists of several communication rounds where the central server
interacts synchronously with the clients. The CNN model is first initialized with random weights (1). These
weights are then shared with all the clients (2). The clients then update these weights by performing training
with its local private data (3). Once the local training is done the new updated weights are sent back to the
server (4). Finally, the server receives updates from all participating clients and aggregates the weights to
compute the new global model (1). This model is then shared to the clients for the next FL round.

While choosing the Models, the choice of the CNN architecture is not our main concern, and there are several
architectural choices that can slightly increase or decrease the overall performance. Our main aim was to
demonstrate that the federated learning of a deep CNN model allows us to reap benefits of the rich private data
sharing while conserving privacy.
For simplicity we adopt three well-known CNN architectures in image classification, namely VGG19,
InceptionV3 and MobileNetV1, all pre-trained on the Image-net dataset. To adapt the models for our specific
task of classifying X-ray images into two classes (covid and normal), we made several modifications.
First, for each architecture, we loaded the pre-trained model, excluding its fully connected layers, and
specified the input shape of the X-ray images as (224, 224, 3) and (299, 299, 3) in the case of InceptionV3.
To prevent the pre-trained weights from being updated during training, we set all layers in the model to be
non-trainable. Next, we extracted the output tensor from the last layer of the model and flattened it to create a
1-dimensional feature vector. We then added a dense layer with 128 units and utilized the rectified linear unit
(ReLU) activation function to introduce non-linearity to the model. For the classification task, we included a
final dense layer with two units, representing the two classes (covid and normal), and applied the softmax
activation function to generate probability values for each class. To construct the complete modified model,
we used the Keras functional API to define the input and output tensors and created an instance of the Model
class. To optimize the model's performance, we selected the Adam optimizer with a learning rate of 0.0001.
We used the binary cross-entropy loss function, suitable for binary classification tasks, and measured the
model's accuracy during training.

These modifications provide a framework for accurately classifying X-ray images into the two designated
categories. The model's input consists of X-ray images with dimensions of 224 × 224 pixels (299 x 299 in the
case of InceptionV3), and the output provides the probabilities for the covid and normal classes. By training
the model with appropriate X-ray datasets, we aim to achieve reliable and accurate classification results in the
context of our research.

43
Data
The data split has been explained in section 3.1.1. In order to make sure we use data that is non-IID we use a
skewed class distribution and we make sure that each client has a different number of images from each class.
Another way we made sure we had an unbalanced distribution is we let some clients use a larger dataset than
the others. We applied data augmentation operations in order to artificially expand the size of the training and
test subsets by creating modified versions of the images. We used geometric transformations like, rotation,
width and height shift, shearing, flipping the images and zoom.

Experimental Setup
In this section, we provide details on the model hyperparameters used in our experiments, including learning
rate, batch size, activation function, optimizer details, number of epochs, and averaging algorithms at the
server side.

Model Hyperparameters
We adopted the same CNN networks, namely VGG19, MobileNetV1, and InceptionV3, for both federated
learning and centralized learning approaches. The pre-trained weights on ImageNet were utilized, and the
fully connected layer head was replaced by a new classification head for training and prediction. Below, we
outline the specific hyperparameters employed in our experiments:

Learning rate: We used a fixed learning rate of 0.0001 for all models to ensure stable convergence during
training.

Batch size: The batch size was set to 32 for all models, balancing computational efficiency and memory usage.

Activation function: ReLU (Rectified Linear Unit) was chosen as the activation function for all models due to
its effectiveness in capturing non-linear relationships.

Optimizer: We employed the Adam optimizer with a learning rate of 0.0001.

Number of epochs: To allow the models to converge adequately, we trained all models for 100 epochs,
ensuring sufficient exposure to the dataset.

44
Averaging algorithms at the server side: In the federated learning approach, we used FedProx as the algorithm
for aggregating model updates from participant nodes. FedProx is an extension of the federated averaging
algorithm that incorporates a proximal term to account for local participant data's distribution and ensure
fairness in the aggregation process.

FedProx aims to mitigate the effects of imbalanced data distribution across participants by introducing a
regularization term. This term penalizes large deviations of participant updates from the global model,
proportional to the difference between the participant's local loss and a global loss threshold. By doing so,
FedProx encourages participants with different data distributions to contribute more equally to the global
model.

Specifically, the FedProx algorithm can be summarized as follows:

 Initialization: The global model is initialized with pre-trained weights on ImageNet, and participant
nodes receive a copy of the global model.

 Local training: Each participant independently trains the model using their local data. The training is
typically performed using stochastic gradient descent (SGD) or another optimization algorithm, with
the participant's local loss function.

 Model update: After local training, each participant computes the difference between their local model
and the initial global model, resulting in a participant-specific update.

 Proximal term: FedProx introduces a proximal term to regularize the participant updates. This term
penalizes updates that deviate significantly from the global model. The regularization is determined
by the difference between the participant's local loss and a global loss threshold.

 Aggregation: The server aggregates the participant updates using weighted averaging. The weights are
typically determined by the size of the participant's local data or another criterion that accounts for
data quality or relevance.

 Model synchronization: The aggregated update is applied to the global model, and the process repeats
for a specified number of communication rounds.

45
By using the FedProx algorithm, we aimed to address the potential data distribution imbalances across
participants, ensuring fair aggregation and representation of participant contributions. The inclusion of the
proximal term helps maintain the integrity of the global model while incorporating the benefits of participant-
specific updates.

By adopting these hyper-parameter settings, we aimed to create a fair comparison between the federated
learning-based method and the traditional centralized learning approach. These hyper-parameters were
selected based on prior literature, initial experimentation, and computational constraints.

It is worth noting that these hyper-parameters might require adjustment based on the specific dataset
characteristics, computing resources, and convergence behavior observed during training. Therefore, we
conducted preliminary experiments and hyper-parameter tuning to ensure the selected settings provided
optimal performance for our COVID-19 diagnosis task.

Evaluation Metrics
Loss
Loss is a fundamental concept in machine learning that quantifies the discrepancy between the predicted
outputs of a model and the true values. It serves as a measure of how well the model is performing on a given
task. In the context of classification problems, where the goal is to assign a label or class to each input, the
choice of an appropriate loss function is crucial. One commonly used loss function for binary classification is
binary cross-entropy, also known as log loss. The binary cross-entropy loss function evaluates the difference
between the predicted probabilities and the true binary labels. It calculates the average cross-entropy across all
examples in the dataset.

The essence of binary cross-entropy is to penalize large deviations between the predicted probabilities and the
true labels. When the predicted probability aligns well with the true label (0 or 1), the loss is minimized.
However, as the predictions deviate from the true values, the loss increases. By minimizing the binary cross-
entropy loss during model training, the model aims to adjust its parameters in a way that improves the
accuracy of its predictions for binary classification tasks.

The formula of this loss function can be given by:

46
Accuracy
Accuracy is a widely used evaluation metric in machine learning that measures the overall correctness of a
model's predictions. It quantifies the proportion of correctly classified examples out of the total number of
examples in the dataset. Accuracy provides a simple and intuitive measure of a model's performance,
especially in classification tasks. In the context of binary classification, accuracy specifically measures the
proportion of correctly predicted binary labels (true positives and true negatives) relative to the total number
of examples. It represents the model's ability to correctly identify the positive and negative instances in the
dataset. To calculate accuracy, the model's predicted labels are compared to the true labels for each example.
If the predicted label matches the true label, it is considered a correct prediction. The accuracy is then
computed as the ratio of the number of correct predictions to the total number of examples. Accuracy is a
valuable metric as it provides an easily interpretable measure of how well the model is performing overall.
However, it is important to note that accuracy alone may not always be sufficient for evaluating model
performance, especially when dealing with imbalanced datasets or when different types of errors have
different consequences. In such cases, additional metrics like precision, recall, and F1-score, or considering
the confusion matrix, can provide more comprehensive insights into the model's performance.

Confusion Matrix
Confusion Matrix: A confusion matrix is a table that visualizes the performance of a classification model by
comparing the predicted class labels with the true class labels. It provides insights into the types of errors
made by the model. The confusion matrix consists of four components: true positives (TP), true negatives
(TN), false positives (FP), and false negatives (FN). Each row of the matrix corresponds to the actual class
labels, while each column represents the predicted class labels. The diagonal elements (top-left to bottom-
right) represent the correct predictions, and the off-diagonal elements represent the miss-classifications. The
confusion matrix is useful for computing various evaluation metrics, such as precision, recall, and F1-score,
which provide more detailed insights into the model's performance for each class.

47
Chapter 4 : IMPLEMENTATION & RESULT ANALYSIS

In this chapter, we present the results of our experiments using three different models (VGG19, InceptionV3,
and MobileNetV1) and three clients. We compare the traditional training setup, where each client trains its
own model with its own data (Table 4.1), with a centralized setup where all three clients combine their data to
train a single model (Table 4.2). Note that combining all the client data is done only to compare it with the
Federated Learning (FL) scenario. Finally, we evaluate the results of the FL approach, where each client trains
its own model and contributes its learnings to a global central model without sharing private data (Table 4.3
and Table 4.4). These results are obtained after 100 FL rounds with the local epochs set to 3 which results in a
total of 300 epochs. The non FL trainings on the other hand were carried out for 100 epochs.

TABLE 4.1

Test Results of clients training on their own datasets (non-FL)

Model CLIENT 1 CLIENT 2 CLIENT 3

Accuracy Loss Accuracy Loss Accuracy Loss

VGG19 100 2.963 97.77 0.154 87.19 0.295

InceptionV3 100 3.7024 94.77 0.1790 76.53 0.5155

MobileNetV 90 0.5303 61.48 0.6025 85.2 0.3298


1

TABLE 4.2

Test results of clients combining datasets to train a centralized model

Model Accuracy Loss

VGG19 0.9930 0.0718

InceptionV3 0.8553 0.31575

MobileNetV1 0.8832 0.2684

48
TABLE 4.3

FL individual Client Accuracy values

Model CLIENT 1 CLIENT 2 CLIENT 3

Train Val Test Train Val test Train Val Test

VGG19 96.87 100 96.66 93.75 95.50 94.44 90.62 89.60 90.80

Inception 100 100 90.00 93.75 88.76 87.77 84.37 78.79 78.8
V3

MobileNe 100 100 96.66 93.75 97.75 93.33 92.18 90.39 82.0
tV1

TABLE 4.4

FL individual Client Loss values

Model CLIENT 1 CLIENT 2 CLIENT 3

Train Val Test Train Val test Train Val Test

VGG19 0.028 0.060 0.033 0.233 0.224 0.055 0.328 0.593 0.092

Inception 0.004 0.016 0.1 0.203 0.153 0.122 0.360 0.457 0.212
V3

MobileNe 8.626 2.448 0.033 0.366 0.050 0.066 0.211 0.476 0.18
tV1

Our experimental findings reveal that FL results (shown in Table 4.3 and Table 4.4) outperform the traditional
training results (Table 4.1), where each client trains with only its own data. Additionally, the FL results are
comparable to those of the combined model in the centralized setup (Table 4.2). It is worth noting that the
VGG19 model consistently exhibits the best performance among the three clients, thanks to its powerful
architecture.

49
Specifically, when using the FL strategy, the test loss of client 1 trained with the VGG19 model decreased
significantly from 2.963 to 0.033. From 0.154 to 0.055 for client 2 and from 0.295 to 0.092 for client 3.
Similarly, the losses of the other models also decreased when employing FL instead of training each client
only with its own data. Table 4.5, 4.6 and 4.7 show the losses incurred during FL training and the loss where
each client trains with only its own data.

Comparing the traditional training results with the FL results, we observe a substantial reduction in test loss
when using FL. This highlights the advantage of FL in leveraging collective knowledge from diverse datasets
without compromising privacy. By sharing model updates instead of raw data, FL enables each client to
benefit from the insights derived from other clients' data while ensuring the privacy and security of individual
datasets.

TABLE 4.5

Comparison of Losses for VGG19 Model - Clients Trained with Federated Learning (FL) and Individual
Datasets

Client 1 2 3

Non-FL 2.963 0.154 0.295

FL 0.033 0.055 0.092

TABLE 4.6

Comparison of Losses for InceptionV3 Model - Clients Trained with Federated Learning (FL) and Individual
Datasets

Client 1 2 3

Non-FL 3.7024 0.1790 0.5155

FL 0.1 0.122 0.212

50
TABLE 4.7

Comparison of Losses for MobileNetV1 Model - Clients Trained with Federated Learning (FL) and Individual
Datasets

Client 1 2 3

Non-FL 0.5303 0.6025 0.3298

FL 0.033 0.066 0.18

Loss Accuracy

Figure 4.1: Loss and Accuracy Comparison of VGG19 Model - Federated Learning vs Centralized
Training.

In Figure 4.1, in addition to the loss convergence graph, there is an accuracy graph displayed
alongside it. The accuracy graph represents the performance of the VGG19 model when trained
using federated learning (FL) compared to the scenario where all datasets are combined to train a
single global model in traditional machine learning (ML) approach.

The accuracy graph shows the performance of the model on a test dataset as training progresses. It
indicates how well the model is able to generalize to unseen data as it undergoes training rounds (in
the case of FL) or epochs (in the case of non-FL approach).

Based on the findings from the comparison, it can be observed that federated learning (FL) achieves
a level of accuracy comparable to the traditional ML approach. This implies that FL, which involves

51
training models locally on client devices and aggregating their updates, can effectively capture the
underlying patterns in the decentralized data and achieve results on par with centralized approaches
that collect and use data from diverse sources to train a global model.

Loss Accuracy

Figure 4.2 Loss and Accuracy Comparison of MobileNetV1 Model - Federated Learning vs Centralized
Training.

In Figure 4.2, we present the loss convergence graph of the MobileNetV1 model, where we compare
its performance when trained using federated learning (FL) against a scenario where all datasets are
combined to train a single global model in traditional machine learning (ML) approach. The graph
displays how the loss value changes as training progresses. In the case of federated learning (FL), we
conducted 100 rounds of training, with each round comprising 3 epochs. On the other hand, the non-
FL approach involved training for 100 epochs. The comparison allows us to assess the effectiveness
of FL in capturing underlying patterns in decentralized data and achieving results comparable to
centralized approaches. Accompanying the loss convergence graph, there is an accuracy graph shown
on the side. This accuracy graph tracks the model's performance on a test dataset as training
advances. It illustrates how well the MobileNetV1 model generalizes to unseen data during the
training rounds (FL) or epochs (non-FL). Based on the findings from the comparison, we observe
that federated learning (FL) demonstrates a level of accuracy that is on par with the traditional ML
approach. This indicates that FL, with its ability to train models locally on client devices and
aggregate their updates, can effectively capture the underlying patterns in decentralized data and
achieve results comparable to centralized approaches, where data from diverse sources is collected
and used to train a global model.

52
Loss Accuracy

Figure 4.3: Loss and Accuracy Comparison of InceptionV3 Model - Federated Learning vs Centralized
Training.

In Figure 4.3, we depict the loss convergence graph of the InceptionV3 model, enabling a
comparison of its performance when trained using federated learning (FL) against a scenario where
all datasets are combined to train a single global model in the traditional machine learning (ML)
approach. The graph visually illustrates the changes in the loss value during the training process. For
federated learning (FL), we conducted 100 rounds of training, with each round comprising 3 epochs.
Conversely, the non-FL approach involved training the model for 100 epochs. This setup allows us to
evaluate the effectiveness of FL in capturing the underlying patterns present in decentralized data and
to assess its ability to achieve results on par with centralized approaches. In conjunction with the loss
convergence graph, we present an accuracy graph on the side. This accuracy graph tracks the
performance of the InceptionV3 model on a test dataset as training progresses. It demonstrates how
well the model generalizes to unseen data during the training rounds (FL) or epochs (non-FL). Upon
analyzing the comparison, we find that federated learning (FL) achieves a level of accuracy
comparable to the traditional ML approach. These results suggest that FL, which involves training
models locally on client devices and aggregating their updates, is proficient in capturing the
underlying patterns in decentralized data, ultimately achieving performance on par with centralized
approaches. In the traditional ML approach, data from diverse sources is collected and utilized to
train a global model. However, FL showcases its capability to harness the power of decentralized
data while delivering comparable results to the traditional centralized approach.

53
Fig 4.4 VGG19: Confusion Matrices of Client 1,2 and 3

In Fig 4.4 we see the confusion matrices of the 3 clients when trained using VGG19. The first confusion
matrix [14 1; 0 15] indicates that Client 1 achieved high accuracy for both classes. It correctly predicted 14
examples from class 0 (positive class) and 15 examples from class 1 (negative class), with only one
misclassification between the two classes. This suggests that Client 1 effectively learned the distinguishing
features of the dataset using the VGG19 model. The second confusion matrix [36 4; 1 49] shows that Client 2
also achieved high accuracy for both classes. It correctly predicted 36 examples from class 0 and 49 examples
from class 1. However, it made four misclassifications, predicting four examples from class 0 as class 1.
While the accuracy is still relatively high, there is room for improvement in handling class 0 predictions for
Client 2. The third confusion matrix [113 12; 11 114] indicates that Client 3 achieved high accuracy for both

54
classes. It correctly predicted 113 examples from class 0 and 114 examples from class 1. It made 12
misclassifications, predicting 12 examples from class 0 as class 1 and 11 examples from class 1 as class 0. The
accuracy is generally high, but further investigation is needed to understand the patterns of misclassifications
and potential areas for improvement.

Fig 4.5 InceptionV3: Confusion Matrices of Client 1,2 and 3

The first confusion matrix [12 3; 0 15] indicates that Client 1 achieved high accuracy for both classes. It
correctly predicted 12 examples from class 0 (positive class) and 15 examples from class 1 (negative class).
However, it made three misclassifications, predicting three examples from class 0 as class 1. These
misclassifications highlight a potential area for improvement in Client 1's training with the InceptionV3
model. The second confusion matrix [30 10; 1 49] shows that Client 2 achieved high accuracy for both

55
classes. It correctly predicted 30 examples from class 0 and 49 examples from class 1. It made ten
misclassifications, predicting ten examples from class 0 as class 1. While the accuracy is still relatively high,
the misclassifications suggest the need for further refinement in Client 2's training process with the
InceptionV3 model. The third confusion matrix [113 12; 41 84] indicates that Client 3 achieved high accuracy
for both classes. It correctly predicted 113 examples from class 0 and 84 examples from class 1. However, it
made 12 misclassifications, predicting 12 examples from class 0 as class 1 and 41 examples from class 1 as
class 0. These misclassifications indicate areas for improvement in Client 3's training with the InceptionV3
model. Overall, the confusion matrices reveal that the InceptionV3 model's performance varied across the
three clients during the federated learning process. While all clients achieved high accuracy, there were
variations in the number of misclassifications, suggesting the need for further analysis and refinement in the
training procedure.

Fig 4.6 MobileNetV1: Confusion Matrices of Client 1, 2 and 3

56
The first confusion matrix [14 1; 0 15] indicates that Client 1 achieved high accuracy for both classes. It
correctly predicted 14 examples from class 0 (positive class) and 15 examples from class 1 (negative class),
with only one misclassification between the two classes. This suggests that Client 1 effectively learned the
distinguishing features of the dataset using the MobileNetV1 model. The second confusion matrix [36 4; 0 50]
shows that Client 2 achieved high accuracy for both classes. It correctly predicted 36 examples from class 0
and 50 examples from class 1. Importantly, it did not make any misclassifications, indicating a strong
performance by Client 2 with the MobileNetV1 model. The third confusion matrix [118 7; 28 97]
demonstrates that Client 3 achieved high accuracy for both classes. It correctly predicted 118 examples from
class 0 and 97 examples from class 1. It made seven misclassifications, predicting seven examples from class
0 as class 1 and 28 examples from class 1 as class 0. While the accuracy is generally high, there are
opportunities for further improvement in minimizing these misclassifications.

Performing predictions
Once a client has trained its model under the FL system it can perform local predictions on unseen images. Fig
5 shows screenshots of a UI built by us. This UI is built using HTML and CSS. It is served by a back-end that
uses flask. This UI application takes in an x-ray image as an input, sends it to the server for prediction and
once the prediction is done it sends the results back to the UI.

57
Fig 4.7 UI Screenshots

58
Algorithm 4.1 Prediction code under front-end
from flask import Flask, render_template, request
from PIL import Image
import numpy as np
from tensorflow import keras
from tensorflow.keras.models import load_model

app = Flask(__name__)
model = keras.models.load_model('reuben_model_innception.h5')

def preprocess_image(image):
image = image.resize((299, 299)) #inception model expected input
image = image.convert('RGB') # Convert grayscale to RGB
image = np.array(image)
image = image / 255.0
image = np.expand_dims(image, axis=0)
return image

def classify_image(image):
preprocessed_image = preprocess_image(image)
predictions = model.predict(preprocessed_image)
print (predictions)
return predictions

@app.route('/')
def home():
return render_template('index.html')

@app.route('/classify', methods=['POST'])
def classify():
if 'image' not in request.files:
return "No image uploaded"
image = request.files['image']
image = Image.open(image)
predictions = classify_image(image)
predicted_class = np.argmax(predictions[0])

loss=(predictions[0][1])
message="maybe your image is invalid"
print(loss)
if loss > 1:
message="maybe your image is invalid"
if predicted_class == 0:

return render_template('covid.html',accuracy=round((predictions[0][0])*100,
2),message=message)
else:
return render_template('negative.html',accuracy=round((predictions[0][0])*100,
2),message=message)

if __name__ == '__main__':
app.run()

59
Chapter 5 : CONCLUSION & FUTURE SCOPE
5.1 Conclusion
In conclusion, our experiments involving three different models (VGG19, InceptionV3, and MobileNetV1)
and three clients demonstrate the effectiveness of federated learning (FL) in comparison to traditional training
approaches. FL, which allows clients to train models locally and contribute their learning to a global central
model without sharing private data, outperformed the traditional training results where each client trained
solely with its own data. The FL results were also comparable to the combined model in the centralized setup,
highlighting the potential of FL in capturing the insights from diverse datasets without compromising privacy.

Specifically, when employing the FL strategy, the test losses of the clients trained with the VGG19,
InceptionV3, and MobileNetV1 models significantly decreased. FL reduced the test loss of Client 1 from
2.963 to 0.033, Client 2 from 0.154 to 0.055, and Client 3 from 0.295 to 0.092 when trained using the VGG19
model. Similarly, the losses of the other models also decreased when using FL instead of training each client
with its own data.

Furthermore, the confusion matrices of the clients trained with VGG19, InceptionV3, and MobileNetV1 in the
FL setup revealed high accuracy for both classes, with some variations in the number of misclassifications.
This highlights the potential for further improvement in refining the training process for individual clients
within the FL framework.

Our results highlight the advantages of FL, including its ability to leverage collective knowledge from diverse
datasets while preserving data privacy. By sharing model updates instead of raw data, FL enables each client
to benefit from the insights derived from other clients' data, leading to improved model performance.

In summary, our findings demonstrate the efficacy of FL as a viable approach for training models in a
decentralized manner while maintaining privacy. The observed reductions in test loss and the comparable
performance to centralized training approaches emphasize the potential of FL in various real-world scenarios
where data privacy is a critical concern. Future research could explore further optimization techniques and
expand the experiments to larger-scale federated learning setups to validate the scalability and robustness of
the FL approach.

60
5.2 Future Scope

The conclusion of the work highlights the experimental results of COVID-19 identification using CXR images
based on the federated learning framework and the comparison with traditional training approaches.

Based on these findings, several future research directions and potential improvements can be explored:

Algorithmic Enhancements: Investigate and develop advanced federated learning algorithms or optimization
techniques that can further improve the performance of different models on the federated learning framework.
This could involve exploring techniques like differential privacy, model aggregation, or better communication
strategies between clients.

Model Architectures: Explore and experiment with other state-of-the-art model architectures that have shown
promise in image classification tasks. For example, newer versions of VGG, Inception, and MobileNetV1
have been developed since the time of this work, and considering these newer versions may lead to improved
results.

Ensemble Methods: Investigate the potential benefits of using ensemble methods to combine the predictions
of multiple models. Ensemble techniques, such as majority voting or stacking, can often lead to improved
performance and robustness in classification tasks.

Multimodal Learning: Consider incorporating additional modalities, such as clinical data or patient metadata,
along with CXR images for COVID-19 identification. Multimodal learning can provide complementary
information and potentially improve the accuracy of the identification process.

Real-World Deployment: Evaluate the practical implications of deploying the models in real-world clinical
settings. Assess the model's performance on diverse and large-scale datasets from different healthcare
institutions to ensure its reliability and generalization.

Generalization to Other Diseases: Investigate the models' potential to identify other respiratory or lung-related
diseases from CXR images. Expanding the scope beyond COVID-19 could make the models more versatile
and useful in clinical practice.

Explain ability and Interpretability: Enhance the interpretability of the models to provide clearer insights into
their decision-making process. This is crucial for building trust and understanding in the medical community.
By addressing these future research areas, we can further enhance the accuracy, efficiency, and applicability
of the models for COVID-19 identification, ultimately contributing to improved healthcare outcomes.

61
REFERENCES

[1] “Federated Learning,” Federated Learning, 2018. https://federated.withgoogle.com/ (accessed Feb. 04, 2023).

[2] Jakub Konečný, H. Brendan McMahan, F. X. Yu, P. Richtarik, Ananda Theertha Suresh, and D. Bacon, “Federated Learning:
Strategies for Improving Communication Efficiency,” Google Research, 2016. https://research.google/pubs/pub45648/ (accessed Feb.
04, 2023).

[3] N. Rieke et al., “The future of digital health with federated learning,” npj Digital Medicine, vol. 3, no. 1, Sep. 2020, doi:
10.1038/s41746-020-00323-1.

[4] T. Li, A. K. Sahu, A. Talwalkar, and V. Smith, “Federated Learning: Challenges, Methods, and Future Directions,” IEEE Signal
Processing Magazine, vol. 37, no. 3, pp. 50–60, May 2020, doi: 10.1109/msp.2020.2975749.

[5] J. Xu, B. S. Glicksberg, C. Su, P. Walker, J. Bian, and F. Wang, “Federated Learning for Healthcare Informatics,” Journal of
Healthcare Informatics Research, vol. 5, no. 1, pp. 1–19, Nov. 2020, doi: 10.1007/s41666-020-00082-4.

[6] I. Feki, S. Ammar, Y. Kessentini, and K. Muhammad, “Federated learning for COVID-19 screening from Chest X-ray images,”
Applied Soft Computing, vol. 106, p. 107330, Jul. 2021, doi: 10.1016/j.asoc.2021.107330.

[7] “Federated Learning for Smart Healthcare: A Survey | ACM Computing Surveys,” ACM Computing Surveys (CSUR), 2023.
https://dl.acm.org/doi/abs/10.1145/3501296 (accessed Feb. 04, 2023).

[8] K. M. J. Rahman et al., “Challenges, Applications and Design Aspects of Federated Learning: A Survey,” IEEE Access, vol. 9, pp.
124682–124700, 2021, doi: 10.1109/access.2021.3111118.

[9] L. Li, Y. Fan, and K.-Y. Lin, “A Survey on federated learning,” 2020 IEEE 16th International Conference on Control &
Automation (ICCA), Oct. 2020, doi: 10.1109/icca51439.2020.9264412.

[10] D. H. Mahlool and M. H. Abed, “A Comprehensive Survey on Federated Learning: Concept and Applications,” Mobile
Computing and Sustainable Informatics, pp. 539–553, 2022, doi: 10.1007/978-981-19-2069-1_37.

[11] Kandati, Dasaradharami Reddy, and Thippa Reddy Gadekallu. “Genetic Clustered Federated Learning for COVID-19 Detection.”
Electronics, vol. 11, no. 17, 29 Aug. 2022, p. 2714, 10.3390/electronics11172714. Accessed 16 Sept. 2022.

[12] Feki, Ines, et al. “Federated Learning for COVID-19 Screening from Chest X-Ray Images.” Applied Soft Computing, vol. 106,
July 2021, p. 107330, 10.1016/j.asoc.2021.107330. Accessed 17 May 2022.

[13] Bhattacharya, Amartya, et al. “Application of Federated Learning in Building a Robust COVID-19 Chest X-Ray Classification
Model.” ArXiv:2204.10505 [Cs], 22 Apr. 2022, arxiv.org/abs/2204.10505. Accessed 1 Feb. 2023.

[14] Nguyen, Dinh C., et al. “Federated Learning for COVID-19 Detection with Generative Adversarial Networks in Edge Cloud
Computing.” IEEE Internet of Things Journal, 2021, pp. 1–1, 10.1109/jiot.2021.3120998. Accessed 30 Mar. 2022.

[15] Chanda, Pramit Brata, et al. “CNN Based Transfer Learning Framework for Classification of COVID-19 Disease from Chest X-
Ray.” IEEE Xplore, 1 May 2021, ieeexplore.ieee.org/abstract/document/9432181. Accessed 1 Feb. 2023.

[16] Maghdid, Halgurd S., et al. “Diagnosing COVID-19 Pneumonia from X-Ray and CT Images Using Deep Learning and Transfer
Learning Algorithms.” ArXiv:2004.00038 [Cs, Eess], 31 Mar. 2020, arxiv.org/abs/2004.00038.

[17] M. J. Horry et al., "COVID-19 Detection Through Transfer Learning Using Multimodal Imaging Data," in IEEE
Access, vol. 8, pp. 149808-149824, 2020, doi: 10.1109/ACCESS.2020.3016780.

62
[19] Qayyum, Adnan, et al. “Collaborative Federated Learning for Healthcare: Multi-Modal COVID-19 Diagnosis at the Edge.” IEEE
Open Journal of the Computer Society, vol. 3, 2022, pp. 172–184, 10.1109/ojcs.2022.3206407. Accessed 2 Nov. 2022.

[20] Vaid, Akhil, et al. “Federated Learning of Electronic Health Records to Improve Mortality Prediction in Hospitalized Patients
with COVID-19: Machine Learning Approach.” JMIR Medical Informatics, vol. 9, no. 1, 27 Jan. 2021, p. e24207,
medinform.jmir.org/2021/1/e24207/, 10.2196/24207. Accessed 20 May 2022.

[21] Zhang, Weishan, et al. “Dynamic-Fusion-Based Federated Learning for COVID-19 Detection.” IEEE Internet of Things Journal,
vol. 8, no. 21, 1 Nov. 2021, pp. 15884–15891, www.ncbi.nlm.nih.gov/pmc/articles/PMC9128757/, 10.1109/jiot.2021.3056185.
Accessed 1 Feb. 2023.

[22] Sozinov, Konstantin, et al. “Human Activity Recognition Using Federated Learning.” 2018 IEEE Intl Conf on Parallel &
Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social
Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), Dec. 2018,
people.kth.se/~sarunasg/Papers/Sozinov2018FederatedLearning.pdf, 10.1109/bdcloud.2018.00164.

[23] Brisimi, Theodora S., et al. “Federated Learning of Predictive Models from Federated Electronic Health Records.” International
Journal of Medical Informatics, vol. 112, Apr. 2018, pp. 59–67, www.sciencedirect.com/science/article/abs/pii/S138650561830008X,
10.1016/j.ijmedinf.2018.01.007.

[24] Dayan, Ittai, et al. “Federated Learning for Predicting Clinical Outcomes in Patients with COVID-19.” Nature Medicine, 15 Sept.
2021, pp. 1–9, www.nature.com/articles/s41591-021-01506-3, 10.1038/s41591-021-01506-3.

[25] Qayyum, Adnan, et al. “Collaborative Federated Learning for Healthcare: Multi-Modal COVID-19 Diagnosis at the Edge.” IEEE
Open Journal of the Computer Society, vol. 3, 2022, pp. 172–184, 10.1109/ojcs.2022.3206407. Accessed 2 Nov. 2022.

[26] Y. Liu, L. Zhang, N. Ge, and G. Li, “A Systematic Literature Review on Federated Learning: From A Model Quality
Perspective,” arXiv.org, 2020. https://arxiv.org/abs/2012.01973 (accessed Apr. 18, 2023).

[27] N. Vaibhav et al., “The future of digital health with federated learning,” npj Digital Medicine, vol. 3, no. 1, Sep. 2020, doi:
10.1038/s41746-020-00323-1.

[28] M. J. Sheller et al., “Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data,”
Scientific Reports, vol. 10, no. 1, Jul. 2020, doi: https://doi.org/10.1038/s41598-020-69250-1.

63

You might also like