Image_Classification_Using_Federated_Averaging_Algorithm
Image_Classification_Using_Federated_Averaging_Algorithm
Abstract—Machine learning privacy and data ownership con- vices, such as smartphones, Internet of Things (IoT) devices,
cerns have prompted the exploration of innovative methods and other edge devices. By bringing the model training process
to address these issues. This research delves into the realm to the data rather than centralizing the data on a server.
of image classification using federated averaging algorithms
as a promising solution. By capitalizing on federated learning The traditional machine learning paradigm sends data from
frameworks, multiple collaborating parties can collectively build various sources to a central server for model training. This
a machine learning model without the need to share their raw approach raises privacy concerns, as the data is vulnerable
data. This paper presents a comprehensive overview of the during transmission and storage, and users might be reluctant
Federated Averaging algorithm, focusing on its application to to share their sensitive information. On the other hand, Feder-
image classification tasks.This approach can potentially be used
in various applications where data privacy is a concern. The ated Learning allows devices to collaboratively learn a shared
dataset had sixty thousand training photos and ten thousand model while keeping the raw data on the device itself. This
test images, divided into ten categories. The categories are T- federated approach offers several compelling advantages. One
shirt, pant, dress, sweater, coat, sandal, shirt, sneakers, bag, and of the most significant benefits is data privacy. As sensitive
boots in each example, which was a 28x28 gray-scale image. data never leaves the user’s device, the risk of data breaches or
The study showcases the effectiveness of Federated Averaging in
image classification, revealing outcomes that rival those attained unauthorized access is minimized. This property is especially
through traditional centralized training methods. With a notable crucial in healthcare, finance, and other industries dealing with
achievement of 87.27% accuracy in image classification, this personal information. Moreover, Federated Learning leads to
research highlights the practical viability of Federated Learning more inclusive models. It allows devices from different regions
while upholding stringent data privacy standards. and demographics to participate in model training, ensuring
Index Terms—Federated Averaging, Image Classification, Fed-
erated Learning, Deep Learninig. that the resulting models are more representative and unbiased.
However, Federated Learning comes with its set of chal-
I. I NTRODUCTION lenges. Coordinating the learning process across various de-
vices can be complex, as devices may have different hardware
The advancement in sensor, computing, and communication capabilities, connectivity issues, or varying data distributions.
technologies has led to a significant improvement in the Ensuring the security and integrity of model updates during
collection and analysis of data. One innovative technology that aggregation is critical to prevent malicious attacks or data ma-
has emerged to address the need for efficient and effective data nipulation. The federated approach opens the door to various
processing is Federated Learning. Mobile devices can learn applications, from personal assistants and predictive models to
a shared predictor cooperatively through federated learning healthcare and industrial IoT, where data privacy and efficiency
while retaining every bit of training examples locally. In are paramount. As this technology evolves, it promises to make
contrast to traditional machine learning methods requiring a machine learning more inclusive, efficient, and secure for the
central server to manage a trained model and predict outcomes, benefit of society as a whole.
Federated Learning eliminates cloud storage services [1]. In this study has been used 70,000 photos from the Fashion-
Since all data processing and storage is done electronically, MNIST dataset [3] to investigate the usage of Federated
there is an increase in the security and confidentiality of Learning for image classification.
information and a decrease in communication and computing
costs. II. L ITERATURE R EVIEW
Federated Learning is an innovative approach to machine
learning that enables the training of models across decentral- A widespread machine learning approach called federated
ized devices while keeping data localized and secure [2]. This learning (FL) enables several clients to jointly develop a model
emerging technology addresses the challenge of data privacy, without consolidating their data. Federated Averaging is one
where sensitive information often resides on individual de- of Florida’s most often utilized algorithms (FedAvg). The
application of FedAvg for image categorization was exam- demonstrate that FedProx performs better than other federated
ined in this literature study. Federated Averaging (FedAvg), learning algorithms currently in use. This research also shows
suggested in [4] enables every device to train locally on that FedProx is resilient to communication lags and noisy
its data before sending model updates to a central server. gradients. Healthcare, banking, and the Internet of Things are
The updated global model is subsequently sent back to each just a few industries where FedProx could find use. These
device for the subsequent round of local training by the industries place a high priority on data security and privacy.
central server, which has combined the updates. The report Overall, the paper offers a fresh perspective on federated
presents experimental findings that show how FedAvg can learning and provides guidance on how to create successful
save communication costs and protect data privacy while and efficient algorithms.
attaining excellent model training accuracy. FedAvg has gained The study [8] suggests FedQuant, an emerging federated
popularity as a decentralized machine learning method and has learning technique that applies quantization to model upgrades
been used in many industries, including banking, healthcare, delivered between devices to cut communication costs. To
and the Internet of things. compress the model updates and cut down on transmis-
The study [5] proposed a technique for federated learning, sion bandwidth, FedQuant combines several quantization ap-
which enables the training of machine learning models across proaches, such as uniform and stochastic quantization. In terms
distributed devices to become more personalized. MAML- FA, of communication effectiveness, convergence time, and model
which is Model Agnostic Meta-Learning for Fast Adaptation. accuracy, FedQuant beats other federated learning methods,
The suggested technique employs meta-learning to discover according to experimental data presented in the research.
how to swiftly adjust a global model to a local device’s The article also exhibits FedQuant’s adaptability to many
data distribution. The global model is first tailored to each communication contexts, such as low-bandwidth and high-
device’s data using a sparse number of local updates, and latency networks. Healthcare, banking, and the Internet of
subsequently, performance is optimized using additional local Things are just a few industries where FedQuant could find
updates. These are the two-step method used by MAML-FA. use. These industries place a high priority on data security
The research presents experimental findings that show how and privacy. Overall, the paper offers a fresh perspective on
MAML-FA can enhance personalization while lowering the federated learning and provides guidance on how to create
number of training-related communication rounds. MAML-FA successful and efficient algorithms. It may be possible to
has demonstrated encouraging outcomes in many applications, scale up the technique to more extensive and complicated
including language modeling and image classification. Overall, scenarios thanks to FedQuant, which is a significant step
MAML-FA is a significant step towards federated learning’s towards addressing the communication issues associated with
personalization challenge and may find use in industries like federated learning.
healthcare and finance, where customized models are essential. The semi-synchronous gradient descent (SSGD) algorithm
Federated Learning with Matched Averaging (FedMA) method suggested in the study [9] is intended to address the com-
proposed in [6] intends to overcome some of the issues with munication issues associated with federated learning. As a
federated learning, including the variety of devices and the client-centric algorithm, SSGD generates local gradients asyn-
communication bottleneck. To lessen noise and accelerate chronously and transmits them to a central server, allowing
the convergence of the global model, FedMA uses a novel each device to participate in the training process. The server’s
aggregating approach called matched averaged values, which semi-synchronous method of aggregating the gradients enables
permits averages of model parameters solely from related de- it to work with devices with various data distributions or
vices. Regarding accuracy, converging speed, and communica- computing capacities. Regarding communication effectiveness,
tion effectiveness, the paper’s experimental results demonstrate convergence speed, and model accuracy, the paper’s experi-
that FedMA performs better than current federated learning mental findings demonstrate that SSGD performs better than
methods. The article also shows how resilient FedMA is to current federated learning techniques. The research also shows
hardware faults and data heterogeneity. Healthcare, finance, the SSGD’s adaptability to several communication contexts,
and the Internet of Things are just a few industries where including low-bandwidth and high-latency networks. Health-
FedMA could find use. These industries place a high priority care, banking, and the Internet of Things are just a few
on data security and privacy. Overall, the paper offers a fresh industries where SSGD could find use. These industries place
perspective on federated learning and opens up new research a high priority on data security and privacy. Overall, the
directions. paper offers a fresh perspective on federated learning and
Concerning the heterogeneity of endpoints and the pseudo- provides guidance about how to create successful and efficient
convexity of the loss function in federated learning, the study algorithms.
[7] suggests a brand-new federated learning algorithm named By proposing compression techniques for remote optimization,
FedProx. Using a proximal term in the objective function, the research [10] suggests a novel strategy for overcoming the
FedProx promotes faster convergence and lessens the effects communication difficulties associated with federated learning.
of device heterogeneity by encouraging local models to stick In order to minimize the quantity of data delivered during
close to the global model. Regarding convergence speed training, the study provides three compression algorithms:
and model accuracy, the research’s experimental findings random compression, top-k compression, and approximation
676
Authorized licensed use limited to: Zhejiang University. Downloaded on October 10,2024 at 06:33:23 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
quantization. These compression algorithms retain the preci- step towards overcoming the difficulties of optimizing model
sion of the trained model while enabling effective communi- designs in federated learning contexts, and there is still room
cation across devices. In terms of communication effective- for efficiency and accuracy improvements. Overall, the paper
ness, convergence speed, and model accuracy, the suggested offers a fresh perspective on federated learning, as well as
compression algorithms surpass current federated learning ideas on how to create federated learning algorithms that are
techniques, according to experimental data presented in the both efficient and successful.
research. The research also demonstrates how the proposed The problem of developing a federated learning model on non-
algorithms can be scaled to large-scale federated learning iid data, wherein data distribution among devices varies, is
environments. Potential uses for the presented compression covered in the study [13]. By offering a two-stage training
algorithms include the financial, healthcare, and Internet of method that uses local and global updates, FedNova provides
Things industries, which prioritize data security and privacy. an innovative solution to this problem. The technique uses
The paper offers a fresh perspective on federated learning and local updates in the initial stage to improve the version on
guides how to create information exchange-efficient federated every device while maintaining data privacy. The algorithm
learning algorithms. aggregates the local upgrades across all devices in the next
The authors of research [11] suggested a unique federated step, using global promotions to hone the model further.
learning algorithm called FedBoost that uses gradient boosting Regarding model accuracy and convergence speed, the pa-
techniques during training to cut down on communication per’s experimental results demonstrate that FedNova performs
expenses. FedBoost combines numerous weak models trained better than other federated learning methods, especially when
on various devices using a gradient-boosting architecture to working with non-iid data. The article also exhibits FedNova’s
create a robust global model. The approach dramatically low- adaptability to varied environments, including heterogeneous
ers communication costs by allowing each device to compute devices and imbalanced data distribution. Healthcare, banking,
and broadcast only a small collection of gradients to the and the Internet of Things are just a few industries where
central server. The central server then updates the global model FedNova could find use. These industries place a high priority
using gradient boosting to aggregate these gradients. FedBoost on data security and privacy. The approach is a big step
surpasses other federated learning algorithms, according to ex- towards overcoming the difficulties of training a federated
perimental data presented in the research, in terms of commu- learning model on non-iid data, and there is still room for
nication effectiveness, convergence speed, and model accuracy. efficiency and accuracy improvements.
The article also exhibits FedBoost’s adaptability to diverse analyzes how to use federated learning for picture classifi-
communication environments, including low-bandwidth and cation task while protecting data privacy in [14]. The study
high-latency networks. Healthcare, banking, and the Internet suggests a method for partitioning data that protects privacy
of Things are just a few industries where FedBoost could find by breaking it up into separate portions. Without sharing their
use. These industries place a high priority on data security data with the central server, clients can train their models using
and privacy. Overall, the paper offers a fresh perspective on their own sets of data. The models in the paper are trained
federated learning and provides guidance about how to create using a modified version of the Federated Averaging algo-
successful and efficient algorithms. Thanks to FedBoost, an rithm, and the methodology is assessed using the benchmark
essential first step in overcoming the communication issues datasets MNIST and CIFAR-10.The results demonstrate that
associated with federated learning, it may be possible to scale the suggested method protects data privacy while performing
up the system to more extensive and complicated scenarios. better than traditional centralized training. The study assesses
The study [12] suggested a federated learning algorithm the effect of the model’s client count on performance as well
that uses neural architecture search (NAS) to optimize the and shows that an increase in client count results in decreased
model architecture across many devices while maintaining performance because of communication overhead.
data privacy. To find the best architecture that maximizes the
III. M ATERIALS AND M ETHODS
effectiveness of the models on the federated dataset while min-
imizing communication costs, FedNAS employs a hierarchical Training a machine learning model in Federated Learning
optimization technique. Before doing a local search on each involves several steps to ensure efficient and secure collab-
device to optimize the chosen architectures, the program does oration between the central server and the individual client
a global search for suitable architectures. The models that devices. The typical training steps for Federated Learning are
emerge are then combined using a weighted average method. as follows:
Regarding model correctness and communication effective- 1) Data Preparation: The data is gathered and pre-
ness, the paper’s experimental findings demonstrate that Fed- processed in this step. The information might come
NAS performs better than other federated learning algorithms from a variety of sources, but it must be reliable and
already in use. The research also shows how FedNAS may be consistent. After then, the data is divided up into several
scaled to massive, federated learning applications. Healthcare, subsets, one for every device.
finance, and the Internet of Things are just a few industries 2) Model Initialization: The deep neural network model
where FedNAS could find use. These industries place a high is initialized on a central server. The challenge at hand
priority on data security and privacy. The approach is a big can determine the model’s design, however it is crucial
677
Authorized licensed use limited to: Zhejiang University. Downloaded on October 10,2024 at 06:33:23 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
678
Authorized licensed use limited to: Zhejiang University. Downloaded on October 10,2024 at 06:33:23 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
initially distributed among them. For improved model round. Federated averaging involves generating local
training, the images were then normalized and made updates for each client, averaging them, and applying
grayscale. The next step was to create client models. the average to the global model. This iterative process
Each client received 10,000 test images and a manually continues until convergence. Federated learning enables
divided 12,000 training dataset. The data already ex- decentralized data training while preserving data privacy
isted as TensorFlow Federated (TFF) data. Dataset and on clients’ devices.
dataset transformations performed pre-processing, such 4) Prediction
as flattening the 28x28 photos into 784 element arrays, The output of a machine learning model trained on his-
shuffling the individual samples, grouping them into torical data and then applied to fresh data to determine
batches, renaming the features from pixels, and labeling the likelihood of a particular outcome, such as whether
them to x and y. The data set was also repeated multiple a client will cancel within 30 days, is referred to as a
times to run many epochs. prediction. The model tries to find the most probable
2) Modeling and Algorithm Selection value for an unknown variable for each record in the
The optimizer is simply one component of the optimiza- new data. After training, the model can predict outcomes
tion technique when employing the federated averaging for particular photos. Let’s use the ith image, forecasts,
process because it creates local model upgrades for each and forecast array as an example. The correct prediction
client. The algorithm’s second step is averaging these labels are in blue, while the wrong ones are in red. The
adjustments across clients and applying them to the number represents the projected label’s percentage.
server’s overall model. In TFF, any model to be used 5) Performance Evaluation
must be wrapped into a TFF learning instance. This In evaluating a machine learning project, it is crucial to
model interface is similar to Keras but also includes consider various measures besides the accuracy score.
new features, such as controlling federated metrics’ Although a model may yield good results based on
production. TFF can be wrapped using the learning from accuracy, it might not perform as well in other measures,
compiled keras model function, which takes the model such as logarithmic loss. The study uses classification
and a sample information batch as parameters. Sparse accuracy to evaluate the model’s performance but goes
Categorical Cross Entropy loss function and Adam’s beyond just considering the model’s overall accuracy.
optimizer were used to train the model in this study. The Following the training of their models, it also keeps
mean of the weights was calculated using the federated track of each client’s test accuracy. In order to assess
averaging approach, and the model’s accuracy and loss and contrast the two models’ differences, the project also
were also calculated. trains the model without employing federated learning
3) Training Model and tracks the outcomes.
In supervised learning, a machine learning algorithm
creates a model by adjusting weights and bias using IV. R ESULTS AND D ISCUSSION
labeled data to minimize prediction error (loss). The aim Traditional machine learning algorithms typically involve
is to find efficient weight and bias combinations for all data centralization, meaning that sensitive data from multiple
cases. In a project, clients are trained concurrently with sources are collected in one location. This approach poses
unique data, followed by 50 federated learning cycles to significant risks, such as data leaks and security breaches, and
train their models. Model weights are collected, statisti- raises privacy concerns for individuals. These problems can
cally evaluated, and then refined in each communication be resolved through the application of a decentralized method
679
Authorized licensed use limited to: Zhejiang University. Downloaded on October 10,2024 at 06:33:23 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
of machine learning known as federated learning. Data is kept preserving data privacy. Along with this strategy’s difficulties
on individual devices or in small groups in federated learning. and possible drawbacks, federated learning has been discussed
The only information shared between the devices or groups is for its benefits, such as maintaining data privacy. Federated
the most recent model updates, ensuring data privacy. Table I Averaging works for image classification have produced en-
shows the performance analysis for different clients. couraging results, but more study is required to fine-tune the
The three test cases in the project aimed to assess the algorithm’s settings for various uses. Future research should
effectiveness of the federated learning model in achieving high concentrate on extending the federated learning paradigm
accuracy in image classification while preserving data privacy. to more complex machine learning tasks, like reinforcement
The first test case involved comparing the accuracy of the learning and natural language processing.
federated learning model with that of a non-federated model.
R EFERENCES
The accuracy of the model after numerous communication
rounds was measured in the second test scenario to assess [1] J. Wen, Z. Zhang, Y. Lan, Z. Cui, J. Cai, and W. Zhang, “A survey on
federated learning: challenges and applications,” International Journal of
the federated learning model’s capacity to advance over time. Machine Learning and Cybernetics, vol. 14, no. 2, pp. 513–535, 2023,
The last test case involved analyzing the accuracy of a new doi: https://doi.org/10.1007/s1304202201647y.
test data set using the model in order to assess the model’s [2] Parizi, Reza M, A. Dehghantanha, Q. Zhang, and K. Franke, “Decentral-
ized federated learning: An introduction and the road ahead,” in IEEE
ability to generalize to additional information. The outcomes Computer Society, 2021.
of these test scenarios show how successfully the FL technique [3] ZALANDO RESEARCH, “Fashion MNIST,” www.kaggle.com, Jan. 01,
achieves high accuracy while protecting data privacy. The fed- 2017. https://www.kaggle.com/datasets/zalando-research/fashionmnist.
[4] B. McMahan, E. Moore, D. Ramage, S. Hampson, and A. Blaise,
erated learning model outperformed the non-federated model “Communication-efficient learning of deep networks from decentralized
in terms of accuracy, and the accuracy of the federated learning data,” in PMLR, 2017, pp. 1273–1282.
model improved over time. Additionally, the federated learning [5] Y. Jiang, J. Konečnỳ, K. Rush, and S. Kannan, “Improving federated
learning personalization via model agnostic meta learning,” 2019.
model demonstrated the ability to generalize to new data, [6] H. Wang, Mikhail Yurochkin, Y. Sun, D. S. Papailiopoulos, and Yasaman
indicating its effectiveness in real-world scenarios. Overall, Khazaeni, “Federated Learning with Matched Averaging,” arXiv , Feb.
these test cases highlight the potential of federated learning as 2020.
[7] T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith,
a secure and efficient approach to machine learning. “Federated optimization in heterogeneous networks,” Proceedings of
Machine learning and systems, vol. 2, pp. 429–450, 2020.
TABLE I [8] Amirhossein Reisizadeh, A. Mokhtari, H. Hassani, A. Jadbabaie, and
R ESULTS OF THE STUDY FOR TRAINING AND TESTING PHASES Ramtin Pedarsani, “FedPAQ: A Communication-Efficient Federated
Learning Method with Periodic Averaging and Quantization.,” arxiv, pp.
2021–2031, Sep. 2019.
Client Training Accuracy(%) Testing Accuracy(%) [9] J. Konečnỳ, M. H. Brendan, F. X. Yu, P. Richtárik, A. T. Suresh, and
Client-1 81.56 83 D. Bacon, “Federated learning: Strategies for improving communication
Client-2 83.12 84.67 efficiency,” 2016.
Client-3 82.99 84.54 [10] D. Alistarh, T. Hoefler, M. Johansson, N. Konstantinov, Sarit Khirirat,
Client-4 81.53 84.29 and C. Renggli, “The Convergence of Sparsified Gradient Methods,”
Client-5 79.35 82.29 Advances in Neural Information Processing Systems, vol. 31, no. 2018,
Sep. 2018, doi: https://doi.org/10.48550/arxiv.1809.10505.
[11] J. Hamer, Mehryar Mohri, and Ananda Theertha Suresh, “FedBoost:
The study discovered that the accuracy of the federated Communication-efficient algorithms for federated learning,” in Interna-
tional Conference on Machine Learning, Jan. 2020, pp. 3931–3941.
learning model was 87.27%, greater than the traditional clas- [12] C. He, E. Mushtaq, J. Ding, and S. Avestimehr, “Fednas: Federated deep
sification’s accuracy of roughly 85.4%. This demonstrates that learning via neural architecture search,” 2021.
federated learning delivers improved data protection as well as [13] Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated
learning with non-iid data,” 2018.
quicker and more precise outcomes. In the upcoming years, [14] Q. Yang et al., “Federated Learning with Privacypreserving and Model
federated learning is anticipated to gain popularity as worries IPrightprotection,” Machine Intelligence Research, vol. 20, no. 1, pp.
about data privacy continue to rise. 19–37, 2023, doi: https://doi.org/10.1007/s1163302213432.
[15] D. Preuveneers, V. Rimmer, I. Tsingenopoulos, J. Spooren, W. Joosen,
and E. Ilie-Zudor, “Chained Anomaly Detection Models for Federated
V. C ONCLUSIONS Learning: An Intrusion Detection Case Study,” Applied Sciences, vol.
A promising method for resolving machine learning privacy 8, no. 12, p. 2663, Dec. 2018, doi: https://doi.org/10.3390/app8122663.
[16] Joshi, R., Shah, D.: Approach to Avoid Resource Exhaustion
and data ownership problems is image classification using Caused by Editing Tools for Automating Effects Using Noise
federated averaging algorithms. Multiple parties can work Inducing Procedures in Deep Learning. International Journal of
together without sharing their data to build a machine-learning Intelligent Communication, Computing and Networks. (2021).
https://doi.org/10.51735/ijiccn/001/21.
model using the federated learning framework. For image
classification tasks, the federated learning algorithm Federated
Averaging has produced findings that are on par with conven-
tional centralized training techniques. The Federated Averag-
ing algorithm and its use for image classification tasks have
been outlined in this paper. The study’s results demonstrated
that Federated Learning was a practical approach for achieving
high accuracy reaches 87.27% in image classification while
680
Authorized licensed use limited to: Zhejiang University. Downloaded on October 10,2024 at 06:33:23 UTC from IEEE Xplore. Restrictions apply.