ICEI24 Paper 7
ICEI24 Paper 7
Inflammations of Bladder
Abstract—In the era of burgeoning data privacy concerns these diseases are imperative to prevent complications such as
and the ever-growing need for collaborative healthcare solutions, renal scarring, sepsis, or renal insufficiency.
Federated Learning stands out as a beacon of innovation. This Traditional methods for often rely on centralized ap-
paper delves into the realm of medical diagnostics, particularly
focusing on the diagnosis of acute bladder inflammation, a proaches, where vast amounts of data are gathered and pro-
prevalent condition affecting millions worldwide. Our research cessed in a single location. However, these methods suffer
introduces a novel Federated Logistic Regression model tailored from several limitations, including privacy concerns, data se-
explicitly for this purpose. By harnessing the collective knowledge curity risks, and computational bottlenecks due to the massive
dispersed across multiple medical institutions, our federated amounts of data involved [2]. Moreover, centralized systems
approach enables collaborative model training while ensuring
the utmost privacy of sensitive patient data. Through rigorous may encounter challenges in scalability and real-time respon-
experimentation and comparative analysis with traditional cen- siveness, particularly in resource-constrained environments.
tralized methods, we unveil a striking revelation: our Federated Hence, there is a pressing need for an improved approach
Learning framework outperforms its centralized counterpart by that addresses these shortcomings. Federated learning emerges
a staggering factor of nearly 20 in terms of efficiency and as a promising paradigm, offering a decentralized solution
model convergence. These results underscore the transformative
potential of Federated Learning in revolutionizing medical diag- by leveraging distributed data sources while ensuring privacy
nostics, offering a scalable, privacy-preserving solution that holds preservation and computational efficiency. This paper proposes
immense promise for the future of healthcare. a Federated Learning Approach towards Diagnosing Acute In-
Index Terms—Nephritis, Federated Learning, Logistic Regres- flammations of Bladder, aiming to enhance detection efficiency
sion, Distributed Systems, Bladder Inflammation and scalability while maintaining data privacy and security.
The literature survey conducted for this paper showcased a
I. I NTRODUCTION significant improvement in results and higher efficiency in the
Federated models when compared to Traditional models [3]
Acute inflammation of the urinary bladder, clinically recog- as represented in Fig. 1.
nized as cystitis, stands as a prevalent and disruptive ailment
affecting individuals across demographics. Clinically, cystitis
presents a spectrum of symptoms including urinary urgency,
frequency, dysuria, and hematuria, often accompanied by
suprapubic discomfort. While typically self-limiting, recurrent
or severe episodes may warrant further investigation to discern
underlying etiological factors. Nephritis of the renal pelvis,
a condition characterized by inflammation within the renal
pelvis and surrounding structures, represents a significant
pathological entity within the spectrum of urinary tract dis- Fig. 1. Traditional Model vs Federated Model
orders. This inflammatory process typically originates from
bacterial ascent originating in the lower urinary tract, leading This paper aims at fulfilling the following objectives:
to infection and subsequent inflammation in the renal pelvis • Produce a highly accurate model while strictly adhering
[1]. Prompt recognition and appropriate management of both to privacy regulations.
• Propose an approach that ensures scalability, allowing the model trained on its respective dataset. These client models
model to be deployed across diverse healthcare settings autonomously update their parameters based on local data,
while increasing efficiency and reducing training time. capturing intricate features within their data distributions.
Section 2 discusses the existing research work done in this Periodically, the client models communicate their updates
domain, understanding the shortcomings of existing solutions. to a central server, where federated averaging is employed
This paper explains the Proposed Model in the third section to aggregate the model parameters across all clients. The
explaining a Federated Logistic Regression Model used for aggregated parameters are then used to update the global
training the data which is then averaged off using FedAvg. The model on the server, ensuring that the global model reflects
fourth section reviews the performance measures that are used the collective knowledge from all participating clients. This
to evaluate the proposed model and the results generated. The iterative process of training, aggregation, and updating enables
final section (5th Section) summarises the paper by offering the global model to continually improve its performance while
a conlcusion and also delves into the future scope of this respecting the data privacy constraints inherent in federated
research. learning settings.
B. Logistic Regression
Logistic regression is a statistical method used for modeling
the relationship between one or more independent variables
and a binary outcome [12]. Unlike linear regression, which
predicts continuous outcomes, logistic regression is specif- Fig. 3. Logistic Regression
ically designed for binary outcomes, where the dependent
variable is categorical and has only two possible values,
typically represented as 0 and 1. Mathematically, federated learning can be formulated as
The logistic regression model employs the logistic function, follows:
also known as the sigmoid function, to transform the linear Let D1 , D2 , ..., Dn represent the datasets held by N
combination of independent variables into a probability score devices or servers. Each dataset Di contains samples
bounded between 0 and 1. The logistic function is defined as: (xi1 , yi1 ), (xi2 , yi2 ), ..., (xiM , yiM ) where xij is the input data
and yij is the corresponding label.
p
logit(p) = ln( ) = β0 + β1 x1 + β2 x2 + .... + βn xn (1) The goal of federated learning is to learn a global model
1−p θ that minimizes a global loss function ζ(θ) , defined as the
where: average loss over all local datasets [16]:
• p represents the probability of the binary outcome (suc-
N
cess or failure). 1X
minθ E(x,y) [l(θ; x, y] (3)
• x1 , x2 , ..., xn are the independent variables. N i=1
• β0 , β1 , ..., βn are the coefficients representing the effects
of independent variables on the log-odds of the outcome. Where l(θ; x, y) is the loss function for a single sample and
The logistic function transforms the linear combination of E denotes the expectation over the local dataset Di .
independent variables into log-odds, also known as the logit During training, each device computes a local update △θi
[13]. The logit is then mapped to the probability scale using by minimizing its local loss function:
the logistic function: M
1 Xi
1 △θi = argmin△θ l(θ + △θ; xij , yij ) (4)
p= (2) Mi j=1
1+ e−(βo +β1 X1 +β2 X2 +...+βn Xn )
The logistic regression model estimates the coefficients The local updates △θi are then aggregated at a central
β0 , β1 , ..., βn using maximum likelihood estimation or other server using methods like federated averaging [15]. The server
optimization techniques to maximize the likelihood of observ- updates the global model θ by averaging the local updates:
ing the given binary outcomes given the predictor variables.
N
Once the coefficients are estimated, the model can predict the 1X
probability of the binary outcome for new observations based θ←
−θ−η △θi (5)
N i=1
on their values of independent variables.
Where η is the learning rate. This process iterates until con-
C. Federated Learning vergence, resulting in a global model trained on the collective
Federated Learning (FL) is a machine learning paradigm knowledge of all participating devices while preserving data
where a model is trained across multiple decentralized devices privacy.
or servers holding local data samples, without exchanging
them [14]. Instead, only model updates, such as gradients D. Stochastic Gradient Descent
or weights, are shared among devices to collaboratively train SGD, or Stochastic Gradient Descent, is a popular opti-
a global model while keeping the data local. This approach mization algorithm used in machine learning for minimizing
addresses privacy concerns associated with centralized data the loss function and iteratively updating the parameters of
storage by allowing training to occur on the device or server a model. It belongs to the category of gradient-based opti-
where the data resides, minimizing the need for data transmis- mization methods and is particularly well-suited for large-scale
sion. datasets due to its computational efficiency [16].
Mathematically, the SGD algorithm updates the model pa- Algorithm 1 Logistic Regression Model
rameters θ in the direction of the negative gradient of the loss 1) Initialize model parameters
function with respect to those parameters. For a given training 2) Define loss function
example (xi , yi ), the update rule for SGD is given by: criterion = torch.nn.CrossEntropyLoss()
3) Define optimizer
θ(t+1) = θ(t) − η∇θ L(xi , yi ; θ(t) ) (6) 4) For each epoch:
where: a) Iterate over mini-batches of training data:
• θ
(t)
represents the parameters of the model at iteration t. i) Forward pass: Compute model predictions
• η is the learning rate, a hyperparameter that controls the ii) Compute loss between predictions and true
size of the parameter updates. labels
(t)
• L(xi , yi ; θ ) is the loss function, which quantifies the iii) Backward pass: Compute gradients of loss with
discrepancy between the model predictions and the true respect to model parameters
labels. iv) Update model parameters using optimizer’s
(t)
• ∇θ L(xi , yi ; θ ) denotes the gradient of the loss function step() method
with respect to the model parameters. b) Evaluate model performance on validation data
In SGD, the gradient is computed using only a single train- c) Store training and validation loss and accuracy for
ing example at each iteration, hence the term ”stochastic.” This analysis
characteristic makes SGD highly efficient and scalable to large 5) Plot training and validation loss/accuracy curves
datasets, as it avoids computing gradients for the entire dataset 6) Test trained model on held-out test data
in each iteration. However, due to the randomness introduced 7) Report final test accuracy and any additional evaluation
by the selection of individual training examples, SGD may metrics
exhibit noisy updates and slower convergence compared to
batch gradient descent methods. Algorithm 2 Federated LR Model
E. Psuedo Code 1) Initialize parameters and data structures
2) Define global model architecture
Algorithm 1. outlines the steps for training a basic Lo- 3) Define loss function and optimizer
gistic Regression model. It begins by initializing the model 4) Iterate over federated rounds:
parameters, defining the loss function, and setting up the
a) Distribute global model to participating hospitals
optimizer. During each epoch, the model iterates over mini-
b) Iterate over hospitals:
batches of training data, making predictions, computing loss,
and updating parameters using backpropagation. The model’s i) Train local model using hospital-specific data
performance is evaluated on validation data, and training and ii) Compute gradients and update local model pa-
validation metrics are stored for analysis. After training, the rameters
model is tested on separate test data, and the final test accuracy iii) Aggregate local model parameters to update
is reported. global model
Algorithm 2. reflects the federated learning approach, where c) Evaluate global model performance on validation
the training process involves distributing the global model to data
participating hospitals, training local models using hospital- d) Store federated training metrics for analysis
specific data, aggregating local model parameters to update 5) Plot federated training metrics
the global model, and repeating the process over multiple 6) Test global model on held-out test data
federated rounds. It includes steps for distributing, training, 7) Report final test performance and any additional findings
aggregating, evaluating, plotting, testing, reporting, and dis-
cussing the federated learning experiment.
but also optimizes resource utilization. Additionally, the num-
IV. P ERFORMANCE E VALUATION
ber of iterations sheds light on the model’s complexity and
A. Number of Iterations capacity to capture underlying data patterns. Models with
The number of iterations required to decrease the loss greater parameters or higher complexity may necessitate more
of a model serves as a fundamental performance metric iterations to adequately learn from the data.
in machine learning training processes. It encapsulates the Furthermore, the stability of convergence across multiple
efficiency and effectiveness of the training procedure, offering training runs can be assessed through the consistency of
valuable insights into various aspects of model optimization the required iterations. A highly variable number may signal
[17]. Primarily, it provides a gauge of training efficiency, instability or sensitivity to initialization conditions. Lastly,
indicating how rapidly the model converges towards an op- understanding the optimal number of iterations facilitates deci-
timal solution. A lower number of iterations signifies quicker sions regarding early stopping strategies, preventing overfitting
convergence, which not only minimizes computational time and conserving computational resources.
B. BCELoss while the model for nephritis achieves 0.0563 and 100%
BCELoss, short for Binary Cross-Entropy Loss, is a com- respectively. The testing accuracy for both the models is 100%.
mon loss function used in binary classification tasks, partic- The above results from Table 1 can be better comprehended
ularly in neural network models. It measures the discrepancy using the graphs showed in Fig. 4 and Fig. 5.
between the predicted probabilities and the actual binary labels
of the target variable.
N
1X
BCELoss(p, y) = − [yi .log(pi ) + (1 − yi ).log(1 − pi )]
N i=1
(7)
where:
• p is the predicted probability output by the model.
• y is the true binary label (0 or 1) of the target variable.
• N is the number of samples in the dataset.
• pi and yi represent the predicted probability and true label
for the i-th sample, respectively.
In binary classification, p is typically the output of a sigmoid
activation function, which ensures that predicted probabilities
fall between 0 and 1. BCELoss serves as an effective measure
for training binary classification models by quantifying the
agreement between predicted probabilities and true labels, Fig. 4. Loss for Regular Bladder Inflammation Model
thereby guiding the optimization process towards accurate
classification performance [18].
V. R ESULTS AND D ISCUSSION
This paper executes a normal and a federated Logistic
Regression model to compare the efficiency of both models
in terms of number of iterations and training loss calculated.
Table 1. contains the forementioned values recorded at inter-
vals of 1000 iterations for the training model for Inflammation
of Urinay Bladder and Table 2. shows the same for Nephritis
of Renal Pelvis for a normal regression model.
TABLE I
N ORMAL M ODEL FOR I NFLAMMATION OF U RINARY B LADDER
Iterations W1 W2 W3 W4
0 0.9096 0.8328 1.9309 1.3678
100 0.9181 0.4144 1.5114 1.1034
200 0.8108 0.4497 1.0155 0.8722
300 0.3383 0.3264 0.3469 0.4698
400 0.2006 0.1812 0.1624 0.2024
500 0.1804 0.1632 0.1447 0.1817
600 0.1647 0.1488 0.1309 0.1655
700 0.1518 0.1369 0.1197 0.1523
800 0.1409 0.127 0.1103 0.1411
900 0.1315 0.1184 0.1024 0.1315
TABLE VI
T RAINING ACCURACY OVER I TERATIONS FOR E ACH W ORKER