Pneumonia Diagnosis Through Pixels - A Deep Learning Model For Detection and Classification

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Pneumonia Diagnosis through pixels - A Deep

Learning Model for detection and classification


Gurpur Amit Karanth Janani S Ajeetha B
School of Electronics and School of Electronics and School of Electronics and
Communication Communication Communication
Vellore Institute of Technology Vellore Institute of Technology Vellore Institute of Technology
Chennai, India Chennai, India Chennai, India

Dr.Brintha Therese A Dr. Rajeswaran Rangasami


Professor Professor
Vellore Institute of Technology Sri Ramachandra University
Chennai, India Chennai,India

Abstract - Manual identification and ResNet-50, and the MobileNetV2 aided gradient
classification of pneumonia and COVID-19 boosting classifier, we propose an ensemble
infection is a cumbersome process that, if model with an accuracy of 98 percent on
delayed can cause irreversible damage to the unseen data.
patient. We have compiled CT scan images from
various sources, namely, from the China
Keywords: Pneumonia, Lung CT, Deep Learning,
Consortium of Chest CT Image Investigation
Convolutional Neural Networks, Medical Imaging,
(CC-CCII), the Negin Radiology located at Sari Computer-Aided Diagnosis.
in Iran, an open access COVID-19 repository
from Havard dataverse, and Sri Ramachandra I. INTRODUCTION
University, Chennai, India. The images were
preprocessed using various methods such as Pneumonia, a leading cause of death worldwide,
normalization, sharpening, median filter is a medical condition that inflames the air sacs
application, binarizing, and cropping to ensure in one or both lungs. Pneumonia can affect
uniformity while training the models. We people of all ages and range in severity from
present an ensemble classification approach moderate to life-threatening. It continues to be a
using deep learning and machine learning major global health concern, causing an
methods to classify patients with the said estimated 2.5 million fatalities each year, with a
diseases. Our ensemble model uses pre-trained disproportionately high impact on young children
networks such as ResNet-18 and ResNet-50 for and the elderly. Improved patient outcomes and
classification and MobileNetV2 for feature efficacious therapy depend on an early and
extraction. The features from MobileNetV2 are precise diagnosis. Chest X-rays have historically
used by the gradient-boosting classifier for the been used to diagnose pneumonia in its early
classification of patients. Using ResNet-18, stages. They may, however, be insensitive and
specific, which could result in a false diagnosis. performance in automating the diagnosis process
This is when lung CT (computed tomography) compared to traditional machine learning
scans come in handy. Lung CT scans offer finely approaches. In order to improve efficiency and
detailed cross-sectional images of the lungs, accuracy, transfer learning techniques are used to
making abnormalities related to pneumonia leverage pre-trained models and modify them for
easier to see. COVID-19 and pneumonia detection tasks. By
using the F-RNN-LSTM system, a better
accuracy of 95% with low computational efforts
A model that can automatically identify is achieved [2].
pneumonia in new CT scans by training a CNN
(Convolutional Neural Networks) on a sizable A novel method for textural feature-based
dataset of lung CT scans with confirmed automatic pneumonia diagnosis in chest X-ray
pneumonia and healthy lung visuals is created. images is explored. The images' textural
This approach makes use of ensemble learning, a characteristics were extracted in order to identify
potent machine learning methodology that minute patterns that point to the existence of
enhances performance by combining the pneumonia. This method provides a non-invasive
advantages of several models. and effective method to diagnose pneumonia,
The three different classification algorithms that important for prompt medical attention.
were used to classify pneumonia, COVID Compared to conventional techniques, the use of
infection, and normal patients are: textural cues improves the sensitivity and
● Pre-trained CNN models: specificity of pneumonia identification. By using
● ResNet-18 three machine learning algorithms: KNN, SVM,
● ResNet-50 and RF, improvements are shown in both
● MobileNetV2 (feature extraction) accuracy and F-Score between 4% and 8% [3].
● Gradient Boosting Classifier
Using datasets from CT scans and X-ray
images, the research suggests a unique ensemble
II. LITERATURE REVIEW framework that combines deep learning models
for better COVID-19 and pneumonia
An effective ensemble-based CAD system for
identification. The study highlights the
pneumonia detection in chest X-ray images,
ensemble framework's high diagnostic precision
showcasing high accuracy but recognizing the
in identifying COVID-19 cases, pneumonia, and
need for potential improvements in occasional
healthy individuals through thorough analysis,
mispredictions and computational efficiency is
enabling timely and precise patient
introduced. Their research showed encouraging
management. Ensemble (VGG-16, ResNet,
outcomes. However, more study is still required
DenseNet) attains 95-96% accuracy in
to test and modify these techniques for lung CT
COVID-19 and pneumonia classification in
pictures in order to improve diagnostic
chest imaging, outperforming with under 0.5s
capabilities, considering the various modalities
time complexity [4].
and complexity of lung CT scans [1].
A deep learning framework for interpreting
Deep learning models, particularly Convolutional
chest X-ray pictures that use several CNN
Neural Networks (CNNs), demonstrate superior
architectures, such as AlexNet, and VGG16, model incorporating the pre-trained CNNs -
among others is studied. It emphasizes the ResNet-18 and ResNet-50, and the
importance of transfer learning, where Gradient-boosting classifier. The block diagram
pre-trained models are fine-tuned for the in Fig 3.1 aims to explain the flow of the model
specific task of pneumonia detection. Modified proposed.
deep convolutional neural network (DCNN)
architecture is used with a training accuracy of
98.02%, and validation accuracy of 96.09%,
which is much higher than the existing
approaches and techniques [5].

Chest X-ray pictures are used to show how well


CNNs distinguish COVID-19 pneumonia from
other kinds of pneumonia. By using pre-trained
CNN architectures like DenseNet-201 in
Fig 3.1 Block diagram of the proposed model
conjunction with transfer learning, they were
able to classify COVID-19 pneumonia cases This model uses lung CT scans of healthy and
with high accuracy. DenseNet-201 CNN infected patients to explore deep learning for
produced about 94.96% accuracy in COVID-19 pneumonia diagnosis. We obtained lung CT
screening from CXIs which promises faster scans from the China Consortium of Chest CT
diagnosis, lower radiation, and cost-effective Image Investigation (CC-CCII), Negin
healthcare[6]. Radiology located at Sari in Iran, the Harvard
Dataverse, a repository with confirmed
AI systems show a great deal of promise for
COVID-19 cases, and from Sri Ramachandra
helping to quickly and accurately diagnose
University, Chennai, India.
COVID-19 pneumonia from chest X-ray
pictures. The high sensitivity and specificity of Next, in order to ensure consistency, we
these AI-based systems enable early detection preprocess the data using techniques such as
and treatment initiation, which is essential for image enhancement, selection, noise removal
restricting the disease's spread. The AI models using median filtering, and cropping. The
for COVID-19 pneumonia detection using CXR features are extracted using the convolutional
images, show high sensitivity and scalability for layers present in the deep learning models. We
remote screening, providing practical solutions have also introduced data augmentation to the
for pulmonary diseases [7]. dataset to bring in variability to the dataset. The
variability introduced will help negate any effects
of having a small and constant dataset.
III. METHODOLOGY
The pre-trained models used in this project are
the ResNet-50, ResNet-18, and MobileNetv2.
This section outlines the methodology of our The ResNet-50 and ResNet-18 networks have
research paper investigating pneumonia been trained to classify the input images into
diagnosis with lung CT scans using an ensemble different classes, which will be discussed later in
the paper. For feature extraction for the grayscale, TIFF format, making them appear
Gradient-boosting classifier, we have used pitch black on normal monitors, whereas the
pre-trained networks such as MobileNetV2. images from the Harvard Database and SRMC,
Chennai were in DICOM format, where an extra
Lung segmentation-based characteristics are used step of converting these images to TIFF was
by ensemble classifier-Gradient Boosting. After required before normalizing them. The enhancer
feature extraction, we develop and train the helps make these images visible on regular
classifier system. Convolutional and pooling monitors as shown in Fig 4.2.1. Each pixel value
layers are used by the CNN to extract spatial is normalized by dividing by the maximum pixel
characteristics from images, and ensemble value of the image, thereby the output images
classifiers combine several decision trees to have a 32-bit double that can be visualized by
increase accuracy. Lastly, we assess each model's regular monitors. These images can be later
performance on a different test dataset using converted to other formats for better
metrics like accuracy, precision, recall, and compatibility. We stored these images as uint8
F1-score. TIFF files.
IV. PRE-PROCESSING

The raw lung CT scans appear dark and lack


visual details due to variations in tissue density.
Because of differences in tissue density, raw CT
scans of the lungs seem black and lack visual
features. Preprocessing techniques are essential
in order to improve the image and highlight
anatomical features of the lungs for additional
analysis. Fig 4.2.1 Normalization of Raw Images

B. Selection

A certain region of interest (ROI) is chosen in


every image, in our case we chose 241 to 340
and columns 121 to 370 in the 512x512 image.
The selected area in the image contains the lung.
Using this ROI, the number of pixels within this
Fig 4.1 Block diagram of pre-processing
region that have an intensity less than a threshold
of 200 is chosen. Images that pass this criterion
are chosen for further preprocessing, as
illustrated in Fig 4.2.2, and the omitted images
A. Image Enhancement
are discarded as they contain images of the
The images were obtained from different sources closed lung, which are not useful for
and were present in different formats. The raw classification.
images from Negin radiology are in 16-bit uint
D. Median Filtering and Binarizing

Reducing noise in lung CT images without


substantially blurring or distorting vital elements
is the main objective of median filtering. As seen
in Fig 4.2.4, the contours are not more
pronounced, and the noise has been reduced.
Median filtering involves substituting the
intensity value of each pixel with the median
Fig 4.2.2 Selection of Normalized Images value of intensity within a local neighborhood
determined by a kernel size. This nonlinear
C. Laplacian procedure works well at reducing noise while
preserving edges and minute details.
Laplacian filters excel at detecting these edges by
highlighting areas with rapid intensity changes.
This filtering method helps to make regions of
infection and minute anomalies more visible in
the image. The Laplacian filter is defined as the
sum of the second partial derivatives of the
image intensity function f(x, y) concerning the
spatial coordinates x and y:

∇²f(x, y) = ∂²f(x, y) / ∂x² + ∂²f(x, y) / ∂y²

The filtered image improves structural details Fig 4.2.4 Median Filtering of Sharpened Images
and focuses on the edges and contours of the
The median filtered images are binarized to
lung, making the images more qualified for
capture minute details and contours of the lung
training the model(s), as illustrated in Fig 4.2.3
as shown in Fig 4.2.5.

Fig 4.2.3 Sharpening of Selected Images


Fig 4.2.5 Binarization of Median Filtered Images
E. Automated Cropping

Noise and unwanted parts of an image often


reduce the performance of the model by training
it on futile features. Eliminating background
pixels and other irrelevant parts of the image will
help models concentrate on the lung tissue,
which cuts down on the training time and
processing requirements and, at the same time,
Fig. 4.3 Process of Automated Cropping
increases classification accuracy. The cropped
lung ROI has features that are more pertinent to
the diagnosis of pneumonia, which aid in
V. PRE TRAINED NETWORKS
improving model performance. The cropped
images are as shown in Fig 4.2.6.
We have utilized pre-trained networks to detect
pneumonia rather than building a CNN model
from scratch. Using pretrained networks and
transfer learning, we can achieve remarkable
classification accuracy even with a small amount
of labeled data by utilizing the information
gathered from the pretrained networks. It
improves accuracy, shortens training times, and
speeds up convergence in detecting anomalies
Fig 4.2.6 Cropping of Binarized Images linked to pneumonia in CT scans by optimizing
pre-existing structures. By using well-established
Cropping lung images with a set dimension will
feature representations that have been trained
lead to a loss of vital characteristics, leading to
from a variety of data sets, this method helps the
poor classification of images. To tackle this, we
model generalize better to new datasets and
have automated the cropping process for each
clinical scenarios.
image. The input image (1) is first smoothed by
using a gaussian filter. On the smoothed image,
A. Resnet-50 and ResNet-18 - for classification
edge detection is performed by using the sobel
operator, as seen in image (2) in Fig. 4.3. The
ResNet-50 and ResNet-18, like other pretrained
image is dilated first to fill any holes in the lung
networks, are trained using over a million images
region, and the remaining gaps are closed as
from datasets such as ImageNet. These
shown in images (3) and (4) of Fig.4.3. Further,
pretrained models are capable of classifying
the largest connected component is detected to
images into over 1000 classes. Pre-trained
aid in cropping the image. Using the bounding
networks can be optimized to classify images of
box function in Matlab, the image is cropped and
classes different from the ones they were trained
resized to 224x224 to keep all the images
on. Since both networks have not been trained on
uniform.
medical images, their performance on CT scan
images is not optimized for precise classification
of medical images. Fig 5.1 and Fig 5.2 show the images, with images of patients with COVID-19
block diagrams of ResNet-18 and ResNet-50 infection (49 patients) , pneumonia (48 patients),
respectively. and normal patients (45 patients). Further, the
first 162 layers are frozen, and the layers from
ResNet-18: To make the model familiar with 163 to 183 are unfrozen to train a second set of
medical images and increase its overall 6826 images, with images of patients with
capability, the first 44 layers (14 convolution COVID-19 infection (20 patients), pneumonia
layers, reLu layers etc.) and the last 15 layers (2 (23 patients) and normal patients (26 patients).
convolutional layers) of the model, as in Fig 5.1
are frozen first and trained using 8700 images of
patients with COVID-19 infection (49 patients)
and of patients with pneumonia (48 patients).
After this, the first 59 layers (17 convolutional
layers) of the model are frozen, and the last 12
layers (2 convolutional layers and multiple fully
connected layers) are kept unfrozen to train the
model for a second time. This time, the model is
trained using 4600 images of patients with
COVID-19 infection (20 patients) and of patients
with pneumonia (23 patients). Using this method Fig 5.2 Block diagram of ResNet-50
has made the model familiar with the actual
training data, making it more efficient and B. MobileNetV2 - for feature extraction
precise.
MobileNetV2 is a lightweight architecture,
making it computationally efficient for
deployment on resource-constrained devices
without compromising accuracy. MobileNetV2 is
used to extract features for the gradient boosting
classifier. The model is trained using
approximately 15000 images. In transfer learning
with the MobileNetv2 model, as shown in Fig
5.3, it's common practice to freeze all the
convolutional layers and only fine-tune the fully
connected layers that are added on top of the
Fig 5.1 Block diagram of ResNet-18 pre-trained MobileNetv2 model. In
MobileNetV2, there are typically 19 blocks, and
ResNet-50: A similar methodology is followed each block contains one or more convolutional
while re-training the ResNet-50. Layers from 1 layers. A total of 53 convolutional layers are
to 152 are frozen (the first 44 convolutional frozen by setting their weight factor, weight learn
layers) and the layers from 163 to 183 (the last 2 rate factor, and bias learn rate factor to zero.
convolutional layers) are also frozen for future Freezing these layers prevents their weights from
training . The model is trained using 14049 being updated during training on the new dataset,
thus preserving the learned representations. By all the trees in the ensemble to determine if the
freezing these layers, we reduce the risk of output class label is healthy or pneumonia. The
overfitting on the new dataset and maintain gradient-boosting classifier has been trained on
stability. two classes, namely pneumonia and normal
patients, and the model achieves an accuracy of
95 percent.

Using the three models, we have developed an


ensemble classifier that takes into consideration
the predictions of all three classifiers before
giving a final result.

Fig 5.3 Block diagram of MobileNet-V2 VII. RESULTS AND DISCUSSIONS

A. Training Graph of Pre-trained models


VI. ENSEMBLE MODEL

The idea of building an ensemble model is to


leverage the strengths of the individual models.
A more reliable and accurate diagnosis will result
from combining the predictions of multiple
models. This ensemble approach has the
potential to be more accurate than relying on a
single model.

A. ResNet-50 and ResNet-18


Fig 6.1.1 Training Graph of Resnet-50
The ResNet-50 model has been trained on all
three classes, namely COVID-19 infection,
pneumonia, and normal patients. The model is
highly accurate in its predictions of
COVID-19-infected patients and pneumonia
patients and is moderately accurate with its
normal patients' predictions. The ResNet-18 has
been trained only with two classes: COVID-19
infection and pneumonia, and the model's
predictions are high.

B.Gradient Boost Classifier Fig 6.1.2 Training Graph of Resnet-50

This method sequentially builds a model by


adding decision trees. Each new tree focuses on
improving on the previous ones by learning from
their errors. It uses the collective predictions of
C. ResNet-50 confusion matrix

Fig 6.2 Training Graph of MobileNet-V2

From the training graphs in Fig 6.1 and Fig 6.2,


we may see that there are some initial
fluctuations in loss and accuracy, which denotes
that the model is still adjusting its internal
parameters. The obtained loss curve is decreasing
over epochs and the accuracy curve is increasing
over epochs, which is a typical pattern during
Fig 6.4 Confusion matrix of ResNet-50
successful training.
From the confusion matrix in Fig 6.4, we derived
B. ResNet-18 confusion matrix
that the model has an accuracy of 84.7% with a
precision rate of 0.98. The calculated F1 score is
0.99 and recall is 0.99.

D. Gradient boosting confusion matrix

Fig 6.3 Confusion matrix of ResNet-18

From the confusion matrix in Fig 6.3, we derived


that the model has an accuracy of 96.1% with a
precision of 0.98. The calculated F1 score is 0.96
and the recall is 0.94.

Fig 6.5 Confusion matrix of Gradient Boosting


From the confusion matrix in Fig 6.5, we derived ability to correctly differentiate between healthy
that the model has accuracy of 95.1% with a lungs, lungs with pneumonia, and lungs infected
precision of 0.98. The calculated F1 score is 0.95 with COVID-19. These results suggest that this
and recall is 0.92. ensemble approach holds promise as a valuable
tool for computer-aided diagnosis in clinical
E. Ensemble confusion matrix settings. Through the optimization of each
algorithm's distinctive features, our model attains
improved accuracy, robustness, and versatility.
Moreover, the ensemble method reduces the
impact of each algorithm's specific flaws,
producing predictions that are more reliable.

In medical applications, this ensemble model


could assist radiologists in reading CT scans of
the lungs, facilitating the early identification and
precise diagnosis of pneumonia. Sustained
research and development endeavors in this
domain have the capability to improve diagnostic
procedures, ultimately resulting in enhanced
patient outcomes and more effective healthcare
distribution.

Fig 6.6 Confusion matrix of Ensemble Model REFERENCES

From the confusion matrix in Fig 6.6, we derived [1] Kundu R, Chakraborty S, Pal A, Sarker A,
that the model has accuracy of 98% with a Shakhawat Hossain SM. Pneumonia detection in
precision rate of 0.97. The calculated F1 score is chest X-ray images using an ensemble of deep
0.98 and recall is 0.99. learning models. PLoS One. 2021 Sep
7;16(9):e0256630.
VIII. CONCLUSION
[2] Goyal, S., & Singh, R. (2023). Detection and
The ensemble model combining pre-trained CNN classification of lung diseases for pneumonia and
models and Gradient Boosting Classifier exhibits Covid-19 using machine and deep learning
promising results in pneumonia and COVID-19 techniques. Journal of Ambient Intelligence and
infection diagnosis using lung CT scans. When Humanized Computing, 14(4), 3239-3259.
diagnosing pneumonia from lung CT images, the
ensemble model that combines CNN and [3] Ortiz-Toro, C., et al. (2022). Automatic
Gradient Boosting Classifier shows promising detection of pneumonia in chest X-ray images
results. using textural features. Computers in Biology
and Medicine, 145, 105466.
The proposed model achieved a high accuracy of
98% and a precision of 97%, indicating its strong
[4] Xue, X., et al. (2023). Design and Analysis of [11] Zhang, K., Liu, X., Shen, J., Li, Z., Sang, Y.,
a Deep Learning Ensemble Framework Model Wu, X., ... & Wang, G. (2020). Clinically
for the Detection of COVID-19 and Pneumonia applicable AI system for accurate diagnosis,
Using Large-Scale CT Scan and X-ray Image quantitative measurements, and prognosis of
Datasets. IEEE Transactions on Medical COVID-19 pneumonia using computed
Imaging. tomography. Cell, 181(6), 1423-1433.

[5] Yi, R., et al. (2023). Identification and [12] Mostafavi, S. M. COVID19-CT-Dataset: An
classification of pneumonia disease using a deep Open-Access Chest CT Image Repository of
learning-based intelligent computational 1000+ Patients with Confirmed COVID-19
framework. Expert Systems with Applications. Diagnosis 2021. DOI: https://doi.
org/10.7910/DVN/6ACUZJ.
[6] Alhudhaif, A., et al. (2021). Determination of
COVID-19 pneumonia based on generalized
convolutional neural network model from chest
X-ray images. Computers in Biology and
Medicine.

[7] Baltazar, L. R., et al. (2021). Artificial


intelligence on COVID-19 pneumonia detection
using chest X-ray images. PLOS ONE.

[8] Rahimzadeh, M., Attar, A., & Sakhaei, S. M.


(2021). A fully automated deep learning-based
network for detecting COVID-19 from a new and
large lung CT scan dataset. Biomedical Signal
Processing and Control, 68, 102588.

[9] Ahamed, K. U., Islam, M., Uddin, A., Akhter,


A., Paul, B. K., Yousuf, M. A., ... & Moni, M. A.
(2021). A deep learning approach using effective
preprocessing techniques to detect COVID-19
from chest CT-scan and X-ray images.
Computers in biology and medicine, 139,
105014.

[10] Kundu, R., Das, R., Geem, Z. W., Han, G.


T., & Sarkar, R. (2021). Pneumonia detection in
chest X-ray images using an ensemble of deep
learning models. PloS one, 16(9), e0256630.

You might also like