AI BASE Report Team 4
AI BASE Report Team 4
CHAPTER 1
1.1 INTRODUCTION
Pneumonia is a serious respiratory infection that affects millions of people
worldwide each year, particularly children under five and the elderly.It causes
inflammation in the air sacs of the lungs, which can fill with fluid or pus, leading to
symptoms such as coughing, difficulty breathing, and fever. Early and accurate
detection of pneumonia is crucial to ensure timely medical intervention and reduce
the risk of complications or death.
This system typically uses deep learning algorithms and medical imaging,
especially chest X-rays, to detect signs of pneumonia with high accuracy and
efficiency. By leveraging artificial intelligence, the system assists healthcare
professionals in making faster and more reliable diagnoses, even in remote or
lowresource settings. The implementation of such systems holds the potential to
transform respiratory disease management and improve patient outcomes
worldwide.
1.This section provided an ove rview of some related studies for a better
understanding of the area under study and to provide the state-of-the-art picture.
Convolutional neural networks (CNNs), which are one of the most effective DL
models, have successfully proved their mastery over conventional methods in
several disciplines, including image classification and pattern recognition [13,14].
Currently, it has indeed been successfully implemented in the field of medicine
with impressive outcomes and outstanding performance in different challenging
settings. Various medical imaging systems using DL techniques have also been
developed to assist physicians and specialists in effective COVID-19 diagnosis,
care, and Follow up examination [15,16]. Narin et al. [11] used five-fold cross
validation to enforce various binary classifications. With an accuracy equal to
98%, specificity value of 100%, and a recall with 96%, the pre-trained ResNet-50
method gives the best efficiency. On the other hand, Wang et al. [17] have
suggested using CXR images to automatically establish a new deep architecture
called COVID-Net to detect COVID-19 instances. Using a database containing
13,975 CXR images, this model has the highest classification accuracy of 93.3%.
The key strength of this approach is that the conceptual composition could create a
balance between different goals such as accuracy and computational costs through
architectural design choices.Hemdan et al. [18] introduced COVIDXNet, a DL
framework for detecting COVID19 infections in CXR images. A small dataset of
50 images was used to compare seven DL techniques (e.g., MobileNetV2,
2.In the past decade, the field of medical image analysis has grown exponentially,
with an increased number of pattern recognition tools and an increase in data set
sizes. These advances have facilitated the development of processes for
3.The coronavirus infection, COVID-19 has surprised the world with its spread
potential virulence, with potential profound overall impact on the lives of billions of
people from both a safety and an economic perspective. As of this writingthere are
approximately 93,158 confirmed cases of which 80,270 are in “Mainland China”
with 3,198 deaths, a mortality rate of 3.4% [1]. The sensitivity of the current
diagnostic gold standard at the initial presentation of the disease has been called
into question. Fang et al [2] compared the sensitivity of non-contrast chest CT with
reverse transcription-polymerase chain reaction (RT-PCR) which detects viral
nucleic acid and is the current reference standard in the detection of COVID19.
Their study looked at 51 patients who had a history of travel or residence endemic
areas and fever or acute respiratory symptoms of unknown cause. The patients
underwent initial and repeat RTPCR testing. Their gold standard was an eventual
confirmed diagnosis of COVID-19 infection by serial RTPCR testing. The authors
suggested a sensitivity of noncontrast chest CT for detection of COVID-infection of
98% compared to initial RT-PCR sensitivity (results of the first RT_RPR test) of
71%. Cases highlighted in their paper demonstrated either diffuse or focal
groundglass opacities.
This lack of sensitivity on initial RT-PCR testing was also described in another
study by Xie et al [3] who reported that 3% of 167 patients had negative RT-PCR
for the virus despite chest CT findings typical of viral pneumonia suggesting the
use of chest CT to decrease false negative lab studies. Bernheim et al [4] examined
121 chest CT studies from four centers in China that were obtained in the early,
intermediate and late stages of infection. They also described ground glass
opacities as characteristic of the disease, particularly bilateral and peripheral
ground-class and consolidative pulmonary opacities.
They noted greater severity of disease with increasing time from onset of
symptoms and they described later signs of disease to include “greater lung
involvement, linear opacities, “crazypaving” pattern and the “reverse halo” sign.
There was bilateral lung involvement in 28% of early patients, 76% of
intermediate patients and 88% of patients in late stages (6-12 days) of disease.
Once a decision has been made to use thoracic CT as these recent studies suggest
for patient diagnosis or screening, a need immediately rises to rapidly evaluate
potentially very large numbers of imaging studies.
III. Testing phase in which an additional set of cases not used in training is
presented to the network and the output of the network is tested
statistically to determine its success of categorization.
In the case of a new disease, such as the coronavirus, datasets are just now being
identified and annotated. There are very limited data sources as well as limited
expertise in labeling the data specific to this new strain of the virus in humans.
Accordingly, it is not clear that there are enough examples to achieve clinically
meaningful learning at this early stage of data collection despite the increasingly
critical importance of this software, especially given fears of a pandemic. It is
our hypothesis that AIbased tools can be rapidly developed leveraging the ability
to modify and adapt existing AI models and combine them with initial clinical
understanding to address the new challenges and new category of COVID-19.
Our goal is to develop deeplearning based automated CT image analysis tools
and demonstrate that they can enable differentiation of coronavirus patients from
those who do not have the disease to provide support in the detection,
measurements, and tracking of disease progression.
4.In this initial exploratory work, we show the capabilities of AI to assist in the
efforts to accurately detect and track the progression or resolution of the
Coronavirus. This is the first report to our knowledge in the literature of software
specifically developed to detect, characterize and track the progression of
This would allow a greater volume of patients being screened for Coronavirus,
with earlier and more rapid detection of positive cases, which could lead to more
effective identification and containment of early cases. As illustrated above, using
standard machine learning techniques and innovative AI applications, in
combination with an established pulmonary CT detection platform, an effective
tool can be utilized for the screening and early detection of patients who may have
contracted the COVID-19 pathogen. In individual patients who have contracted
the
Virus and have the pulmonary abnormalities associated with it, the same
methodologies can be used to accurately and more rapidly assess disease
progression and guide therapy and patient management.
The disease (PNEUMONIA) is rapidly spreading around the world. Early diagnosis
and isolation of PNEUMONIA patients has proven crucial in slowing the disease’s
spread. One of the best options for detecting PNEUMONIA reliably and easily is to
use image processing techniques and machine learning [Neural Network]
strategies. Two different DL approaches based on a pertained neural network
model for PNEUMONIA detection using chest X-ray (CXR) images are proposed
in this study. With the ever-increasing demand for screening millions of
prospective PNEUMONIA cases, and due to the emergence of high false negatives
in the commonly used PCR tests, the necessity for probing an alternative simple
screening mechanism of PNEUMONIA using radiological images (like chest X-
Rays) assumes importance. In this scenario, machine learning (ML) and deep
learning (DL) offer fast, automated, effective strategies to detect abnormalities and
extract key features of the altered lung parenchyma, which may be related to
specific signatures of the PNEUMONIA.
1.3.2 Objectives
To develop AI-based automated X-ray image analysis tools for detection,
quantification, and tracking of Pneumonia and demonstrate that they can
differentiate Pneumonia patients from those who do not have the disease. The
scope of this project is to build the efficient AI system to detect the Pneumonia
using xrays images as it cost less in price compared to CT scans and by using
Image processing techniques and ML NN we could detect the Pneumonia faster
with accuracy. Hence system would be optimized in memory and computing.
• Feature Extraction:
1.3.6 CONSTRAINT
The *Pneumonia Infection Detection System using Deep Learning* faces several
constraints that can impact its performance and deployment. One of the primary
limitations is data quality and availability, as deep learning models require large,
well-labeled datasets for accurate detection, yet medical imaging datasets are
often limited due to privacy concerns and variability in image acquisition
techniques.
CHAPTER 2
SL No Tasks
1 Requirement gathering/Planning
2 Design
3 UI-Preparation
4 UI-Development
5 DB design/Back design
6 Methodologies
*Image process
*Gray scale
*Bil lateral Algorithm
*Binary conversion
*ROI
*KNN
7 Testing
`
Table 2.1 Work Break Down
2 Design
3 UI-Preparation
NetBeans
4 UI-Development
3 week
Methodology
[Image proces,
6 Gray scale,Bil lateral Algorithm, NetBeans 6 week
Binary conversion,ROI,,KNN]
Manual
7 Testing 1 week
Testing
Design
2
Mitigation: Comply with data privacy regulations like HIPAA or GDPR, use
data anonymization, and implement strong security measures to protect
sensitive information.
Mitigation: Use advanced algorithms, continuously test the model with different
datasets, and perform validation using cross-validation techniques to reduce
overfitting.
Risk: The system might not be readily accepted by healthcare providers due to a
lack of understanding, trust, or reluctance to adopt new technology.
5. Regulatory Compliance
Mitigation: Stay up-to-date with relevant regulatory bodies (FDA, EMA, etc.) and
ensure the project adheres to medical device regulations.
Risk: If the training data lacks diversity, the model might produce biased results,
leading to disparities in diagnosis across different patient demographics.
Mitigation: Build the system with scalability in mind, conduct stress tests, and
establish an ongoing maintenance and update plan.
• Pneumonia Detection
The system should process the image using a machine learning model
to detect the presence of pneumonia.
• Result Display
The system should display results (e.g., "Pneumonia Detected
or"Normal") alongwith confidence scores.
• Image Preprocessing
The system should normalize, resize, and possibly enhance the X-ray
images before eeding them into the model.
• Report Generation
The system should generate a downloadable report summarizing the
findings.
• Extensibility: The venture work is similarly open for any future alteration
Availability: System will be accessible all the time with the exclusion of
the time needed for the strengthening of information.
• Portability: The application is produced in Java. It is compact to other
working system gave JDK is available to the working system.
• Integrity: The task work is fundamentally outlined in an incorporated
development environment, where every class, individuals, traits is
planned under java bundle.
Building and examining the principle class will incorporate every one of
the classes in like manner for the best possible assemblage of the
undertaking work.
Users (patients or doctors) can fill in a form with the above clinical
parameters Could be a chatbot-style interface or a web form
2. Hardware Constraints
Hard Disk : 50 GB
RAM : 1GB
Monitor : LED
3. Software Constraints
• Technology : Java
4. Model Constraints
1: Image acquisition:
Chest X-Ray dataset is used as input for this Covid 19 detection system. It
consists of 5000 X-ray images from 1610 patients and they are of two types
of image data sets: normal and Covid 19.
2. Image pre-processing
B. Bilateral Filtering:
Due to brightness characteristics of a pixel which is a background noise.
Hence image pre-processing is necessary. It involves noises removal by
Bilateral filter removing label and artifacts. Where Bilateral filter is a low
pass linear digital filtering technique used to remove noise from an image
and then the image is cropped according to requirement.
C. Image segmentation:
Image segmentation is a process of partitioning of an image into several
constituent components. Binary Conversion is used to segment an image.
The algorithm returns a single intensity threshold that separate pixels into
two classes mainly foreground and background.
3. Feature extraction:
Feature extraction is a method of capturing visual content of images for
indexing and retrieval. We consider two types of texture features for the
process of extraction they are 1st order and 2nd order texture features. 1st
order texture features are calculated from individual pixel and need not to
consider the neighboring relationship. 2nd order texture features compute
the statistical features from the pixel of the neighboring relationship. Mean,
standard deviation, skewness are the examples of 1st order texture features.
Energy, entropy, homogeneity, correlation, contrast are the examples of
2nd order texture features.
CHAPTER 3
The general algorithm for transferring color, the basic idea is then extended
to use sample blocks. The general procedure for converting the gray-scale
image to color image requires a few simple steps.
Step 1: Each image is converted into the lxy color space. Select a small
subset of pixels in the color image as sample.
Step 4: The chromaticity values (x,y) of the best matching pixel are then
transferred to the gray-scale image to form the final image.
Step 5: Color transfer using sample blocks involves the same for the whole
image 70 matching procedure but only between the source and target
sample blocks.
2.Bilateral Filtering
3.ROI Segmentation
Step 1. Segmentation algorithm is applied to the loaded image to find the
light and dark color for the from the image given to the system. From the
image with the help of threshold as: 𝑖𝑓 𝑅 + 𝐺 + 𝐵 3 ≥ (𝑥, 𝑦) = 𝑤ℎ𝑖𝑡𝑒
𝑃(𝑥, 𝑦) = 𝑏𝑙𝑎𝑐𝑘 where P(x, y), represents the current pixel with
coordinate positions x and y
Step 3. To check the possibility to have a white region in the given image,
the height and width of the image must follow the following criteria:
ℎ𝑒𝑖𝑔ℎ𝑡 ≥ 50 , 𝑤𝑖𝑑𝑡ℎ ≥
50 1 ≤ ℎ𝑒𝑖𝑔ℎ𝑡 𝑤𝑖𝑑𝑡ℎ ≤ 2
Step 4. Convert the image into Binary image.
Step 5. Then cut the white region from the binary image according to
pixels present in the image by considering height and width.
Step 6. Then find Region of Interests (ROI) from the image according to knowledge
based method of cancer detection.
4.Multilayer Perceptron NN
3.1.1 There are basically three steps in the training of the model.
Forward pass
Backward pass
1.Forward pass
In this step of training the model, we just pass the input to model an
multiply with weights and add bias at every layer and find the calculated
output of the model.
2.Loss Calculate
When we pass the data instance(or one example) we will get some output from
the model that is called Predicted output(pred_out) and we have the label with
the
data that is real output or expected output(Expect_out). Based upon these
bothwecalculate the loss that we have to backpropagate(using
Backpropagation algorithm). There is various Loss Function that we use
based on our output and requirement.
3.Backward Pass
After calculating the loss, we backpropagate the loss and updates the
weights of the model by using gradient. This is the main step in the training
of the model.
In this step, weights will adjust according to the gradient flow in that
direction.
CHAPTER 4
Dept Of CSE 39 2025-26
Pneumonia Infection Detection System
Function : This button allows users to load the dataset required for
training the pneumonia detection model.
Process :
1. When clicked, it prompts the user to select a dataset from their
computer.
2. The dataset likely consists of chest X-ray images labeled as
"Pneumonia"or "Normal."
3. Once selected, the software processes and loads the dataset into
memory.
4. A confirmation message (like the one in the image) appears to inform
the user that the data has been successfully loaded.
4.1.1.2 Browse
How It Works:
1. User Clicks "Browse"
• Opens a file dialog to navigate through local storage and select
an X-ray image.
How It Works :
4.1.1.5 Predict
4. Prediction Displayed
• The *identified class* (either *Normal* or *COVID+*) appears
on the screen.
• The *classification accuracy (e.g., 71.43%)* is also shown,
indicating the confidence level of the prediction.
Interpretation of Output
• If the system predicts *Normal (Accuracy: 71.43%), it means the
X-ray is classified as non-infected with **71.43% confidence*.
• If the system predicts *COVID+*, it suggests pneumonia
infection is detected.
4.2TYPES OF TESTING
4.2.1 Testing
Black box testing is a type of system testing where the internal structure or
code of the system being tested is not known to the tester. It focuses on
validating the functionality, inputs, and outputs of the system without
considering the internal implementation. Test cases are designed based on
system requirements and user perspectives to ensure the system functions
as expected.
White box testing, also known as structural or glass box testing, is a type
of system testing where the internal structure and code of the system are
known to the tester. It involves testing individual components, such as
functions, methods, or classes, and understanding how they interact within
the system. Test cases are designed to ensure code coverage, check for
logical errors, and validate the internal behavior of
the system.
Test Description
Test system with a low-quality X-ray
image.
Sample Input
Chest X-ray of tuberculosis or lung
cancer patient.
Expected Output System does not misclassify it
as pneumonia.
Test Description
Test System With an X-ray of a
child
CHAPTER 5
2. Target Customers
• Hospitals and Clinics
• Diagnostic Laboratories
Government health departments (public health screening)
• Non-Governmental Organizations (NGOs)
• Telemedicine companies
• Rural and Mobile Health Units
3. Revenue Model
• B2B Licensing: Sell licenses to hospitals and labs for using the detection
system.
• SaaS (Software as a Service): Subscription model for cloud-based
pneumonia detection tools.
• One-time Purchase: Selling the software/hardware package as a one-time
product.
• Partnerships: Collaborations with device manufacturers (e.g., X-ray
machine companies).
4. Cost Structure
• Development costs: AI model training, software development, image
processing integration.
• Data acquisition: Cost of medical datasets or partnerships with hospitals.
• Regulatory approvals: Compliance with medical regulations (e.g., FDA,
CE). - Deployment and maintenance: Cloud services, updates, customer
support.
5. Competitive Advantage
• Faster and more accurate diagnosis
• Low-cost solution for resource-poor settings
• Integration with existing radiology infrastructure
• AI-driven second opinion for doctors
7. Social Impact
• Improved healthcare access in rural areas
• Reduced diagnostic errors
• Support for overwhelmed healthcare systems during outbreaks • (e.g., flu
season, COVID-19)
8. Future Opportunities
• Expansion into detection of other respiratory conditions (e.g.,
tuberculosis, COVID-19)
• Integration with mobile apps or telemedicine platforms
• AI model improvement with ongoing data collection
The allocated budget for the capstone project is Twelve Thousand Rupees
(₹12,000). This budget should be carefully managed to cover expenses related
to hardware, software, data acquisition, development tools, and any other
necessary resources
These financial projections must take into account ongoing expenses and
maintenance costs, including subscription fees for cloud services, hardware
upkeep, software licensing, and regular security updates. Additionally,
scalability is a crucial factor; as the project grows, the infrastructure should be
able to handle increased demand without significant cost overruns. By carefully
analyzing both revenue streams and recurring expenses, stakeholders can make
informed decisions, adjust the business strategy as needed, and ensure the long-
term sustainability of the project
1. Development Costs
• Research and Development: Time and resources for algorithm
development (e.g., AI/ML for image recognition or diagnostic tools).
• Data Acquisition: Cost of collecting and annotating chest X-rays or CT
scans.Software/Hardware Requirements: Expenses for computing
infrastructure (cloud-based or on-premise).
2. Operational Costs
• Maintenance & Updates : Continuous improvement and system updates.
• Technical Support : Providing support for users and institutions.
• Training: Educating medical personnel to effectively use the system.
• Licensing Fees: If third-party tools or platforms are used.
Through rigorous training, testing, and validation phases, the system demonstrated
promising results in terms of sensitivity, specificity, and overall classification
performance. The automated detection model significantly reduced diagnostic
time, minimized human error, and provided a reliable decision support tool for