9.bone Fracture Detection With X-Ray Images Using MobileNet V3 Architecture
9.bone Fracture Detection With X-Ray Images Using MobileNet V3 Architecture
Abstract—Technologies that are developing quickly are being The integration of automated systems driven by deep learning
developed daily in a variety of disciplines, particularly the algorithms offers a promising avenue to enhance diagnostic
medical field. For the purpose of detecting bone fractures in accuracy and expedite patient care.
X-ray pictures of different body segments, our work compares
the ResNet-50 and MobileNetV3 architectures. It evaluates ac- This study utilizes a comprehensive repository of muscu-
curacy and computing efficiency with X-rays of the elbow, loskeletal radiographs, encompassing both normal and frac-
hand, and shoulder from the MURA dataset. Through train- tured images across distinct anatomical segments. The se-
ing and validation, the models are evaluated on normal and lected segments are critical areas susceptible to fractures,
fractured images. While ResNet-50 showcases superior accu- emphasizing the clinical significance of this research. The
racy in fracture identification, MobileNetV3 showcases superior
speed and resource optimization. Despite ResNet-50’s accuracy, dataset division into training and validation sets enables robust
MobileNetV3’s swifter inference makes it a viable choice for evaluation of model performance.
real-time clinical applications, emphasizing the importance of The primary focus lies on ResNet-50 and MobileNetV3
balancing computational efficiency and accuracy in medical imag- architectures, known for their capabilities in image classifi-
ing. We created a graphical user interface (GUI) for MobileNet cation tasks. ResNet-50 exhibits remarkable accuracy, while
V3 model bone fracture detection. This research underscores
MobileNetV3’s potential to streamline bone fracture diagnoses, MobileNetV3, with its lightweight design, emphasizes compu-
potentially revolutionizing orthopedic medical procedures and tational efficiency without compromising acceptable accuracy
enhancing patient care. levels. This study aims to discern the performance trade-
Index Terms—CNN, MobileNet V3, ResNet-50, Healthcare, off between accuracy and computational efficiency in bone
MURA, X-ray, fracture detection fracture identification.
The classification of X-ray images into different bone
types is the aim of this study. To accurately classify X-ray
I. I NTRODUCTION
images into different bone classes, use a deep neural network
Medical imaging has undergone significant advancements model [3]. Identify bone fractures in X-ray images, utilizing
through the integration of deep learning architectures, partic- cutting-edge machine learning methods, correctly identify [4]
ularly in the analysis of X-ray images for fracture detection. and categorizing bone fractures in different regions (elbow,
[1]Paper pioneers an intelligent bone fracture detection system shoulder, and hand) from X-ray pictures. Shorten the time
combining image processing techniques with a trained neural spent by patients and doctors, to employ an AI-assisted model
network, offering high accuracy and efficiency in classifying that will expedite fracture diagnosis and simplify therapy
intricate fracture patterns.This research focuses on evaluating administration, saving time for both patients and physicians.
the effectiveness of ResNet-50 and MobileNetV3 architectures Improving Therapy Results Allow for accurate and timely
in identifying bone fractures across specific anatomical seg- fracture diagnosis to speed up intervention, possibly lowering
ments - the elbow, hand, and shoulder. The aim is to compare hospital stays and lengthening recovery periods, all of which
their performance in terms of accuracy, computational effi- will improve patient care in the end.
ciency, and suitability for real-time clinical applications. MobileNet V3 boasts low latency, hardware efficiency, a
The evolution and significance of deep CNN [2]architec- compact memory footprint, and supports quantization-aware
tures in revolutionizing computer vision applications, encom- training, distinguishing its advantages. Conversely, ResNet-50
passing key advancements and taxonomy for architectural leverages a deeper architecture, excels in feature extraction,
analysis and comparison. Traditional diagnostic methods rely- and demonstrates superior capability in learning intricate de-
ing on manual interpretation are subjective and prone to errors. tails. These differences highlight MobileNet V3’s efficiency
2
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 22,2024 at 07:18:35 UTC from IEEE Xplore. Restrictions apply.
balance for image classification on mobile devices compared
to other pre-trained CNN models, as validated across vari-
ous representative datasets. MobileNetV3 [9], applied in a
bilinear structure (BM-Net), exhibits promise in detecting
cancer in histology images, showcasing high accuracy and
outperforming existing models. Its design optimizes efficiency
while decreasing computing overhead by integrating multiple
improvements, such as the use of effective inverted residuals,
linear bottlenecks, and a streamlined architecture. Based to
these improvements, MobileNetV3 can now recognize bone
fractures with impressive accuracy, which makes it a desirable Fig. 3. The Architecture of ResNet 50 model
option for medical imaging applications. With an architecture
tuned for memory and accuracy, Quantization Friendly Mo-
bileNet (QF-MobileNet) [10] improves bone fracture detection ResNet-50 architecture, shown in fig. 3, leverages its deep
with better resource utilization and inference accuracy for real- layers to extract complex information from X-ray images
time applications on embedded platforms. [12]. The network learns to recognize fracture indicators like
Furthermore, MobileNetV3’s focus on computing discontinuities or unique textures by amalgamating data from
efficiency guarantees quick inference, which is essential different layers. Its depth enables gradual learning and com-
for clinical contexts where real-time diagnosis is required. Its bination of information, while residual connections alleviate
promising architecture for accurate and quick bone fracture issues like vanishing gradients, ensuring efficient learning and
detection contributes to better orthopedic medicine diagnostic representation of complex fracture details.
procedures because of its efficiency and accuracy balancing. Equations governing ResNet-50’s operations involve convo-
lutional processes, activation functions like Rectified Linear
Units (ReLU), and residual connections. These equations
B. ResNet (Residual Network) enable feature extraction, activation, and combining input
and output from different layers, enhancing the network’s
ResNet [11], particularly the adapted ResNet50 model em- capability to identify fractures accurately. The process for the
powered with SENet capabilities, emerges as a potent solution bone fracture detection model will be shown in the following
in detecting ankle fractures from radiographic images. Its section.
deep learning architecture and SENet integration enable robust
fracture identification, showcasing exceptional performance III. M ETHODOLOGY
metrics. With an impressive accuracy of 93%, an AUC of This study’s methodology compares the effectiveness and
95%, and a recall rate of 92%, this model outshines other precision of two deep learning architectures—ResNet-50 and
architectures in fracture detection. MobileNetV3—for the purpose of identifying bone fractures
Moreover, the utilization of Grad-CAM visualizations offers in X-ray pictures throughout the elbow, hand, and shoulder
valuable insights into the model’s decision-making process, anatomical regions in a methodical manner.
highlighting significant areas in the radiographs crucial to frac- A. Implementation
ture identification. The proficiency of the adapted ResNet50
with SENet capabilities signifies its potential as a reliable di- The MURA dataset [13], an extensive collection of bone
agnostic aid, poised to augment traditional diagnostic methods. radiographs, serves as the foundation for our system, enabling
Nevertheless, ongoing refinement and expert validation are precise detection of bone fractures and offering valuable
imperative for ensuring its optimal integration and utility in insights for advancing diagnostic capabilities in bone-related
clinical settings. conditions. The implementation involves deploying of system
that utilizes the ResNet-50 and MobileNetV3 models for the
The excellence of ResNet-50 in bone fracture detection
automatic diagnosis of bone fractures in specific anatomical
stems from its ability to efficiently train extremely deep
segments—elbow, hand, and shoulder—using X-ray images.
networks. Addressing the vanishing gradient issue through
residual connections, it enables training very deep structures
The equations used in ResNet-50 for computations involve:
(50+ layers) without degradation. This depth facilitates the
1) Convolution Operation: It involves a filter/kernel applied
recovery of subtle details from X-ray images, crucial for
to input data to extract features. The operation can be repre-
identifying intricate fracture patterns. ResNet-50’s depth and
sented as:
design enhance the learning and representation of fine-grained
fracture information, leading to more reliable diagnoses. Uti-
Z[l] = W [l] ∗ A[l − 1] + b[l] (1)
lizing residual blocks with skip connections, it captures hi-
erarchical properties at various abstraction levels, aiding in where A[l-1] is the activation of the previous layer, W[l]
discerning fracture-related features and textures throughout stands for the weights, b[l] is the bias term, and Z[l] is the
training. output of the convolutional layer.
3
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 22,2024 at 07:18:35 UTC from IEEE Xplore. Restrictions apply.
Fig. 4. MobileNet V3 Architecture for bone fracture detection
4) Fully Connected Layer: Fully connected layers, which Residual = F (X, W ) + X (7)
are represented as matrix multiplication, are utilized in the last
layers of ResNet-50 for classification: Here, F(X,W) denotes the function performed by layers
with weights W on input X. The algorithm for the MobileNet
Z[L] = W [L] ∗ A[L − 1] + b[L] (4) V3 model is shown below,
The ResNet-50 fracture detection model, optimized for edge
devices, features a user-friendly graphical interface (GUI) Algorithm 1 Bone Fracture Detection using MobileNet V3
with real-time capabilities. This model employs ResNet-50 Require: X-ray images dataset for different body parts (El-
architecture, ensuring high accuracy in identifying bone bow, Hand, Shoulder)
fractures from X-ray images. Designed for edge devices, it 1: for each body part in [Elbow, Hand, Shoulder] do
operates efficiently with minimal computational resources. Its 2: images ← LoadImagesForBodyPart(body part)
intuitive GUI enables seamless navigation and easy input of 3: labels ← CategorizeLabels(images)
X-ray images for rapid fracture analysis. Offering real-time 4: train images, test images ← SplitTrainTest(images,
detection, this model provides quick and reliable diagnoses labels)
through an accessible interface, making it ideal for on-the-go 5: model ← InitializeMobileNetV3Large()
medical applications in clinical settings, ensuring prompt and 6: model ← AddLayersForFineTuning(model)
accurate fracture identification. 7: CompileModel(model)
8: TrainModel(model, train images)
The general equations used in MobileNetV3 for computa- 9: accuracy ← EvaluateModel(model, test images)
tions involve: 10: PlotAccuracyLoss(model)
5) Depthwise Separable Convolution: Utilizes the concept 11: SaveModel(model)
of residual connections, enhancing information flow within 12: end for
the network. Following pointwise convolution, the depthwise
convolution is illustrated in equations 5 and 6:
Fig. 4 displays a system that uses a MobileNetV3 model
Z[l] = W [l] ∗ A[l − 1] (5) to automatically diagnose elbow, hand, and shoulder bone
4
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 22,2024 at 07:18:35 UTC from IEEE Xplore. Restrictions apply.
fractures in X-rays. The model receives X-ray pictures as to its lightweight architecture. MobileNetV3 is designed to
input and outputs a classification of ”normal” or ”fractured”. be efficient in terms of computation, making it more suitable
The system’s speed and accuracy are intended to aid medical for resource-constrained environments like edge devices. Its
professionals in the prompt diagnosis of bone fractures. architecture allows for faster inference while maintaining
The MobileNetV3 fracture detection model, tailored for reasonable accuracy, making it easier to implement on devices
edge devices, boasts a user-friendly graphical interface with limited computational power compared to ResNet-50,
(GUI) designed for rapid bone fracture identification from which is deeper and computationally heavier. Overall, in
X-ray images. Utilizing MobileNetV3 architecture, it excels terms of deployment on edge devices, MobileNetV3 is often
in efficiency without compromising accuracy, ensuring perceived as more straightforward and efficient.
reliable diagnoses. Optimized for edge devices, it operates In a nutshell MobileNetV3 satisfies crucial efficiency targets
swiftly with minimal computational resources. Its intuitive regarding real-time detection on inexpensive hardware with
GUI facilitates effortless X-ray image input and seamless tiny on-device models essential for clinical translation, while
navigation, enabling quick fracture analysis. This real-time ResNet50 maintains higher accuracy. Rapid inference directly
model offers efficient and accurate fracture detection through at the image source can enable doctors in early diagnosis to
an accessible interface, ideal for swift diagnoses in clinical improve patient outcomes with accurate triaging and prompt
settings, optimizing the process of identifying bone fractures. therapy.
We conducted our work on Colab, utilizing a T4 GPU
B. Training and Validation for processing. The outcomes of our implementation will be
shown in the next section.
The models are trained using the training set, where the
neural networks learn the patterns and features indicative
of normal and fractured bones. Exploration of optimal IV. R ESULTS
hyperparameters, including learning rate, batch size, and Three distinct bone segments such as Elbow , Hand and
optimizer settings, to enhance model performance. Evaluation Shoulder are included in the MURA dataset, which is a
of the trained models using the validation set to assess their collection of musculoskeletal radiographs, as seen in fig. 5.
accuracy, precision, recall, and F1-score for fracture detection
across each anatomical segment.
C. Performance Analysis
The analysis of accuracy metrics between ResNet-50
and MobileNetV3 across distinct anatomical segments. It
encompasses the calculation and comparison of these metrics
to unveil the efficacy of both architectures in identifying
fractures in anatomical segments like the elbow, hand,
and shoulder. Additionally, comprehensive visualization of
training progress, including accuracy and loss curves, is
conducted across multiple epochs. This visualization aids in
assessing model convergence and discerning performance
trends, crucial for understanding how these architectures
perform and converge over the training period.
5
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 22,2024 at 07:18:35 UTC from IEEE Xplore. Restrictions apply.
TABLE II
C OMPARISON OF R ES N ET-50 ACCURACIES ON HAND IMAGES
TABLE III
C OMAPRISON OF VALIDATION ACCURACY
TABLE V
C OMAPRISON OF T RAINING ACCURACY
6
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 22,2024 at 07:18:35 UTC from IEEE Xplore. Restrictions apply.
Fig. 10. Fracture is not detected in elbow’s X-ray image
R EFERENCES
[1] K. Dimililer, “Ibfds: Intelligent bone fracture detection system,” Proce-
dia Computer Science, vol. 120, pp. 260–267, 2017. Available online
14 December 2017, Version of Record 14 December 2017.
[2] D. Bhatt, C. Patel, H. Talsania, J. Patel, R. Vaghela, S. Pandya, K. Modi,
and H. Ghayvat, “Cnn variants for computer vision: History, architecture,
application, challenges and future scope,” 2021. Submission received:
2 September 2021 / Revised: 24 September 2021 / Accepted: 25
September 2021 / Published: 11 October 2021.
[3] B. Y. Panchal, B. Talati, S. Shah, and S. Patel, “Bone fracture classi-
Fig. 9. Fracture detected in elbow’s X-ray image
fication using modified alexnet,” Stochastic Modeling & Applications,
vol. 26, p. 10, June 2022. Special Issue.
[4] K. B. S. Kiran and B. Satyasaivani, “Bone fracture detection using
Building a MobileNetV3-based fracture detection model for convolutional neural networks,” 2022.
7
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 22,2024 at 07:18:35 UTC from IEEE Xplore. Restrictions apply.
[5] Y. L. Thian, Y. Li, P. Jagmohan, D. Sia, V. E. Y. Chan, and R. T.
Tan, “Convolutional neural networks for automated fracture detection
and localization on wrist radiographs,” Radiology: Artificial Intelligence,
vol. 1, no. 1, 2019.
[6] L. Alzubaidi, J. Zhang, A. J. Humaidi, A. Al-Dujaili, Y. Duan, O. Al-
Shamma, J. Santamarı́a, M. A. Fadhel, M. Al-Amidie, and L. Farhan,
“Review of deep learning: Concepts, cnn architectures, challenges,
applications, future directions,” vol. 8, March 2021.
[7] U. Kulkarni, M. S.M., S. V. Gurlahosur, and G. Bhogar, “Quantization
friendly mobilenet (qf-mobilenet) architecture for vision based applica-
tions on embedded platforms,” Journal Name, 2020. Article history:
Received 24 June 2020; Received in revised form 10 November 2020;
Accepted 23 December 2020; Available online 29 December 2020.
[8] S. Qian, C. Ning, and Y. Hu, “Mobilenetv3 for image classification,”
2021.
[9] J. Huang, L. Mei, M. Long, Y. Liu, W. Sun, X. Li, H. Shen, F. Zhou,
X. Ruan, D. Wang, S. Wang, T. Hu, and C. Lei, “Bm-net: Cnn-based
mobilenet-v3 and bilinear structure for breast cancer detection in whole
slide images,” Bioengineering, vol. 9, no. 6, p. 261, 2022. Original
submission received: 31 January 2022 / Resubmission received: 24 April
2022 / Revised: 15 June 2022 / Accepted: 15 June 2022 / Published: 20
June 2022.
[10] U. Kulkarni, M. S.M., S. V. Gurlahosur, and G. Bhogar, “Quantization
friendly mobilenet (qf-mobilenet) architecture for vision-based applica-
tions on embedded platforms,” January 2021. Received 24 June 2020,
Revised 10 November 2020, Accepted 23 December 2020, Available
online 29 December 2020, Version of Record 8 January 2021.
[11] J. Ying, H. Wang, J. Liu, T. Yu, and D. Huang, “Harnessing resnet50
and senet for enhanced ankle fracture identification,” 2023.
[12] “A deep learning-based method for the diagnosis of vertebral fractures
on spine mri: Retrospective training and validation of resnet,” vol. 31,
January 2022. Published on 28 January 2022.
[13] A.Solovyova and I.Solovyov, “X-ray bone abnormalities detection using
mura dataset,” arXiv preprint arXiv:2008.03356, 2020.
[14] S. Mohapatra, N. Abhishek, D. Bardhan, A. A. Ghosh, and S. Mohanty,
“Comparison of mobilenet and resnet cnn architectures in the cnn-
based skin cancer classifier model,” in Book Title (S. N. Mohanty,
G. Nalinipriya, O. P. Jena, and A. Sarkar, eds.), 2021. First published:
12 April 2021.
[15] I. B. Venkateswarlu, J. Kakarla, and S. Prakash, “Face mask detection
using mobilenet and global pooling block,” in 2020 IEEE 4th Conference
on Information & Communication Technology (CICT), IEEE, 2020.
[16] K. Üreten, H. F. Sevinç, U. İğdeli, A. Onay, and Y. Maraş, “Use of
deep learning methods for hand fracture detection from plain hand
radiographs,”
8
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 22,2024 at 07:18:35 UTC from IEEE Xplore. Restrictions apply.