0% found this document useful (0 votes)
11 views9 pages

Ieee

This paper presents an optimized food image classification framework utilizing EfficientNet-B7, achieving approximately 85% accuracy on the Food-101 dataset while significantly reducing computational costs through mixed precision training and data prefetching. The proposed system enhances efficiency and accuracy, making it suitable for real-time applications in dietary monitoring and health analytics. Future work will focus on integrating multi-modal data and model compression techniques for deployment on resource-constrained devices.

Uploaded by

offflow48
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views9 pages

Ieee

This paper presents an optimized food image classification framework utilizing EfficientNet-B7, achieving approximately 85% accuracy on the Food-101 dataset while significantly reducing computational costs through mixed precision training and data prefetching. The proposed system enhances efficiency and accuracy, making it suitable for real-time applications in dietary monitoring and health analytics. Future work will focus on integrating multi-modal data and model compression techniques for deployment on resource-constrained devices.

Uploaded by

offflow48
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Optimizing Food Image Classification Using

Advanced Pre-Trained Networks


G.Prasad Babu Dr.R.G.Kumar Syed Vaseem Basha
Assistant Professor Associate Professor UG student
Department of CSE Department of CSE Department of CSE
Siddharth Institute of Engineering Siddharth Institute of Engineering Siddharth Institute of Engineering
& Technology & Technology & Technology
Puttur, AP, India Puttur, AP, India Puttur, AP, India
[email protected] [email protected] [email protected]

Anshu Kumar Raj Akshay Kumar Singh Chalampalem Vasu Deva


UG student UG student UG student
Department of CSE Department of CSE Department of CSE
Siddharth Institute of Engineering Siddharth Institute of Engineering Siddharth Institute of Engineering
& Technology & Technology & Technology
Puttur, AP, India Puttur, AP, India Puttur, AP, India
[email protected] akshaysing975@ gmail.com [email protected]

ABSTRACT comprising 101 food categories with 101,000 images,


utilizes EfficientNet-B7 to achieve an improved
The food image classification has become an important
accuracy of 84.6% within just 5-9 epochs, including
area of research due to the massive number of food-
improved efficiency by reducing training time without
related images shared across social media and the
compromising accuracy. Its lightweight architecture
internet. Existing multiple deep pre-trained CNN
ensures faster processing, making it ideal for real-time
architectures, with Xception achieving the highest
applications. This method provides a cost-effective,
accuracy of 84.54%. These models required high
scalable, and highly accurate solution for food image
computational resources, significant GPU costs, and
classification, surpassing limitations of previous
prolonged training epochs. To overcome these
approaches.
challenges, this project developed on optimizing food
image classification by employing EfficientNet-B7, a Keywords: Transfer Learning, EfficientNet-B7, Fine-
pre-trained model on ImageNet.The Food-101 dataset, Tuning, Deep Learning, Food Image Classification.

I. INTRODUCTION

In recent years, a heightened public focus on health recognition and detection tasks by automatically
and nutrition has emerged, fuelled by the widespread learning hierarchical features, thus effectively
availability of food-related content on social media reducing the high dimensionality of image data
and the internet. This surge in visual food data has without losing crucial information [3, 4]. Unlike
created both opportunities and challenges for traditional approaches that depended on manually
automated image classification systems [1]. The engineered features, modern CNN-based solutions
complexity of this task increases with the sheer have consistently outperformed earlier methods [5,
number of food categories, making robust and 6].
efficient classification methods essential [2].
A key advancement in this field is transfer learning,
Deep learning, particularly through Convolutional where models pre-trained on large-scale datasets
Neural Networks (CNNs), has revolutionized image like ImageNet are adapted for specialized tasks, such
as food classification [7, 8, 9]. Notable architectures In [3], Wu and Jiang extended transfer learning
including Inception-v3 [10], Xception [11], methods by leveraging very deep pre-trained
MobileNet [12], EfficientNets [13], and DenseNet networks for food classification tasks. Their findings
[14] have each contributed unique strengths to the underscore the effectiveness of cross-domain
domain, balancing computational efficiency with knowledge transfer from large-scale datasets like
performance. Among these, EfficientNet-B7 stands ImageNet, which enhances performance on
out due to its superior accuracy, scalability, and specialized tasks such as food image classification.
optimized computational efficiency, making it a
powerful choice for fine-grained image
In [4], the concept of compound scaling in
classification tasks such as food recognition.
EfficientNet was introduced, which balances
EfficientNet-B7 employs a compound scaling network depth, width, and resolution to achieve
approach that adjusts depth, width, and resolution state-of-the-art performance. This study highlights
simultaneously, achieving state-of-the-art accuracy the benefits of EfficientNet-B7, which delivers high
while maintaining manageable computational accuracy with fewer parameters and improved
requirements. This enables it to outperform deeper computational efficiency.
architectures while using significantly fewer
parameters, making it particularly useful for real-
In [5], comparative studies among architectures—
world applications requiring both high accuracy and
including Xception, MobileNet, and DenseNet—
efficiency.
reveal that while Xception provides robust accuracy
This paper builds on these developments by and MobileNet offers lightweight solutions,
exploring the potential of state-of-the-art pre-trained EfficientNet-B7 strikes an optimal balance between
networks for food image classification, with a performance and efficiency. Additional techniques
particular focus on optimizing efficiency and such as data augmentation, mixed precision training,
accuracy using EfficientNet-B7. and prefetching have been demonstrated to further
enhance model generalization and robustness in
II. LITERATURESURVEY
food image classification.
In [1], Bossard et al. introduced the Food-101
III. PROPOSED SYSTEM
dataset, exploring the application of discriminative
patch extraction to manage the diverse and complex The proposed system integrates EfficientNet-B7
nature of food images. The study emphasizes the within an advanced training pipeline that employs
challenges inherent in food classification due to high transfer learning, mixed precision training, and data
intra-class variability and noisy backgrounds. prefetching to maximize both accuracy and
efficiency.

In [2], Chen et al. employed deep convolutional A. System Overview and Implementation
neural networks for food image recognition,
Dataset: The Food-101 dataset, consisting of
demonstrating that fine-tuning pre-trained models
101,000 images across 101 categories, is split into
on domain-specific data markedly improves
training and validation subsets.
classification accuracy compared to traditional
feature engineering approaches.
Image Pre-processing: Standard resizing, B. Network Architecture and Performance
normalization, and extensive data augmentation Analysis
(e.g., random rotations, flips, brightness EfficientNet-B7 is the core of our approach,
adjustments) are applied to improve model employing a compound scaling strategy that adjusts
generalization. network depth, width, and input resolution
simultaneously.
Transfer Learning: EfficientNet-B7 is initialized
with ImageNet weights, enabling the model to Key Architectural Features:
leverage pre-learned features before fine-tuning on
Mobile Inverted Bottlenecks: Efficiently capture
Food-101.
features while keeping computational costs low.
Training Pipeline:
Squeeze-and-Excitation Modules: Dynamically
1.Batch Processing & Prefetching: Utilizing recalibrate channel-wise feature responses to
TensorFlow’s tf.data pipelines, data batches are enhance representation.
prefetched to reduce I/O latency and maximize GPU
The proposed model achieves approximately ~85.%
utilization.
Top-1 accuracy on the Food-101 validation set,
2.Mixed Precision Training: TensorFlow’s mixed outperforming traditional architectures such as
precision API reduces memory usage and ResNet101 and Xception.
accelerates computation without sacrificing
The integration of mixed precision training and data
accuracy.
prefetching results in a 30–40% reduction in training
3.Hyperparameter Optimization: An adaptive time, while maintaining efficient resource
learning rate scheduler (via the Adam optimizer) is utilization—critical for real-world deployments.
employed, with checkpoints and TensorBoard
logging for real-time monitoring.

Figure 2 illustrates the scaling performance of


EfficientNet variants, emphasizing the trade-offs
Figure 1 shows the overall architectural diagram of between model complexity and accuracy. “Model
the proposed system, depicting the flow from data Size vs. ImageNet Accuracy” All numbers are for
pre-processing through to classification. single-crop, single-model. Our EfficientNets

Autoencoders significantly outperform other ConvNets. In


particular, EfficientNet-B7 achieves new state-of-
the-art 84.3% top-1 accuracy but being 8.4x smaller
and 6.1x faster than GPipe. EfficientNet-B1 is 7.6x rate scheduler, ensuring stable convergence and
smaller and 5.7x faster than ResNet-152. preventing overfitting.

IV. RESULT AND DISCUSSION

The proposed EfficientNet-B7 model was trained on


the Food-101 dataset for 6 epochs using mixed
precision and prefetching on a Tesla T4 GPU
(Google Colab). To optimize computational
resources, mixed precision training was employed.
This approach uses half-precision (float16) for most Figure 4 Fine-tuning the model

operations while maintaining full precision (float32) Training Performance: The model was trained for
for essential calculations like model updates. As a multiple epochs, demonstrating rapid convergence
result, training throughput increased by due to pre-trained ImageNet weights and an
approximately 30–40%, significantly reducing the optimized learning strategy. The initial epochs (1–3)
total training time without compromising model showed progressive improvement in accuracy, with
accuracy. the validation accuracy reaching 84.15% by epoch
4. As the learning rate adjusted dynamically, the
accuracy continued improving, stabilizing at
84.86% in later epochs (Fig. 4).

Mixed precision training significantly reduced


training time while maintaining model accuracy. The
Figure 3 prefetching and multithreaded data loading. use of prefetching and multi-threaded data loading
further optimized GPU utilization (fig 3), ensuring
Through implementing prefetching &
minimal computational bottlenecks.
multithreading we will minimized idle GPU time by
preparing subsequent batches in parallel while the B. Final Model Evaluation
GPU processed the current batch. This seamless
The fine-tuned model achieved a final test accuracy
pipeline further enhanced training efficiency, crucial
of ~85%, indicating strong generalization to unseen
for large-scale datasets and limited computational
data. The loss value of 0.6857 confirms stable
resources.
convergence without significant overfitting. The
If you didn't use mixed precision or use techniques results highlight the effectiveness of EfficientNet-
such as prefetch() in the Batch & prepare B7 when combined with modern deep learning
datasets section, your model fine-tuning probably optimizations.
takes up to 2.5-3x longer per epoch.

The EfficientNet-B7 model was fine-tuned on the


Food-101 dataset, leveraging transfer learning and
mixed precision training to optimize performance
and efficiency. The training process was carefully Figure 5 Evaluation of test data
controlled using a ReduceLROnPlateau learning
V. CONCLUSION words for classification of imbalanced image
datasets." In Recent Trends in Image Processing and
This paper presented an optimized food image
Pattern Recognition: Second International
classification framework that leverages
Conference, RTIP2R 2018, Solapur, India,
EfficientNet-B7 within a transfer learning paradigm.
December 21–22, 2018, Revised Selected Papers,
By incorporating mixed precision training and data
Part I 2, pp. 561- 571. Springer Singapore, 2019.
prefetching, the proposed system achieves high
classification accuracy (approximately 85%) with
[6] Susan, Seba, and Jatin Malhotra. "Recognising
reduced computational costs. The experimental
devanagari script by deep structure learning of
results validate the efficacy of the integrated
image quadrants." DESIDOC Journal of Library &
approach, highlighting its potential for real-time
Information Technology 40, no. 5 (2020): 268-271.
applications in dietary monitoring and health
analytics. Future research will explore multi-modal [7] Simonyan, Karen, and Andrew Zisserman. "Very
data integration and model compression techniques deep convolutional networks for large-scale image
to facilitate deployment on resource-constrained recognition." arXiv preprint arXiv:1409.1556
devices. (2014).

REFERENCES
[8] Susan, Seba, Dhaarna Sethi, and Kriti Arora.
"Cross-domain learning for pulmonary nodule
[1] Ilyukhin, Sasha V., Timothy A. Haley, and
detection using Gestalt principle of similarity." Soft
Rakesh K. Singh. "A survey of automation practices
Computing (2023): 1-12.
in the food industry." Food control 12, no. 5 (2001):
285-296.
[9] Saini, Manisha, and Seba Susan. "Cervical
Cancer Screening on Multiclass Imbalanced
[2] Bruno, Vieira, Silva Resende, and Cui Juan. "A
Cervigram Dataset using Transfer Learning." In
survey on automated food monitoring and dietary
2022 15th International Congress on Image and
management systems." Journal of health & medical
Signal Processing, BioMedical Engineering and
informatics 8, no. 3 (2017).
Informatics (CISP-BMEI), pp. 1-6. IEEE, 2022.
[3] LeCun, Yann, Yoshua Bengio, and Geoffrey
[10] Szegedy, Christian, Vincent Vanhoucke, Sergey
Hinton. "Deep learning." nature 521, no. 7553
Ioffe, Jon Shlens, and Zbigniew Wojna. "Rethinking
(2015): 436-444.
the inception architecture for computer vision." In
[4] Susan, Seba, and Seema Chandna. "Object Proceedings of the IEEE conference on computer
recognition from color images by fuzzy vision and pattern recognition, pp. 2818-2826. 2016.
classification of gabor wavelet features." In 2013 5th
[11] Chollet, François. "Xception: Deep learning
International Conference and Computational
with depthwise separable convolutions." In
Intelligence and Communication Networks, pp. 301-
Proceedings of the IEEE conference on computer
305. IEEE, 2013.
vision and pattern recognition, pp. 1251-1258. 2017.
[5] Saini, Manisha, and Seba Susan. "Comparison of
[12] Howard, Andrew G., Menglong Zhu, Bo Chen,
deep learning, data augmentation and bag of-visual-
Dmitry Kalenichenko, Weijun Wang, Tobias
Weyand, Marco Andreetto, and Hartwig Adam. Networks for Pulmonary Nodule Detection." In
"Mobilenets: Efficient convolu-tional neural 2020 IEEE 15th International Conference on
networks for mobile vision applications." arXiv Industrial and Information Systems (ICIIS), pp. 168-
preprint arXiv:1704.04861 (2017). 173. IEEE, 2020.

[13] Tan, Mingxing, and Quoc Le. "Efficientnet: [20] Pan, Lili, Samira Pouyanfar, Hao Chen,
Rethinking model scaling for convolution-al neural Jiaohua Qin, and Shu-Ching Chen. "Deepfood:
networks." In International conference on machine Automatic multi-class classification of food
learning, pp. 6105-6114. PMLR, 2019. ingredients using deep learning." In 2017 IEEE 3rd
international conference on collaboration and
[14] Iandola, Forrest, Matt Moskewicz, Sergey internet computing (CIC), pp. 181-189. IEEE, 2017.
Karayev, Ross Girshick, Trevor Darrell, and Kurt
Keutzer. "Densenet: Implementing efficient convnet [21] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey
descriptor pyramids." arXiv preprint E. Hinton. "Imagenet classification with deep
arXiv:1404.1869 (2014). convolutional neural networks." Communications of
the ACM 60, no. 6 (2017): 84-90.
[15] Tao, Huawei, Li Zhao, Ji Xi, Ling Yu, and Tong
Wang. "Fruits and vegetables recognition based on [22] Jia, Yangqing, Evan Shelhamer, Jeff Donahue,
color and texture features." Transactions of the Sergey Karayev, Jonathan Long, Ross Girshick,
Chinese Society of Agricultural Engineering 30, no. Sergio Guadarrama, and Trevor Darrell. "Caffe:
16 (2014): 305- 311. Convolutional architecture for fast feature
embedding." In Proceedings of the 22nd ACM
[16] Bossard, Lukas, Matthieu Guillaumin, and Luc international conference on Multimedia, pp. 675-
Van Gool. "Food-101– mining discriminative 678. 2014.
components with random forests." In Computer
Vision–ECCV 2014: 13th European Conference, [23] He, Kaiming, Xiangyu Zhang, Shaoqing Ren,
Zurich, Switzerland, September 6-12, 2014, and Jian Sun. "Deep residual learning for image
Proceedings, Part VI 13, pp. 446-461. Springer recognition." In Proceedings of the IEEE conference
International Publishing, 2014. on computer vision and pat-tern recognition, pp.
770-778. 2016.
[17] Zheng, Jiannan, Liang Zou, and Z. Jane Wang.
"Mid‐level deep Food Part mining for food image [24] Yanai, Keiji, and Yoshiyuki Kawano. "Food
recognition." IET Computer Vision 12, no. 3 (2018): image recognition using deep convolutional network
298-304. with pre-training and fine-tuning." In 2015 IEEE
International Conference on Multimedia & Expo
[18] Zhou, Lei, Chu Zhang, Fei Liu, Zhengjun Qiu, Workshops (ICMEW), pp. 1-6. IEEE, 2015.
and Yong He. "Application of deep learning in food:
a review." Comprehensive reviews in food science [25] VijayaKumari, G., Priyanka Vutkur, and P.
and food safety 18, no. 6 (2019): 1793-1811. Vishwanath. "Food classification using transfer
learning technique." Global Transitions Proceedings
[19] Sethi, Dhaarna, Kriti Arora, and Seba Susan. 3, no. 1 (2022): 225-229.
"Transfer Learning by Deep Tuning of Pre-trained
[26] Yadav, Sapna, and Satish Chand. "Automated
food image classification using deep learning
approach." In 2021 7th International Conference on
Advanced Computing and Communication Systems
(ICACCS), vol. 1, pp. 542-545. IEEE, 2021.

[27] Gallo, Ignazio, Gianmarco Ria, Nicola Landro,


and Riccardo La Grassa. "Image and text fusion for
upmc food-101 using bert and cnns." In 2020 35th
International Conference on Image and Vision
Computing New Zealand (IVCNZ), pp. 1-6. IEEE,
2020.

[28] Min, Weiqing, Shuqiang Jiang, Linhu Liu, Yong


Rui, and Ramesh Jain. "A survey on food
computing." ACM Computing Surveys (CSUR) 52,
no. 5 (2019): 1-36.

[29] Kaur, Rajdeep, Rakesh Kumar, and Meenu


Gupta. "Deep neural network for food image
classification and nutrient identification: A
systematic review." Reviews in Endocrine and
Metabolic Disorders (2023): 1-21.

[30] J. Wu and Z. Jiang, “Transfer Learning Using


Very Deep Pre-Trained Models for Food Image
Classification,” IEEE Xplore, 2023.

[31] Sahoo, Doyen, Wang Hao, Shu Ke, Wu


Xiongwei, Hung Le, Palakorn Achananuparp, Ee-
Peng Lim, and Steven CH Hoi. "FoodAI: Food
image recognition via deep learning for smart food
logging." In Proceedings of the 25th ACM SIGKDD
International Conference on Knowledge Discovery
& Data Mining, pp. 2260-2268. 2019.
[9] K. Celik, ‘‘Impact of the FATF recommendations
and their implementation on financial inclusion:
Insights from mutual evaluations and national risk
assessments,’’ World Bank Group, USA, 2021.

[10] S. D. Jayasekara, ‘‘Challenges of implementing


an effective risk-based supervision on anti-money
laundering and countering the financing of terrorism
under the 2013 FATF methodology,’’ J. Money
Laundering Control, vol. 21, no. 4, pp. 601–615,
Oct. 2018.

You might also like