D Merged
D Merged
D Merged
BACHELOR OF ENGINEERING
In
CERTIFICATE
This is to certify that mini project work entitled “CALCULATION OF OBJECT
DIMENSION” carried out by LEKHANA D GOWDA (1TJ21CS042) in partial fulfilment
for sixth semester of Bachelor of Engineering in Computer Science of Visvesvaraya
Technological University, Belagavi during the academic year 2023-2024. It is certified that
all the corrections/suggestions indicated for the internal assessment have been incorporated in
the report deposited in the departmental library. The mini project report has been approved as
it satisfies the academic requirements in respect of mini project work as prescribed for the
said degree.
I, Lekhana D Gowda (1TJ21CS042), sixth semester student hereby declare that the mini
project entitled, “Calculation of object Dimensions”, has been carried out and submitted by
me as partial fulfilment of sixth semester of Bachelor of Engineering in Computer Science
and Engineering, Visvesvaraya Technological University (VTU), during the academic year
2023- 2024. I also declare that, to the best of my knowledge and belief, the work reported
here is accepted and satisfied.
Lekhana D Gowda
1TJ21CS042
ii
ACKNOWLEDGEMENT
The project report on “Calculation of object Dimensions” is the outcome of guidance, moral
support and knowledge imparted on us, throughout our work. For this we acknowledge and
express immense gratitude to all those who have guided and supported us during the
preparation of this mini project.
We take this opportunity to express our gratefulness to everyone who has extended their support
for helping us in the mini project completion.
First and foremost, we thank Dr. Thomas P. John, Chairman of T. John Group of
Institutions and Dr. Suresh Venugopal, Principal, T. John Institute of Technology for
giving us this opportunity to study in this prestigious institute and also providing us with best
of facilities.
We would like to show our greatest appreciation to Dr. Suma R, HOD, Dept of CSE and,
Mrs. Tabassum Fatima, Mini Project Guide, Dept. of CSE and Guide for constantly
guiding us throughout the project.
We would also like to thank to all teaching and non-teaching staff of Computer Science and
Engineering Department for directly or indirectly helping us in completion of Mini Project.
Lastly and most importantly we convey our gratitude to our parents who have been the source
of inspiration and also for instrumental help in successful completion of project.
Lekhana D Gowda
1TJ21CS042
iii
ABSTRACT
This project focuses on the development of an object detection system using the YOLOv8 (You
Only Look Once) model to calculate the dimensions of objects and count the number of
detected objects in images. The primary objective is to create a robust and efficient pipeline
that includes dataset preparation, model training, and model inference.
The project begins with the annotation of a dataset containing images with labelled boxes using
the LabelImg tool. The annotated dataset is then split into training and validation sets. A
configuration file is created to define the dataset paths and class names. The YOLOv8 model
is trained on this dataset using PyTorch and Ultralytics libraries, resulting in a custom-trained
model capable of detecting boxes.
The trained model is used to perform inference on new images, detecting and counting the
number of boxes. The Python code for model inference utilizes OpenCV to process the images,
apply edge detection, and identify contours that represent boxes. The contours are filtered based
on area, aspect ratio, and hierarchy to ensure accurate detection. The final output includes the
number of detected boxes and the dimensions of each box, displayed on the processed image.
Applications of this project span various domains, including automated quality control in
manufacturing, logistics and inventory management, surveillance and security, and retail. The
project concludes with a discussion on future enhancements, such as multi-class detection, real-
time processing, improved accuracy through advanced data augmentation, and deployment in
web or mobile applications.
This project demonstrates a comprehensive approach to object detection, leveraging the power
of YOLOv8 and OpenCV to develop a scalable solution for real-world applications.
iv
TABLE OF CONTENTS
SL NO. CHAPTER PAGE
NO.
CERTIFICATE i
DECLARATION ii
ACKNOWLEDGEMENT iii
ABSTRACT iv
1 OVERVIEW 01
2 INTRODUCTION 02
3 EXISTING SYSTEM 04
4 PROPOSED SYSTEM 06
5 SYSTEM REQUIREMENTS 08
7 MODULE DESCRIPTION 14
8 RESULTS 17
CONCLUSION 22
FUTURE SCOPE 23
REFERENCES 24
FIGURES
8.1 #main.py 17
8.4 #img.py 20
Calculation of Object Dimensions
CHAPTER 1
OVERVIEW
1.1 Tiff Lab, An Overview
TIFF Labs specializes in industrial automation and data analytics, aiming to enhance
operational efficiency and reduce downtime. We offer advanced automation systems, AI-driven
predictive maintenance, and real-time monitoring solutions. Our core services include data
ingestion, time series analysis, anomaly detection, and machine learning model training.
Serving industries like manufacturing and energy, we focus on providing customized, impactful
solutions. Our commitment to research and innovation ensures we stay at the forefront of
technology, delivering cutting-edge solutions to our clients.
CHAPTER 2
INTRODUCTION
In the realm of computer vision, the ability to accurately measure object dimensions from
images is a crucial capability with a wide range of applications. From industrial automation and
quality control to medical imaging and augmented reality, understanding the size and scale of
objects within an image allows for numerous practical implementations. This task, while
seemingly straightforward, involves a complex interplay of advanced image processing
techniques, machine learning algorithms, and robust data annotation tools. In this
comprehensive introduction, we will delve into the intricacies of object dimension calculation,
focusing on the synergistic use of Ultralytics, Roboflow, and state-of-the-art image processing
methodologies.
The field of computer vision has experienced transformative advancements in recent years,
enabling machines to perceive and interpret visual information with unprecedented accuracy.
One of the most compelling applications of this technology is the calculation of object
dimensions within images. By combining sophisticated image processing techniques with
cutting-edge machine learning frameworks such as Ultralytics YOLOv8 and data annotation
tools like Roboflow, we can achieve precise and efficient object dimension measurements.
The calculation of object dimensions using image processing and advanced machine learning
models is a dynamic and evolving field. By leveraging the capabilities of Ultralytics' YOLOv8
and Roboflow's annotation and management tools, we can achieve unprecedented accuracy and
efficiency.
1. Accurate Measurement
Achieve high accuracy in determining the dimensions of objects within various types of
images, ensuring reliability across different scenarios and conditions.
2. Real-Time Processing:
Develop a system capable of performing dimension calculations in real-time, making it
suitable for applications that require immediate analysis, such as autonomous vehicles and
robotic systems.
3. Scalability:
Create a scalable solution that can be deployed across diverse industries and applications,
from manufacturing and quality control to healthcare and augmented reality.
4. User-Friendly Interface:
Ensure the system is easy to use, with streamlined workflows for data annotation, model
training, and deployment, facilitated by integration with tools like Roboflow.
5. Versatility and Adaptability:
Design the system to be adaptable to different object types and sizes, with the flexibility to
handle a variety of image sources and formats.
By achieving these objectives, the project aims to provide a comprehensive solution for object
dimension calculation that can be utilized in a wide range of practical applications, enhancing
accuracy, efficiency, and usability in the field of computer vision.
CHAPTER 3
EXISTING SYSTEM
The existing system for calculating object dimensions using Ultralytics' YOLOv8 involves
leveraging pre-trained models and a streamlined training process.The system is designed to be
user-friendly, enabling efficient model training and object detection. Below is an overview of
the existing setup, highlighting its components and workflow
2. Dataset: The system uses a dataset located in a specified directory. This dataset includes
images and corresponding annotations that indicate the objects and their boundaries within
each image. The dataset is crucial for training the model to accurately detect and measure
object dimensions.
3. Training Process: The model training process involves using the YOLOv8 model and the
annotated dataset. The training parameters, such as the number of epochs and image size,
are configured to optimize the model's performance. During training, the model learns to
identify objects and their dimensions from the provided images and annotations
Preprocessing: The current system may need enhanced preprocessing steps to handle
accurate scale factors, which may not be fully addressed in the existing system.
Real-Time Processing: While the system is efficient, further optimization might be needed
By addressing these limitations and building on the existing capabilities, the project aims to
create a more robust and versatile solution for calculating object dimensions using advanced
image processing and machine learning techniques.
CHAPTER 4
PROPOSED SYSTEM
The proposed system aims to enhance and extend the existing framework for calculating object
dimensions from images using advanced image processing techniques and state-of-the-art
machine learning models. The improvements will focus on achieving higher accuracy, real-time
processing capabilities, scalability, and user-friendly interfaces. The following sections outline
the key components and workflow of the proposed system.
1. Image Input: Users can upload images through the GUI or an API endpoint. The system
will support various image formats and resolutions to ensure compatibility with different
use cases.
2. Preprocessing: Uploaded images will undergo preprocessing to enhance quality. This
includes steps like noise reduction, contrast enhancement, and normalization to prepare
the images for object detection.
3. Object Detection: The preprocessed images will be fed into the improved YOLOv8
model (or an ensemble of models) for object detection. The model will output bounding
boxes and class labels for detected objects.
4. Dimension Calculation: The system will calculate the dimensions of detected objects
based on the bounding box coordinates. This will involve converting pixel measurements
to real-world units using scale factors or calibration data.
5. Real-Time Feedback: Users will receive real-time feedback through the GUI, displaying
the detection results and calculated dimensions. The interface will provide options to
adjust parameters and reprocess images if needed.
6. Data Storage and Retrieval: Detection results and calculated dimensions will be stored
in a database for future reference and analysis. Users can retrieve historical data and
generate reports as required.
7. API Access: The system will offer API endpoints for programmatic access to the object
detection and dimension calculation functionalities. This will enable integration with
external systems and automated workflows.
Increased Accuracy: Enhanced preprocessing and improved detection models will lead
to more accurate object dimension calculations.
Real-Time Processing: Optimizations will enable real-time processing, making the
system suitable for applications that require immediate analysis.
Scalability: The system will be designed to handle large-scale deployments across
various industries, ensuring robustness and reliability.
User Accessibility: A user-friendly interface and comprehensive documentation will
make the system accessible to a broader audience, including non-technical users.
Integration Capabilities: The API will facilitate integration with other software
systems, enabling seamless workflows and interoperability.
CHAPTER 5
SYSTEM REQUIREMENTS
To successfully run and manage the object dimension calculation system described by the
provided scripts (main.py, sub.py, img.py), you'll need to meet the following system
requirements:
1. Processor
A modern multi-core CPU (e.g., Intel i5/Ryzen 5 or better) to handle training and inference
tasks efficiently.
For real-time processing and faster training, a GPU is highly recommended (e.g., NVIDIA
GeForce GTX 1060 or better).
2. Memory
16 GB or more RAM recommended for handling larger datasets and complex models.
3. Storage
Sufficient disk space for storing datasets, model weights, and results. A minimum of 10 GB
of free space is recommended, with more space required as datasets and models grow.
For accelerated model training and inference, a compatible NVIDIA GPU with CUDA
support is recommended. Examples include NVIDIA GTX 1660, RTX 2060, or better.
1. Operating System
Windows 10 or later, or a Linux distribution (e.g., Ubuntu 20.04 or later). Ensure the OS
supports necessary drivers and software libraries.
2. Python
Python 3.8. The YOLOv8 implementation and related libraries are compatible with Python
3.x.
3. Python Libraries
IPython (or Jupyter): For displaying images if using img.py in a notebook environment.
4. Development Environment:
An IDE or text editor such as PyCharm, VSCode, or Jupyter Notebook to write and execute
Python scripts.
Ensure CUDA toolkit and cuDNN are installed if using an NVIDIA GPU for training and
inference. Compatibility with the specific GPU model and PyTorch version is essential.
1. Dataset
The dataset should be properly organized and annotated in the specified directory. This
includes images and a data.yaml file for configuration.
2. Model Weights
Pre-trained model weights (yolov8s.pt) and the trained model weights (best.pt) should be
available in the specified directories.
Python libraries
By meeting these requirements, you ensure that the system can handle the object dimension
calculation tasks effectively, including model training, validation, and inference.
CHAPTER 6
2. Model Training
YOLOv8 Model: The core object detection model, loaded and trained using the
dataset.
3. Model Validation
Inference Pipeline: Uses the trained model to detect objects in new images and
calculates dimensions based on bounding box coordinates.
5. User Interface
Graphical User Interface (GUI): Allows users to interact with the system, upload
images, view results, and obtain dimension calculations.
Images are collected and annotated using tools like Roboflow. Annotated data is preprocessed
to prepare for model training.
2. Model Training
3. Model Validation
5. User Interface
Users interact with the system through a GUI or API to upload images, view results, and access
dimensions.
Results, model weights, and other data are stored in a database and file system for future access
and management.
CHAPTER 7
MODULE DESCRIPTION
The design and implementation of the object dimension calculation system involves several key
modules. Each module handles specific aspects of the system, ensuring a cohesive and efficient
workflow from data collection to result presentation. Below is a detailed description of each
module:
Description: This module is responsible for gathering, annotating, and preparing data for model
training and validation. It ensures that the dataset is properly formatted and labelled.
Components
Annotation Tool Integration: Use tools like Roboflow to label images with bounding
boxes and object classes.
Implementation
Annotation Tool Integration: API integration with Roboflow or similar tools to fetch
annotated data and manage labelling.
Implementation
Description: This module handles the training of the YOLOv8 model using the prepared
dataset. It involves configuring the model, training it, and saving the trained weights.
Components
Description: This module validates the trained model to assess its performance on unseen data.
It ensures that the model generalizes well and provides reliable results.
Components
Validation Pipeline: Execute the validation process and analyze performance metrics.
Description: This module is responsible for using the trained model to detect objects in new
images and calculate their dimensions based on bounding box coordinates.
Components
Inference Pipeline: Process new images through the YOLOv8 model to detect objects.
CHAPTER 8
RESULTS
#main.py
#img.py
Advantages
1. High Accuracy in Object Detection
YOLOv8 Model: Utilizes advanced YOLOv8 models known for their high accuracy and
speed in object detection. This ensures reliable identification of objects and precise
dimension calculations.
2. Scalability
Model Flexibility: The system can be scaled to accommodate different sizes of datasets and
adapted for various types of objects by retraining the model with specific data.
3. User-Friendly Interface
GUI/API: Provides an intuitive graphical user interface (GUI) for non-technical users and
a robust API for programmatic access, facilitating easy integration with other systems and
workflows.
Data Storage: Efficient management of datasets, model weights, and results ensures data
integrity and availability for future analysis or retraining.
5. Validation
Limitations
1. Hardware Dependency
GPU Requirement: Optimal performance and real-time processing often require a high-
performance GPU. Users without access to such hardware may experience slower
performance or longer training times.
Annotation Accuracy: The accuracy of the object detection model heavily relies on the
quality and accuracy of the annotations in the training dataset. Poor annotations can lead to
suboptimal model performance.
3. Complexity in Calibration
Object Classes: The model is trained to detect specific object classes. Detecting new or
unforeseen classes requires retraining the model with updated datasets.
5. Real-Time Constraints:
Processing Speed: While the system aims for real-time processing, actual performance may
vary depending on image complexity, resolution, and hardware capabilities.
By understanding these advantages and limitations, users and developers can better leverage the
system’s capabilities while addressing its constraints through proper configuration,
CONCLUSION
The object dimension calculation system using YOLOv8 offers a sophisticated approach to
accurately detecting and measuring objects in images. By integrating advanced object detection
capabilities with a user-friendly interface and robust data management, the system ensures high
precision and adaptability. Its scalability allows it to handle a range of datasets and object types,
making it versatile across various industries.
However, the system does have some limitations. It relies on high-performance hardware for
optimal performance, which might be a constraint for users without advanced GPUs. The
accuracy of the system also depends heavily on the quality of the training data and annotations,
emphasizing the need for meticulous dataset preparation. Additionally, converting pixel
measurements to real-world dimensions involves precise calibration, which can add complexity
to the setup.
Overall, the system provides an effective and reliable solution for object dimension calculation,
balancing cutting-edge technology with practical usability. By addressing its limitations and
continuously refining its components, users can fully leverage its capabilities for accurate and
real-time measurements in diverse applications.
The effectiveness of the object detection and dimension calculation is also contingent on the
quality of the training data and annotations; any deficiencies in the dataset can impact the overall
accuracy of the system. Furthermore, accurate conversion of pixel-based measurements to real-
world dimensions requires careful calibration, adding an extra layer of complexity to the system
setup.
FUTURE SCOPE
The object dimension calculation system has significant potential for future advancements. One
key area is integrating emerging technologies like augmented reality (AR), which could enable
real-time visualization of object dimensions in physical spaces, enhancing user interaction.
Additionally, incorporating advanced artificial intelligence (AI) techniques could improve
detection accuracy and handle more complex object types and environments.
Expanding the system's capabilities to support multi-class and multi-object detection, as well as
developing 3D object detection, would broaden its applicability. Automating the calibration
process and integrating high-precision sensors could further enhance measurement accuracy and
simplify the setup.
The system also has potential for industry-specific applications, such as in healthcare,
aerospace, or automotive sectors, and could benefit from integration with remote sensing
technologies and drones for large-scale measurements. Enhancing user experience through more
intuitive interfaces and real-time analytics would improve accessibility and engagement.
Finally, cloud-based solutions could facilitate scalable storage and processing, while edge
computing could enable faster real-time data processing. These advancements would ensure the
system remains at the forefront of technology, meeting evolving demands and expanding its
impact across various fields.
REFERENCES
https://link.springer.com/article/10.1007/s00542-019-04552-w
https://www.sciencedirect.com/science/article/pii/S0166361519305761
https://www.emerald.com/insight/content/doi/10.1108/IJQRM-12-2021-0439/full/html
https://link.springer.com/article/10.1007/s10489-020-01720-z
https://blog.roboflow.com/dimension-measurement/
https://stackoverflow.com/questions/36349870/find-the-dimensions-height-width-of-an-
object-using-camera
https://www.vision-doctor.com/en/optical-calculations/calculation-object-size.html
https://pyimagesearch.com/2016/03/28/measuring-size-of-objects-in-an-image-with-
opencv/