0% found this document useful (0 votes)
12 views33 pages

Computer Vision Project

The document is a project report for a Computer Vision Project submitted by Vraj P Diyora as part of his Bachelor of Engineering in Information Technology. It outlines the project's objectives, methodology, and technologies used, including OpenCV and Media Pipe for hand gesture recognition. The project aims to create an 'Air Canvas' that allows users to draw in the air using finger movements, which can assist in communication for deaf individuals.

Uploaded by

prashantamipara2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views33 pages

Computer Vision Project

The document is a project report for a Computer Vision Project submitted by Vraj P Diyora as part of his Bachelor of Engineering in Information Technology. It outlines the project's objectives, methodology, and technologies used, including OpenCV and Media Pipe for hand gesture recognition. The project aims to create an 'Air Canvas' that allows users to draw in the air using finger movements, which can assist in communication for deaf individuals.

Uploaded by

prashantamipara2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 33

COMPUTER VISION PROJECT

A PROJECT REPORT

Submitted by

VRAJ P DIYORA (221SBEIT30005)

In fulfillment for the award of the


degree of
BACHELOR OF ENGINEERING
in
INFORMATION TECHNOLOGY

LDRP Institute of Technology and Research, Gandhinagar

1
Kadi Sarva Vishwavidyalaya
JANUARY, 2022-2023

LDRP INSTITUTE OF TECHNOLOGY AND


RESEARCH GANDHINAGAR

CE-IT Department

CERTIFICATE
This is to certify that the Project Work entitled “Computer Vision Project” has been carried out
by DIYORA VRAJ PRAVINBHAI (221SBEIT30005) under my guidance in fulfilment of the
degree of Bachelor of Engineering in Information Technology Semester-6 of Kadi Sarva
Vishwavidyalaya University during the academic year 2022-2023.

Prof. Deepali Jain Prof. Mehul Barot

Internal Guide Head of the Department

LDRP ITR LDRP ITR

2
Presentation-I for Project-I

1. Name & Signature of Internal Guide

2. Comments from Panel Members

3. Name & Signature of Panel Members

3
ACKNOWLEDGEMENT

We take this opportunity to express our gratitude and thankfulness towards all those concerned
with our project.

Firstly, we are thankful to LDRP-ITR for undertaking this project. We are sincerely indebted to
Prof. Deepali Jain for giving us the opportunity to work on this project.

Her continuous guidance and help have proved to be the key to our collective success in
overcoming the challenges that we have faced during the project work. Her support made the
project making experience a pleasantly memorable one. Without her help at all stages in spite of
her own workload; the completion of the project would not have been possible.

We would also like to express our gratitude to our friends and of course, CE & IT Department of
LDRP-ITR.

Last but not least we are thankful to the almighty God and our parents for giving us such a good
atmosphere to work hard and to succeed.

Regards
DIYORA VRAJ PRAVINBHAI

(221SBEIT30005)

4
ABSTARCT

Writing in air has lately become one of the most interesting and difficult research areas in image
processing and pattern recognition. A wide range of human-machine interactions benefit from this
technology's inclusion. Numerous researches have investigated new methods and tactics for
reducing processing time and improving recognition accuracy. In Computer Vision, object tracking
is a crucial part of the task. Object tracking systems have become more popular as a result of faster
computers, more affordable and higher quality video cameras, and the increasing need for
automated video analysis. Typically, video analysis involves three major steps: identifying the
object, tracking its movement from frame to frame, and lastly analyzing its behavior. Object
tracking takes into account four issues: object representation, tracking feature selection, object
identification, and object tracking. Object tracking algorithms are utilized in a variety of real-world
applications, including video indexing, autonomous surveillance, and vehicle navigation. Motion-
to-text converters for intelligent wearable electronics that can write in the air are the subject of this
study. This initiative serves as a time capsule for transitory movements. It will use computer vision
to track the path of the finger. Messages, emails, etc. maybe be sent using the text that has been
generated. It will be a great help for deaf individuals to communicate. It's a good way to
communicate without having to write.

5
LIST OF FIGURES

NO NAME PAGE NO

1 Use case Diagram 16

2 Data Flow Diagram 17

3 E-R Diagram 19

4 Sequence Diagram 20

5 Activity Diagram 21

6 UML Class Diagram 22

6
TABLE OF CONTENTS

NO CHAPTER NAME PAGE NO

1 Introduction

1.1 Computer Vision Project

1.2 Aims and Objective of the work 9

1.3 Brief Literature Review

1.4 Problem definition

1.5 Plan of their work

2 Technology and Literature Review 12

3 System Requirements Study

3.1 User Characteristics 13

3.2 Hardware and Software Requirements

3.3 Assumptions and Dependencies

4 System Diagram 16

5 Software Testing 25

6 Data Dictionary 28

7
7 Result, Discussion and Conclusion 30

8 Future Enhancement 31

9 References 33

8
Chapter 1

Introduction

1.1 Computer Vision Project

Air canvas helps to draw on a screen just by waiving your finger fitted with a colorful point or a
simple colored cap. We will be using the computer vision techniques of OpenCV to build this
project. The preferred language is python due to its exhaustive libraries and easy to use syntax but
understanding the basics it can be implemented in any OpenCV supported language.

1.2 Aims and Objective of the work

The basic goal of Computer Vision Project is to map the coordinates of the user's pointer finger to
the screen, where colored circles are drawn and connected to simulate a crude brush stroke.

Air Canvas is developed for communicating a concept or presenting an idea in the real-world
space. Air canvas makes all easy to the user With COMPUTER VISION PROJECT we present the
thought of the increased reality material for data representation.

1.3 Brief Literature Review

The air canvas project is an innovative solution for hand gesture recognition and its use in various
applications. It is based on two popular technologies, OpenCV and Media Pipe. This review aims
to provide an overview of the project and its development.

OpenCV is an open-source computer vision library that is used for a variety of tasks, including
image and video processing, computer vision, and machine learning. It has a large community of
developers and users who contribute to its development and support.

Media Pipe is a high-performance machine learning framework developed by Google. It is used for
computer vision and other applications in which the processing of large amounts of data is
required. The framework includes a number of pre-trained models for a variety of tasks, such as
face detection and tracking, hand tracking, and object detection.

9
1.4 Problem definition

Ever thought, waiving your finger into the air can draw on a real canvas. How this air canvas in
Computer Vision Projects works.

1.5 Plan of their work

1. Study the reference paper well.

2. Check for other papers related to this topic.

3. Analyze other methods (if any found from related papers) which can be used in this project.

4. Install Python and start learning it.

5. Prepared the steps and algorithms.

6. Make a vague design of the project.

Work scheduled for coming time period.

1. Start the implementation.

2. Complete the implementation of each step

2.1 Understanding the HSV (Hue, Saturation, Value) color space for Color Tracking. And
tracking the small colored object at fingertip.

2.2 Detecting the Position of Colored object at finger top and forming a circle over it. That
is Contour Detection.

2.3 Tracking the fingertip and drawing points at each position for air canvas effect. That is
Frame Processing.

2.4 Fixing the Minor Details of the code to function the program smoothly. Algorithmic
Optimization.

3. Complete the entire coding and test it.

10
4. Check the performance of the project.

5. Prepare for the final presentation.

11
Chapter 2

Technology and Literature Review

1) Computer Vision: This technology is used to track and analyze hand gestures in real-time,
allowing the user to draw in the air using hand movements.

2) OpenCV: OpenCV (Open-Source Computer Vision Library) is an open-source computer vision


and machine learning software library that provides a comprehensive set of algorithms and tools to
enable computer vision-based applications.

3) Media Pipe: Media Pipe is a cross-platform framework that enables real-time multimedia
processing using machine learning models and computer vision algorithms.

4) Python: Python is a high-level, interpreted programming language that is widely used for
scientific computing, data analysis, and artificial intelligence.

5) GUI libraries: GUI libraries such as Tkinter or PyQt are used to create the user interface of the
application, allowing the user to interact with the application and view the results of the hand
gesture analysis.

6) Deep Learning: Deep learning algorithms and models may be used to train the computer vision
system to recognize and respond to different hand gestures.

7) TensorFlow: TensorFlow is a popular open-source software library for machine learning and
deep learning developed by Google.

12
Chapter 3

System Requirement Study

3.1 User Characteristics

1) Technical knowledge: The user should have some knowledge of computer vision, image
processing and programming concepts to understand the workings of the Air Canvas project.

2) Creativity: The user should have an artistic inclination and be able to think creatively in order
to use the Air Canvas to its full potential.

3) Patience: The user should be patient and willing to experiment with different techniques to get
the desired results.

4) Hand-eye coordination: The user should have good hand-eye coordination to make precise
gestures that can be accurately captured and translated into digital drawings.

5) Open to learning: The user should be open to learning new software and technologies to
enhance their experience with the Air Canvas project.

6) Device compatibility: The user should have a device that is compatible with the software used
in the Air Canvas project, such as a webcam or a depth camera.

3.2 Hardware and software Requirements

 Hardware requirements: -

 PC or laptop with a webcam or external camera

 High-speed internet connection

 Software requirements: -

 Operating System: Windows, Mac or Linux

13
 Python 3.x installed

 OpenCV and Media Pipe libraries installed

 Additional libraries: NumPy, Imutils, Time, Dlib, and TensorFlow

 Functional Requirements: -

1) Real-time hand tracking and gesture recognition: The system must be able to
detect and track hands in real-time and recognize different hand gestures to enable
users to control the canvas.

2) Image processing: The system must be able to perform image processing tasks,
such as background subtraction and object detection, to identify the user's hands.

3) Painting functionality: The system must provide a virtual canvas where users can
paint, draw, or create graphics using their hands.

4) User interface: The system must have a user-friendly interface that allows users
to interact with the canvas and access different painting tools and options.

5) Compatibility with different input devices: The system must be compatible with
different input devices, such as cameras and depth sensors, to enable users to use
the system with the device of their choice.

 Non-Functional Requirements: -

1) Performance: The system must have a fast response time and low latency to
enable real-time hand tracking and gesture recognition.

2) Scalability: The system must be scalable to accommodate an increasing number


of users.

3) Usability: The system must be user-friendly and easy to use for users of different
ages and skill levels.

14
3.3 Assumptions and Dependencies

 Assumptions: -

1) The system is designed to work with a standard webcam or a video capture device.

2) The system assumes that the user's hand gestures are clearly visible in the captured
video.

3) The system assumes that the background is uniform and does not contain any
distracting elements.

4) The system assumes that the user is wearing contrasting colors on their hands, which
will help in detecting their hand gestures.

 Dependencies: -

1) The project requires the installation of OpenCV and Media Pipe libraries to function
correctly.

2) The project also requires a computer with a dedicated graphics card and adequate
processing power to handle the computational requirements.

3) The project requires a webcam or a video capture device to capture the hand
gestures.

4) The project depends on the accuracy of the hand detection algorithms implemented
in OpenCV and Media Pipe.

5) The project depends on the availability of accurate and up-to-date libraries and
modules used in the development process.

15
Chapter 4

System Diagrams

 Use Case Diagram

16
 Data Flow Diagram Level 0

The 0-level data flow diagram represents the basic Structure Through the Flow of given Project. It helps User
to get understand easily and Short.

 Data Flow Diagram Level 1

The level 1 of Data flow diagram Represents the Internal Structure of Diagram through flow of data. It used
to Show the light description of project Flow.

17
 Data Flow Diagram Level 2

18
 E-R Diagram

An entity relationship diagram s a type of a flowchart that illustrate how “entities” such as people, objects or
concepts relate to each other within a system. ER diagram is most often used to design or debug relational
database in the fields of software engineering, business information system, education and research. Also
known as ERDs or ER models, they use a defined set of symbols such as rectangle, diamonds, oval and
connecting lines to depict the interconnectedness of entities, relationship and their attributes. They mirror
grammatical structure, with entities as nouns and relationships as verbs ER diagram are related to data
structure diagram (DSDs), which focus on the relationships of element themselves. ER diagram also are
often used in conjunctions with data flow diagrams (DFDs), which map out the flow of information for
process or system.

19
 Sequence Diagram

20
 Activity Diagram

Activity Diagram is another important behavioral diagram to described dynamic aspects of the system.
Activity diagram is essentially an advanced version of flow chart that modeling the flow from one activity to
another activity.

21
 UML Class Diagram

22
Screenshots

1. Landing Page

2. Implementation Page

23
3. Ending Page

24
Chapter 5

Software Testing

Software testing is a critical element of software quality assurances and represents the ultimate review of
specification, design and coding. Testing is an exposure of a system to trial input to see whether it produces
correct output. Testing cannot be determined whether software meets user’s need, only whether it appears to
confirm to requirements. Testing can show that a system is free of errors, only that it contains error. Testing
finds errors, it does not correct errors. Software success is a quality product, on time and within cost. Though
testing can reveal critical (costly) mistakes. Testing should therefore,

 Validate Performance.
 Detects Errors.
 Identify Inconsistencies.

Testing Principles:

All tests should be traceable to customer requirements.

 Test should be planned long before testing begins. Ø The Pareto principle applies to software testing.

 Testing should begin “in the small” and progress toward testing “in the large”.

 Exhaustive testing is not possible.

1.Unit Testing

Unit testing focuses verification effort on the smallest unit of software design – the software component or
module. Using the component-level design description as a guide, important control path are tested to
uncover errors within the boundary of the module. The unit test focuses in the internal processing logic and
data structures within the boundaries of a component. This type of testing can be conducted in parallel for
multiple components.

25
2.Integration Testing

Integration testing is a systematic technique for constructing the software architecture while at the same time
conducting test to uncover errors associated with interfacing. The objective is to take unit tested components
and build a program structure that has been dictated by design.

Top-down integration
It is incrementing approach to construction of the software architecture. Modules are integrated by moving
downward through the control hierarchy, beginning with the main control module.

Bottom-up integration
It begins construction and testing with atomic modules. Because components are integrated from the bottom
up, processing required for components subordinate to a given level is always available and the need for stub
is eliminated.

3. Validation Testing

In validation testing, requirement established as part of software requirements analysis are available against
the software that has been constructed. All validation criteria are tested. Validation testing provides the final
assurance that software meets all functional, behavioral and performance requirements.

The Alpha Test is conducted at the developer’s site by end-users. The software is used in a natural setting
with the developer “looking over the shoulder” of typical users and recording errors and usage problems. It
conducted in a controlled environment.

The Beta Test is conducted at end-user sites. Unlike alpha testing, the developer is generally not present.
Therefore, the beta test is a “live” application of the software in an environment that cannot be controlled by
the developer. The end-user records all problems that are encountered during beta testing and reports these to
the developer at regular intervals.

26
White Box Testing Principles

White-box testing sometimes called glass-box testing, is a test design method that users the control structure
of the procedural design to derive test cases. Using white-box testing methods, the software engineer can
derive test cases that:

 Guarantee that all independent paths within a module have been exercised at least once.
 Exercise all logical decisions on their true and false sides.
 Execute all loops at their boundaries and within their operational bounds.
 Exercise internal data structures to ensure their validity.

Testing is software quality assurance activity which is a very important to work the system successfully and
achieve high quality of software.

27
Chapter 6

Data Dictionary

1) User Input:

Definition: Data collected from the user's hand gestures, including x and y coordinates, size, and
shape.

Data Type: Integer/Float

Description: This data is used to track the movement of the user's hand and draw on the canvas.

2) Canvas Dimensions:

Definition: The dimensions of the canvas where the user can draw.

Data Type: Integer

Description: This data is used to set the size of the canvas and ensure that the user's hand gestures
are within the canvas bounds.

3) Color:

Definition: The color used for drawing on the canvas.

Data Type: String

Description: This data is used to determine the color of the line drawn on the canvas.

4) Brush Size:

Definition: The size of the brush used for drawing on the canvas.

Data Type: Integer

Description: This data is used to determine the width of the line drawn on the canvas.

28
5) Draw Mode:

Definition: The mode used for drawing on the canvas.

Data Type: Boolean

Description: This data is used to determine whether the user is in draw mode or not.

6) Clear Canvas:

Definition: A function used to clear the canvas.

Data Type: Boolean

Description: This data is used to determine whether the canvas should be cleared or not.

7) Save Image:

Definition: A function used to save the image on the canvas.

Data Type: Boolean

Description: This data is used to determine whether the image on the canvas should be saved or
not.

29
Chapter 7

Result, Discussion and Conclusion

 Result

It is a digital drawing application where the user can interact with the canvas using
hand gestures. The computer vision algorithms can track the movement and position
of the hand, allowing the user to draw, color, and erase on the canvas with hand
gestures. This technology has potential applications in digital art, design, and
gaming. The result can also be evaluated based on its accuracy and efficiency in
detecting and tracking hand gestures, as well as its usability and user experience.

 Discussion

1) Performance evaluation: Discuss the accuracy and reliability of hand detection


and drawing capabilities, and any limitations that were encountered during testing.

2) Comparison with existing solutions: Compare the air canvas project with other
hand tracking and drawing solutions in terms of performance, user experience, and
overall functionality.

3) User feedback: Discuss any feedback received from users of the air canvas
project, including their experience with the technology, ease of use, and suggestions
for improvement.

4) Future work: Discuss potential areas for further development, such as improving
hand detection accuracy, adding new features, or enhancing the user experience.

5) Conclusion: Summarize the key findings of the air canvas project, highlighting
its strengths and weaknesses, and providing recommendations for future work.

30
Chapter 8

Future Enhancement

Air Canvas is a technology that allows users to draw and interact with virtual content in
the air using hand gestures. Here are some potential future enhancements for this
technology:

1. Improved Gesture Recognition: Air Canvas relies on precise hand gestures to function effectively. As
such, future enhancements could include improvements in gesture recognition technology to make the
system more responsive and accurate. This could involve the use of more advanced sensors, such as
radar or LiDAR, to capture and interpret hand movements more precisely.

2. Integration with Augmented Reality: Air Canvas could be integrated with augmented reality (AR)
technology to create a more immersive experience. This would allow users to draw and interact with
virtual objects in the real world, enhancing the creative possibilities of the technology.

3. Collaboration and Sharing: Future enhancements could also include features that allow multiple users
to collaborate and share their work in real-time. This could involve the ability to work on the same
canvas simultaneously, or to share content across different devices and platforms.

4. Haptic Feedback: Air Canvas could also benefit from the addition of haptic feedback technology,
which would allow users to feel tactile sensations as they draw and interact with virtual content. This
could help to enhance the overall user experience and make the technology more engaging and
intuitive to use.

5. Integration with AI: Finally, Air Canvas could be enhanced through the integration of artificial
intelligence (AI) technology, which could help to automate certain aspects of the drawing process and
provide users with suggestions and recommendations based on their previous work. This could help

to streamline the creative process and make it more accessible to a wider range of users.

31
 Conclusion

The purpose of the project was to detect hand gestures and translate them into
drawings on a virtual canvas using computer vision techniques. The project
leveraged the power of OpenCV and

The results of the air canvas project demonstrate the effectiveness of using
computer vision to recognize hand gestures and translate them into drawings. The
project was successful in detecting hand gestures and providing a smooth Media
pipe to achieve the desired results. And intuitive interface for users to create art.

Overall, the air canvas project highlights the potential of computer vision to
enhance the user experience in a variety of domains. The project provides a glimpse
into the future of technology, where intuitive and seamless interfaces will play a
major role in making computing more accessible and engaging.

In conclusion, the air canvas project has successfully achieved its goals and has laid
a foundation for further research and development in the area of computer vision
and user experience.

32
Chapter 9

References

1) OpenCV: OpenCV (Open-Source Computer Vision Library) is an open-source computer vision


and machine learning software library. You can refer to the OpenCV documentation
(https://docs.opencv.org/master/)for more information.

2) Media Pipe: Media Pipe is a platform for building multimodal (e.g., video, audio, etc.) applied
machine learning pipelines. You can refer to the Media Pipe official website
(https://mediapipe.dev) for more information.

3) Hand Tracking and Pose Estimation using Media Pipe: You can refer to the following article for
an overview of the hand tracking and pose estimation process using Media Pipe
(https://mediapipe.dev/docs/examples/hand_tracking).

33

You might also like