0% found this document useful (0 votes)
238 views65 pages

Batch 13

This document is a project report submitted to Jawaharlal Nehru Technological University for the degree of Bachelor of Technology in Computer Science and Engineering. It describes the development of an "Air Canvas" application using OpenCV in Python that allows users to draw in mid-air using finger gestures detected by a webcam. The project aims to enable virtual teaching by allowing teachers to draw on a virtual canvas without additional hardware. It tracks finger movements to generate digital brush strokes or drawings in real-time.

Uploaded by

Sheema Nazle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
238 views65 pages

Batch 13

This document is a project report submitted to Jawaharlal Nehru Technological University for the degree of Bachelor of Technology in Computer Science and Engineering. It describes the development of an "Air Canvas" application using OpenCV in Python that allows users to draw in mid-air using finger gestures detected by a webcam. The project aims to enable virtual teaching by allowing teachers to draw on a virtual canvas without additional hardware. It tracks finger movements to generate digital brush strokes or drawings in real-time.

Uploaded by

Sheema Nazle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 65

Air canvas using OpenCV-python

A Mini Project Report Submitted to

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY,HYDERABAD

In Partial Fulfillment of the requirement For the Award of the Degree of

BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING

Submitted by

1.S.Architha (H.T.N0:19N01A0597)
2.Khuteja Nazlee (H.T.N0:19N01A0563)
3.S. Pooja (H.T.N0:19N01A0598)
4:P.Laxmi (H.T.N0:19N01A0580)
5:S.Sai Surya (H.T.N0:19N01A0599)

Under the Supervision of


Mr. RAMESH
Assistant Professor

Department of Computer Science and Engineering

SREE CHAITANYA COLLEGE OF ENGINEERING

(Affiliated to JNTUH, HYDERABAD)

THIMMAPUR, KARIMNAGAR, TELANGANA-505 527

DECEMBER 2022
SREE CHAITANYA COLLEGE OF ENGINEERING
(Affiliated to JNTUH, HYDERABAD)
THIMMAPUR, KARIMNAGAR, TELANGANA-505 527
Department of Computer Science and Engineering

CERTIFICATE

This is to certify that the mini project report entitled “Air Canvas using Python-OpenCV”
is being submitted by S.ARCHITHA (19N01A0598), KHUTEJA NAZLEE (19N01A0563),
S.POOJA (19N01A0598), P.LAXMI(19N01A0580), S.SAISURYA (19N01A0599) for partial
fulfillment of the requirement for the award of the degree of Bachelor of Technology in
Computer Science and Engineering discipline to the Jawaharlal Nehru Technological
University, Hyderabad during the academic year 2022 - 2023 is a bonafide work carried out by
him under my guidance and supervision.

The result embodied in this report has not been submitted to any other University or
institution for the award of any degree of diploma.

Project Guide Head of the department

Mr.R.RAMESH Mr.KHAJA ZIAUDDIN


Assistant Professor Associate Professor
Department of CSE Department of CSE

EXTERNAL EXAMINER

i
SREE CHAITANYA COLLEGE OF ENGINEERING
(Affiliated to JNTUH, HYDERABAD)
THIMMAPUR, KARIMNAGAR, TELANGANA-505 527
Department of Computer Science and

DECLARATION

I, S.Architha, is student of Bachelor of Technology in Computer Science and


Engineering,during the academic year:2022-2023 hereby declare that the work presented in this
Project Work entitled Air Canvas Using Python -OpenCV is the outcome of our own bona fide
work and is correct to the best of our knowledge and this work has been undertaken taking care of
Engineering Ethics and carried out under the supervision of Mr.Ragi Ramesh,Assistant
Professor.

It contains no material previously published or written by another person nor material which
has been accepted for the award of any other degree or diploma of the university or other institute
of higher learning, except where due acknowledgment has been made in the text.

1.S.Architha(19N01A0597)
2.Khuteja Nazlee(19N01A0563)
3.S.Pooja(19N01A0598)
4.P.Laxmi(19N01A0580)
5.S.Sai Surya(19N01A0599)

ii
SREE CHAITANYA COLLEGE OF ENGINEERING
(Affiliated to JNTUH, HYDERABAD)
THIMMAPUR, KARIMNAGAR, TELANGANA-505 527
Department of Computer Science and Engineering

ACKNOWLEDGEMENTS

The Satisfaction that accomplishes the successful completion of any task would be
incomplete without the mention of the people who make it possible and whose constant
guidance and encouragement crown all the efforts with success.

I would like to express my sincere gratitude and indebtedness to my project


supervisor .MR.R.Ramesh,Assistant Professor, Department of Computer Science and
Engineering, Sree Chaitanya College of Engineering,LMD Colony,Karimnagar for his
valuable suggestions and interest throughout the course of this project
I am also thankful to Head of the department Mr Khaja Ziauddin Associate Professor &
HOD, Department of Computer Science and Engineering, Sree Chaitanya College of
Engineering,LMD Colony,Karimnagar for providing excellent infrastructure and a nice
atmosphere for completing this project successfully
We Sincerely extend out thanks to Dr.G.Venkateswarlu,Principal, Sree Chaitanya
College of Engineering,LMD Colony,Karimnagar,for providing all the facilities required for
completion of this project.
I convey my heartfelt thanks to the lab staff for allowing me to use the required
equipment whenever needed.
Finally, I would like to take this opportunity to thank my family for their support through
the work.
I sincerely acknowledge and thank all those who gave directly or indirectly their support
in completion of this work.

S.Architha

iii
ABSTRACT
Writing in air has been one of the most fascinating and challenging research areas in field of
image processing and pattern recognition in the recent years. It contributes immensely to the
advancement of an automation process and can improve the interface between man and
machine in numerous applications. Several research works have been focusing on new
techniques and methods that would reduce the processing time while providing higher
recognition accuracy. Object tracking is considered as an important task within the field of
Computer Vision. The invention of faster computers, availability of inexpensive and good
quality video cameras and demands of automated video analysis has given popularity to
object tracking techniques. Generally, video analysis procedure has three major steps: firstly,
detecting of the object, secondly tracking its movement from frame to frame and lastly
analysing the behaviour of that object. For object tracking, four different issues are taken
into account; selection of suitable object representation, feature selection for tracking,
object detection and object tracking. In real world, Object tracking algorithms are the
primarily part of different applications such as: automatic surveillance, video indexing and
vehicle navigation etc. The project takes advantage of this gap and focuses on developing a
motion-to-text converter that can potentially serve as software for intelligent wearable
devices for writing from the air. This project is a reporter of occasional gestures. It will
use computer vision to trace the path of the finger. The generated text can also be used for
various purposes, such as sending messages, emails, etc. It will be a powerful means of
communication for the deaf. It is an effective communication method that reduces
mobile and laptop usage by eliminating the need to write
CHAPTER 1
INTRODUCTION

Air Canvas, a project built in python, is a computer vision project using Media pipe which is
a cross-platform framework for building multimodal applied machine learning pipeline.
Computer Vision is a field of Artificial Intelligence (AI) that enables computers and systems
to derive meaningful information from digital image, video and other visual inputs- and take
action or recommendation based on that information.

Using the tip of the pointing finger one can interact within the canvas. Aim of building this
project is to make the virtual classes effective and easy for teachers who are facing
difficulties drawing/ writing from the mouse and they don’t have either touchscreen laptop or
any other pen input devices. They can simply draw on the board using the web cam and tip of
their finger, no additional hardware is required.

An air canvas is a type of interactive display that allows users to draw or paint in mid-air,
using gestures or other input devices. While it is possible to use objects, such as pens or
sticks, to create an air canvas, this would likely require some additional technology, such as a
depth-sensing camera or other input device that can detect the movement of objects in 3D
space.

This technology could then be used to track the movement of the objects and convert them
into digital brush strokes or other drawing tools. However, creating an air canvas using
objects would likely be a complex and challenging task, and there are likely to be many
technical challenges involved. It is also worth noting that using objects to create an air canvas
may not be as intuitive or user-friendly as using gestures or other input devices designed
specifically for this purpose. An air canvas is a type of interactive display that allows users to
draw or paint in mid-air, using gestures or other input devices.

Air Canvas is a software tool that allows users to create and manipulate digital images using
hand gestures. It uses a combination of computer vision and machine learning algorithms to
track hand movements and interpret them as drawing commands. Air Canvas allows users to
draw, paint, and erase with their hands, using a webcam or other video input device as the
input source. It can be used for a variety of creative and artistic purposes, as well as for
educational and research applications.

1.1 OVERVIEW

An air canvas is a virtual whiteboard that allows users to draw or write in the air using
gestures. It can be implemented using the OpenCV (Open Source Computer Vision) library,
which is a popular open-source library for computer vision tasks.To implement an air canvas
using OpenCV, you will need a camera to capture the gestures made by the user. You will
also need some way of detecting and tracking the gestures, such as using color tracking or
motion tracking algorithms.

Once the gestures have been detected and tracked, you can use OpenCV to draw the
corresponding lines or text on a virtual canvas. This can be displayed on a screen or projected
onto a surface, allowing the user to see their drawings in real-time.Overall, implementing an
air canvas using OpenCV requires a combination of computer vision techniques and
interactive graphics. It can be a challenging project, but can also be very rewarding as it
allows users to interact with technology in a natural and intuitive way.

1.2 MOTIVATION

The underlying inspiration was a requirement for a dustless study hall for the understudies to
concentrate in. I realize that there are numerous ways like touch screens and then some yet
what might be said about the schools which cannot bear the cost of it to purchase such
gigantic huge screens and instruct on them like a TV. Along these lines, I thought why not
can a finger be followed, however that too at an underlying level-without profound learning.
Consequently it was OpenCV which acted the hero for these PC vision projects.

1.3 EXISTING SYSTEM

The superior pen contains of a tri pivotal accelerometer, a microcontroller, and a RF remote
transmission module for detecting and amassing velocity will increase of hand composing
and movement directions. Our implanted project first concentrates the time-and recurrence
area highlights from the rate boom indicators and, then, at that point, sends the symptoms
with the aid of using making use of RF transmitter. In beneficiary section RF symptoms may
be gotten with the aid of using RF recipient and given to microcontroller. The regulator
procedures the records finally then results may be proven on Graphical LCD.

1.Fingertip detection:

The existing system only works with your fingers, and there are no highlighters, paints, or
relatives. Identifying and characterizing an object such as a finger from an RGB image
without a depth sensor is a great challenge.

2.Lack of pen up and pen down motion:

The system uses a single RGB camera to write from above. Since depth sensing is not
possible, up and down pen movements cannot be followed. Therefore, the fingertip's entire
trajectory is traced, and the resulting image would be absurd and not recognized by the
model.

1.4 PROPOSED SYSTEM

This computer vision experiment uses an Air canvas, which allows you to draw on a screen
by waving a finger equipped with a colorful tip or a basic colored cap. These computer vision
projects would not have been possible without OpenCV's help. There are no keypads,
styluses, pens, or gloves needed for character input in the suggested technique.

In this proposed framework, going to utilize webcam and show unit(monitor screen). Here,
will be utilizing pen or hand for drawing attractive images in front of the camera then we will
attract those images, it will be shown on the presentation unit. Our mounted framework is suit
for decoding time-collection pace boom alerts into extensive thing vectors. Users can make
use of the pen to compose digits or make hand motions and so on may be proven at the
presentation unit.
Modules of Proposed System

1.Color Tracking
Understanding the HSV ( Hue , Saturation , Value ) shading space for Color Tracking.
Furthermore, following the little hued object at fingertip. The approaching picture from the
webcam is to be changed over to the HSV shading space for recognizing the hued object at
the tip of finger.
2. Trackbars
When the trackbars are arrangement, we will get the realtime esteem from the trackbars and
make range. This reach is a numpy structure which is utilized to be passed in the capacity
cv2.inrange(). This capacity returns the Mask on the hued object. This Mask is a high contrast
picture with white pixels at the situation of the ideal tone.
3. Contour Detection
Recognizing the Position of Colored item at fingertip and shaping a circle over it. We are
playing out some morphological procedure on the Mask, to make it liberated from
contaminations and to distinguish shape without any problem. That is Contour Detection.
4. Frame Processing
Following the fingertip and drawing focuses at each position for air material impact. That is
Frame Processing
5. Algorithmic Optimization.
Making the code efficient to work the program without a hitch. Algorithmic Optimizatio
1.5 OBJECTIVE

The main objective of an air canvas using OpenCV is to allow users to draw or write in the
air using gestures, and have their drawings or writing displayed in real-time on a screen or
projected surface. This can be achieved by using a camera to capture the gestures made by
the user, and then using computer vision algorithms to detect and track the gestures.

Other possible objectives of an air canvas using OpenCV might include:

1.Providing an intuitive and natural way for users to interact with technology

2.Allowing users to express themselves creatively through drawing or writing

3.Creating a fun and engaging experience for users

4.Demonstrating the capabilities of OpenCV and computer vision technology

5.Serving as a platform for further development and experimentation with computer vision
algorithms and interactive graphics.
CHAPTER – 2
LITERATURE SURVEY

A. Tracking of Brush Tip on Real Canvas :

Silhouette-Based and Deep Ensemble Network-Based Approaches Author-Joolekha


Bibi Joole, challenging task, and there are likely to be many technical challenges
involved. It is also worth noting that using objects to create an air canvas may not be
as intuitive or user-friendly as using gestures or other input devices designed
specifically for this purpose , Ahsan Raza, Muhammad Abdullah, Seokhee Jeon .
Working-The proposed profound outfit community is ready disconnected using
records stuck thru an outer tracker (Optitrack V120) and the define primarily based
totally approach.During actual drawing, the organized organisation appraises the
brush tip function via way of means of taking the brush deal with act like an records,
allowing us to make use of actual cloth with a actual brush.During the checking out
system, the framework works continuously, considering that round then, it tracks the
brush deal with present (function and direction) and the proposed profound troupe
community takes this brush deal with act like information and predicts the brush tip
function in genuine time. For information assortment, played out various strokes for
60 seconds on a superficial level inside the following area.

B. Augmented Airbrush for Computer Aided Painting (CAP):

Author-Roy Shilkrot, Pattie Maes, Joseph A. Paradiso, and Amit Zoran


Working- To work our expanded artificially glamorize, the client remains before the
material, allowed to chip away at any piece of the composition, utilize any style, and
counsel the PC screen in the event that the person wishes The reference and material
are lined up with an aligned focus point that relates to the virtual beginning. The client
can move the gadget utilizing a coordinated strategy, a more instinctive one,or a blend
of both.The PC will mediate just when the virtual following compares with a paint
projection that disregards a virtual reference. In such a case, the PC will keep the
client from utilizing the maximum capacity of the artificially glamorize trigger and
applying paint where it isn't needed.A gadget depends on a Grex Genesis.XT, a gun
style digitally embellish mitigated of its back paint-volume change handle. Since this
is a double activity artificially glamorize, working the trigger opens both the
compelled air valve and the paint liquid valve, which is made of a needle and a spout,
bringing about a stream of air blended in with paint particles.They fostered a specially
crafted expansion component, to permit advanced control of the paint combination. A
Grex air blower supplies compressed air at 20 PSI, and a Polhemus Fastrack attractive
movement global positioning framework positions the gadget in 6DOF.

C. 3D Drawing with Augmented Reality :


Author-Sharanya M, Sucheta Kolur , Sowmyashree B V, Sharadhi L, Bhanushree K J
Working-A mobile application that runs on Android devices and lets the user draw on
the world, treating it as a canvas, implement real-time sync of the drawing on all
instances of the application running in the same network room and provide a tool for
creative content producers to quickly sketch their ideas in3D spaces. The Freehand
procedure permits the client to draw constantly as coordinated by hand developments.
To begin a line the client plays out the air-tap motion. The line is drawn constantly at
the list cursor position until the client ends the line by playing out a subsequent airtap.
CHAPTER-3
PROBLEM DEFINITION
The existing system only works with your fingers and no highlighters, paints, or relatives.
Identify and distinguish something like a finger from RGB image without depth sensor great
challenge. Another problem is lack of top and movement under the pen. The system uses a
one RGB camera that you can overwrite. From the depths discovery is impossible, jobs up
and down of the pen cannot be traced. So, everything finger path is drawn, and the result the
image will be abstract and unseen by model. Using real-time hand touch to change position
the process from one region to another requires a lot of code care. In addition, the user should
know many movement to control his plan adequately. The project focuses on solving some
of the most important social issues Problems. First of all, there are many hearing-impaired
people problems in everyday life. While listening again listening is taken for granted, people
don’t have this communicating with a disability using sign language. Most countries in the
world cannot understand yours feelings and emotions outside the middle translator. Second,
overuse causes accidents, stress, Smartphones: distractions, and other illnesses that people
can no longer tolerate find out. Although its portability and ease of use exist. very popular, its
obstacles include life terrifying events. Waste paper is not uncommon. Many papers are
wasted on writing, writing, drawing, etc. A4 paper production requires about 5 liters of
water. 93% sources come from trees, 50% of commercial waste is paper, 25% of landfills are
paper, and the list goes to. Waste paper harms the environment through use of water and trees
and produce tons waste. On-air writing can solve these problems quickly. It will serve as a
communication tool for the deaf. Your online text can be displayed in AR or translated into
speech. One can write on the air quickly and continue to operate without much interruption.
Also, writing in the air does not require paper. Everything is stored electronically.
The project focuses on solving some major societal problems –
1.People hearing impairment: Although we take hearing and listening for granted, they
communicate using sign languages. Most of the world can't understand their feeling, their
emotions without a translator in between.
2.Overuse of Smartphones: They cause accidents, depression, distractions, and other
illnesses that we humans can still discover. Although its portability, ease to use is
profoundly admired, the negatives include life-threatening events.
3.Paper wastage is not scarce news. We waste a lot of paper in scribbling, writing, drawing,
etc.… Some basic facts include - 5 liters of water on average are require
CHAPTER-4
SOFTWARE AND HARDWARE REQUIREMENT

1. Hardware Requirement
 Dual Core CPU
 Minimum 1 gb of RAM
 Window 7 or greater
 Web Camera
2. Software Requirement
 Python
 Numpy Module
 OpenCV module
 Mediapipe

1.HARDWARE REQUIREMENTS:

Dual Core CPU: A dual core CPU is a type of central processing unit (CPU) that has two
independent cores, or processing units, on the same chip. It is capable of processing two
streams of instructions simultaneously, which can increase the performance of certain types
of tasks.

Minimum 1gb of RAM: RAM (random access memory) is a type of computer memory that
is used to store data that is actively being used or processed by the system. It is volatile
memory, meaning it is wiped clean when the power is turned off.
The amount of RAM that a computer has can affect its performance. In general, having more
RAM can allow a computer to run more programs concurrently and improve the speed at
which it can complete tasks.
A minimum of 1 GB (gigabyte) of RAM is often recommended for basic tasks such as web
browsing, word processing, and email. However, the actual amount of RAM that is required
for a particular task or application can vary depending on the specific needs of the software
and the operating system. For example, more resource-intensive tasks such as video editing
or gaming may require more RAM to run smoothly.
It's worth noting that the minimum requirements for RAM can vary depending on the specific
operating system and software being used. For example, the minimum RAM requirement for
running the latest version of Microsoft Windows is 2 GB, while the minimum requirement
for running macOS is 4 GB. It is always a good idea to check the system requirements for the
specific software or operating system that you are using to ensure that your computer has
enough RAM to run it effectively.

Window 7 or greater: Windows 7 is a personal computer operating system that was


produced by Microsoft and released as part of the Windows NT family of operating systems.
It was released to manufacturing on July 22, 2009, and became generally available on
October 22, 2009. Windows 7 was succeeded by Windows 8, which was released in October
2012.
Some key features of Windows 7 include:
 A redesigned taskbar and start menu
 Improved support for multi-touch input
 Enhanced support for hardware acceleration
 Improved performance and boot time
 Improved security features, including BitLocker encryption and AppLocker
 The ability to create a home network and share files and printers with other computers
 Support for virtual hard disks
 Improved support for different languages and input methods
To use Windows 7, your computer must meet certain hardware and software requirements.
These requirements include:
 Processor: 1 GHz or faster 32-bit (x86) or 64-bit (x64) processor
 Memory: 1 GB RAM for 32-bit or 2 GB RAM for 64-bit
 Hard drive: 16 GB available hard disk space for 32-bit or 20 GB for 64-bit
 Graphics card: DirectX 9 graphics processor with WDDM 1.0 or higher driver
It's worth noting that Microsoft ended mainstream support for Windows 7 on January 13,
2015, and ended extended support on January 14, 2020. This means that Microsoft no longer
provides security updates or technical support for the operating system. If you are still using
Windows 7, it is recommended to upgrade to a newer version of Windows to receive ongoing
security updates and support.

Web Camera: A webcam, also known as a web camera, is a video camera that is used to
capture images and video for transmission over the internet. Webcams are typically small and
portable, making them convenient for use with computers and other devices.
Webcams can be used for a variety of purposes, such as video conferencing, live streaming,
and recording video for social media or other online platforms. They are often integrated into
laptops and desktop computers, but can also be purchased as standalone devices that can be
connected to a computer or other device via USB.
Most webcams have a built-in microphone, which allows them to capture audio as well as
video. Some webcams also have additional features such as a built-in LED light or the ability
to pan, tilt, and zoom to capture a wider field of view.
Webcams can vary in terms of their video quality, with higher-end models typically offering
higher resolution and a more detailed image. They can also vary in terms of their frame rate,
which is the number of frames captured per second. A higher frame rate can result in a
smoother, more realistic video, but may also require more bandwidth and processing power.
It's worth noting that webcams can be vulnerable to security risks, such as the potential for
unauthorized access or surveillance. If you are concerned about the security of your webcam,
you may want to consider using a physical cover to block the camera when it is not in use, or
disabling the webcam in your device's settings.
2.SOFTWARE REQUIREMENTS:

Python:
Python is a general-purpose interpreted, interactive, object-oriented, and high-level
programming language. It was created by Guido van Rossum during 1985- 1990. Like Perl,
Python source code is also available under the GNU General Public License (GPL).
Why to Learn Python?
Python is a high-level, interpreted, interactive and object-oriented scripting language.
Python is designed to be highly readable. It uses English keywords frequently where as other
languages use punctuation, and it has fewer syntactical constructions than other languages.
Python is a MUST for students and working professionals to become a great Software
Engineer specially when they are working in Web Development Domain. I will list down
some of the key advantages of learning Python:
As mentioned before, Python is one of the most widely used language over the web. I'm
going to list few of them here:
 Easy-to-learn − Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
 Easy-to-read − Python code is more clearly defined and visible to the eyes.
 Easy-to-maintain − Python's source code is fairly easy-to-maintain.
 A broad standard library − Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
 Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
12 Portable − Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.
 Extendable − You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more efficient.
 Databases − Python provides interfaces to all major commercial databases.
 GUI Programming − Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows MFC,
Macintosh, and the X Window system of Unix.
 Scalable − Python provides a better structure and support for large programs than
shell scripting.

Numpy
NumPy is the fundamental package for scientific computing in Python. It is a Python library
that provides a multidimensional array object, various derived objects (such as masked arrays
and matrices), and an assortment of routines for fast operations on arrays, including
mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier
transforms, basic linear algebra, basic statistical operations, random simulation and much
more.
At the core of the NumPy package, is the ndarray object. This encapsulates n-dimensional
arrays of homogeneous data types, with many operations being performed in compiled code
for performance. There are several important differences between NumPy arrays and the
standard Python sequences:
 NumPy arrays have a fixed size at creation, unlike Python lists (which can grow
dynamically). Changing the size of an ndarray will create a new array and delete the
original.
 The elements in a NumPy array are all required to be of the same data type, and thus
will be the same size in memory. The exception: one can have arrays of (Python,
including NumPy) objects, thereby allowing for arrays of different sized elements.
 NumPy arrays facilitate advanced mathematical and other types of operations on large
numbers of data. Typically, such operations are executed more efficiently and with
less code than is possible using Python’s built-in sequences.
 A growing plethora of scientific and mathematical Python-based packages are using
NumPy arrays; though these typically support Python-sequence input, they convert
such input to NumPy arrays prior to processing, and they often output NumPy arrays.
In other words, in order to efficiently use much (perhaps even most) of today’s
scientific/mathematical Python-based software, just knowing how to use Python’s
.
In other words, in order to efficiently use much (perhaps even most) of today’s
scientific/mathematical Python-based software, just knowing how to use Python’s
built-in sequence types is insufficient - one also needs to know how to use NumPy
arrays.
The points about sequence size and speed are particularly important in scientific computing.
As a simple example, consider the case of multiplying each element in a 1-D sequence with
the corresponding element in another sequence of the same length.
OpenCV
OpenCV is a huge open-source library for computer vision,machine learning, and image
processing. OpenCV supports a wide variety of programming languages like
Python, C++, Java, etc. It can process images and videos to identify objects, faces, or even
the handwriting of a human. When it is integrated with variouslibraries, such as
Numpy which is a highly optimized library for numerical operations, then the
number of weapons increases in your Arsenal i.e whatever operations one can do in Numpy
can be combined with OpenCV.This OpenCV tutorial will help you learn the Image-
processing from Basics to Advance, like operations on Images, Videos using a huge set of
Opencv-programs and projects.

In OpenCV, the CV is an abbreviation form of a computer vision, which is defined as a field


of study that helps computers to understand the content of the digital images such as
photographs and videos.
The purpose of computer vision is to understand the content of the images. It extracts the
description from the pictures, which may be an object, a text description, and three-dimension
model, and so on. For example, cars can be facilitated with computer vision, which will be
able to identify and different objects around the road, such as traffic lights, pedestrians, traffic
signs, and so on, and acts accordingly.
Computer vision allows the computer to perform the same kind of tasks as humans with the
same efficiency. There are a two main task which are defined below:
o Object Classification - In the object classification, we train a model on a dataset of
particular objects, and the model classifies new objects as belonging to one or more of
your training categories.
o Object Identification - In the object identification,
our model will identify a particular instance of an
object - for example, parsing two faces in an image and
tagging one as Virat Kohli and other one as Rohit
Sharma.
History
OpenCV stands for Open Source Computer Vision Library, which is widely used for image
recognition or identification. It was officially launched in 1999 by Intel. It was written in
C/C++ in the early stage, but now it is commonly used in Python for the computer vision as
well.
The first alpha version of OpenCV was released for the common use at the IEEE Conference
on Computer Vision and Pattern Recognition in 2000, and between 2001 and 2005, five betas
were released. The first 1.0 version was released in 2006.
The second version of the OpenCV was released in October 2009 with the significant
changes. The second version contains a major change to the C++ interface, aiming at easier,
more type-safe, pattern, and better implementations. Currently, the development is done by
an independent Russian team and releases its newer version in every six months.
How does computer recognize the image?
Human eyes provide lots of information based on what they see. Machines are facilitated
with seeing everything, convert the vision into numbers and store in the memory. Here the
question arises how computer convert images into numbers. So the answer is that the pixel
value is used to convert images into numbers. A pixel is the smallest unit of a digital image or
graphics that can be displayed and represented on a digital display device.

The picture intensity at the particular location is represented by the numbers. In


the above image, we have shown the pixel values for a grayscale image consist of
only one value, the intensity of the black color at that location.
There are two common ways to identify the images:
1. Grayscale
Grayscale images are those images which contain only two colors black and
white. The contrast measurement of intensity is black treated as the weakest
intensity, and white as the strongest intensity. When we use the grayscale image,
the computer assigns each pixel value based on its level of darkness.
2. RGB
An RGB is a combination of the red, green, blue color which together makes a
new color. The computer retrieves that value from each pixel and puts the results
in an array to be interpreted.

Why OpenCV is used for Computer Vision?


o OpenCV is available for free of cost.
o Since the OpenCV library is written in C/C++, so it is quit fast. Now it can
be used with Python.
o It require less RAM to usage, it maybe of 60-70 MB.
o Computer Vision is portable as OpenCV and can run on any device that can
run on C and Python.

MediaPipe
It is a cross-platform framework for building multimodal applied machine learning
pipelines
MediaPipe is a framework for building multimodal (eg. video, audio, any time series data),
cross platform (i.e Android, iOS, web, edge devices) applied ML pipelines. With MediaPipe,
a perception pipeline can be built as a graph of modular components, including, for instance,
inference models (e.g., TensorFlow, TFLite) and media processing functions.
Cutting edge ML models
 Face Detection
 Multi-hand Tracking
 Hair Segmentation
 Object Detection and Tracking
 Objectron: 3D Object Detection and Tracking
 AutoFlip: Automatic video cropping pipeline
Cross Platform ML solutions
Build once, deploy anywhere. Works optimally across mobile (iOS, Android), desktop server
and the Web
Ondevice ML Acceleration
Performance optimized end-to-end ondevice inference with ML acceleration for mobile GPU
& EdgeTPU compute
How Google uses MediaPipe
MediaPipe is used by many internal Google products and teams including: Nest, Gmail, Lens,
Maps, Android Auto, Photos, Google Home, and YouTube.
MediaPipe for Hand
MediaPipe Hands is a high-fidelity hand and finger tracking solution. It employs machine
learning (ML) to infer 21 3D landmarks of a hand from just a single frame. Whereas current
state-of-the-art approaches rely primarily on powerful desktop environments for inference,
our method achieves real-time performance on a mobile phone, and even scales to multiple
hands. We hope that providing this hand perception functionality to the wider research and
development community will result in an emergence of creative use cases, stimulating new
CHAPTER-5
Design and implementation
5.1 Architecture of the proposed system
5.1.1 Architectures
5.1.2
Module description

The collection Module in Python provides different types of containers. A Container is an


object that is used to store different objects and provide a way to access the contained objects
and iterate over them. Some of the built-in containers are Tuple, List, Dictionary, etc. In this
article, we will discuss the different containers provided by the collections module. Image
processing modules: These modules may be used to perform operations on images such
as resizing, cropping, color space conversion, thresholding, and edge detection.
 Video processing modules: These modules may be used to process video streams,
such as to extract frames, stabilize the video, or track objects.
 Machine learning modules: These modules may be used to train and use machine
learning models for tasks such as object detection or classification.
 User interface modules: These modules may be used to create a user interface for the
application, such as to display images or video, or to allow the user to input
commands or select options.
 Air canvas-specific modules: There may also be specific modules that are designed
specifically for the "air canvas" application, such as modules for tracking hand
movements or detecting gestures.
 Gesture recognition module: This module is responsible for detecting and interpreting
hand gestures or movements made by the user. It may use techniques such as motion
tracking, skeleton tracking, or machine learning algorithms to identify specific
gestures.
 Drawing module: This module is responsible for rendering the drawings or paintings
created by the user on the screen. It may use techniques such as image manipulation
or graphics rendering to create the final image.
 User interface module: This module is responsible for creating the interface that the
user interacts with, such as the canvas area, buttons for selecting colors or brush sizes,
and any other controls or options.
 Input module: This module is responsible for capturing input from the user, such as
hand gestures or movements, and passing it on to the appropriate modules for
processing. This may involve using computer vision techniques or sensors to track the
user's hand movements.
 Output module: This module is responsible for displaying the final result to the user,
such as the drawing or painting created using the user's hand gestures. It may use
techniques such as image rendering or video output to display the result on the screen.
 Communication module: This module is responsible for communicating with any
external devices or systems, such as a server or database. It may be used to save or
load drawings, or to share them with other users.

5.1.3 System Workflow

A workflow using OpenCV might involve the following steps:

1. Importing and setting up the OpenCV library in your project.


2. Loading an image or video from a file or camera into the program.
3. Preprocessing the image or video to improve the accuracy of any subsequent analysis.
This might include steps such as resizing, cropping, or converting the image to a
different color space.
4. Applying image processing or computer vision techniques to the image or video. This
might include tasks such as object detection, face recognition, or feature extraction.
5. Analyzing the results of the image processing or computer vision algorithms to extract
information or make decisions based on the data.
6. Visualizing the results by displaying the processed image or video to the user or
saving the results to a file.
7. Optionally, repeating the process on multiple images or videos in a loop to perform
automated analysis.
Working
Here Colour Detection and tracking is used in order to achieve the objective. The colour
marker in detected and a mask is produced. It includes the further steps of morphological
operations on the mask produced which are Erosion and Dilation. Erosion reduces the
impurities present in the mask and dilation further restores the eroded main mask. The air
canvas Detects blue colour in the camera frame and whichever object is detected, that object
becomes pen/stylus to draw the objects.(Caution- We should not have any other blue colour
object in camera frame background for air canvas to work smoothly) Some pen or objects of
blue colour can be used to act as brush to the canvas.
Images of Working
5.1.4. Interation Among all modules

1. The input module captures the user's hand gestures or movements and passes them to
the gesture recognition module.
2. The gesture recognition module interprets the input and determines the appropriate
action to take, such as moving the brush or changing the brush size. It then sends this
information to the drawing module.
3. The drawing module uses the input from the gesture recognition module to update the
canvas image, adding lines or strokes as needed.
4. The output module receives the updated canvas image from the drawing module and
displays it to the user on the screen.
5. The user interface module receives input from the user, such as button clicks or
selections, and passes this input to the appropriate module for processing. For
example, if the user selects a different brush color, the user interface module would
pass this information to the drawing module, which would update the brush color
accordingly.
6. The communication module may also be involved in the process, for example if the
user wants to save their drawing or share it with others. In this case, the
communication module would handle the transfer of data to and from external devices
or systems.
5.2 ALGORITHM

1.Start reading the frames and convert the captured frames to HSV color space (Easy for
color detection).
2. Prepare the canvas frame and put the respective ink buttons on it.

3.Adjust the track bar values for finding the mask of the colored marker.

4.Preprocess the mask with morphological operations (Eroding and dilation).

5.Detect the contours, find the center coordinates of largest contour and keep storing them in
the array for successive frames (Arrays for drawing points on canvas).

6.Finally draw the points stored in an array on the frames and canvas
1. Writing Mode - In this state, the system will trace the fingertip coordinates and stores
them.
2. Colour Mode – The user can change the colour of the text among the various available
colours.
1. 3. Backspace - Say if the user goes wrong, we need a gesture to add a quick
backspace Motion tracking: This algorithm is responsible for tracking the user's
hand movements in real-time. It may use techniques such as frame differencing,
optical flow, or feature tracking to identify the location and movement of the
hand.
2. Skeleton tracking: This algorithm is responsible for detecting the bones and joints in
the user's hand and arm, and creating a skeleton model of them. This allows the
application to more accurately interpret the user's hand gestures.
3. Gesture recognition: This algorithm is responsible for interpreting the user's hand
gestures and determining the appropriate action to take, such as moving the brush or
changing the brush size. It may use techniques such as hidden Markov models,
decision trees, or machine learning algorithms to classify the gestures.
4. Drawing: This algorithm is responsible for rendering the lines or strokes on the screen
as the user moves their hand. It may use techniques such as image manipulation or
graphics rendering to create the final image.
5. Machine learning: Depending on the complexity of the gestures that the application
needs to recognize, it may also use machine learning algorithms to train a model to
classify the gestures. This may involve using a dataset of labeled gestures to train the
model, and then using the trained model to classify new gestures in real-time.
5.3 System design

Here is a possible design for such a system:


1. The system would require a device with a camera, such as a laptop or smartphone, to
capture images of the user's hands.
2. The system would use OpenCV to process the images and track the movements of the
user's hands in real-time. This could be done using techniques such as object detection
or feature tracking.
3. The system would have a virtual canvas displayed on a screen or projected onto a
surface. As the user moves their hands in the air, the system would translate the hand
movements into digital strokes on the virtual canvas.
4. The system might also include additional features, such as the ability to select
different colors or brush sizes, or to undo or redo strokes.
The system could potentially be extended to support multiple users, allowing them to
collaborate on a single virtual canvas in real-time
5.2.1.ER Diagrams
5.2.2.DFD Diagrams
5.2.3.UML Diagrams
5.2.4.Data Base Design
5.4.Sample code
CHAPTER-6
TESTING

Testing is the process of exercising software with the intent of finding errors and ultimately
correcting them. The following testing techniques have been used to make this project free of
errors. Content Review
The whole content of the project has been reviewed thoroughly to uncover typographical
errors, grammatical error and ambiguous sentences.
Navigation Errors
Different users were allowed to navigate through the project to uncover the navigation
errors. The views of the user regarding the navigation flexibility and user friendliness were
taken into account and implemented in the project.
Unit Testing
Focuses on individual software units, groups of related units.
 Unit – smallest testable piece of software.
 A unit can be compiled /assembled / linked/loaded; and put under a test harness.
 Unit testing done to show that the unit does not satisfy the application and /or its
implemented software does not match the intended designed structure.
Integration Testing
Focuses on combining units to evaluate the interaction among them
 Integration is the process of aggregating components to create larger components.
 Integration testing done to show that even though components were individually
satisfactory, the combination is incorrect and inconsistent.
System testing
Focuses on a complete integrated system to evaluate compliance with specified requirements
(test characteristics that are only present when entire system is run)
 A system is a big component.
 System testing is aimed at revealing bugs that cannot be attributed to a component as such,
to inconsistencies between components or planned interactions between components.
 Concern: issues, behaviors that can only be exposed by testing the entire integrated system
(e.g., performance, security, recovery)each form encapsulates (labels, texts, grid etc.). Hence
in case of project in V.B. form are the basic units. Each form is tested thoroughly in term of
calculation, display etc.
Regression Testing
Each time a new form is added to the project the whole project is tested thoroughly to rectify
any side effects. That might have occurred due to the addition of the new form. Thus
regression testing has been performed.
White-Box testing
White-box testing (also known as clear box testing, glass box testing, transparent box testing
and structural testing) tests internal structures or workings of a program, as opposed to the
functionality exposed to the end-user. In white-box testing an internal perspective of the
system, as well as programming skills, are used to design test cases. The tester chooses inputs
to exercise paths through the code and determine the appropriate outputs.
Black-
Black-box testing
box testing treats the software as a “black box”, examining functionality without any
knowledge of internal implementation. The tester is only aware of what the software is
supposed to do, not how it does it. Black-box testing methods include: equivalence
5455
partitioning, boundary value analysis, all-pairs testing, state transition tables, decision table
testing, fuzz testing, model-based testing, use case testing, exploratory testing and
specification-based testing. Specification-based testing aims to test the functionality of
software according to the applicable requirements. This level of testing usually requires
thorough test cases to be provided to the tester, who then can simply verify that for a given
input, the output value (or 55ehavior), either “is” or “is not” the same as the expected value
specified in the test case. Test cases are built around specifications and requirements, i.e.,
what the application is su posed to do. It uses external descriptions of the software, including
specifications, requirements, and designs to derive test cases. These tests can be functional or
non-functional, though usually functional.
Alpha Testing
Alpha testing is simulated or actual operational testing by potential users/customers or an
independent test team at the developers’ site. Alpha testing is often employed for off-the-
shelf software as a form of internal acceptance testing, before the software goes to beta
testing.
Beta Testing
Beta testing comes after alpha testing and can be considered a form of external user
acceptance testing. Versions of the software, known as beta versions, are released to a limited
audience outside of the programming team. The software is released to groups of people so
that further testing can ensure the product has few faults or bugs. Sometimes, beta versions
are made available to the open public to increase the feedback field to a maximal number of
future users

import numpy as np

import cv2

from collections import deque

 numpy: A library for scientific computing with Python. It provides functions


for working with arrays, matrices, and numerical operations.

 cv2: The OpenCV (Open Source Computer Vision) library. It provides a wide
range of functions and tools for computer vision tasks, including image and
video processing, object detection, and machine learning.
 deque: A double-ended queue implementation from the collections module. It
allows you to add and remove elements from both ends of the queue
efficiently.
 The import numpy as np line imports the numpy library and assigns it the
alias np, which is a common convention. This allows you to refer to the library
using the shorter np name instead of typing out numpy every time.
 The import cv2 line imports the cv2 module, which provides access to the
functions and tools in the OpenCV library.
 The from collections import deque line imports the deque class from the
collections module. This allows you to create deque objects, which are double-
ended queues that you can add and remove elements from efficiently.

Step 2 – Read frames from a webcam:

#default called trackbar function

def setValues(x):

print("")provided d

The setValues function you have defined takes a single parameter x, but it does not do anything
with it. It simply prints an empty string.

It is likely that this function is intended to be used as a callback function for a trackbar in the
OpenCV library. A trackbar is a graphical widget that allows the user to set a value by sliding a
knob along a range of values. When the trackbar is moved, the callback function is called with the
new value of the trackbar.

## Creating
Creating the
the trackbars
trackbars needed
needed for
for adjusting
adjusting the
the marker
colour
marker colour
cv2.namedWindow("Color
cv2.namedWindow("Color detectors")
detectors")

cv2.createTrackbar("Upper
cv2.createTrackbar("UpperHue",
Hue","Color
"Colordetectors",
detectors",153,
153,180,setValues)
180,setValues)
cv2.createTrackbar("Upper
cv2.createTrackbar("UpperSaturation",
Saturation","Color
"Colordetectors",
detectors",255,
255,255,setValues)
255,setValues)

cv2.createTrackbar("Upper Value", "Color detectors", 255, 255,setValues)

cv2.createTrackbar("Lower Hue", "Color detectors", 64, 180,setValues)

cv2.createTrackbar("Lower Saturation", "Color detectors", 72, 255,setValues)


In this code snippet, you are creating a window called "Color detectors" and adding six
trackbars to it. The trackbars are used to adjust the upper and lower bounds of the hue,
saturation, and value channels of a color space.
The createTrackbar cv2.createTrackbar("Lower Value", "Color detectors", 49, 255,setValues)
function is used to create a trackbar. It takes the following arguments:
The name of the trackbar.
The name of the window in which the trackbar will be displayed.
The initial value of the trackbar.
The maximum value of the trackbar.
The callback function to be called when the trackbar is moved.
In this case, the trackbars are given names like "Upper Hue" and "Lower Saturation", and
they have a range of values from 0 to 255. The setValues function is specified as the callback
function for each trackbar.
When the user moves any of these trackbars, the setValues function will be called with the
new value of the trackbar as an argument. You can then use this value to adjust the color
detection parameters in your code.

# Giving different arrays to handle colour points of different colour

bpoints = [deque(maxlen=1024)]

gpoints = [deque(maxlen=1024)]

rpoints = [deque(maxlen=1024)]

ypoints = [deque(maxlen=1024)]

 This code appears to be defining four different arrays, each of which is a deque
(double-ended queue) with a maximum length of 1024. The deques are named
"bpoints", "gpoints", "rpoints", and "ypoints", and they are each associated with a
different color.
 A deque is a data structure that allows you to add and remove elements from both the
front and the back of the queue. It is similar to a list, but it has more efficient insert
and delete operations for elements at the beginning and end of the queue. The
"maxlen" parameter specifies the maximum number of elements that the deque can
hold. If the deque reaches its maximum length and a new element is added, the oldest
element will be automatically removed to make room for the new one.
 In this code, it looks like the four deques are being used to store points of different
colors. It is not clear from this code snippet how the deques are being used or what the
points represent. Without more context, it is difficult to provide a more detailed
explanation of this code.

# These indexes will be used to mark the points in particular arrays of specific colour

blue_index = 0

green_index = 0

red_index = 0

yellow_index = 0

 This code appears to be defining four variables: "blue_index", "green_index",


"red_index", and "yellow_index". These variables are all integers with initial values
of 0.
 It looks like these variables are being used to keep track of the indexes of points in the
"bpoints", "gpoints", "rpoints", and "ypoints" arrays that were defined in the previous
code snippet. It is not clear from this code snippet how these variables are being used
or what the points represent. Without more context, it is difficult to provide a more
detailed explanation of this code.

#The kernel to be used for dilation purpose

#The =kernel
kernel to be used for dilation purpose
np.ones((5,5),np.uint8)

kernel = np.ones((5,5),np.uint8)

colors = [(255, 0, 0), (0, 255, 0), (0, 0, 255), (0, 255, 255)]
colorIndex = 0 akdhwfh"kernel" is a 2D array of ones with a shape of (5, 5) and a data type
of "np.uint8". It is being used to create a kernel for the purpose of dilation. Dilation is a
morphological operation in image processing that is used to increase the size of features in an
image. It works by applying a structuring element (such as the kernel defined in this code) to
the input image and increasing the size of the features by adding pixels to the boundaries of
the features.

"kernel" is a 2D array of ones wi"kernel" is a 2D array of ones with a shape of (5, 5) and a
data type of "np.uint8". It is being used to create a kernel for the purpose of dilation. Dilation
is a morphological operation in image processing that is used to increase the size of features
in an image. It works by applying a structuring element (such as the kernel defined in this
code) to the input image and increasing the size of the features by adding pixels to the
boundaries of the features.

"colors" is a list of tuples that represent four different colors: blue, green, red, and yellow.
Each tuple consists of three integers that represent the red, green, and blue (RGB) values of
the color.th a shape of (5, 5) and a data type of "np.uint8". It is being used to create a kernel
for the purpose of dilation. Dilation is a morphological operation in image processing that is
used to increase the size of features in an image. It works by applying a structuring element
(such as the kernel defined in this code) to the input image and increasing the size of the
features by adding pixels to the boundaries of the features.

# Here is code for Canvas setup

paintWindow = np.zeros((471,636,3)) + 255

paintWindow = cv2.rectangle(paintWindow, (40,1), (140,65), (0,0,0), 2)


# Here is code for Canvas setup
paintWindow = cv2.rectangle(paintWindow, (160,1), (255,65), colors[0], -1)
paintWindow = np.zeros((471,636,3)) + 255
paintWindow = cv2.rectangle(paintWindow, (275,1), (370,65), colors[1], -1)

paintWindow = cv2.rectangle(paintWindow, (390,1), (485,65), colors[2], 1)

paintWindow = cv2.rectangle(paintWindow, (505,1), (600,65), colors[3], 1)


This code appears to be setting up a canvas by creating an image with a shape of (471, 636,
3) and filling it with white pixels. The image is then modified by drawing four rectangles on
top of it using the "cv2.rectangle" function from the OpenCV library.

The "cv2.rectangle" function is used to draw a rectangle on an image. It takes several


parameters:

 The image on which to draw the rectangle

 The top-left corner and bottom-right corner of the rectangle, specified as (x, y)
coordinates

 The color of the rectangle, specified as an (R, G, B) tuple

 The thickness of the rectangle's outline, specified as a positive integer. A value of -1


indicates that the rectangle should be filled with the specified color.

In this code, four rectangles are being drawn on the image. The first rectangle is a thin black
outline with a thickness of 2. The remaining three rectangles are filled with the colors
specified in the "colors" list. The position and size of each rectangle is determined by the
coordinates of the top-left and bottom-right corners.

It is not clear from this code snippet how the resulting image is being used or what it
represents. Without more context, it is difficult to provide a more detailed explanation of this
code.

cv2.putText(paintWindow, "CLEAR", (49, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0,


0, 0), 2, cv2.LINE_AA)

cv2.putText(paintWindow, "BLUE", (185, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,


(255, 255, 255), 2, cv2.LINE_AA)

cv2.putText(paintWindow, "GREEN", (298, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,


(255, 255, 255), 2, cv2.LINE_AA)

cv2.putText(paintWindow, "RED", (420, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,


255, 255), 2, cv2.LINE_AA)
cv2.putText(paintWindow, "YELLOW", (520, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
(150,150,150), 2, cv2.LINE_AA)

cv2.namedWindow('Paint', cv2.WINDOW_AUTOSIZE) This code appears to be adding text


labels to the "paintWindow" image that was created in the previous code snippet. It is using
the "cv2.putText" function from the OpenCV library to draw text on the image.

The "cv2.putText" function is used to draw text on an image. It takes several parameters:

 The image on which to draw the text

 The text to be drawn, specified as a string

 The bottom-left corner of the text, specified as (x, y) coordinates

 The font to use for the text, specified using one of the constants from the
"cv2.FONT_HERSHEY_" family

 The font scale, specified as a floating-point value

 The color of the text, specified as an (R, G, B) tuple The thickness of the text's
outline, specified as a positive integer

 The line type, specified using one of the constants from the "cv2.LINE_" family

 In this code, five lines of text are being drawn on the "paint Window" image. Each
line of text is positioned at a different (x, y) coordinate and has a different color and
thickness. The text is also being rendered using the
"cv2.FONT_HERSHEY_SIMPLEX" font and a scale of 0.5.

 The final line of code in this snippet uses the "cv2.namedWindow" function to create
a window with the name "Paint" and the "cv2.WINDOW_AUTOSIZE" flag, which
indicates that the window should automatically resize to fit the displayed image. It is
not clear from this code snippet how the resulting image or window are being used or
what they represent. Without more context, it is difficult to provide a more detailed
explanation of this code.

# Loading the default webcam of PC.

cap = cv2.VideoCapture(0)
The "cv2.VideoCapture" function is used to open a video stream or a video file and create a
video capture object that can be used to read frames from the stream or file. It takes a single
parameter, which specifies the source of the video stream. In this case, the parameter is 0,
which indicates that the default webcam of the computer should be used as the source.

Once the video capture object has been created, it can be used to read frames from the video
stream using the "cap.read" method. The "cap.read" method returns a Boolean value
indicating whether a frame was successfully read, as well as the frame itself. The frames can
then be processed or displayed using various functions from the OpenCV library or other
image processing libraries.

It is not clear from this code snippet how the video capture object or the frames it captures are
being used or what they represent. Without more context, it is difficult to provide a more
detailed explanation of this code.

# Keep looping
# Keep looping
while True:
while True:
# Reading the frame from the camera
# Reading the frame from the camera
ret, frame = cap.read()
ret, frame = cap.read()
#Flipping the frame to see same side of yours
#Flipping the frame to see same side of yours
frame = cv2.flip(frame, 1)
frame = cv2.flip(frame, 1)
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

u_hue = cv2.getTrackbarPos("Upper Hue", "Color detectors")

u_saturation = cv2.getTrackbarPos("Upper Saturation", "Color detectors")


u_hue = cv2.getTrackbarPos("Upper Hue", "Color detectors")
u_value = cv2.getTrackbarPos("Upper Value", "Color detectors")
u_saturation = cv2.getTrackbarPos("Upper Saturation", "Color detectors")
l_hue = cv2.getTrackbarPos("Lower Hue", "Color detectors")
u_value = cv2.getTrackbarPos("Upper Value", "Color detectors")
l_saturation = cv2.getTrackbarPos("Lower Saturation", "Color detectors")
l_hue = cv2.getTrackbarPos("Lower Hue", "Color detectors")
l_value = cv2.getTrackbarPos("Lower Value", "Color detectors")

Upper_hsv = np.array([u_hue,u_saturation,u_value])

Lower_hsv = np.array([l_hue,l_saturation,l_value])
The main loop of the code is an infinite while loop that reads a frame from the camera (stored
in a variable called cap) using the read() method, and then flips the frame horizontally using
the flip() method. This is done so that the output video appears as if it were being viewed
from the same side as the viewer.

Next, the code converts the frame from the BGR color space to the HSV color space using
the cvtColor() method. This is done because it is often easier to perform color-based image
processing tasks in the HSV color space, as it separates the hue (color), saturation (intensity),
and value (brightness) channels.

The code then uses the getTrackbarPos() method to get the values of several trackbars,
which are GUI widgets that allow the user to adjust a value by sliding a thumb along a track.
These trackbars are being used to set upper and lower bounds for the HSV values of the
pixels in the frame. The upper bounds are stored in the Upper_hsv array, and the lower
bounds are stored in the Lower_hsv array

Finally, the code uses these upper and lower bounds to perform some kind of color-based
image processing on the frame, although it is not clear from the code snippet exactly what
this processing entails.

# Adding the colour buttons to the live frame for colour access

frame = cv2.rectangle(frame, (40,1), (140,65), (122,122,122), -1)

frame = cv2.rectangle(frame, (160,1), (255,65), colors[0],

-1)

frame = cv2.rectangle(frame, (275,1), (370,65), colors[1], -1)

frame = cv2.rectangle(frame, (390,1), (485,65), colors[2], -1)

frame = cv2.rectangle(frame, (505,1), (600,65), colors[3], -1)

cv2.putText(frame, "CLEAR ALL", (49, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,


(255, 255, 255), 2, cv2.LINE_AA)

cv2.putText(frame, "BLUE", (185, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255,


255), 2, cv2.LINE_AA)
cv2.putText(frame, "GREEN", (298, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,
255, 255), 2, cv2.LINE_AA)

cv2.putText(frame, "RED", (420, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255,


255), 2, cv2.LINE_AA)

cv2.putText(frame, "YELLOW", (520, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,


(150,150,150), 2, cv2.LINE_AA)

 The rectangle() function is being used to draw four-sided polygons (rectangles) on


the frame. The first argument to this function is the frame on which the rectangles will
be drawn, and the second argument is the coordinates of the top-left corner of the
rectangle. The third argument is the coordinates of the bottom-right corner of the
rectangle. The fourth argument is the color of the rectangle, and the final argument
specifies whether the rectangle should be filled in or just drawn as an outline.
 The code also uses the putText() function to add text labels to the frame. The first
argument to this function is the frame on which the text will be drawn, the second
argument is the text to be drawn, the third argument is the coordinates of the bottom-
left corner of the text, the fourth argument specifies the font to use, the fifth argument
specifies the font scale (relative to the font size), the sixth argument specifies the
color of the text, the seventh argument specifies the thickness of the text outline, and
the final argument specifies the line type (anti-aliased or not).
 It looks like the code is adding a series of rectangles to the top of the frame, with each
rectangle representing a different color. The rectangles are labeled with text indicating
the color they represent. The first rectangle is gray and labeled "CLEAR ALL," which
suggests that it may be used to clear the other rectangles or reset some aspect of the
program. The other rectangles are colored and labeled "BLUE," "GREEN," "RED,"
and "YELLOW." It is not clear from the code snippet exactly what these rectangles
and labels are used for, but they may be used to allow the user to select colors or
perform some other kind of color-based operation on the video.
# # Identifying the pointer by making its mask

Mask = cv2.inRange(hsv, Lower_hsv, Upper_hsv)

Mask = cv2.erode(Mask, kernel, iterations=1)

Mask = cv2.morphologyEx(Mask, cv2.MORPH_OPEN, kernel)

Mask = cv2.dilate(Mask, kernel, iterations=1)

22222

The first line of code uses the inRange() function to create a binary mask from the
HSV frame (hsv) by thresholding the values of the pixels in the frame. This mask will
have pixels set to 255 (white) wherever the values of the pixels in the frame fall
within the specified range (Lower_hsv to Upper_hsv) and pixels set to 0 (black)
everywhere else. This mask will highlight or "mask out" the pixels in the frame that
fall within the specified range, which will be used to identify the pointer.

The next three lines of code apply morphological transformations to the mask to
refine it. The erode() function erodes away the boundaries of the white pixels in the
mask, reducing their size. The morphologyEx() function performs an "opening"
operation, which consists of an erosion followed by a dilation. This can be used to
remove small, isolated pixels from the mask. The dilate() function dilates the white
pixels in the mask, increasing their size.

These morphological transformations are applied using a kernel, which is a small


matrix of values that is used to specify the shape of the structuring element used in
the transformations. The kernel is specified as a global variable (kernel) that is not
shown in the code snippet.

It is not clear from the code snippet exactly what the "pointer" is or how it is being
used, but it appears that the mask is being used to identify and locate it in the video
frame.
# Find contours for the pointer after idetifying it

cnts,_ = cv2.findContours(Mask.copy(), cv2.RETR_EXTERNAL,

cv2.CHAIN_APPROX_SIMPLE)

center = None

 The findContours() function is used to locate the contours of the objects in the binary
mask (Mask). The first argument to this function is the mask itself, and the second
argument specifies the mode of contour retrieval. The RETR_EXTERNAL flag
specifies that only the extreme outer contours of the objects in the mask should be
returned. The third argument specifies the contour approximation method to be used.
The CHAIN_APPROX_SIMPLE flag specifies that the contours should be
approximated using a simple algorithm that removes redundant points and compresses
the contour coordinates.

 The findContours() function returns a list of contours (cnts) and a hierarchy of the
contours. The hierarchy is not being used in this code snippet, so it is discarded using
the underscore character (_) as a dummy variable.

 The center variable is also being initialized to None, although it is not clear from the
code snippet exactly what this variable will be used for. It may be used to store the
coordinates of the center of the pointer or some other aspect of its shape or position.

#if the contours are formed

If len(cnts)>0:

# sorting the contours to find biggest

cnt = sorted(cnts, key = cv2.contourArea, reverse = True)[0]

# Get the radius of the enclosing circle around the found contour
((x, y), radius) = cv2.minEnclosingCircle(cnt)

# Draw the circle around the contour

cv2.circle(frame, (int(x), int(y)), int(radius), (0, 255, 255), 2)

# Calculating the center of the detected contour

M = cv2.moments(cnt)

center = (int(M['m10'] / M['m00']), int(M['m01'] / M['m00']))

The code first checks if there are any contours present in the binary mask (cnts). If there
are, it selects the largest contour (cnt) using the sorted() function and the contourArea()
method. This ensures that the pointer, which is assumed to be the largest object in the mask,
is selected.

Next, the code uses the minEnclosingCircle() function to fit a circle around the contour and
get the coordinates of the center of the circle (x, y) and its radius. The code then uses the
circle() function to draw the circle around the contour in the frame.

Finally, the code calculates the center of the contour using the moments() method and the
spatial moments of the contour. The moments() method returns a dictionary of moments that
can be used to calculate various properties of the contour, such as its area, centroid, and
orientation. In this case, the centroid (center of mass) of the contour is being calculated by
dividing the first and second spatial moments by the zero-order moment (area). The resulting
coordinates are stored in the center variable.

It is not clear from the code snippet exactly what the "pointer" is or how it is being used, but
it appears that the code is using the contours and moments of the pointer to identify its
position and shape in the video frame.

# Now checking if the user wants to click on any button above the screen
if center[1] <= 65:

if 40 <= center[0] <= 140: # Clear Button

bpoints = [deque(maxlen=512)]

gpoints = [deque(maxlen=512)]

rpoints = [deque(maxlen=512)]

ypoints = [deque(maxlen=512)]

blue_index = 0

green_index = 0

red_index = 0

yellow_index = 0

paintWindow[67:,:,:] = 255

elif 160 <= center[0] <= 255:

colorIndex = 0 # Blue

elif 275 <= center[0] <= 370:

colorIndex = 1 # Green

elif 390 <= center[0] <= 485:


colorIndex = 2 # Red

elif 505 <= center[0] <= 600:

colorIndex = 3 # Yellow

else :

if colorIndex == 0:

bpoints[blue_index].appendleft(center)

elif colorIndex == 1:

gpoints[green_index].appendleft(center)

elif colorIndex == 2:

rpoints[red_index].appendleft(center)

elif colorIndex == 3:

ypoints[yellow_index].appendleft(center)

 The code first checks if the y coordinate of the pointer's center (center[1]) is less than
or equal to 65. If it is, then the pointer is considered to be within the top 65 pixels of
the frame, which corresponds to the area where the color buttons are located. In this
case, the code checks the x coordinate of the pointer's center (center[0]) to determine
which button the pointer is hovering over.

 If the pointer is over the "CLEAR" button (40 <= center[0] <= 140), the code resets
several variables that are used to store the points of the pointer's path (bpoints,
gpoints, rpoints, ypoints). These variables are lists of deques (double-ended queues)
that are used to store the coordinates of the pointer as it moves around the frame. The
code also resets several variables that are used to track the position of the pointer in
the lists (blue_index, green_index, red_index, yellow_index).
 If the pointer is over one of the color buttons (160 <= center[0] <= 255, 275 <=
center[0] <= 370, 390 <= center[0] <= 485, or 505 <= center[0] <= 600), the code
sets a variable called colorIndex to the index of the selected color.

 If the pointer is not within the top 65 pixels of the frame, then it is assumed to be
within the main area of the frame where the user can draw. In this case, the code
checks the value of colorIndex to determine which color the user has selected, and
then appends the coordinates of the pointer's center to the appropriate list of points
(bpoints, gpoints, rpoints, or ypoints).

 It is not clear from the code snippet exactly what the "pointer" is or how it is being
used, but it appears that the code is using the position of the pointer to allow the user
to select colors and draw on the video frame. The paintWindow variable, which is
not shown in the code snippet, may be used to store the drawings made by the user.

# Append the next deques when nothing is detected to avois messing up

else:

bpoints.append(deque(maxlen=512))

blue_index += 1

gpoints.append(deque(maxlen=512))

green_index += 1

rpoints.append(deque(maxlen=512))

red_index += 1

ypoints.append(deque(maxlen=512))

yellow_index += 1
 . Each of these lists is a list of deques (double-ended queues) that are used to store the
coordinates of the pointer as it moves around the frame. When the pointer not
detected, it is assumed that it has stopped moving and a new eque is appended to the
list to store the coordinates of the pointer's next movement. The maxlen argument to
the deque() function specifies the maximum number of elements that the deque can
hold.

 The code also increments four variables (blue_index, green_index, red_index,


yellow_index) by 1. These variables are used to track the position of the pointer in the
lists, and they are incremented whenever a new deque is appended to the list.

 It is not clear from the code snippet exactly what the "pointer" is or how it is being
used, but it appears that the code is using these lists and variables to store the
coordinates of the pointer's movements and allow the user to draw on the video frame.

# # Draw lines of all the colors on the canvas and frame

points = [bpoints, gpoints, rpoints, ypoints]

for i in range(len(points)):

for j in range(len(points[i])):

for k in range(1, len(points[i][j])):

if points[i][j][k - 1] is None or points[i][j][k] is None:

continue

cv2.line(frame, points[i][j][k - 1], points[i][j][k],

colors[i], 2)
a separate "canvas" (paintWindow) based on the coordinates of the "pointer" stored in four
lists (bpoints, gpoints, rpoints, ypoints).
The points variable is a list of the four lists, and the outer for loop iterates over the elements
of this list This code appears to be using the OpenCV library to draw lines on a video frame
and. The inner for loop iterates over the elements of each list, and the nested for loop iterates
over the elements of each deque in the list.
For each element in the deque, the code uses the line() function to draw a line on the video
frame (frame) and the canvas (paintWindow) between the current element and the previous
element in the deque. The colors list is used to specify the color of the line, and the 2
argument specifies the thickness of the line.
The continue statement is used to skip over any elements in the deque that are None, which
may occur if the pointer was not detected in the previous frame.
It is not clear from the code snippet exactly what the "pointer" is or how it is being used, but
it appears that the code is using the stored coordinates of the pointer's movements to allow
the user todraw lines on the video frame and the canvas.
# Show all the windows
cv2.imshow("Tracking", frame)
cv2.imshow("Paint", paintWindow)
cv2.imshow("mask",Mask)
This code is using the OpenCV (cv2) library to display three images in separate windows.
The first window, "Tracking", shows the frame of a video or a single image. The second
window, "Paint", shows an image called "paintWindow". The third window, "mask", shows
an image called "Mask".
cv2.imshow() is a function that takes in two arguments: the name of the window and the
image to be displayed in that window. When the code is run, it will open three windows and
display the respective images in them. The windows will remain open until the user closes
them or until the program is stopped.
The purpose of this code is likely to allow the user to view and analyze the images in each
window. The "Tracking" window may show the original video or image, while the "Paint"
and "mask" windows may show some kind of processed version of the original image, such
as a mask or an annotated version

# If the 'q' key is pressed then stop the application


if cv2.waitKey(1) & 0xFF == ord("q"):
break
This code is checking for user input in the form of a key press. It is using the cv2.waitKey()
function, which waits for a specified time in milliseconds for a user to press a key. If the user
presses a key, the ASCII value of the key is returned.
The code then checks if the key pressed was the letter "q" by comparing the ASCII value of
the "q" key (which is obtained using the ord() function) to the key pressed. If the key pressed
was "q", the code breaks out of the loop it is in.
This code is likely being used to allow the user to stop the program by pressing the "q" key.
When the "q" key is pressed, the program will break out of the loop and continue with the
rest of the code (if there is any). If a different key is pressed, the program will continue
running as normal.
# Release the camera and all resources
cap.release()
cv2.destroyAllWindows()

The "cap" object is a video capture object that is used to capture video frames from a camera or
a video file. The cap.release() method releases the camera and frees up any resources it was
using. This is important to do when you are finished using the camera or video file, as it ensures
that the resources are released and can be used by other programs.

The cv2.destroyAllWindows() function closes all windows that were created using the
cv2.imshow() function. This is useful when you have multiple windows open and want to close
them all at once.

This code is likely being used to clean up resources and close windows at the end of the
program. It is good practice to release resources and close windows when you are finished using
them to ensure that your program is not using unnecessary resources and is not cluttering up the
user's desktop with open windows.
Chapter-7
Results and Output Screens
Chapter:8
Conclusion & Future Work

The system has the potential to challenge traditional writing methods. It eradicates the
need to carry a mobile
phone in hand to jot down notes, providing a simple on-
the-go way to do the same. It will also serve a great
purpose in helping especially abled people communicate
easily. Even senior citizens or people who find it difficult
to use keyboards will able to use system effortlessly.
Extending the functionality, system can also be used to
control IoT devices shortly. Drawing in the air can also be
made possible. The system will be an excellent software
for smart wearables using which people could better
interact with the digital world. Augmented Reality can
make text come alive. There are some limitations of the
system which can be improved in the future. Firstly,
using a handwriting recognizer in place of a character
recognizer will allow the user to write word by word,
making writing faster. Secondly, hand-gestures with a
pause can be used to control the real-time system as
done by [1] instead of using the number of fingertips.
Thirdly, our system sometimes recognizes fingertips in
the background and changes their state. Air-writing
systems should only obey their master's control gestures
and should not be misled by people around. Also, we
used the
EMNIST dataset, which is not a proper air-character
dataset. Upcoming object detection algorithms such as
YOLO v3 can improve fingertip recognition accuracy and
speed. In the future, advances in Artificial Intelligence
will enhance the efficiency of air-writing.
Future Work-:
Given more time to deal with this project, we would improve hand contour recognition,
investigate our unique Air Canvas objectives, and attempt to comprehend the multicore
module. To improve hand gesture tracking, we would need to dig more into OpenCV. There
are various strategies for contour analysis, yet in this specific algorithm, it very well might be
advantageous to investigate the color histogram used to create the contours in question.
Besides, we could explore different interpolation methods. PyGame incorporates a line
drawing technique (pygame.draw.line()) that could be valuable in creating smoother, cleaner
lines. On a similar vein, implementing a variety of brush shapes, surfaces, and even an eraser
would make Air Canvas more powerful as a drawing program. Permitting the client to save
their final work or watch their drawing cycle as a playback animation could likewise be
remarkable highlights that look like genuine innovative software. Maybe there would even be
an approach to interface Air Canvas to genuine computerized drawing projects, for example,
Adobe Photoshop, Clip Studio Paint, or GIMP! At last, we could make critical walks by
sorting out how multicore processing works with in-order information processing.

).

You might also like