Air Canvas
Air Canvas
Submitted by
2. Introduction 3
3. Literature Review 4
3.1Research Gap 4
3.2 Motivation 4
3.3 Objectives 5
3.4 Methodology
7. Model Architecture
1
8. Prototype and Experimental results
8.3 Output
10. References
2
Abstract
In the realm of computer vision and augmented reality, the fusion of technology and artistic
expression has given rise to innovative applications. One such advancement is the
development of Air Canvas, a virtual pen system that harnesses the power of MediaPipe and
OpenCV frameworks. This technology enables users to create digital artwork in the air,
transforming their gestures into vibrant, dynamic drawings.
Air Canvas leverages the capabilities of MediaPipe, a popular library for real-time hand and
gesture recognition, and OpenCV, a versatile computer vision library, to track the movement
of the user's hand in real-time. By capturing precise hand gestures and movements, Air
Canvas translates these actions into digital strokes on a virtual canvas. Users can draw,
doodle, and paint without the constraints of physical mediums, opening up endless
possibilities for artistic expression.
This abstract explores the technical aspects and implementation of Air Canvas, delving into
the algorithms and methodologies behind real-time hand tracking and gesture recognition. By
utilising the rich features provided by MediaPipe and OpenCV, Air Canvas creates an
immersive experience for users, bridging the gap between the physical and digital worlds.
The system's accuracy and responsiveness make it an ideal tool for artists, designers, and
enthusiasts, providing them with a novel platform to unleash their creativity.
Furthermore, this abstract discusses the potential applications of Air Canvas beyond the
realm of artistry. From educational tools that enhance learning experiences to interactive
presentations that engage audiences, the versatility of Air Canvas extends far beyond
traditional creative pursuits. Its intuitive interface and seamless integration with existing
technologies pave the way for future developments in the fields of virtual reality, education,
and interactive design.
In summary, Air Canvas represents a significant leap in the evolution of virtual pen
technologies. By combining the capabilities of MediaPipe and OpenCV, this innovative
system offers a unique and immersive way for users to express their creativity while pushing
the boundaries of what is possible in the digital realm.
3
Introduction
Air Canvas, often referred to as the Virtual Pen, is a cutting-edge project leveraging the power of
OpenCV and computer vision technology. Its primary aim is to transform any ordinary surface
into an engaging and interactive sketching area, redefining the way we create digital art and
design.
One of the core features of this project involves the implementation of sophisticated colour
recognition and tracking methods. These techniques allow users to wield a "virtual pen" and
create digital masterpieces by simply moving a coloured object in their hands. It offers a
remarkably immersive experience that closely emulates the sensation of painting, all made
possible through the fusion of technology and creativity.
The Air Canvas project stands at the intersection of cutting-edge technology and artistic
creativity, offering an innovative solution for digital expression. In a world increasingly shaped
by advanced computer vision and augmented reality, this project introduces a groundbreaking
virtual pen system. By harnessing the power of MediaPipe and OpenCV frameworks, Air Canvas
enables users to create digital artwork in the air, breaking free from the constraints of physical
mediums.
Traditionally, artistic expression has been confined to paper, canvas, or digital tablets. However,
with the advent of real-time hand tracking and gesture recognition technologies, the boundaries
of creativity are expanding. Air Canvas capitalises on these advancements to transform hand
gestures into dynamic digital strokes. Users can draw, paint, and visualise their ideas in real-time,
immersing themselves in a virtual artistic space.
This project dives deep into the realm of computer vision, exploring the intricacies of MediaPipe
and OpenCV to accurately track hand movements and gestures. By translating these movements
into digital form, Air Canvas provides users with a novel and intuitive way to create art. The
system's responsiveness and precision make it not only a tool for artists but also a platform for
interactive learning, engaging presentations, and beyond.
In the following sections, we will delve into the technical aspects of Air Canvas, exploring the
algorithms, methodologies, and potential applications of this virtual pen system.
4
Literature Review
Title of the paper Authors of the paper Year, Name of the Key Learning's
Journal/Conference
Air Canvas Through Harshit Rajput, International The research paper introduces a new
Object Detection Mudit Sharma, Journal of Creative system called the air canvas, which
Using Twesha Mehrotra, Research allows users to draw in mid-air using
Tanya Maurya Thoughts,2023 a stylus on a virtual canvas. The
OpenCV in Python system utilises object detection
techniques in OpenCV to track the
stylus's position and enable real-time
drawing.
AIR CANVAS Prof. S.U. International The paper primarily focuses on the
Saoji,Nishtha Research Journal development of a motion-to-text
APPLICATION
Dua,Akash Kumar of Engineering and converter using hand gesture recognition
USING OPENCV for air writing. It explores the challenges
Choudhary,Bharat Technology,2022
AND NUMPY IN Phogat in hand gesture recognition, addresses
PYTHON societal issues related to
communication,smartphone dependence
and paper wastage. It outlines the
methodology including the creation of a
fingertip recognition dataset and the use
of deep learning algorithms for fingertip
detection and recognition
5
Title of the paper Authors of the paper Year, Name of the Key Learning's
Journal/Conference • Discuss about the algorithm / model /
tool / technique used.
• Main focus of the paper in couple of
sentences
6
Title of the paper Authors of the paper Year, Name of the Key Learning's
Journal/Conference
• Discuss about the algorithm / model /
tool / technique used.
• Main focus of the paper in couple of
sentences
7
Title of the paper Authors of the paper Year, Name of the Key Learning's
Journal/Conference
• Discuss about the algorithm / model /
tool / technique used.
• Main focus of the paper in couple of
sentences
8
Title of the paper Authors of the paper Year, Name of the Key Learning's
Journal/Conference
• Discuss about the algorithm / model /
tool / technique used.
• Main focus of the paper in couple of
sentences
Real Time Chandan G, Ayush International The research paper discusses the use
Jain, Harsh Jain, Conference on of deep learning algorithms for object
Object
Mohana Inventive Research detection and tracking, specifically
Detection and focusing on the Region-based
in Computing
Tracking Using Applications,2018 Convolutional Neural Networks
Deep Learning (RCNN), Faster-RCNN, Single Shot
Detector (SSD), and You Only Look
and OpenCV.
Once (YOLO) algorithms.
9
3.1 Research Gap
3.2 Motivation
The motivation behind the development of the Air Canvas virtual pen system stems from a
combination of technological innovation, creative empowerment, and the pursuit of enhancing
human-computer interaction. Several compelling factors drive the need for this project:
The rapid advancements in computer vision, particularly in the realms of hand tracking and
gesture recognition, have opened new possibilities for interactive digital experiences. Harnessing
these technologies allows us to create innovative tools that bridge the physical and digital
worlds, enabling users to express themselves in novel ways.
Traditional artistic mediums have their limitations. The digital realm, on the other hand, offers
endless possibilities for creativity. Air Canvas aims to empower artists and enthusiasts by
providing them with a tool that allows for free-form expression without the constraints of
physical materials. By enabling users to draw and paint in the air, the project encourages
creativity to flow without boundaries.
11
● Fostering Innovation in Education:
Virtual pen systems have the potential to revolutionise education by providing interactive and
engaging learning experiences. Imagine students being able to visualise complex concepts by
drawing them in the air or educators creating interactive lessons with intuitive digital tools. By
exploring the educational applications of Air Canvas, the project aims to foster innovation in
teaching and learning methodologies.
The development of Air Canvas requires collaboration between experts in computer vision,
software engineering, and artistic design. This multidisciplinary approach not only enriches the
project but also promotes collaboration between diverse fields. Encouraging professionals from
different backgrounds to work together fosters creativity and drives innovation, leading to the
development of groundbreaking technologies.
3.3 Objectives
● To gather comprehensive user feedback and conduct behaviour analysis, refining the
virtual pen system based on user interactions and preferences and minimise latency and
enhance real-time rendering, providing a responsive and immersive virtual drawing
experience.
12
3.4 Methodology
● Framework Integration:
Seamlessly integrate MediaPipe and OpenCV frameworks, leveraging their respective strengths.
Develop algorithms to merge hand tracking and image processing, ensuring accurate and
real-time gesture recognition.
● Algorithm Optimization:
Refine hand gesture recognition algorithms to accommodate diverse hand shapes and sizes.
Implement machine learning techniques to enhance the system's ability to recognize intricate
gestures, ensuring inclusivity for all users.
● User-Centric Design:
Design an intuitive and ergonomic user interface, considering user preferences and comfort.
Conduct usability tests and gather feedback to iteratively refine the interface, ensuring a seamless
user experience.
● Exploration of Applications:
13
Engineering Knowledge and Resource Management
Proficiency in software engineering principles and programming languages (such as Python and
C++) is vital for developing the algorithms and software modules. Software engineers
collaborate with computer vision experts to implement, optimise, and integrate the MediaPipe
and OpenCV frameworks, ensuring seamless functionality and performance.
Knowledge of HCI principles guides the user interface (UI) and user experience (UX) design.
HCI experts ensure that the virtual pen system is intuitive, ergonomic, and user-friendly. Their
insights contribute to the iterative refinement of the system, focusing on enhancing user
satisfaction and usability.
Engineering knowledge in machine learning is applied to train models for gesture recognition.
Data analysis skills are utilised to process user feedback and behaviour data, providing valuable
insights for system improvements. Machine learning algorithms might also be employed for
optimising gesture recognition accuracy based on user input.
14
● Resource Management:
Resources:
Hardware: Personal computers with webcams.
Software: Python, OpenCV, NumPy, Git/GitHub for collaboration.
15
Environment and Sustainability
The Air Canvas project recognizes the importance of considering environmental and
sustainability aspects in its development and deployment. The following points highlight how the
project aligns with principles of environmental consciousness and sustainability:
● Energy Efficiency:
The virtual pen system is designed to operate on commonly available hardware, reducing the
need for specialised, resource-intensive devices. This not only increases accessibility but also
minimises the environmental impact associated with the production and disposal of electronic
devices.
● Cloud-Based Deployment:
Cloud computing resources are leveraged judiciously to enhance scalability and reduce the need
for individual users to own high-performance hardware. This approach promotes resource
sharing and optimises the use of computational power, contributing to a more sustainable
computing model.
16
● Open-Source Collaboration:
The Air Canvas project embraces open-source principles, encouraging collaboration and
knowledge sharing within the developer community. This collaborative approach not only fosters
innovation but also reduces redundancy in software development efforts, promoting a more
sustainable use of human and computational resources.
● End-User Awareness:
The project emphasises the importance of user awareness regarding energy consumption and
device sustainability. Educational components within the user interface or accompanying
documentation may provide tips on optimising settings for energy efficiency and responsible
device usage.
● Materials Consideration:
In the physical components associated with the project, such as hardware interfaces or input
devices, consideration is given to the environmental impact of materials used in manufacturing.
Efforts are made to choose materials with lower environmental footprints and to promote
recycling or responsible disposal practices.
The Air Canvas software is designed with long-term maintenance and upgradability in mind.
Regular updates and improvements ensure that users can continue to benefit from the system
without the need for frequent replacements, reducing electronic waste and promoting a more
sustainable product lifecycle.
17
Dataset Description and Preprocessing
The dataset for the Air Canvas project is collected using the MediaPipe framework, which
provides robust hand tracking and gesture recognition capabilities. Video recordings capture
users interacting with the virtual pen system, with MediaPipe extracting key hand landmarks and
gesture information.
MediaPipe detects and annotates 21 key hand landmarks, including fingertips, knuckles, and the
palm. These landmarks serve as the basis for training the hand tracking and gesture recognition
models. Each frame in the dataset is enriched with the spatial coordinates of these landmarks.
● Gesture Annotation:
Ground truth annotations involve labelling specific gestures and movements performed by users.
This annotation process is crucial for training the model to recognize a diverse set of gestures,
such as drawing, pointing, and various artistic expressions.
The dataset intentionally incorporates a variety of hand poses and movements to ensure the
model's adaptability to different artistic activities. Users are encouraged to perform gestures that
span the entire range of hand movements supported by MediaPipe.
The dataset includes recordings under different environmental conditions, mirroring the
real-world scenarios where the virtual pen system might be utilised. Variations in lighting,
background, and camera angles are considered to enhance the model's robustness.
18
● MediaPipe User Diversity:
To address diversity in hand shapes and sizes, the dataset includes samples from users with
varying demographics. This diversity ensures that the hand tracking and gesture recognition
models are trained to accommodate different user profiles effectively.
Data augmentation techniques, such as rotations, flips, and scaling, are applied specifically to the
extracted hand landmark data from MediaPipe. Augmentation increases the dataset's size and
introduces variability, improving the model's ability to generalise across different hand poses and
orientations.
Given the dynamic nature of artistic expression, temporal sequencing is considered during
preprocessing. MediaPipe's hand landmarks over consecutive frames are used to create temporal
sequences, allowing the model to capture the flow and context of hand movements over time.
The spatial coordinates of hand landmarks extracted by MediaPipe are normalised and
standardised. This preprocessing step ensures consistency in scale and distribution, facilitating
convergence during model training and improving the model's generalisation capabilities.
19
Model Architecture
20
7.1 Hyper parameters
21
Prototype and Experimental results
Hand-tracking:
The hand-tracking algorithm used in the Air Canvas project relies on the MediaPipe framework,
a robust library developed by Google that provides real-time hand tracking and pose estimation.
MediaPipe employs a machine learning-based approach to detect and track the landmarks of the
human hand in images or video frames. The CNN is trained on a vast dataset of annotated hand
images, learning to recognize patterns and features indicative of hand presence and
configuration. The algorithm proceeds to identify and localise key landmarks on the hand. In the
case of hand tracking, 21 landmarks are identified, including the fingertips, knuckles, and the
palm's centre. These landmarks serve as spatial references for tracking hand movements.The
entire hand-tracking process is optimised for real-time performance, making it suitable for
interactive applications like Air Canvas.
22
Air Canvas:
HTML provides the structure and content of the webpage. It deals with the front end part of the
Air Canvas webpage.
Cascading Style Sheets (CSS) is used to style and layout the HTML elements.
Flutter is integrated into the HTML page referencing the necessary Flutter JavaScript files. It
contains the Flutter app's logic, including the initialization and configuration of the Air Canvas
application.
The JavaScript function is triggered when the "Run Air Canvas" button is clicked. It initiates the
Flutter app, launching the Air Canvas virtual pen system.
In summary, the HTML file structures the webpage, CSS styles the elements, and Flutter is
integrated to handle the dynamic and interactive aspects of the Air Canvas application. The
combination of these technologies allows for a seamless and visually appealing user interface for
initiating and interacting with the Air Canvas virtual pen system.
23
The above image shows the virtual white board on the left hand side.
On the right part, it can be seen that the hand tracking is being done through the camera and the
movement is being processed and is being reflected onto the canvas. The user can hover over the
colours using two fingers to change the colour and also when hovered over the clear part, the
canvas gets cleared.
Basically the user can use the index finger to write. This option of a specific finger can be
changed but index finger has been used for the ease of usability .
24
Performance Analysis:
25
Performance analysis in Air Canvas involves evaluating key metrics such as latency, frames per
second (FPS), CPU usage, and memory usage to ensure a smooth and responsive user
experience. A graphical representation of the same is available once the user quits the air canvas.
● Latency:
Latency in Air Canvas is minimised through efficient algorithms, real-time hand tracking, and
optimised rendering. Techniques such as predictive modelling may be employed to anticipate
user actions, reducing perceived latency.
Monitoring FPS is crucial for assessing the system's responsiveness. Techniques like hardware
acceleration and optimised rendering pipelines are implemented to achieve a high and consistent
FPS, ensuring a visually fluid experience.
● CPU Usage:
To maintain optimal CPU usage, the algorithms for hand tracking and gesture recognition are
designed to be computationally efficient. Additionally, background processes and unnecessary
computations are minimised to prevent undue strain on the CPU.
● Memory Usage:
Effective memory management practices, such as efficient data structures and resource cleanup,
are employed to minimise memory usage. This ensures that the application remains lightweight
and responsive, even during prolonged usage.
26
Conclusions and Future Scope
The Air Canvas project represents a significant advancement in the realm of virtual pen systems,
leveraging the powerful combination of MediaPipe and OpenCV frameworks. The integration of
these technologies has enabled the creation of a responsive, intuitive, and versatile platform that
transcends traditional artistic boundaries. Through meticulous dataset collection, preprocessing,
and algorithm development, the project has achieved robust hand tracking and gesture
recognition capabilities, providing users with a unique and immersive digital drawing
experience. The user-centric design approach, considering factors such as inclusivity,
ergonomics, and diverse application scenarios, ensures that Air Canvas goes beyond being a
mere artistic tool. Its potential applications in education, healthcare, and collaborative design
open up new avenues for exploration and innovation.
Future Scope:
Future iterations of Air Canvas could focus on enhancing gesture recognition capabilities,
allowing users to perform a broader range of intricate and nuanced gestures. This could include
recognizing specific symbols or hand poses for more precise and detailed digital artwork.
Exploring integration with emerging technologies, such as augmented reality (AR) or virtual
reality (VR), could elevate Air Canvas to new heights. Immersive environments could provide
users with an even more engaging and interactive digital canvas.
27
● Collaborative Features:
Implementing collaborative features could enable multiple users to create art together in
real-time. This collaborative aspect could extend to virtual classrooms, enabling educators and
students to interact dynamically during lessons.
Further development of educational modules within Air Canvas could provide users with guided
tutorials, interactive lessons, and skill-building exercises. This could position Air Canvas as a
valuable tool for both art education and skill development in various domains.
● Accessibility Features:
Future versions of Air Canvas could incorporate accessibility features to cater to users with
diverse abilities. This may include voice commands, adaptive interfaces, or gesture
customization, ensuring a more inclusive user experience.
● Cloud-Based Collaboration:
Exploring cloud-based collaboration features would enable users to access their artistic creations
from multiple devices seamlessly. This would enhance the flexibility and convenience of using
Air Canvas across different platforms.
Creating a platform for users to share, showcase, and collaborate on their creations could foster a
vibrant community around Air Canvas. This user-generated content platform could serve as a
hub for creativity, inspiration, and collaboration.
28
References
1. S Guennouni, A Ahaitouf and A Mansouriss , ―Multiple object detection using OpenCV
on an embedded platform‖, 2014 Third IEEE International Colloquium in Information
Science and Technology (CIST), 2014, pp. 374-377.
2. Chandan, Mohana A.H Jain ―The Real Time Object Detection and Tracking Using Deep
Learning and OpenCV‖, 2018 International Conference on Inventive Research in
Computing Applications (ICIRCA), 2018, pp. 1305-1308.
3. Y. Huang, X. Liu, X. Zhang, and L. Jin, "A Pointing Gesture Based Egocentric
Interaction System: Dataset, Approach, and Application," 2016 IEEE Conference on
Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, pp.
370-377, 2016.
4. .P. Ramasamy, G. Prabhu, and R. Srinivasan, "An economical air writing system is
converting finger movements to text using a web camera," 2016 International Conference
on Recent Trends in Information Technology (ICRTIT), Chennai, pp. 1-6, 2016.
5. Fan Zhang, Valentin Bazarewsky, Andrey Vakunov, Andrei Tkachenka, George Sung,
Chuo-Ling Chang, et al., "MediaPipe Hands: On-device Real-time Hand Tracking", 18,
June 2020.
6. PavithraRamasamy, Prabhu. G, Dr. R. Srinivasan, “ An Economical Air Writing System
Converting Finger Movements To Text Using Web Camera” in Fifth International
Conference on Recent Trends in Information Technology, 978-1-4673- 9802-2, 2016.
7. Prof. S.U. Saoji, NishthaDua, Akash Kumar Choudhary, Bharat Phogat, “ Air canvas
application using OpenCV and numpy in python” in IRJET, (Deemed to be University)
College of Engineering, Pune, Volume: 08 Issue: 08, e-ISSN: 2395-0056, p-ISSN:
2395-0072, Aug 2021.
9. Niharika M., Neha J., Mamatha Rao, & Vidyashree K. P. (2022). Virtual Paint
Application Using Hand Gestures. International Research Journal of Engineering and
Technology (IRJET), 09(04), 3090–3093.
10. Haria, A., Subramanian, A., Asokkumar, N., Poddar, S., & Nayak, J. S. (2017). Hand
Gesture Recognition for Human-Computer Interaction. Procedia Computer Science, 115,
367–374.
11. V. Gajjar, V. Mavani, & A. Gurnani, (2017). Hand gesture real time paint tool-box:
Machine learning approach, IEEE International Conference on Power, Control, Signals
and Instrumentation Engineering (ICPCSI), 2017, pp. 856-860, DOI:
10.1109/ICPCSI.2017.8391833.
29
30