0% found this document useful (0 votes)
27 views46 pages

Priya

Uploaded by

abineldho9207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views46 pages

Priya

Uploaded by

abineldho9207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

STEREOPILOT: A WEARABLE TARGET LOCATION

SYSTEM

A SEMINAR REPORT

submitted by

HARIPRIYA K S
(KME20CS029)

to
the APJ Abdul Kalam Technological University
in partial fulfillment of the requirements for the award of the Degree
of
Bachelor of Technology
in
Computer Science & Engineering

Department of Computer Science & Engineering


KMEA Engineering College Edathala, Aluva
683 561
December 2023
DECLARATION

I undersigned hereby declare that the seminar report “StereoPilot: A Wear-


able Target Location System ” submitted for partial fulfilment of the requirement
for the award of Degree of Bachelor of Technology of the APJ Abdul Kalam Tech-
nological University,Kerala,is a bonafide academic document prepared under the
supervision of Ms. Abeera V P, Assistant Professor, KMEA, Cochin.
I have not submitted the matter presented in this seminar report anywhere for the
award of any other Degree.

Signature of student : ......................................


Name of student : HARIPRIYA K S

Place : ..........................
Date : .̇.........................

2
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
KMEA Engineering College Edathala, Aluva
683 561

CERTIFICATE

This is to certify that the report entitled StereoPilot: A Wearable Tar-


get Location System submitted by HARIPRIYA K S to the APJ Abdul Kalam
Technological University in partial fulfillment of the requirements for the award of
the Degree of Bachelor of Technology in Computer Science & Engineering is a
bonafide record of the project work carried out by him/her under our guidance and
supervision. This report in any form has not been submitted to any other University
or Institute for any purpose.

Seminar Guide Seminar Coordinator


Name : Ms. Abeera V P Name : Ms. Vidya Hari
Signature : ....................... Signature : .......................

Head of Department
Name : Dr. Rekha Lakshmanan
Signature : .......................
ACKNOWLEDGMENT

First and foremost, I would like to express my thanks to Almighty for the
diving grace bestowed on me to complete this seminar successfully on time.
I would like to thank our respected Principal Dr. Amar Nishad T. M, the leading
light of our institution and Dr. Rekha Lakshmanan, Vice Principal and Head of
Department of Computer Science and Engineering for her suggestions and support
throughout my seminar. I also take this opportunity to express my profound grat-
itude and deep regards to my Seminar Coordinator Ms. Vidya Hari, for all her
effort, time and patience in helping me to complete the seminar successfully with
all her suggestions and ideas. And a big thanks to my Seminar Guide Ms. Abeera
V P, of the Department of Computer Science & Engineering for leading to the suc-
cessful completion of the seminar. I also express my gratitude to all teachers for
their cooperation. I gratefully thank the lab staff of the Department of Computer
Science and Engineering for their kind cooperation. Once again I convey my grati-
tude to all those who had direct or indirect influence on my seminar.

HARIPRIYA K S
B. Tech. (Computer Science & Engineering)
Department of Computer Science & Engineering
KMEA Engineering College Edathala, Aluva

i
ABSTRACT

This paper introduces StereoPilot, a groundbreaking wearable target loca-


tion system designed specifically for individuals who are blind or visually impaired.
The system leverages spatial audio rendering and computer vision to provide cru-
cial navigational cues, thereby enhancing spatial cognition and improving the over-
all navigation experience for users. By utilizing a head-mounted RGB-D camera,
StereoPilot is able to capture and process 3D spatial information, which is then
translated into auditory cues to assist users in perceiving and interacting with their
surroundings. The primary objective of this innovative system is to address the chal-
lenges faced by individuals with visual impairments when navigating unfamiliar or
dynamic environments.
The development of StereoPilot represents a significant advancement in as-
sistive technology, particularly in the realm of spatial perception and target location
for individuals with visual impairments. By integrating computer vision and spatial
audio rendering, the system offers a comprehensive solution to the complex nav-
igational needs of its users. Through a series of experiments and evaluations, the
paper demonstrates the system’s ability to enhance information transfer rate and
reduce positioning error during spatial navigation tasks. These findings underscore
the potential of StereoPilot to significantly improve the mobility and independence
of individuals with visual impairments in real-world scenarios.
The core functionality of StereoPilot revolves around the seamless integra-
tion of spatial perception, computer vision, and auditory feedback. The system’s
reliance on a head-mounted RGB-D camera enables it to capture detailed 3D spa-
tial information, which serves as the foundation for generating precise navigation
cues. These cues are then delivered to the user through spatial audio rendering, ef-
fectively providing them with essential environmental information in a non-visual

ii
format. By leveraging auditory feedback, StereoPilot empowers individuals with
visual impairments to perceive and interpret their surroundings with enhanced ac-
curacy and efficiency, thereby facilitating independent navigation and interaction
within various environments.
The paper extensively discusses the technical underpinnings of StereoPilot,
emphasizing its utilization of spatial audio rendering and computer vision-based
spatial perception. It delves into the comparative analysis of different feedback
methods employed by the system and evaluates their impact on information trans-
fer rate, positioning accuracy, and overall usability. Through rigorous experimenta-
tion, the paper substantiates the system’s efficacy in improving information transfer
efficiency when compared to alternative feedback methods, thereby highlighting
its potential to serve as a valuable tool for individuals with visual impairments in
spatial tasks and navigation.

iii
CONTENTS

ACKNOWLEDGMENT i

ABSTRACT ii

LIST OF FIGURES iv

ABBREVIATIONS vi

Chapter 1. INTRODUCTION 1

Chapter 2. LITERATURE SURVEY 3

Chapter 3. METHODOLOGY 15
3.1 System Framework . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Feasibility and Positioning Accuracy Testing . . . . . . . . 18
3.3 Comparison of Feedback Methods . . . . . . . . . . . . . . 19
3.4 Desktop Manipulation Experiment . . . . . . . . . . . . . . 21
3.5 ITR Evaluation Experiment . . . . . . . . . . . . . . . . . . 24

Chapter 4. ADVANTAGES 26

Chapter 5. CHALLENGES 27

Chapter 6. RESULTS AND DISCUSSION 29

Chapter 7. CONCLUSION 31

iv
LIST OF FIGURES

3.1 The design concept of Stereopilot . . . . . . . . . . . . . . . . . . 16


3.2 The system framework . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 The typical workflow of StereoPilot . . . . . . . . . . . . . . . . . 17
3.4 3D coordinate positioning accuracy of the RGB-D camera . . . . . 19
3.5 Scatter plot based on MT and ID and the linear regression curve.
For simplicity, only a portion of the sample points are shown . . . . 20
3.6 The experimental setup of the desktop manipulation experiment . . 22
3.7 (a) Success rate and (b) completion time in desktop manipulation
experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

v
ABBREVIATIONS

Abbreviation Expansion
GANs Generative Adversarial Networks
CNN Convolutional Neural Network
MSE mean-square error
PSNR peak signal-to-noise ratio
SAPLC Spatial Aggregation of Pixel-level Local Classifiers
FCN fully convolutional network
*** ***

vi
Chapter 1

INTRODUCTION

This paper introduces StereoPilot, a groundbreaking wearable target loca-


tion system designed specifically for individuals who are blind or visually impaired.
The system leverages spatial audio rendering and computer vision to provide cru-
cial navigational cues, thereby enhancing spatial cognition and improving the over-
all navigation experience for users. By utilizing a head-mounted RGB-D camera,
StereoPilot is able to capture and process 3D spatial information, which is then
translated into auditory cues to assist users in perceiving and interacting with their
surroundings. The primary objective of this innovative system is to address the chal-
lenges faced by individuals with visual impairments when navigating unfamiliar or
dynamic environments.
The development of StereoPilot represents a significant advancement in as-
sistive technology, particularly in the realm of spatial perception and target location
for individuals with visual impairments. By integrating computer vision and spatial
audio rendering, the system offers a comprehensive solution to the complex nav-
igational needs of its users. Through a series of experiments and evaluations, the
paper demonstrates the system’s ability to enhance information transfer rate and
reduce positioning error during spatial navigation tasks. These findings underscore
the potential of StereoPilot to significantly improve the mobility and independence
of individuals with visual impairments in real-world scenarios.
The core functionality of StereoPilot revolves around the seamless integra-
tion of spatial perception, computer vision, and auditory feedback. The system’s
reliance on a head-mounted RGB-D camera enables it to capture detailed 3D spa-
tial information, which serves as the foundation for generating precise navigation
cues. These cues are then delivered to the user through spatial audio rendering, ef-
fectively providing them with essential environmental information in a non-visual
format. By leveraging auditory feedback, StereoPilot empowers individuals with
visual impairments to perceive and interpret their surroundings with enhanced ac-
curacy and efficiency, thereby facilitating independent navigation and interaction
within various environments.
The paper extensively discusses the technical underpinnings of StereoPilot,
emphasizing its utilization of spatial audio rendering and computer vision-based
spatial perception. It delves into the comparative analysis of different feedback
methods employed by the system and evaluates their impact on information trans-
fer rate, positioning accuracy, and overall usability. Through rigorous experimenta-
tion, the paper substantiates the system’s efficacy in improving information transfer
efficiency when compared to alternative feedback methods, thereby highlighting
its potential to serve as a valuable tool for individuals with visual impairments in
spatial tasks and navigation.
Furthermore, the paper situates StereoPilot within the broader landscape of
assistive technology for individuals with visual impairments.

2
Chapter 2

LITERATURE SURVEY

Shabnam Mohamed Aslam et al., (2020) This paper describes a research ini-
tiative aimed at addressing the challenges visually impaired individuals face when
interacting with touch screen devices. Despite various innovative interaction meth-
ods like virtual keyboards, three-dimensional gestures, and RFID sensing, visually
impaired individuals encounter navigation difficulties on touch screens. The pri-
mary objective here is to develop a Braille-based interface for touch screen smart-
phones to facilitate easier access for visually impaired users. This initiative utilizes
Braille codes as the foundation for communication, enabling individuals with visual
impairments to comfortably interact with touch screens. The process involves opti-
mizing hand finger motions as input parameters, such as coordinate values on x and
y axes, swipe speed and distance, pixel rate, and X and Y axis speeds. To enhance
system performance, the researchers employ a technique that involves varying hid-
den layers and neurons using the Crow Search Algorithm (CSO) in Artificial Neu-
ral Networks (ANN). This approach aims to determine the Optimal Hidden Layer
and Neuron (OHLN) configuration for accurately predicting the intended gestures,
providing a solution for visually impaired individuals to effectively communicate
through hand signals with others. The proposed model is anticipated to offer high
precision and optimal performance metrics compared to existing models. It ad-
dresses the limitations faced by visually impaired individuals in using information
devices such as keyboards, smartphones, and other tech gadgets. The advancement
in information technology has notably improved Braille reading and writing, mak-
ing it more accessible for visually impaired individuals to interpret various materials
like bank statements, transport tickets, maps, and music notes. Mobile phones play
a pivotal role not only in the lives of the general population but also in the lives
of differently-abled individuals. However, communication remains a challenge for
the visually impaired. Consequently, this research aims to bridge this gap by estab-
lishing an interaction framework between visually impaired individuals and mobile
devices, freeing users from the challenges of accessing smartphones for various ac-
tivities. The research focuses on developing a Braille-based system for touch screen
mobiles, predicting different gestures through instructions. The primary contribu-
tion lies in optimizing the interaction framework using an Artificial Neural Network
(ANN) by adjusting hidden layers and neurons. The subsequent sections of the pa-
per outline a literature review, methodology, simulation results, and conclude by
discussing future scopes of this innovation. Overall, the research aims to signifi-
cantly enhance accessibility and communication capabilities for visually impaired
individuals using touch screen technology.
Mandhatya Singh et al., (2022) This paper delineates a comprehensive re-
search endeavor addressing the longstanding issue faced by blind and visually im-
paired individuals (BVIP) in recognizing Indian paper currency denominations. In
countries like India, where currency notes exhibit minimal size variations and lack
distinct tactile attributes, visually impaired individuals encounter challenges in dis-
tinguishing between different denominations. To mitigate this issue, this paper in-
troduces an innovative framework named IPCRNet—a lightweight neural network
designed for low/medium-level smartphones. IPCRNet employs Dense connection,
Multi-Dilation, and Depth-wise separable convolution layers to enhance recogni-
tion accuracy, aiming to assist BVIP in identifying Indian currency notes accurately.
The research team curated an extensive dataset, IPCD, comprising over 50,000 im-
ages representing various real-life scenarios of Indian paper currency. Addition-
ally, they developed an Android application called ’Roshni-Currency recognizer’

4
tailored specifically for BVIP, providing voice-based guidance and denomination
information, enabling hassle-free currency recognition. Recognizing the limita-
tions of existing models in resource-constrained environments, the research focuses
on IPCRNet’s lightweight design—less than four million parameters—making it
highly deployable on mobile devices. This model integrates MobileNet as the front-
end and employs a Contextual Block in the backend to optimize computations while
maintaining accuracy. The innovative multi-dilation scheme expands the network’s
receptive field without inflating the parameters, effectively integrating global and
semantic features for improved accuracy. To facilitate effective training and evalu-
ation of IPCRNet, the researchers conducted comprehensive quantitative and qual-
itative analyses using multiple publicly available datasets. Furthermore, they em-
phasized the importance of their BVIP-friendly android app, ’Roshni,’ which offers
a user-friendly interface and aids in real-time recognition of currency denomina-
tions. The distinctive contributions of this research lie in its novel lightweight CNN
model, the vast and diverse dataset of Indian currency images, thorough quanti-
tative and qualitative analyses, and the publicly available BVIP-oriented android
application, ’Roshni.’ These elements collectively form the proposed end-to-end
Indian paper currency recognition framework (IPCRF), offering a solution to ad-
dress the challenges faced by BVIP in recognizing currency notes. The paper’s
structure includes sections detailing the literature review on currency recognition,
the creation and characteristics of IPCD, the architecture and implementation de-
tails of IPCRNet, experimental setups and results, the development of the ’Roshni’
android application, discussions on accuracy and reliability, and concludes by out-
lining future research directions. Overall, the research provides a comprehensive
solution that amalgamates advanced technology with user-friendly applications to
aid visually impaired individuals in recognizing Indian paper currency.
Salvador Martinez-Cruz et al., The navigation challenges faced by visually

5
impaired and blind people (VIBP) in locating public transport and bus stops due to
their vision limitations have prompted the development of various assistance sys-
tems over the past decade. However, most existing solutions rely on the global
positioning system (GPS), which encounters issues with satellite coverage, partic-
ularly in indoor environments. Moreover, some prototypes designed to aid VIBP
in navigation tend to be cumbersome for the user, affecting their mobility and in-
dependence. Addressing these challenges, a novel assistance system for VIBP uti-
lizing Bluetooth Low Energy (BLE) technology has been introduced in this paper
to facilitate the use of public transportation. This innovative system integrates BLE
beacons installed on buses and bus stops, coupled with a mobile application for
seamless user interaction. The BLE beacons serve as location markers, tracked in
real-time by the mobile app, which subsequently provides pertinent information
to users through verbal instructions. Crucially, this includes details such as trans-
portation line, destination, next stop name, and current location, empowering users
to proactively select the desired bus in advance and alight at the correct destina-
tion stop. The effectiveness of this system has been rigorously tested in controlled
settings and real-world environments, demonstrating an impressive 97.6% success
rate for VIBP traveling independently between points. Participants reported en-
hanced confidence and independence compared to GPS-based systems, citing sev-
eral key advantages. Firstly, the system operates seamlessly with or without an
internet connection, addressing a critical limitation of GPS-based solutions. Sec-
ondly, it offers real-time information without the encumbrance of wearable devices,
alleviating concerns about impeding natural movements. Notably, the BLE-based
system does not encounter satellite coverage issues indoors, a significant advantage
over GPS systems, ensuring reliable functionality regardless of the environment. In
the broader context of public transportation management systems (PTMS), which
commonly provide data on arrival/departure times through digital screens at bus

6
stops—information inaccessible to VIBP—the introduction of this BLE-based sys-
tem represents a significant step towards inclusivity. By leveraging technology that
bypasses the limitations of GPS and addresses indoor coverage challenges, this
innovative system empowers visually impaired individuals to navigate public trans-
port confidently and independently. The positive feedback from participants under-
scores the system’s efficacy in enhancing user experience, bolstering their sense of
security and comfort. Ultimately, this BLE-based assistance system not only fills
a critical gap in accessibility for VIBP but also sets a benchmark for inclusive and
user-friendly solutions in the realm of public transportation navigation.
Wafa M. et al.,(2018) The paper presents a comprehensive overview of the
challenges faced by visually impaired (VI) individuals and the limitations of exist-
ing systems designed to aid their mobility. It introduces an intelligent framework
aimed at significantly improving the quality of life for the VI population by offering
a novel solution that integrates sensor-based and computer vision-based technolo-
gies. The objective is to create a cost-effective and accurate system that enhances
navigation and obstacle avoidance for VI individuals, particularly considering the
high prevalence of VI individuals in developing countries. The statistics highlighted
from the World Health Organization (WHO) underscore the magnitude of visual
impairment globally, emphasizing the urgency to address this issue. The challenges
faced by VI individuals in navigating their surroundings, detecting obstacles (both
static and dynamic), and ensuring safe mobility are discussed. Traditional aids
like white canes and guide dogs are acknowledged but deemed limited in provid-
ing comprehensive real-time information about the environment, especially con-
cerning head-level barriers, and their availability and affordability pose additional
challenges. The limitations of existing electronic devices aimed at aiding VI indi-
viduals, such as ultrasonic obstacle detection glasses, laser canes, and smartphone
applications, are outlined. These systems are noted for their high cost and restricted

7
functionalities, often falling short in providing a complete solution for VI indi-
viduals, particularly those from low-income backgrounds. The paper proposes an
innovative framework that integrates computer vision technology and sensor-based
solutions to address these limitations. It emphasizes the novel approach of using
image depth for proximity measurement, enhancing the system’s ability to detect
and avoid obstacles while providing real-time navigational guidance. The integra-
tion of multiple sensor data through a data fusion algorithm aims to improve the
system’s accuracy and performance. Real-time scenario testing has demonstrated
the system’s effectiveness, achieving high accuracy rates in obstacle detection and
avoidance while providing auditory warnings to users. This system is intended to
assist VI individuals in their daily navigation, offering more comprehensive support
than traditional aids and existing electronic devices. Overall, this paragraph empha-
sizes the need for an efficient and inclusive navigation assistant for VI individuals,
discussing the limitations of current solutions and proposing a novel framework that
integrates sensor technologies and computer vision to provide enhanced real-time
assistance and navigation for the visually impaired.
Sreenu Ponnada et al., (2018) This paper outlines a comprehensive prototype
designed to aid visually impaired individuals in recognizing and navigating obsta-
cles like staircases and manholes. Understanding an object is essential for individu-
als to categorize it correctly. However, this becomes challenging for blind individu-
als. Therefore, this prototype utilizes a combination of feature vector identification
and sensor-computed Arduino chips to empower visually impaired individuals with
more independence while traversing roads. The primary objective of this prototype
is to enhance the autonomy of the visually impaired by helping them recognize and
navigate obstacles through a lightweight stick integrated with technology. To de-
tect manholes and staircases, the chip embedded in the stick is programmed using
specific algorithms. For manhole detection, a code is embedded in the stick’s chip,

8
utilizing a bivariate Gaussian mixture model. Meanwhile, for staircase detection,
the system employs the speeded up robust features (SURF) algorithm for feature
extraction. Navigation in unfamiliar surroundings poses a significant challenge for
visually impaired individuals due to their visual impairment. In India alone, about
1.5 million people face these challenges, and globally, around 170 million indi-
viduals are visually impaired, with this number increasing by approximately 10%
annually. Staircases are a major concern in navigation for the visually impaired.
Various sensors, such as monocular and stereo cameras, depth sensors, and laser
scanning devices like LiDAR, have been used to detect staircases. Image-based
methods often identify staircases by recognizing non-ground plane regions and the
concurrent line patterns resembling staircases within those regions. Moreover, the
detection of open manholes, a critical risk in the Indian context, has been addressed
using ultrasonic sensors. Several systems, such as the Smart Cane and the Ultra-
Cane/Batcane, have relied on a white cane integrated with a single sonar sensor to
detect above-knee obstacles. This paper proposes a hybrid approach utilizing both
sensor and image-based algorithms to detect and classify upward and downward
staircases, employing an array of sonar sensors mounted on a white cane managed
by an Arduino processor. The system also utilizes median-based thresholds for pre-
cise manhole identification and vibro-feedback on the cane to alert the user about
obstacles. Importantly, this entire processing occurs on a smartphone without the
need for heavy computation devices or high-speed internet connectivity for cloud
computation, making it lightweight and cost-effective. The subsequent sections of
the paper provide an overview of ultrasonic sensors, vibrator mechanisms, and Ar-
duino processors. They elaborate on the methodology for identifying manholes and
staircases, feature extraction methods, experimental results, and a summary along
with suggestions for future directions.

9
Yunjia Lei et al,(2022) The paper you’re referring to focuses on a critical
aspect of assistive navigation for visually impaired individuals—pedestrian lane
detection. This task is crucial for helping visually impaired people navigate safely
through environments by providing information about walkable areas, aiding in
staying within pedestrian lanes, and assisting in obstacle detection. However, de-
spite its significance, there has been limited attention given to pedestrian lane de-
tection in unstructured scenes within the research community. The goal of this
paper is to address this gap by conducting a comprehensive review and experimen-
tal evaluation of methods applicable to pedestrian lane detection, intending to pave
the way for future research in this area. The World Health Organization (WHO)
reports that there are approximately 253 million visually impaired individuals glob-
ally, with 217 million experiencing moderate to severe impairments and 36 million
being blind. Visual impairment significantly reduces mobility and increases the risk
of accidents like falls or collisions, making navigation in unfamiliar environments
extremely challenging for the visually impaired. Presently, traditional walking aids
like white canes or guide dogs assist visually impaired individuals, but they have
limitations. White canes have short detection ranges, while guide dogs require train-
ing and are effective primarily in familiar environments. Hence, there’s a growing
need to develop advanced assistive navigation systems. Pedestrian lane detection
plays a crucial role in these systems as it allows visually impaired users to navigate
within lanes, aiding their balance and mobility. An accurate, reliable, and real-time
pedestrian lane detection algorithm can immensely enhance the safety and mobility
of visually impaired individuals. Despite the significance of pedestrian lane detec-
tion in assistive navigation, research in this domain has been lacking. This survey
paper aims to lay the groundwork for assistive navigation research by reviewing
and assessing various methods, including those used for general road detection and
semantic segmentation. The methods’ design principles and performances on a spe-

10
cialized pedestrian lane detection dataset serve as valuable resources for developing
new methods. The paper highlights that methods designed for vehicle road detec-
tion aren’t optimized for pedestrian lane detection due to differences in structure
and environmental considerations. Pedestrian lanes have diverse shapes and surface
textures (e.g., bricks, concrete, grass), unlike vehicle roads with clearer boundaries
and asphalt surfaces. Moreover, pedestrian lane detection encompasses both indoor
and outdoor scenes, whereas road detection primarily deals with outdoor scenarios.
Paul Mejia et al., (2021) The challenges faced by visually impaired peo-
ple (VIPs) in accessing mathematical resources pose significant obstacles, particu-
larly in pursuing degrees in science-related fields. Traditional computational tools
like Computer Algebra Systems (CAS) are not designed to be user-friendly for the
visually impaired, making even simple mathematical problem-solving a daunting
task. To address this issue, a new system called Casvi has been developed. Casvi
functions as a specialized CAS tailored for individuals with visual disabilities, en-
abling them to perform basic and advanced numerical calculations using the Max-
ima mathematical engine. The system underwent testing by 25 VIPs to evaluate
its functionality and user-friendliness. Impressively, these individuals achieved a
92% accuracy rate in executing mathematical operations using Casvi. Addition-
ally, Casvi proved to be more efficient than the LAMBDA system in terms of the
time required for VIPs to perform mathematical operations accurately. Globally,
approximately 2,200 million people grapple with visual impairment or blindness,
a statistic that highlights the magnitude of this challenge [1]. In the United States,
the dropout rate among high school students with disabilities hovers around 40%
[2]. Moreover, only 13.7% of students with visual disabilities pursuing higher ed-
ucation manage to obtain a degree [3]. In the context of Ecuador, where a portion
of this research was conducted, the population exceeds 17 million people [4], with
481,392 individuals registered as having some form of disability, equating to an an-

11
nual prevalence of 2.74%. Within this group, 11.60% (55,843 people) suffer from
visual disabilities. Specifically, 2,906 students with visual impairments are studying
in primary, middle, or high school, and 1,188 are enrolled in universities or poly-
technic schools. Additionally, 147 individuals with visual disabilities are registered
in Technical and Technological Institutes. For VIPs pursuing Bachelor of Science
majors, such as engineering, the lack of accessibility in essential resources like spe-
cialized software and math textbooks severely restricts their academic and career
options. Algebraic Computational Systems (CAS) like MATLAB, Wolfram Math-
ematica, and Maxima, which are crucial tools in engineering and related fields, are
inaccessible to the visually impaired. This inaccessibility renders even basic math-
ematical operations challenging for VIPs, despite the assistance of screen readers.
Moreover, the technical complexity of documents exacerbates difficulties for vi-
sually impaired individuals, reducing their access to crucial mathematical content.
The primary barrier for visually impaired individuals in grasping mathematical se-
mantics isn’t blindness itself but rather the lack of access to mathematical content.
Bridging this gap between existing CAS tools and VIPs becomes crucial, allowing
for the writing, editing, evaluation, and solving of mathematical expressions. Ad-
ditionally, as visually impaired students increasingly integrate into regular schools,
these tools must also be accessible to teachers who may not be proficient in braille
[5]. To address these challenges, the Casvi computational algebraic system emerges
as a promising solution, providing crucial support for individuals with varying de-
grees of visual impairment in their academic journey within engineering and exact
sciences.
Amit Kumar Jaiswal, (2021) The field of healthcare has witnessed a surge in
interest due to the integration of Deep Learning and IoT, particularly in addressing
real-time health concerns. Among these, Diabetic Eye Disease stands as a leading
cause of blindness among the working-age population, notably affecting populous

12
Asian countries like India and China, where the prevalence of diabetes is burgeon-
ing. The escalating number of diabetic patients presents a formidable challenge for
healthcare professionals to conduct timely medical screenings and diagnoses. The
objective at hand is to harness deep learning methodologies to automate the identi-
fication of blind spots in the eye and assess the severity of this condition. The pro-
posed solution in this paper introduces an optimized technique built upon the foun-
dation of recently introduced pre-trained EfficientNet models. This approach aims
to detect blindness indicators in retinal images, culminating in a comparative analy-
sis among various neural network models. Notably, the fine-tuned EfficientNet-B5
based model, evaluated using benchmark datasets comprising retina images cap-
tured through fundus photography across diverse imaging stages, demonstrates su-
perior performance compared to CNN and ResNet50 models. The convergence of
AI and IoT in smart healthcare systems has garnered attention, offering more effi-
cient detection and management of various health conditions. Diabetes, a prevalent
chronic ailment globally, arises due to insufficient insulin production or ineffective
utilization by the body. The World Health Organization (WHO) recorded over 1.6
million deaths attributable to diabetes in 2016, emphasizing its critical impact. Di-
abetic Retinopathy (DR) emerges as a severe complication of diabetes, potentially
leading to complete blindness, affecting a substantial proportion of diabetic indi-
viduals worldwide. Approximately 25% of diabetic patients suffer from DR exclu-
sively, highlighting its complexity and impact within this demographic. Long-term
diabetes poses a significant risk of DR, a progressive disease capable of causing
partial or permanent vision impairment. Notably, the majority of those affected by
DR belong to the working-age group, a crucial segment of any country’s workforce.
India, in particular, houses a considerable diabetic population, and this number is
rapidly escalating each year. Detecting DR at its early stages remains challeng-
ing, as initial symptoms are often subtle and may go unnoticed until irreversible

13
retinal damage occurs or is diagnosed via medical testing. However, the identifi-
cation of DR necessitates highly skilled professionals capable of evaluating digital
color fundus photographs of the retina. Fundus images, capturing the rear part of
the human eye, undergo assessment to pinpoint lesions linked to vascular abnor-
malities caused by diabetes. Deep learning methodologies, notably Convolutional
Neural Networks (ConvNets), have emerged as a prominent approach for exten-
sive medical image processing across various healthcare applications. EfficientNet
architecture, specifically utilized in this study, showcases its efficacy in analyzing
retina images to detect DR. The scalability of ConvNets’ parameters enhances their
accuracy, especially in domains prioritizing precision, such as the medical field.
Thus, this research employs EfficientNet architecture to scrutinize retina images
and identify indicators of DR, signifying a potential breakthrough in early detec-
tion and intervention for this debilitating condition.

14
Chapter 3

METHODOLOGY

3.1 SYSTEM FRAMEWORK

The system framework of StereoPilot is a comprehensive and innovative ap-


proach designed to assist individuals who are blind or visually impaired in perceiv-
ing and interacting with their environment. The framework integrates a wearable
visual perception module with spatial audio rendering (SAR) to provide essential
navigational cues and location assistance. The wearable visual perception mod-
ule utilizes a head-mounted RGB-D camera for object recognition and tracking,
enabling the system to identify and locate objects in the user’s environment.
The system framework operates in a closed control loop, allowing real-time
interaction between the user and the environment. The RGB-D camera captures
the environmental video stream, which is then processed by the assistant controller
for object and hand recognition. The recognition results are relayed back to the
user through voice instruction feedback, providing auditory cues for object recog-
nition. Additionally, the spatial information obtained under the user coordinate sys-
tem (UCS) is fed into a virtual environment to generate spatial audio cues, which
are then delivered to the user through stereo earphones. This closed-loop system en-
ables users to interact with their environment in real-time, facilitating independent
navigation and interaction.
The typical workflow of StereoPilot involves the user invoking the system
through a voice trigger command, followed by the system guiding the user to select
corresponding modes through synthetic speech. The user can then specify tasks
based on the recognition results, such as locating a specific object in the environ-
ment. The system provides spatialized navigation cues of the target object relative
to the user’s hand, allowing the user to follow the auditory cues to achieve local-
ization assistance. The system primarily interacts with users through voice com-
mands, allowing users to perceive the environment without the need to operate the
assistance device manually, thus freeing their hands to focus on environmental per-
ception.

Figure 3.1: The design concept of Stereopilot

The wearable visual perception module is based on computer vision tech-


nology, utilizing the MediaPipe framework for object and hand recognition. This

16
framework leverages machine learning solutions and deep neural networks to iden-
tify and track objects and hands in the environment. The system is designed to han-
dle close-distance environmental perception, making it suitable for various daily
life tasks and practical life skills.
In summary, the system framework of StereoPilot integrates wearable visual
perception with spatial audio rendering to provide individuals with visual impair-
ments the ability to perceive and interact with their environment in a non-visual
format.

Figure 3.2: The system framework

Figure 3.3: The typical workflow of StereoPilot

17
3.2 FEASIBILITY AND POSITIONING ACCURACY TESTING

The feasibility and positioning accuracy testing of the wearable target lo-
cation system, StereoPilot, for blind and visually impaired individuals involved
evaluating the system’s ability to increase information transfer rate (ITR) and re-
duce positioning error during spatial navigation tasks. The system utilizes a head-
mounted RGB-D camera to capture 3D spatial information, which is then translated
into auditory cues through spatial audio rendering to assist users in perceiving and
interacting with their surroundings.
The testing process aimed to assess the effectiveness of the system in pro-
viding accurate target location information to assist individuals with visual impair-
ments in navigating their environment. The experimental results demonstrated that
the system significantly improved information transfer efficiency and reduced po-
sitioning error compared to other feedback methods. This indicates that the system
has the potential to assist visually impaired individuals in spatial tasks, thereby en-
hancing their spatial cognition and navigation experience.
The feasibility and positioning accuracy testing involved comparing the per-
formance of the wearable target location system with three other baseline feedback
strategies based on auditory and haptic display methods. The evaluation focused
on the information transfer rate (ITR) for the spatial audio rendering (SAR) on
blind and visually impaired individuals. The results of the testing indicated that
the system’s spatial audio rendering approach outperformed the alternative feed-
back methods, highlighting its efficacy in providing essential navigational cues and
location assistance to individuals with visual impairments.
The testing process also involved assessing the system’s ability to accurately
convey spatial information and provide precise navigation cues to users. By lever-
aging spatial audio rendering, the system empowered individuals with visual im-
pairments to perceive and interpret their surroundings with enhanced accuracy and

18
efficiency. This not only facilitated independent navigation and interaction within
various environments but also contributed to improving the overall mobility and
independence of individuals with visual impairments in real-world scenarios.
So, the feasibility and positioning accuracy testing of the wearable target lo-
cation system, StereoPilot, demonstrated its effectiveness in enhancing information
transfer rate and reducing positioning error during spatial navigation tasks for blind
and visually impaired individuals. The system’s ability to provide accurate target
location information and assist individuals with visual impairments in navigating
their environment underscores its potential as an innovative and valuable tool in the
field of assistive technology for individuals with visual impairments.

Figure 3.4: 3D coordinate positioning accuracy of the RGB-D camera

3.3 COMPARISON OF FEEDBACK METHODS

The paper discusses the development and evaluation of a wearable target lo-
cation system, StereoPilot, designed for individuals with blindness or visual impair-
ment. The system utilizes spatial perception based on computer vision and target

19
location based on spatial audio rendering to provide essential navigational cues and
location assistance. One of the key aspects of the evaluation involved comparing
different feedback methods and assessing their impact on the information transfer
rate for individuals with visual impairments.
The comparison of feedback methods aimed to evaluate the effectiveness of
the spatial audio rendering (SAR) approach employed by the system in providing
accurate target location information to assist individuals with visual impairments in
navigating their environment. The evaluation involved testing the system’s perfor-
mance against three representative auditory and haptic display methods, including
voice instruction feedback, vibrotactile feedback, and non-speech sonification feed-
back.
The results of the evaluation indicated that the system’s spatial audio render-
ing approach outperformed the alternative feedback methods in terms of informa-
tion transfer efficiency. This finding underscores the efficacy of the spatial audio
rendering approach in providing essential navigational cues and location assistance
to individuals with visual impairments. By leveraging spatial audio rendering, the
system empowered individuals with visual impairments to perceive and interpret
their surroundings with enhanced accuracy and efficiency, thereby facilitating inde-
pendent navigation and interaction within various environments.

Figure 3.5: Scatter plot based on MT and ID and the linear regression curve. For
simplicity, only a portion of the sample points are shown

20
The comparison of feedback methods also highlighted the potential of the
spatial audio rendering approach to significantly improve the mobility and indepen-
dence of individuals with visual impairments in real-world scenarios. The system’s
ability to convey spatial information and provide precise navigation cues to users
through spatial audio rendering demonstrated its potential as an innovative and valu-
able tool in the field of assistive technology for individuals with visual impairments.
The comparison of feedback methods in the evaluation of the wearable tar-
get location system, It emphasized the superiority of the spatial audio rendering
approach in enhancing information transfer efficiency and providing accurate tar-
get location information to assist individuals with visual impairments in navigating
their environment. This underscores the potential of the system to serve as a valu-
able tool for individuals with visual impairments in spatial tasks and navigation.

3.4 DESKTOP MANIPULATION EXPERIMENT

The desktop manipulation experiment conducted in the study aimed to ver-


ify the feasibility of the wearable target location system, StereoPilot, for accurate
prehension tasks in a real environment. The experiment was designed to assess
the technology fusion of wearable visual perception and spatial information feed-
back, focusing on the system’s ability to assist individuals with blindness or visual
impairment in completing indoor target location tasks.
The experimental setup involved a desktop manipulation task where sub-
jects were required to wear a visual perception helmet and locate a specific colored
block based on the guidance of spatial information provided by the system. The
visual perception helmet utilized computer vision technology to identify the spatial
information and color of the blocks in the real environment, and the system pro-
vided real-time spatial location information of the target object to the subjects. The
experimenters shuffled the position of the blocks before each trial, and each subject

21
completed 30 trials for each spatial information feedback method.

Figure 3.6: The experimental setup of the desktop manipulation experiment

The experiment aimed to quantify the information transfer efficiency of dif-


ferent feedback methods in a normalized human-computer interface, using Fitts’
law as a model to assess the performance of the feedback methods. The position of
the target object and its physical properties were considered significant factors af-
fecting the user’s grasping success rate. The experiment focused on the technology
fusion of wearable visual perception and spatial information feedback, particularly
for objects with visual differences that require effective technical support of com-
puter vision and spatial information feedback.
The results of the desktop manipulation experiment demonstrated that the
spatial audio rendering (SAR) method significantly improved the completion time
and success rates of the subjects in accurately completing the tasks. The SAR, along
with other auditory and haptic feedback methods, assisted the subjects in complet-

22
ing tasks accurately, with very few failures. In contrast, the non-speech sonification
feedback method showed significant shortcomings compared to the other methods,
leading to its exclusion from further experiments.

Figure 3.7: (a) Success rate and (b) completion time in desktop manipulation ex-
periment

Overall, the desktop manipulation experiment provided valuable insights


into the system’s performance in assisting individuals with blindness or visual im-
pairment in completing indoor target location tasks. The results underscored the
potential of the spatial audio rendering approach as an effective method for pro-

23
viding accurate spatial information and enhancing the navigation experience for
individuals with visual impairments.
The desktop manipulation experiment verified the feasibility of the wearable
target location system, StereoPilot, for accurate prehension tasks in a real environ-
ment. The results highlighted the effectiveness of the spatial audio rendering ap-
proach and its potential to enhance the spatial perception and navigation experience
for individuals with blindness or visual impairments.

3.5 ITR EVALUATION EXPERIMENT

The ITR (Information Transfer Rate) evaluation experiments conducted in


the study aimed to quantify the information feedback performance of different ap-
proaches, particularly focusing on the effectiveness of the spatial audio rendering
(SAR) method employed by the wearable target location system, StereoPilot, for
individuals with blindness or visual impairment. The experiments were designed
based on Fitts’ law, a well-established model used in human-computer interactions
and ergonomics evaluations.
The Fitts’ law test involved a computer program where participants were
required to move a cursor to a target point using different feedback information.
The ITR was calculated based on the index of difficulty (ID) and the correspond-
ing movement time (MT). The evaluation experiments aimed to objectively assess
the performance of SAR in comparison to three other baseline feedback methods,
including voice instruction feedback, vibrotactile feedback, and non-speech sonifi-
cation feedback.
The results of the ITR evaluation experiments demonstrated that SAR signif-
icantly improved the ITR for individuals with blindness or visual impairment. The
experiments showed that SAR outperformed the alternative feedback methods, in-
dicating its efficacy in providing essential navigational cues and location assistance

24
to individuals with visual impairments. The spatial audio rendering approach was
found to enhance information transfer efficiency, thereby empowering individuals
with visual impairments to perceive and interpret their surroundings with enhanced
accuracy and efficiency.
The experiments also involved comparing the completion time, positioning
errors, and success rates of the different feedback methods. The results indicated
that SAR greatly shortened the completion time, contributing to a smooth user ex-
perience. Additionally, SAR, along with the other auditory and haptic feedback
methods, was able to assist the subjects in completing tasks accurately, with very
few failures. In contrast, the non-speech sonification feedback method showed sig-
nificant shortcomings compared to the other methods, leading to its exclusion from
further experiments.
Overall, the ITR evaluation experiments provided valuable insights into the
performance of different feedback methods, with SAR demonstrating superior in-
formation transfer efficiency and effectiveness in providing accurate target location
information to assist individuals with visual impairments in navigating their en-
vironment. The results underscored the potential of the spatial audio rendering
approach as an innovative and valuable tool in the field of assistive technology for
individuals with visual impairments. This experiments highlighted the effectiveness
of the spatial audio rendering approach employed by StereoPilot, emphasizing its
potential to enhance the spatial perception and navigation experience for individuals
with blindness or visual impairment.

25
Chapter 4

ADVANTAGES

1. Innovative Solution: The paper introduces StereoPilot, a wearable target lo-


cation system using spatial audio rendering, providing an innovative solution for
transmitting environmental information to blind and visually impaired individuals.
2. Enhanced Information Transfer: The experimental results demonstrate that the
system increases information transfer rate and reduces positioning error for spatial
navigation, indicating its effectiveness in providing accurate target location infor-
mation to assist individuals with visual impairment in navigating their environment.
3. Comparative Evaluation: The paper compares different feedback methods and
evaluates the information transfer rate, providing a comprehensive analysis of the
system’s performance in assisting visually impaired individuals in spatial tasks.
4. Potential Impact: The results indicate that the system has the potential to assist
visually impaired individuals in spatial tasks, highlighting its potential to enhance
spatial perception and navigation experience for individuals with blindness or visual
impairment.
5. Comprehensive Coverage: The paper is part of a list of research papers and
conference proceedings related to assistive technology for the visually impaired,
covering a range of topics including mobility aids, environmental perception, object
localization, and sensory substitution, providing a comprehensive overview of the
field.
Chapter 5

CHALLENGES

1. Spatial Audio Rendering Deviations: Some subjects reported deviations be-


tween the rendered spatial audio and the actual orientation, potentially due to the
incompatibility of the generalized Head-Related Transfer Function (HRTF) model
on certain subjects. This challenge may require the establishment of a subject-
specific HRTF model and the development of more realistic spatial audio rendering
technology.
2. Machine Vision Recognition Errors: The system faced challenges related to
errors in machine vision recognition during human-environment interactions. Oc-
clusion of hands and objects could cause positioning errors and result in incorrect
spatial information in virtual environments. Improving the robustness of object and
hand recognition in real-time video streams is identified as a key challenge for the
system.
3. Limitations of Existing Technologies: The paper highlights the limitations of
existing auditory display technologies, such as speech, sonification, and earcons,
which are not spatialized. This limitation makes it time-consuming and non-intuitive
to transmit spatial information to blind and visually impaired individuals. Overcom-
ing these limitations and developing effective spatialized auditory display technolo-
gies presents a significant challenge in the field of assistive technology for individ-
uals with visual impairments.
4. Object Recognition and Localization in Real Environments: The problem space
of the paper focuses on object recognition and localization in desktop scenes, par-
ticularly in real environments. This presents a challenge as there is limited research
on the hand prehension task of blind and visually impaired individuals in real en-
vironments. Overcoming this challenge requires the development of effective and
accurate object recognition and localization techniques for real-world scenarios.
5. Performance Optimization for Mobile Devices: The evaluation of the assistance
system running on a mobile device revealed challenges related to performance op-
timization. While the overall CPU usage and memory usage were reported, further
optimization may be required to ensure efficient operations on various mobile de-
vices ().
These challenges underscore the complexity and multifaceted nature of de-
veloping assistive technologies for individuals with visual impairments, highlight-
ing the need for continued research and innovation in this field.

28
Chapter 6

RESULTS AND DISCUSSION

The paper presents the results and discussions of the StereoPilot wearable
target location system, focusing on the evaluation of the system’s performance and
its implications for individuals with blindness or visual impairment. The study
compared the Spatial Audio Rendering (SAR) feedback method with three other
baseline feedback strategies based on auditory and haptic display methods, namely
voice instruction feedback (VI), vibrotactile feedback (VB), and non-speech sonifi-
cation feedback (NS).
The results of the study demonstrated that SAR significantly improved the
Information Transfer Rate (ITR) for individuals with blindness or visual impair-
ment (BVI) compared to the other baseline feedback methods (). The experimental
evaluation based on Fitts’ law test showed that SAR greatly shortened the com-
pletion time, contributing to a smooth user experience (). The study also involved
in-depth research on the wearable design of the assistance device and extensive
comparative experiments on target populations, demonstrating the feasibility and
positioning accuracy of the wearable visual perception module.
Furthermore, the study conducted desktop manipulation experiments to ver-
ify the feasibility of StereoPilot for accurate prehension tasks in real environments.
The results indicated that SAR, VI, and VB were able to assist the subjects in com-
pleting all tasks accurately, with very few failures, while NS had significant short-
comings compared to the other three information feedback methods (). The study
also highlighted the impact of the physical properties of the target and adjacent
interfering objects on the user’s grasping success rate, emphasizing the need for
effective technical support of computer vision and spatial information feedback for
individuals with visual impairments ().
The evaluation metrics used in the study included positioning errors, com-
pletion time, ITR, Pearson correlation coefficient between the Index of Difficulty
(ID) and Movement Time (MT), the root mean square error of the linear regression
curve, and the success rate. The results of the Fitts’ law test and the desktop ma-
nipulation experiments provided valuable insights into the performance of the SAR
feedback method and its potential impact on individuals with blindness or visual
impairment ().
The discussions in the paper emphasized the significance of the experimental
results, highlighting the potential of SAR to improve the spatial perception and nav-
igation experience for individuals with blindness or visual impairment. The study
also identified challenges related to deviations in rendered spatial audio, machine
vision recognition errors, limitations of existing auditory display technologies, and
performance optimization for mobile devices, underscoring the need for continued
research and innovation in the field of assistive technology.

30
Chapter 7

CONCLUSION

The paper presents the development and evaluation of StereoPilot, a wear-


able target location system designed to assist blind and visually impaired individ-
uals in navigating their environment. The system utilizes spatial audio rendering
to provide essential navigational cues and location assistance. The innovative ap-
proach of using a head-mounted RGB-D camera to measure 3D spatial information
and transmitting navigation cues through spatial audio rendering offers several ad-
vantages.
The experimental results demonstrate that the system increases information
transfer rate and reduces positioning error for spatial navigation, indicating its ef-
fectiveness in providing accurate target location information to assist individuals
with visual impairment in navigating their environment. The improved information
transfer efficiency compared to other feedback methods highlights the potential of
the system to enhance the spatial perception and navigation experience for visually
impaired individuals.
The comparative evaluation of different feedback methods provides valuable
insights into the performance of the system. By assessing the information transfer
rate and comparing various feedback methods, the paper offers a comprehensive
analysis of the system’s effectiveness in assisting visually impaired individuals in
spatial tasks. This comparative evaluation contributes to the understanding of the
system’s capabilities and its potential impact on individuals with blindness or visual
impairment.
The paper also covers a range of topics related to assistive technology for the
visually impaired, including mobility aids, environmental perception, object local-
ization, and sensory substitution, providing a comprehensive overview of the field.
This comprehensive coverage enhances the understanding of the broader context in
which the wearable target location system operates, highlighting its relevance and
potential impact within the field of assistive technology for individuals with visual
impairments.
In conclusion, the paper presents an innovative and effective wearable target
location system, offering potential benefits for individuals with blindness or visual
impairment in navigating their environment and enhancing their spatial perception.
The system’s ability to increase information transfer rate and reduce positioning
error, as well as its comparative evaluation against other feedback methods, demon-
strates its potential to significantly improve the spatial navigation experience for
individuals with visual impairments. The comprehensive coverage of topics related
to assistive technology for the visually impaired further emphasizes the relevance
and potential impact of the wearable target location system within the field.

32
REFERENCES

[1] L. Zhao, C. Chen, and J. Huang, “Deep learning-based forgery attack on


document images,” IEEE Trans. Image Process., vol. 30, pp. 7964–7979,
2021.

[2] W. Sun, Y. Song, C. Chen, J. Huang, and A. C. Kot, “Face spoofing detec-
tion based on local ternary label supervision in fully convolutional networks,”
IEEE Trans. Inf. Forensics Security, vol. 15, pp. 3181–3196, 2020.

[3] Y. Sun, R. Ni, and Y. Zhao, “MFAN: Multi-level features attention network
for fake certificate image detection,” Entropy, vol. 24, no. 1, p. 118, Jan. 2022.

[4] P. Roy, S. Bhattacharya, S. Ghosh, and U. Pal., “STEFANN: Scene text


editor using font adaptive neural network,” in Proc. IEEE/CVF Conf. Comput.
Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 13228–13237.

[5] P. Zhuang, H. Li, S. Tan, B. Li, and J. Huang, “Image tampering localiza-
tion using a dense fully convolutional network,” IEEE Trans. Inf. Forensics
Security, vol. 16, pp. 2986–2999, 2021.

[6] Y. Gao, F. Wei, J. Bao, S. Gu, D. Chen, F. Wen, and Z. Lian , “High-
fidelity and arbitrary face editing,” in Proc. IEEE/CVF Conf. Comput. Vis.
Pattern Recognit. (CVPR), Jun. 2021, pp. 16115–16124.

[7] R. Chen, X. Chen, B. Ni, and Y. Ge, “SimSwap: An efficient framework for
high fidelity face swapping,” in Proc. 28th ACM Int. Conf. Multimedia, Oct.
2020, pp. 2003–2011.

33
[8] Y. Nirkin, I. Masi, A. Tran Tuan, T. Hassner, and G. Medioni, “On face
segmentation, face swapping, and face perception,” in Proc. 13th IEEE Int.
Conf. Autom. Face Gesture Recognit. (FG), May 2018, pp. 98–105

34
Chapter 8
Chapter 9
Chapter 10

You might also like