1-Cover Page-CSE
1-Cover Page-CSE
1-Cover Page-CSE
Department of AI &DS
Global Institute of Technology
Jaipur (Rajasthan)-302022
Batch: 2020-24
GLOBAL INSTITUTE OF TECHNOLOGY
ITS-1 & 2, IT Park, EPIP, Sitapura, Jaipur – 302022, Rajasthan – INDIA
Certificate
This is to certify that the work, which is being presented in the project entitled
“Drishti” submitted by Chhotu Kumar Ram, a student of fourth year (VIII Sem),
B.Tech. in Computer Science & Engineering, in partial fulfilment for the award of
degree of Bachelor of Technology is a record of student’s work carried out and
found satisfactory for submission.
We take this opportunity to express my deep sense of gratitude to project coordinators Ms.
Manju Mathur, Assistant Professor, Department of Computer Science and Engineering, Dr.
Reema Ajmera, Assistant Professor, Department of Computer Science and Engineering and our
project guide “Ms. Apoorva Joshi”, Professor/Assistant Professor Department of Computer
Science and Engineering, Global Institute of Technology, Jaipur, for his/her valuable guidance
and cooperation throughout the Project work. He/She provided constant encouragement and
unceasing enthusiasm at every stage of the project work.
We are grateful to our respected Dr. I. C. Sharma, Principal GIT, for guiding us. We express
our indebtedness to Dr. Pradeep Jha, Head of the Department of Computer Science and
Engineering, Global Institute of Technology, Jaipur, for providing us with ample support.
Without their support and timely guidance, the completion of our project would have seemed a
farfetched dream. We find ourselves lucky to have mentors of such great potential in this respect.
Certificate.....................................................................................................................................................................i
Acknowledgement....................................................................................................................................................ii
Abstract.......................................................................................................................................................................iii
Table of Content.......................................................................................................................................................iv
List of Figures..................................................................................................................................v
List of Tables..................................................................................................................................vi
List of Symbols..............................................................................................................................vii
REFERENCES.............................................................................................................................46
List of Figures
Chapter-1
INTRODUCTION AND OBJECTIVE
Today there are nearly 2.2 billion people in the world that are visually impaired. Most tasks rely
on optical information; thus, visually impaired people are at a disadvantage. The crucial
information about their surroundings is unavailable. A blind person will always require an aid or
assistance to accompany them or support them with daily tasks. Visual impairment can have
significant effects on an individual's life, including physical, emotional, and social impacts.
Physically, visual impairment can make it challenging to carry out daily tasks such as reading,
writing, and navigating the environment. It can also lead to a decrease in mobility, falls, and
injuries. Several solutions are available to individuals with visual impairment to help them
overcome the challenges they face. These solutions include assistive technology, rehabilitation
services, and education. Assistive technology such as screen readers, location detection, image
detection can help individuals with visual impairment to access information, communicate with
others, and perform daily tasks. The assistance given to those with visual impairments can now
be provided with the use of modern technologies.
So, this virtual assistant will be helpful for visually impaired people. This application contains an
OCR (Optical Character Recognition) Reader and a location detector. The app will be able to
read a specific text to the user as audio and tell them their exact current location to get all the
information about his/her surroundings.
Our application “Drishti” can help to improve the quality of life of visually impaired people.
Since it is not always possible for someone to be with a person who is blind for 24 hours, This
app will prove to be a very useful tool for those who are blind. It will make reading very simple
for them, whether they prefer to read fiction, newspapers, or school textbooks.
There are no barriers ahead. A person of any age can use it, who has a smartphone
OBJECTIVE:
The major goal of building an Android app that serves as a virtual assistant for visually impaired
individuals can be a highly impactful project. By providing a tool that enables individuals with
visual impairments to navigate the world more independently, we can empower them to live
more fulfilling lives. Such an app can offer features like voice commands, text-to-speech, and
location tracking, making it easier for users to perform daily tasks, communicate with others, and
access information. Developing this app can be a way to leverage technology for the greater
good and make a positive difference in people's lives.
There are some prerequisites which are:
Android Studio and Firebase should be installed on your local machine.
Knowledge of Java
Knowledge of TensorFlow
Knowledge of ML libraries
Knowledge about APIs
1.1 Present Standards
One of the most important senses for humans is their ability to see with their eyes, and the
absence of this potential has a profound effect on all the possible decisions a character is likely to
make throughout his or her existence. They frequently experience discrimination in social
structures and at their place of employment because they are not expected to advance in their
profession as much as a person with abilities. So, by organizing campaigns and offering
education with new tools and technologies, the government and civil society can play a
significant role in making the lives of visually impaired people easier and safer.
There are various apps available on the internet for assisting visually impaired people, like:
Be My Eye: a free app that connects sighted volunteers with blind or visually impaired people so
they can help them out in their daily lives. Through OpenAl's GPT-4, the app has created the
first-ever virtual volunteer, enabling users to send images and receive thorough descriptions and
instructions for a variety of tasks.
OneStep Reader App: With the touch of a button on the iPhone, the OneStep Reader converts
printed text into clear speech to offer precise, quick, and effective access to both single-page and
multi-page documents.
TapTapSee: The purpose of TapTapSee is to assist the blind and visually impaired in
recognizing objects in their daily environment. Double tapping the screen will enable users to
take pictures from any angle of anything, then the app will speak the identification to the user.
Cash Reader: People who have visual impairments can quickly and easily identify and count
bills with the help of the Cash Reader app, which speaks the denomination and instantly
recognizes the currency.
Despite the availability of so many apps, a sizable number of visually impaired people are still
unable to benefit from them. This may be due to a lack of knowledge, some apps that are not free
to use, and some only work on iPhones.
Below we can see the basic data flow diagram of virtual assistant apps:
Fig.1.1
The client will enter their choice, then the application takes in the information. At that point, the
given input is used to perform an activity. The provided information is verified in a database. If
pertinent information is discovered in the input, it is provided to the user as output or feedback.
1.2 Proposed Standards
The proposed system is to create a simple Android application with an improved user interface
that will act as their voice assistant. This assistant will carry out all their tasks, from simple to
complex, with little to no internet connection. Any basic smartphone with a minimal interface
can use this. The user will have the option of giving voice commands as input.
This project is an Android app called “Drishti” which can Assist the Blind. It understood the text
from a pdf report and synthesized it for the user using Speech Synthesis and text content
reputation. A text document or a.ppt file can be converted into a.pdf file by looking for a specific
set of words. Being built on Android, the application uses pre-defined APIs for text-to-speech
conversion, making the process even greener. Google's Vision API is used, but it does not
recognize text through images. The overall percentage of blind people in the population is
3.44%, of which 53.1% use Android smartphones and the rest do not.
System Architecture
The system proposes the following applications:
1. OCR Reader: With the help of this application users can listen to the text from a pdf by
giving voice commands.
2. Location: In this also user can give voice commands for knowing their location and then
the app will give their present location as an output in voice command.
3. Object Detection: With the help of this feature, user can know about the objects present
in their surroundings.
Fig.1.2
Fig.1.4
Fig.1.5
A location tracker using Google API is a powerful tool that enables users to track the real-time
location of a device or individual. This technology leverages the power of Google Maps to
provide accurate and up-to-date location data, making it an essential tool for businesses,
individuals, and even law enforcement agencies. By integrating with Google API, a location
tracker can access a wealth of data, including traffic patterns, local landmarks, and business
listings, all of which can be used to provide a comprehensive understanding of a user's location.
Additionally, this technology can be used to monitor the movements of individuals, providing an
extra layer of security for loved ones or employees. Overall, a location tracker using Google API
can help businesses and individuals save time, increase efficiency, and improve safety.
One of the key benefits of a location tracker using Google API is its ability to provide accurate
location data in real time. This technology can be used to track the movements of a device or
individual, providing users with an up-to-date understanding of their location. Additionally, this
technology can be used to monitor the location of a device or individual over time, providing
insights into patterns and behaviors. For businesses, this can be an invaluable tool for optimizing
operations, as it can help identify inefficiencies and opportunities for improvement. Additionally,
individuals can use location trackers to keep tabs on loved ones or ensure their own safety when
traveling or exploring new areas. Overall, a location tracker using Google API is a powerful and
versatile tool that can be used for a wide range of applications.
Fig.1.6
1. JAVA
Java is a high-level programming language based on the concepts of object-oriented
programming initially developed by Sun Microsystems and now owned by Oracle Corporation.
It was designed to be platform-independent and portable, meaning that once a Java program is
written, it can run on any computer or device with a Java Virtual Machine (JVM) installed.
Fig.1.7
One of the main features of Java is its "write once, run anywhere" philosophy, which allows
developers to create a single codebase that can be used on multiple platforms without the need
for modification. This makes it a popular choice for developing applications that can run on a
variety of operating systems, including Windows, macOS, Linux, and mobile devices such as
Android.
Java also provides a wide range of libraries and tools for developers, making it easier to build
complex applications. It is commonly used in enterprise applications, web development, mobile
app development, and game development.
Design and Architecture
Java's design centers around robustness, portability, and high performance. Its syntax draws
heavily from C++, which eases the learning curve for developers familiar with that language.
However, Java eschews the complexity and security issues associated with direct pointer
manipulation, making it a safer programming language that is less prone to bugs and security
vulnerabilities.
Central to Java’s architecture is the Java Virtual Machine (JVM), an engine that executes Java
bytecode. This bytecode is a translation of Java source code, allowing Java programs to run on
any device that has a JVM. This layer of abstraction not only ensures operational consistency
across diverse hardware but also facilitates security, as the JVM can contain and manage code
execution within a virtual sandbox.
Development Tools and Libraries
Java developers have access to the Java Development Kit (JDK), which includes all the
necessary tools for Java application development, such as the compiler (javac), runtime
environment (JRE), and numerous utility programs. Over the years, Java has expanded its library
of classes, which now includes tools for GUI development, networking, security, and database
access, among many other functions, simplifying the development process significantly.
Enterprise and Web Applications
For enterprise environments, Java offers Java Enterprise Edition (Java EE), which extends Java
Standard Edition (Java SE) with specifications for scalable, multi-tiered business applications.
Java EE includes APIs for object-relational mapping, distributed computing, and web services,
which are essential for modern enterprise solutions.
In web development, Java’s servlets and JavaServer Pages (JSP) allow for the creation of
dynamic, data-driven web applications. Java’s strong security features make it an excellent
choice for web applications that require reliable security protocols to handle sensitive data.
2. TENSORFLOW
TensorFlow, developed by the Google Brain team, is a versatile open-source software library
designed for high-performance numerical computation. It has become synonymous with machine
learning and deep learning due to its powerful and flexible capabilities for building and training
complex neural networks. The software excels in handling dataflow and differentiable
programming across various computing devices, including CPUs, GPUs, and TPUs (Tensor
Processing Units), making it highly scalable and efficient for both research and production.
Fig.1.8
Core Features and Functionalities
At its core, TensorFlow allows users to create advanced machine learning models through an
intuitive and flexible architecture that supports defining, optimizing, and computing multi-
dimensional arrays, or tensors. This functionality is crucial for developing sophisticated
algorithms that can learn from and make decisions based on large sets of data.
TensorFlow's architecture is built to be modular, which means developers can use and reuse
components as needed. The framework supports a range of tasks from simple regression models
to complex neural networks involving deep learning. Its ability to automate the differentiation of
complex expressions enables developers to focus more on the architecture of their models rather
than the calculus behind them.
Tooling and Libraries
To aid in the development and deployment of machine learning models, TensorFlow provides a
rich ecosystem of tools, including TensorFlow Lite for mobile and embedded devices,
TensorFlow Extended (TFX) for production environments, and TensorBoard for visualization
and monitoring of model training. These tools are designed to streamline the process of model
development, from initial data preprocessing and model design to training, evaluation, and
deployment.
TensorFlow also offers a comprehensive library, TensorFlow Hub, which is a repository for
reusable machine learning models and parts. This allows developers to import and implement
pre-built models and layers in their projects, significantly speeding up the development process
and encouraging best practices in machine learning.
Community and Support
The TensorFlow community is a robust and vibrant network of developers, researchers, and
technology enthusiasts who contribute to its continuous development. This community plays a
critical role in the iterative improvement of the library, by providing feedback, sharing
innovative uses, and contributing code. TensorFlow’s support for multiple programming
languages, including Python—the most popular language for machine learning—as well as C++
and Java, makes it accessible to a broad audience.
Applications and Impact
TensorFlow's impact is widespread across industries—from healthcare, where it enables better
diagnostics and predictive analytics, to automotive, where it is used for vehicle recognition and
autonomous driving technologies. In the field of natural language processing, TensorFlow
powers systems that can understand and generate human-like text, and in computer vision, it
helps in creating systems that can identify objects, faces, and emotions from images and videos.
In academics and research, TensorFlow is used to push the boundaries of machine learning and
artificial intelligence, helping researchers to uncover new possibilities and innovations that could
lead to further breakthroughs in the field.
3. ML LIBRARIES
ML: Machine learning (ML) is a subfield of artificial intelligence (AI) that involves the use
of algorithms and statistical models to enable computer systems to learn from data, identify
patterns, and make predictions without being explicitly programmed. It has a wide range of
applications, including image and speech recognition, natural language processing,
recommendation systems, and predictive analytics. ML algorithms can be supervised,
unsupervised, or semi-supervised, and they require large amounts of data to be trained
effectively. As the field continues to grow, new algorithms and techniques are constantly
being developed, making ML an exciting and dynamic area of research and innovation.
Machine learning (ML) libraries are software tools that enable developers and data
scientists to build and train machine learning models. These libraries provide a set of pre-
built algorithms, functions, and tools that make it easy for developers to implement complex
ML models without having to write extensive code from scratch. Popular ML libraries
include TensorFlow, PyTorch, and Scikit-learn, among others. Each library has its unique
features, advantages, and disadvantages that suit different use cases.
ML Kit Vision APIs, developed by Google, offers a versatile platform for integrating
computer vision capabilities into mobile applications. One notable feature is its Optical
Character Recognition (OCR) module, which provides seamless text recognition
functionalities. This API is designed to simplify complex computer vision tasks, allowing
developers to leverage pre-trained models without extensive machine learning expertise.
In the context of an OCR Reader project, developers can effortlessly integrate ML Kit's OCR
capabilities. By incorporating the necessary dependencies into the project and initializing the
OCR detector, developers gain access to a robust toolset for extracting textual information
from images. This includes support for multiple languages and the flexibility to choose
between on-device or cloud-based processing, offering a balance between real-time
responsiveness and computational efficiency.
Tensorflow Lite Object Detection API, is a framework that enables efficient deployment of
object detection models on mobile and edge devices. It is an extension of TensorFlow Lite,
designed specifically for on-device object detection tasks. The API allows developers to
integrate and run pre-trained models for real-time object detection in applications, balancing
accuracy, and speed for resource-constrained environments.
Single Shot MultiBox Detector (SSD): SSD is a popular object detection framework that
enables the simultaneous prediction of multiple bounding boxes and class scores in a single
forward pass. It operates at different scales to capture objects of varying sizes.
Feature Extractor: MobileNet V1 serves as the feature extractor for the COCO SSD model.
It transforms input images into a set of feature maps, capturing hierarchical features at
different spatial resolutions.
Anchor Boxes: SSD employs anchor boxes at different scales and aspect ratios to predict
bounding boxes efficiently. These anchor boxes serve as reference boxes for predicting
object locations.
Output Layers: The model's output layers provide predictions for bounding box coordinates
and associated class scores. The SSD architecture generates predictions across multiple
scales, contributing to its versatility in detecting objects of various sizes.
COCO (Common Objects in Context) Dataset: The COCO dataset is a widely used
benchmark in computer vision that encompasses a diverse range of object categories. The
COCO SSD MobileNet V1 model is trained on this dataset, enabling it to recognize and
classify a broad spectrum of objects.
Chapter-2
System Design
2. System Design
System design is the process of designing the architecture, components, and interfaces of a
system to meet the requirements of the end user. Designing a system for a technical interview
cannot be ignored! Almost every IT giant, including Facebook, Amazon, Google, and Apple,
asks a variety of questions in their interviews based on system design concepts such as
scalability, load balancing, caching, and more.
It is a broad field of engineering study that includes a variety of concepts and
principles to help design scalable systems. These concepts are widely requested during
interviews for SDE 2 and SDE 3 positions in various technology companies. These senior
roles require better understanding of how to solve specific design challenges, how to respond
when the system is expected to have more traffic, how to design the system's database, and so on.
All of these decisions must be made carefully, taking into account scalability, reliability,
availability, and maintainability.
Approaching a Design Problem
Breaking Down the Problem: Given a design task, I start breaking it down into
smaller components. These components can be services or functions that the system must
implement. At first, your development system may have a lot of features,
and you don't need to design everything if it's an interview. Ask the interviewer what
features you want to implement in the system.
Communicating your Ideas: Communicate well with the interviewer. Keep him up to
date as you develop your system. Discuss the process out loud. Visualize your designs on
the board using flowcharts and diagrams. Explain to the interviewer your ideas,
how to solve scalability problems, how to design databases, etc.
Assumptions that make sense: Make some reasonable assumptions when designing
your system. Let's say you need to guess the number of queries your system
will handle per day, the number of database hits per month, or the efficiency level of
your caching system. Here are some numbers to consider when designing. Keep this
number as reasonable as possible. Back up your guesses with some compelling facts and
figures.
The degree of availability varies from system to system. If you're developing a social networking
application, you don't really need high availability. A delay of several seconds is acceptable. It's
not hard to see your favourite celebrity's Instagram posts with a 5-10 second delay. However, if
you are developing a system for a hospital, data center or banking institution, you must ensure
that the system is highly available. Because service delays can lead to huge losses.
There are various principles you should follow in order to ensure the availability of your system:
There should be no single point of failure in the system. Essentially, the system should
not rely on a single service to handle all requests. This is because if this service is interrupted,
the entire system may become corrupted and eventually become unusable. Detect and eliminate
current errors.
Data Flowcharts
As its name implies, this flowchart is used to evaluate data, and more specifically, it aids in the
analysis of project-related structural information. This flowchart makes it simple to comprehend
how data enters and leaves the system. Most frequently, it is utilized to manage data or evaluate
information moving in and out of the system.
There are various box kinds that can be used to create flowcharts. Arrow lines link each of the
many types of boxes to the others. Arrow lines are used to show control flow. Let's explore each
box in brief.
1. Terminal
This oval-shaped box is used to signal the beginning or end of the program. Every flowchart
diagram has two oval shapes, one to represent the beginning of an algorithm and the other to
represent its conclusion.
Data
The inputs and outputs are entered into a parallelogram-shaped box. The information
entering the system or algorithm and information leaving the system or algorithm is
essentially depicted like this.
2. Process
The main logic of the algorithm or the major body of the program is written inside this
rectangular box by the programmer. The primary processing codes are written inside this
box, making it the most important part of the flowchart.
3. Decision
This is a rhombus-shaped box, and inside it are control statements like if and conditions like
a > 0. There are two ways to go from this one; one is "yes," and the other is "no." These are
the possibilities in this box, just as there are just two options for any decision: yes or no.
4. Flow
The algorithm or process's flow is depicted by this arrow line. It stands for the process flow's
direction. Arrows were added to each stage in the examples before to show how the programs
flowed. arrow makes the software easier to read.
5. Delay
Any waiting interval that is a component of a process is represented by the Delay flowchart
symbol. Process mapping frequently uses delay shapes.
3. Software Details
1. Android Studio:
Fig.3.1
Android Studio is the official integrated development environment (IDE) for Android app
development. It is developed by Google and is based on the popular IntelliJ IDEA software.
Android Studio provides a comprehensive suite of tools for developing Android apps, including
a code editor, visual layout editor, debugger, and performance analysis tools. It also includes a
variety of templates and sample code to help developers get started with their projects quickly.
Some key features of Android Studio include:
A Gradle-based build system that automates the building and packaging of app code and
resources.
A layout editor that allows developers to drag and drop UI components and preview the
design of their app in real-time.
A rich code editor that supports features like code completion, refactoring, and
debugging.
Integration with Google Play services and other libraries, allowing developers to easily
add features like Google Maps, Firebase, and AdMob to their apps.
Support for multiple programming languages, including Kotlin and Java.
Required 8 GB or more
RAM
2. Firebase
Fig.3.2
Firebase is a mobile and web application development platform owned by Google. It provides a
wide range of services that help developers build, test, and deploy apps more quickly and easily.
Firebase includes a number of different features, such as real-time database, cloud storage,
authentication, hosting, analytics, and more. These features are designed to work seamlessly
together, allowing developers to create complex applications with ease.
One of the key advantages of Firebase is that it is a serverless platform, meaning that developers
don't have to worry about managing servers or infrastructure. Instead, Firebase takes care of all
the backend services, allowing developers to focus on building the frontend and user experience
of their applications.
Firebase also has a strong community of developers and resources available, including
documentation, code samples, and support forums. This makes it easier for developers to get
started with Firebase and troubleshoot any issues that may arise.
Overall, Firebase is a powerful platform that enables developers to build high-quality mobile and
web applications quickly and easily.
F
i g.
3. 3
Text-to-speech (TTS) software is a type of computer software that converts written text into
spoken words. It uses natural language processing (NLP) and speech synthesis technology to
convert written text into audio output, which can then be played through speakers or headphones.
TTS software can be useful for individuals with visual impairments or reading difficulties, as it
allows them to listen to text rather than reading it. It can also be helpful for language learning or
for individuals who prefer listening to reading.
TTS software has come a long way in recent years, with advancements in NLP and machine
learning making it more accurate and natural-sounding. Some TTS software even allows users to
customize the voice and speed of the spoken output, and some can even generate multiple voices
and accents. Overall, TTS software has many practical applications and has the potential to make
information more accessible to a wider range of individuals.
Chapter-4
4. Work Details
Details are important in the workplace because they make a lasting impression on colleagues,
customers, and bosses.
This shows that you are organized and attentive to your responsibilities. Also, the
accuracy and thoroughness of work is a great ways to earn trust and respect. People look for
attentive employees for every good reason.
Sensory perception is the ability to perceive information through the senses. Paying attention to
detail is a skill everyone needs from time to time. When attention to detail becomes part of your
nature, it helps you develop your sensory perception.
It is important to develop sensory perception at work and in life, as attention to detail has
negative consequences.
If you don't pay attention to the details, you won't know what needs to be
fixed or improved. Attention to detail develops sensory skills, helping you better deal with
distractions and focus.
A virtual assistant for visually impaired individuals can be a game-changer in many real-life
applications. For example, in the workplace, a virtual assistant can assist visually impaired
employees by reading out important documents, emails, and messages. This can help them stay
on top of their work and reduce the need for assistance from others. Additionally, a virtual
assistant can track the location of important objects, such as office supplies, and guide visually
impaired individuals to them. This can help improve their efficiency and independence at work.
In daily life, a virtual assistant can also be incredibly useful for visually impaired individuals. It
can read out labels on food items, medication, and household products, helping them to identify
what they are using or consuming. A virtual assistant can also provide information about the
location of objects within their home, such as keys, wallets, and phones, reducing the amount of
time spent searching for them. Furthermore, a virtual assistant can guide visually impaired
individuals through unfamiliar environments, such as public transportation systems or airports,
ensuring they arrive at their destination safely and on time.
Overall, we can say that a virtual assistant can help visually impaired individuals with everyday
tasks such as shopping and running errands. The assistant can read product labels and scan
barcodes, making it easier for individuals to identify items they need. The virtual assistant can
also track the location of items in stores, making it easier for individuals to navigate and find
what they need. Overall, a virtual assistant can significantly improve the quality of life for
visually impaired individuals, enabling them to be more independent and self-sufficient.
</project>
deploymentTargetDropDown.xml
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="deploymentTargetDropDown">
<targetSelectedWithDropDown>
<Target>
<type value="QUICK_BOOT_TARGET" />
<deviceKey>
<Key>
<type value="VIRTUAL_DEVICE_PATH" />
<value value="C:\Users\DELL\.android\avd\
Pixel_2_XL_API_25.avd" />
</Key>
</deviceKey>
</Target>
</targetSelectedWithDropDown>
<timeTargetWasSelectedWithDropDown value="2023-03-
16T08:49:30.845948600Z" />
</component>
</project>
gradle.xml
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="GradleMigrationSettings" migrationVersion="1" />
<component name="GradleSettings">
<option name="linkedExternalProjectsSettings">
<GradleProjectSettings>
<option name="testRunner" value="GRADLE" />
<option name="distributionType" value="DEFAULT_WRAPPED" />
<option name="externalProjectPath" value="$PROJECT_DIR$" />
<option name="modules">
<set>
<option value="$PROJECT_DIR$" />
<option value="$PROJECT_DIR$/app" />
</set>
</option>
</GradleProjectSettings>
</option>
</component>
</project>
misc.xml
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="ExternalStorageConfigurationManager" enabled="true" />
<component name="ProjectRootManager" version="2" languageLevel="JDK_11"
default="true" project-jdk-name="Android Studio default JDK" project-jdk-
type="JavaSDK">
<output url="file://$PROJECT_DIR$/build/classes" />
</component>
<component name="ProjectType">
<option name="id" value="Android" />
</component>
</project>
workspace.xml
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="AndroidLayouts">
<shared>
<config />
</shared>
</component>
<component name="AutoImportSettings">
<option name="autoReloadType" value="NONE" />
</component>
<component name="ChangeListManager">
<list default="true" id="8b0fc1cc-7412-4c3d-9cc3-680e76b249a7"
name="Changes" comment="" />
<option name="SHOW_DIALOG" value="false" />
<option name="HIGHLIGHT_CONFLICTS" value="true" />
<option name="HIGHLIGHT_NON_ACTIVE_CHANGELIST" value="false" />
<option name="LAST_RESOLUTION" value="IGNORE" />
</component>
<component name="ExecutionTargetManager"
SELECTED_TARGET="device_and_snapshot_combo_box_target[C:\Users\DELL\.android\
avd\Pixel_2_XL_API_25.avd]" />
<component name="ExternalProjectsData">
<projectState path="$PROJECT_DIR$">
<ProjectState />
</projectState>
</component>
<component name="MarkdownSettingsMigration">
<option name="stateVersion" value="1" />
</component>
<component name="ProjectId" id="2N520S5DO2z8jAXIFM7Sgok4tKW" />
<component name="ProjectViewState">
<option name="hideEmptyMiddlePackages" value="true" />
<option name="showLibraryContents" value="true" />
</component>
<component name="PropertiesComponent"><![CDATA[{
"keyToString": {
"RunOnceActivity.OpenProjectViewOnStart": "true",
"RunOnceActivity.ShowReadmeOnStart": "true",
"RunOnceActivity.cidr.known.project.marker": "true",
"cidr.known.project.marker": "true",
"last_opened_file_path": "C:/Users/DELL/AndroidStudioProjects/Drishti",
"project.structure.last.edited": "Dependencies",
"project.structure.proportion": "0.17",
"project.structure.side.proportion": "0.2"
}
}]]></component>
<component name="RunManager">
<configuration name="app" type="AndroidRunConfigurationType"
factoryName="Android App">
<module name="Drishti.app.main" />
<option name="DEPLOY" value="true" />
<option name="DEPLOY_APK_FROM_BUNDLE" value="false" />
<option name="DEPLOY_AS_INSTANT" value="false" />
<option name="ARTIFACT_NAME" value="" />
<option name="PM_INSTALL_OPTIONS" value="" />
<option name="ALL_USERS" value="false" />
<option name="ALWAYS_INSTALL_WITH_PM" value="false" />
<option name="CLEAR_APP_STORAGE" value="false" />
<option name="DYNAMIC_FEATURES_DISABLED_LIST" value="" />
<option name="ACTIVITY_EXTRA_FLAGS" value="" />
<option name="MODE" value="default_activity" />
<option name="CLEAR_LOGCAT" value="false" />
<option name="SHOW_LOGCAT_AUTOMATICALLY" value="false" />
<option name="INSPECTION_WITHOUT_ACTIVITY_RESTART" value="false" />
<option name="TARGET_SELECTION_MODE"
value="DEVICE_AND_SNAPSHOT_COMBO_BOX" />
<option name="SELECTED_CLOUD_MATRIX_CONFIGURATION_ID" value="-1" />
<option name="SELECTED_CLOUD_MATRIX_PROJECT_ID" value="" />
<option name="DEBUGGER_TYPE" value="Auto" />
<Auto>
<option name="USE_JAVA_AWARE_DEBUGGER" value="false" />
<option name="SHOW_STATIC_VARS" value="true" />
<option name="WORKING_DIR" value="" />
<option name="TARGET_LOGGING_CHANNELS" value="lldb process:gdb-remote
packets" />
<option name="SHOW_OPTIMIZED_WARNING" value="true" />
</Auto>
<Hybrid>
<option name="USE_JAVA_AWARE_DEBUGGER" value="false" />
<option name="SHOW_STATIC_VARS" value="true" />
<option name="WORKING_DIR" value="" />
<option name="TARGET_LOGGING_CHANNELS" value="lldb process:gdb-remote
packets" />
<option name="SHOW_OPTIMIZED_WARNING" value="true" />
</Hybrid>
<Java />
<Native>
<option name="USE_JAVA_AWARE_DEBUGGER" value="false" />
<option name="SHOW_STATIC_VARS" value="true" />
<option name="WORKING_DIR" value="" />
<option name="TARGET_LOGGING_CHANNELS" value="lldb process:gdb-remote
packets" />
<option name="SHOW_OPTIMIZED_WARNING" value="true" />
</Native>
<Profilers>
<option name="ADVANCED_PROFILING_ENABLED" value="false" />
<option name="STARTUP_PROFILING_ENABLED" value="false" />
<option name="STARTUP_CPU_PROFILING_ENABLED" value="false" />
<option name="STARTUP_CPU_PROFILING_CONFIGURATION_NAME"
value="Java/Kotlin Method Sample (legacy)" />
<option name="STARTUP_NATIVE_MEMORY_PROFILING_ENABLED" value="false"
/>
<option name="NATIVE_MEMORY_SAMPLE_RATE_BYTES" value="2048" />
</Profilers>
<option name="DEEP_LINK" value="" />
<option name="ACTIVITY_CLASS" value="" />
<option name="SEARCH_ACTIVITY_IN_GLOBAL_SCOPE" value="false" />
<option name="SKIP_ACTIVITY_VALIDATION" value="false" />
<method v="2">
<option name="Android.Gradle.BeforeRunTask" enabled="true" />
</method>
</configuration>
</component>
<component name="SpellCheckerSettings" RuntimeDictionaries="0" Folders="0"
CustomDictionaries="0" DefaultDictionary="application-level"
UseSingleDictionary="true" transferred="true" />
<component name="TaskManager">
<task active="true" id="Default" summary="Default task">
<changelist id="8b0fc1cc-7412-4c3d-9cc3-680e76b249a7" name="Changes"
comment="" />
<created>1678939398967</created>
<option name="number" value="Default" />
<option name="presentableId" value="Default" />
<updated>1678939398967</updated>
</task>
<servers />
</component>
</project>
Chapter-5
Source Code
5. Source Code
In computing, source code is any set of code, with or without comments, written in a human-
readable programming language, usually in plain text. A program's source code is
designed specifically to facilitate the computer programmer's job, primarily in writing the source
code to determine what the computer should do.
AndroidManifest.xml:
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools"
package="com.example.drishti">
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/>
<uses-permission
android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
<uses-permission android:name="android.permission.CAMERA" />
<application
android:allowBackup="true"
android:dataExtractionRules="@xml/data_extraction_rules"
android:fullBackupContent="@xml/backup_rules"
android:icon="@mipmap/ic_launcher"
android:label="Drishti"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/Theme.Drishti"
tools:targetApi="31">
<meta-data
android:name="com.google.mlkit.vision.DEPENDENCIES"
android:value="ocr" />
<activity
android:name=".MainActivity"
android:exported="true">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
</manifest>
Build.gradle:
plugins {
id 'com.android.application'
}
android {
namespace 'com.example.drishti'
compileSdk 33
defaultConfig {
applicationId "com.example.drishti"
minSdk 24
targetSdk 33
versionCode 1
versionName "1.0"
testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
}
buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android-
optimize.txt'), 'proguard-rules.pro'
}
}
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_8
targetCompatibility JavaVersion.VERSION_1_8
}
}
dependencies {
implementation 'androidx.appcompat:appcompat:1.6.1'
implementation 'com.google.android.material:material:1.8.0'
implementation 'androidx.constraintlayout:constraintlayout:2.1.4'
implementation 'com.google.android.gms:play-services-mlkit-text-
recognition'
implementation files('libs\\itextpdf-5.4.0.jar')
testImplementation 'junit:junit:4.13.2'
androidTestImplementation 'androidx.test.ext:junit:1.1.5'
androidTestImplementation 'androidx.test.espresso:espresso-core:3.5.1'
}
MainActivity.java:
package com.example.drishti;
import android.Manifest;
import android.content.Context;
import android.content.pm.PackageManager;
import android.graphics.Paint;
import android.graphics.pdf.PdfDocument;
import android.os.Bundle;
import android.os.Environment;
import android.os.Handler;
import android.os.Looper;
import android.speech.tts.TextToSpeech;
import android.util.SparseArray;
import android.view.SurfaceHolder;
import android.view.SurfaceView;
import android.view.View;
import android.widget.Button;
import android.widget.TextView;
import android.widget.Toast;
import android.location.Location;
import android.location.LocationListener;
import android.location.LocationManager;
import androidx.annotation.NonNull;
import androidx.appcompat.app.AppCompatActivity;
import androidx.core.app.ActivityCompat;
import com.google.android.gms.vision.CameraSource;
import com.google.android.gms.vision.Detector;
import com.google.android.gms.vision.text.TextBlock;
import com.google.android.gms.vision.text.TextRecognizer;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Arrays;
import java.util.Locale;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
button2 = findViewById(R.id.btn2);
button3 = findViewById(R.id.btn3);
button = findViewById(R.id.btn);
button1 = findViewById(R.id.btn1);
button1.setVisibility(View.GONE);
ActivityCompat.requestPermissions(this, new String[]{CAMERA},
PackageManager.PERMISSION_GRANTED);
textToSpeech = new TextToSpeech(this, new
TextToSpeech.OnInitListener() {
@Override
public void onInit(int status) {
textToSpeech.setLanguage(Locale.US);
}
});
}
@Override
protected void onDestroy() {
super.onDestroy();
cameraSource.release();
}
surfaceView = findViewById(R.id.surfaceView);
Context context = this;
surfaceView.getHolder().addCallback(new SurfaceHolder.Callback() {
@Override
public void surfaceCreated(SurfaceHolder holder) {
try {
if (ActivityCompat.checkSelfPermission(MainActivity.this,
Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {
return;
}
cameraSource.start(surfaceView.getHolder());
} catch (IOException e) {
e.printStackTrace();
}
}
@Override
public void surfaceChanged(SurfaceHolder holder, int format, int
width, int height) {
@Override
public void surfaceDestroyed(SurfaceHolder holder) {
cameraSource.stop();
}
});
}
SparseArray<TextBlock> sparseArray =
detections.getDetectedItems();
StringBuilder stringBuilder = new StringBuilder();
myPdfDocument.close();
}
} else {
microid.setEnabled(true);
}
} else {
}
}
});
Emulator
Fig.5.1
Fig.5.1.1
Fig.5.2
App:
Fig.5.3
The assistant says that there are three options. We give the required permissions of Camera,
Audio and Microphone.
Fig.5.3.1
Fig.5.3.2
Reader: Input
Fig.5.4
Output
Fig.5.5
Text is read aloud by the assistant.
Location Tracker:
Fig.5.6
Object Detector:
Fig.5.7
Chapter-7
System Testing
System testing (ST) is a "black box" testing method performed to evaluate whether an
entire system meets specified requirements. In system testing, the functionality of a
system is tested on an end-to-end basis.
System testing is usually performed by a team independent of the development team
to impartially measure the quality of the system. This includes both functional and non-
functional tests.
Black box testing, which develops test cases using GUIs or user perspectives, and White box
testing, which uses internal coding to construct test cases, are the two most often used
approaches for software testing.
White box testing
Black box testing
White box testing:
White box testing, also known as transparent, clear box, structural, or code-based testing,
involves an intimate examination of the internal workings of an application. Developers
typically carry out this testing method before the software is sent to a testing team, which
may perform other types of testing such as black-box testing. The main objective of white
box testing is to analyze the internal structures and workings of the application, focusing
primarily on the code, including its paths, conditions, loops, and branches.
This type of testing is fundamental at the early stages of software development as it
encompasses both unit testing, where individual units or components of the software are
tested, and integration testing, where it is checked how well the individual units work
together. By focusing on both the inputs and outputs, white box testing ensures the
functional correctness of the software, enhances its security, and helps in optimizing code
structure. Moreover, white box testing requires a significant level of programming skills
and a deep understanding of the codebase, making it crucial for validating algorithmic
effectiveness in the software.
Black box testing:
Black box testing contrasts sharply with white box testing by focusing solely on the
software functionality without any regard to the internal workings of the application. It
does not require knowledge of the code or structure of the program and is thus accessible
to testers who may not have programming skills. This method uses external descriptions
of the software, including specifications, requirements, and design elements, to develop
test cases.
Each test case evaluates the system's responses to inputs and checks the output against
expected results. If the output matches the expected results, the test is passed; otherwise,
it is considered a failure. This method is particularly effective for validating business
processes and user requirements as it tests from the user's perspective, ensuring the
software meets the established criteria for functionality and user interface behavior.
Although black box testing may seem less comprehensive than white or grey box testing,
it is crucial for confirming that the software performs as intended from the end-user's
standpoint and typically requires less time than the more in-depth techniques.
A customer-stated requirement definition serves as the main source for black-box testing.
It is a different kind of manual test. It is a kind of software testing that looks at the
software's functionality without knowing anything about its coding or internal structure.
Software programming skills is not necessary. Every test case is created by taking into
account the input and output of a specific function. The test engineer compares the
software to the requirements, finds any flaws or bugs, and then returns it to the
development team.
In this method, the tester chooses a function, provides an input value to test its
functionality, and determines whether or not the function produces the desired results.
The function passes the test if it gives the expected result; else, it fails.
Compared to White Box and Grey Box testing techniques, Black Box testing is less
thorough. Of all the testing steps, it takes the least amount of time. Using black box
testing mostly serves the purpose of defining consumer or corporate requirements.
Following are the processes for testing a web app. In web-based testing, several regions
must be tested for potential error and bugs.
App Functionality: In web-based testing, we must verify that a web application's
specified functionality, features, and operational behaviour match its specifications. For
instance, testing all mandatory fields, ensuring that all mandatory fields display an
asterisk, ensuring that optional fields do not trigger an error message, and ensuring that
links such as external, internal, anchor, and mailing links are properly checked and
checked for broken links, which should then be removed. Functional testing allows us to
test the functional requirements and requirements specifications of the app.
Usability: The developers encounter problems with scalability and interactivity while
testing usability. Developers must form a team to test the application using various
hardware configurations and various browsers because different user populations will be
accessing the website. When a person browses an online store, for instance, a number of
queries could cross his or her mind, such as determining the website's legitimacy and
determining whether shipping costs are necessary.
Browser Compatibility: We test the web application to see if the content on the website
is shown appropriately across all browsers in order to determine whether or not the
website is compatible with working the same in different browsers.
Security: Every website that is accessible over the internet must consider security.
Testers for security examine topics like whether unauthorised access to secure pages
should be prohibited and whether files restricted to users should not be downloadable
without the appropriate access.
Load Issues: We carry out this testing to see how the system responds to a particular
load in order to quantify some crucial transactions. The load on the database, application
server, etc. is also kept track of.
Storage and Database: Any web application's testing of its storage or database is a crucial
component, and we must ensure that the database is tested thoroughly. We check for problems
while running any database queries, the query's response time, and if the data obtained from the
database is shown appropriately on the website or not
Chapter-8
Conclusion
8.1Limitations
Delayed updates due to differences in time zones.
Due to linguistic and cultural barriers, the briefing may be insufficient, and output can be
decreased.
For individuals who are not tech-savvy and are unfamiliar with smartphones, there may
be obstacles.
8.2Future Scope
The future scope of an Android application depends on various factors such as the type of app,
its functionality, target audience, market trends, and technological advancements.
Some possible future scopes of our app include integration with emerging technologies like
Computer Vision, AI and AR/VR to provide more help and support to blind people. We are
planning to improve security and enhanced user engagement using these features:
1. Advanced location tracking: The app can use GPS and other advanced technologies to
provide real-time location tracking, with voice-guided directions and alerts when users
are approaching a destination.
2. Integration with smart home devices: The app can be integrated with various smart home
devices such as Alexa or Google Home, allowing visually impaired users to control their
home appliances through voice commands.
3. Integration with wearable devices: The app can be integrated with wearable devices such
as smartwatches or fitness trackers, allowing users to receive notifications, track their
activity, and navigate using vibrations and voice commands.
4. Gesture recognition: The app can incorporate gesture recognition technology, allowing
visually impaired individuals to control their device using simple hand gestures.
5. Improved Language Support: Finally, the app could be expanded to support additional
languages, allowing visually impaired individuals from all around the world to benefit
from its features. This could involve partnering with organizations or experts in different
countries to ensure that the app is tailored to the specific needs of each region.
6. Integration with AI Assistants: Another future scope for the app is to integrate with
popular AI assistants like Google Assistant or Amazon Alexa. This could allow visually
impaired users to perform tasks like setting reminders, making phone calls, or sending
text messages using voice commands.
7. Enhanced Navigation and Route Planning: Another potential future scope for the app is
to expand its navigation and route planning features. For example, the app could use
machine learning algorithms to analyze real-time traffic data and suggest the fastest or
most efficient route for users to take. The app could also incorporate features like voice-
guided navigation or 3D mapping to provide more detailed information about the user's
surroundings.
REFERENCES
[1] Creating Accessible Mobile Apps for the Visually Impaired", Nielsen Norman Group, 2020.
[12-Jan-2020].
[2] Designing Mobile Applications for Visually Impaired Users", UX Planet, 2021. [ 25-Mar-
2021].
[3] 10 Features Every Visual Impaired App Should Have", American Foundation for the Blind,
2019. [ 8-Aug-2019].
[5] Developing Accessible Apps for Blind and Visually Impaired Users", Apple Developer,
2023. [ 17-Jul-2023].
[6] Mobile Accessibility: How to Design an Accessible App", Interaction Design Foundation,
2021. [ 10-Oct-2021].
[7] Best Practices for Designing Accessible Apps", Google Accessibility, 2023.[ 22-May-2023].
[9] The Importance of Accessibility in Mobile App Development", Smashing Magazine, 2022.
[ 7-Mar-2022].
[10] Ensuring Your Mobile App is Accessible to Everyone", Microsoft Accessibility, 2023.
[Accessed: 30-Sep-2023].