0% found this document useful (0 votes)
22 views54 pages

1-Cover Page-CSE

Download as doc, pdf, or txt
Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1/ 54

A

Major Project Report


on
“Drishti”
Submitted in partial fulfillment for the award of degree of
Bachelor of Technology
In
Computer Science & Engineering

Submitted to: Submitted By:


Dr. Pradeep Jha Chhotu Kumar Ram
HOD-CSE/IT/AI&DS Dept. 20EGJCS030

Department of AI &DS
Global Institute of Technology
Jaipur (Rajasthan)-302022
Batch: 2020-24
GLOBAL INSTITUTE OF TECHNOLOGY
ITS-1 & 2, IT Park, EPIP, Sitapura, Jaipur – 302022, Rajasthan – INDIA

Certificate

This is to certify that the work, which is being presented in the project entitled
“Drishti” submitted by Chhotu Kumar Ram, a student of fourth year (VIII Sem),
B.Tech. in Computer Science & Engineering, in partial fulfilment for the award of
degree of Bachelor of Technology is a record of student’s work carried out and
found satisfactory for submission.

Chhotu Kumar Ram


20EGJCS031

Project Guide Project Coordinators


Ms. Apoorva Joshi Ms. Manju Mathur

Dr. Pradeep Jha


(HOD, CSE/ IT/ AI & DS Dept.)
Acknowledgement

We take this opportunity to express my deep sense of gratitude to project coordinators Ms.
Manju Mathur, Assistant Professor, Department of Computer Science and Engineering, Dr.
Reema Ajmera, Assistant Professor, Department of Computer Science and Engineering and our
project guide “Ms. Apoorva Joshi”, Professor/Assistant Professor Department of Computer
Science and Engineering, Global Institute of Technology, Jaipur, for his/her valuable guidance
and cooperation throughout the Project work. He/She provided constant encouragement and
unceasing enthusiasm at every stage of the project work.
We are grateful to our respected Dr. I. C. Sharma, Principal GIT, for guiding us. We express
our indebtedness to Dr. Pradeep Jha, Head of the Department of Computer Science and
Engineering, Global Institute of Technology, Jaipur, for providing us with ample support.
Without their support and timely guidance, the completion of our project would have seemed a
farfetched dream. We find ourselves lucky to have mentors of such great potential in this respect.

Place: GIT, Jaipur

Chhotu Kumar Ram


20egjcs030
B.Tech. VIII Semester, IV Year
Computer Science & Engineering
Abstract

"Drishti" is a groundbreaking application designed to enhance accessibility and independence for


individuals with visual impairments. This project integrates state-of-the-art OCR (Optical
Character Recognition) readers and location detection systems into a user-friendly interface
accessible via smartphones. Through the conversion of printed text into audio and real-time
navigation assistance, "Drishti" empowers users to overcome barriers in accessing information
and navigating their surroundings. Drawing upon a comprehensive literature review, these
abstract highlights the transformative potential of "Drishti" in addressing the diverse needs of
visually impaired individuals. By leveraging the ubiquity of smartphones and the power of
assistive technologies, this project seeks to bridge the accessibility gap, promoting educational,
professional, and social inclusion. Key findings from the literature review underscore the
profound impact of assistive technologies on the quality of life for visually impaired individuals.
OCR readers revolutionize access to printed materials, while location detection systems offer
invaluable support in navigating both indoor and outdoor environments. Moreover, user-centered
design principles are essential in ensuring that "Drishti" meets the unique needs and preferences
of its users. As technology continues to evolve, "Drishti" stands as a testament to the ongoing
efforts to create a more accessible and equitable world. By fostering innovation, collaboration,
and empathy, this project paves the way for greater inclusivity and empowerment for individuals
with visual impairments.
Table of Contents

Certificate.....................................................................................................................................................................i

Acknowledgement....................................................................................................................................................ii

Abstract.......................................................................................................................................................................iii

Table of Content.......................................................................................................................................................iv

List of Figures..................................................................................................................................v

List of Tables..................................................................................................................................vi

List of Symbols..............................................................................................................................vii

Introduction and objective....................................................................................................10


Present Standards..................................................................................................3
Proposed standard.....................................................................................3
System Architecture................................................................................3
Design of location tracker.......................................................................4
Design of object detector.........................................................................5
The flow control in OCR.........................................................................5
Location Tracker...................................................................................................8
Technology used.................................................................................................13
java.............................................................................................................14
Design and architecture.........................................................................14
Developments tools and libraries..........................................................14
TensorFlow................................................................................................15
Core Feature and Fundamentals.................................................................16
ML libraries........................................................16
Tensor-Lite object detection API.......................................................................................17
COCO SSD................................................................................................18
System Design...............................................................................................................................29
Approaching and design problem..............................................................18
Reliablity in system design........................................................................18
System FlowChart......................................................................................18
Basic ideas.................................................................................................18
Software details..........................................................................................18
Android Studio.........................................................................................29
Basic system design for Android studio....................................................18
Firebase......................................................................................................18
TT Software...............................................................................................18
Implementation work details..................................................................18
Real life applications..................................................................................18
Program execution.....................................................................................18
Source code...............................................................................................18
Android XML............................................................................................18
Build.grandle..............................................................................................18
Main Activity.............................................................................................18
Model And Dataset....................................................................................18
Input/Output Screen................................................................................18
Emulator.....................................................................................................18
App Icon.....................................................................................................18
App.............................................................................................................18
Reader input...............................................................................................18
Reader output.............................................................................................18
Location Tracker........................................................................................18
Object Detector..........................................................................................18
System Testing..........................................................................................18
Storage database.........................................................................................18
Result and Conclusion...........................................................................................................42
Conclusion and Future Work...........................................................................43

REFERENCES.............................................................................................................................46
List of Figures

Fig. 1.1 Present Standard


Fig. 1.2 OCR reader.........................................................................................................................4
Fig. 1.3 Location tracker..................................................................................................................4
Fig. 1.4 Object Detector...................................................................................................................4
Fig.1.5 API.......................................................................................................................................1
Fig. 1.6 TensorFlow Lite.................................................................................................................1
Fig. 1.7 Java.....................................................................................................................................1
Fig. 1.8 Tensor Flow........................................................................................................................1

Fig. 3.1 Android Studio


Fig. 3.2 Firebase...............................................................................................................................4
Fig. 3.3 TT. Software.......................................................................................................................4

Fig. 5.1 Emulator


Fig. 5.2 APP ICON..........................................................................................................................4
Fig. 5.3 App.....................................................................................................................................4
Fig. 5.4 Input....................................................................................................................................4
Fig. 5.5 Output.................................................................................................................................4
Fig. 5.6 Location Tracker.................................................................................................................4
Fig. 5.7 Object Detector...................................................................................................................4

Chapter-1
INTRODUCTION AND OBJECTIVE

Today there are nearly 2.2 billion people in the world that are visually impaired. Most tasks rely
on optical information; thus, visually impaired people are at a disadvantage. The crucial
information about their surroundings is unavailable. A blind person will always require an aid or
assistance to accompany them or support them with daily tasks. Visual impairment can have
significant effects on an individual's life, including physical, emotional, and social impacts.
Physically, visual impairment can make it challenging to carry out daily tasks such as reading,
writing, and navigating the environment. It can also lead to a decrease in mobility, falls, and
injuries. Several solutions are available to individuals with visual impairment to help them
overcome the challenges they face. These solutions include assistive technology, rehabilitation
services, and education. Assistive technology such as screen readers, location detection, image
detection can help individuals with visual impairment to access information, communicate with
others, and perform daily tasks. The assistance given to those with visual impairments can now
be provided with the use of modern technologies.
So, this virtual assistant will be helpful for visually impaired people. This application contains an
OCR (Optical Character Recognition) Reader and a location detector. The app will be able to
read a specific text to the user as audio and tell them their exact current location to get all the
information about his/her surroundings.
Our application “Drishti” can help to improve the quality of life of visually impaired people.
Since it is not always possible for someone to be with a person who is blind for 24 hours, This
app will prove to be a very useful tool for those who are blind. It will make reading very simple
for them, whether they prefer to read fiction, newspapers, or school textbooks.
There are no barriers ahead. A person of any age can use it, who has a smartphone
OBJECTIVE:
The major goal of building an Android app that serves as a virtual assistant for visually impaired
individuals can be a highly impactful project. By providing a tool that enables individuals with
visual impairments to navigate the world more independently, we can empower them to live
more fulfilling lives. Such an app can offer features like voice commands, text-to-speech, and
location tracking, making it easier for users to perform daily tasks, communicate with others, and
access information. Developing this app can be a way to leverage technology for the greater
good and make a positive difference in people's lives.
There are some prerequisites which are:
 Android Studio and Firebase should be installed on your local machine.
 Knowledge of Java
 Knowledge of TensorFlow
 Knowledge of ML libraries
 Knowledge about APIs
1.1 Present Standards
One of the most important senses for humans is their ability to see with their eyes, and the
absence of this potential has a profound effect on all the possible decisions a character is likely to
make throughout his or her existence. They frequently experience discrimination in social
structures and at their place of employment because they are not expected to advance in their
profession as much as a person with abilities. So, by organizing campaigns and offering
education with new tools and technologies, the government and civil society can play a
significant role in making the lives of visually impaired people easier and safer.
There are various apps available on the internet for assisting visually impaired people, like:
Be My Eye: a free app that connects sighted volunteers with blind or visually impaired people so
they can help them out in their daily lives. Through OpenAl's GPT-4, the app has created the
first-ever virtual volunteer, enabling users to send images and receive thorough descriptions and
instructions for a variety of tasks.
OneStep Reader App: With the touch of a button on the iPhone, the OneStep Reader converts
printed text into clear speech to offer precise, quick, and effective access to both single-page and
multi-page documents.
TapTapSee: The purpose of TapTapSee is to assist the blind and visually impaired in
recognizing objects in their daily environment. Double tapping the screen will enable users to
take pictures from any angle of anything, then the app will speak the identification to the user.
Cash Reader: People who have visual impairments can quickly and easily identify and count
bills with the help of the Cash Reader app, which speaks the denomination and instantly
recognizes the currency.

Despite the availability of so many apps, a sizable number of visually impaired people are still
unable to benefit from them. This may be due to a lack of knowledge, some apps that are not free
to use, and some only work on iPhones.
Below we can see the basic data flow diagram of virtual assistant apps:

Fig.1.1
The client will enter their choice, then the application takes in the information. At that point, the
given input is used to perform an activity. The provided information is verified in a database. If
pertinent information is discovered in the input, it is provided to the user as output or feedback.
1.2 Proposed Standards
The proposed system is to create a simple Android application with an improved user interface
that will act as their voice assistant. This assistant will carry out all their tasks, from simple to
complex, with little to no internet connection. Any basic smartphone with a minimal interface
can use this. The user will have the option of giving voice commands as input.

This project is an Android app called “Drishti” which can Assist the Blind. It understood the text
from a pdf report and synthesized it for the user using Speech Synthesis and text content
reputation. A text document or a.ppt file can be converted into a.pdf file by looking for a specific
set of words. Being built on Android, the application uses pre-defined APIs for text-to-speech
conversion, making the process even greener. Google's Vision API is used, but it does not
recognize text through images. The overall percentage of blind people in the population is
3.44%, of which 53.1% use Android smartphones and the rest do not.

System Architecture
The system proposes the following applications:
1. OCR Reader: With the help of this application users can listen to the text from a pdf by
giving voice commands.
2. Location: In this also user can give voice commands for knowing their location and then
the app will give their present location as an output in voice command.
3. Object Detection: With the help of this feature, user can know about the objects present
in their surroundings.

Design of OCR Reader

Fig.1.2

Design of Location Tracker


Fig.1.3

Design of Object Detector

Fig.1.4

The Flow control in the OCR reader


The requirements are arranged in two groups: user interface and functional requirements.
1. User interface
 Simple to use.
 Flexibility of voice control
2. Functional requirements
 Read the text from the images. (OCR Reader)
 Tracks the present location of the user.
 Detects and recognizes objects in image.
 Exit: close the app.

OCR Reader: Optical Character Reader


It is a technology that converts printed or handwritten text into machine-encoded text, allowing
computers to read and analyze the information. OCR Reader works by using a scanner or camera
to capture an image of the text, and then the software analyses the image and translates it into
digital characters. OCR Readers are widely used in document management systems, digital
archives, and automated data entry processes, where large volumes of printed or handwritten text
need to be digitized and processed quickly and accurately.
OCR Readers have revolutionized the way we handle printed and handwritten documents,
allowing us to digitize vast amounts of information in a matter of seconds. OCR technology has
significantly reduced the time and cost associated with manual data entry and document
processing, making it an essential tool in various industries, including healthcare, finance, and
legal. Moreover, OCR Readers have made documents more accessible, as digital copies can be
easily stored, shared, and searched, making it easier to retrieve information quickly and
accurately. OCR technology is constantly evolving, with improvements in accuracy, speed, and
versatility, ensuring that it will continue to play an essential role in document management and
data processing.

Location Tracker: Using Google API


API: The application Programming Interface is referred to as API. Any software with a specific
function is referred to as an application when discussing APIs. The interface can be compared to
a service agreement between two programs. This agreement specifies the requests and responses
that the two parties will use to communicate. APIs are required to connect apps and carry out a
predefined function that is based on sharing data and running pre-defined operations. They serve
as a go-between, enabling programmers to create fresh programmatic interactions across the
many programs that consumers and companies use on a daily basis.

Fig.1.5

A location tracker using Google API is a powerful tool that enables users to track the real-time
location of a device or individual. This technology leverages the power of Google Maps to
provide accurate and up-to-date location data, making it an essential tool for businesses,
individuals, and even law enforcement agencies. By integrating with Google API, a location
tracker can access a wealth of data, including traffic patterns, local landmarks, and business
listings, all of which can be used to provide a comprehensive understanding of a user's location.
Additionally, this technology can be used to monitor the movements of individuals, providing an
extra layer of security for loved ones or employees. Overall, a location tracker using Google API
can help businesses and individuals save time, increase efficiency, and improve safety.
One of the key benefits of a location tracker using Google API is its ability to provide accurate
location data in real time. This technology can be used to track the movements of a device or
individual, providing users with an up-to-date understanding of their location. Additionally, this
technology can be used to monitor the location of a device or individual over time, providing
insights into patterns and behaviors. For businesses, this can be an invaluable tool for optimizing
operations, as it can help identify inefficiencies and opportunities for improvement. Additionally,
individuals can use location trackers to keep tabs on loved ones or ensure their own safety when
traveling or exploring new areas. Overall, a location tracker using Google API is a powerful and
versatile tool that can be used for a wide range of applications.

Object Detection – Using Tensorflow


Our Android application integrates TensorFlow's SSD model for robust object detection,
complemented by the Android Text-to-Speech (TTS) API to provide real-time voice output for
visually impaired users. This dynamic fusion enables the app to accurately identify and vocally
communicate detected objects, promoting accessibility and independence. Through usability
testing and user feedback, we have fine-tuned the feature to ensure clear and informative voice
output, marking a significant stride in leveraging technology to enhance the daily experiences of
visually impaired individuals.

Fig.1.6

Important Steps involved in Object Detection:


1. Model Integration:
The TensorFlow library is integrated into the Android project, allowing seamless utilization of
the SSD (Single Shot Multibox Detector) model for object detection.
2. SSD Model Loading:
The pre-trained SSD model is uploaded and dataset is stored in the project's assets folder. During
runtime, the model is loaded into the Android application, enabling efficient and accurate object
detection.
3. Input Image Processing:
Images, either from the camera or gallery, undergo preprocessing to align with the SSD model's
requirements. This step involves resizing, normalization, and formatting the image data
appropriately.
4. Object Detection Inference:
Inference is performed on the pre-processed image using the loaded SSD model. The model
outputs bounding boxes and class predictions, effectively identifying objects within the image.
5. Post-processing:
Post-processing extracts relevant information from the model's output, including bounding box
coordinates and class labels, facilitating a structured representation of detected objects.
6. Text-to-Speech Integration:
The Android Text-to-Speech (TTS) API converts the names or labels of detected objects into
spoken words, providing real-time, audible feedback to users.
7. User Interaction:
The user interface is designed to offer visual feedback, such as drawing rectangles around
detected objects. Object detection is triggered based on user interactions, such as tapping a button
or capturing an image.
8. Usability Considerations:
Usability testing with visually impaired users ensures the clarity and informativeness of the voice
output. Additional contextual information about detected objects may be incorporated to enhance
the user experience
1.3 Technologies Used
This project is done using Java Programming Language, TensorFlow and Machine Learning.

1. JAVA
Java is a high-level programming language based on the concepts of object-oriented
programming initially developed by Sun Microsystems and now owned by Oracle Corporation.
It was designed to be platform-independent and portable, meaning that once a Java program is
written, it can run on any computer or device with a Java Virtual Machine (JVM) installed.

Fig.1.7

One of the main features of Java is its "write once, run anywhere" philosophy, which allows
developers to create a single codebase that can be used on multiple platforms without the need
for modification. This makes it a popular choice for developing applications that can run on a
variety of operating systems, including Windows, macOS, Linux, and mobile devices such as
Android.
Java also provides a wide range of libraries and tools for developers, making it easier to build
complex applications. It is commonly used in enterprise applications, web development, mobile
app development, and game development.
Design and Architecture
Java's design centers around robustness, portability, and high performance. Its syntax draws
heavily from C++, which eases the learning curve for developers familiar with that language.
However, Java eschews the complexity and security issues associated with direct pointer
manipulation, making it a safer programming language that is less prone to bugs and security
vulnerabilities.
Central to Java’s architecture is the Java Virtual Machine (JVM), an engine that executes Java
bytecode. This bytecode is a translation of Java source code, allowing Java programs to run on
any device that has a JVM. This layer of abstraction not only ensures operational consistency
across diverse hardware but also facilitates security, as the JVM can contain and manage code
execution within a virtual sandbox.
Development Tools and Libraries
Java developers have access to the Java Development Kit (JDK), which includes all the
necessary tools for Java application development, such as the compiler (javac), runtime
environment (JRE), and numerous utility programs. Over the years, Java has expanded its library
of classes, which now includes tools for GUI development, networking, security, and database
access, among many other functions, simplifying the development process significantly.
Enterprise and Web Applications
For enterprise environments, Java offers Java Enterprise Edition (Java EE), which extends Java
Standard Edition (Java SE) with specifications for scalable, multi-tiered business applications.
Java EE includes APIs for object-relational mapping, distributed computing, and web services,
which are essential for modern enterprise solutions.
In web development, Java’s servlets and JavaServer Pages (JSP) allow for the creation of
dynamic, data-driven web applications. Java’s strong security features make it an excellent
choice for web applications that require reliable security protocols to handle sensitive data.

2. TENSORFLOW
TensorFlow, developed by the Google Brain team, is a versatile open-source software library
designed for high-performance numerical computation. It has become synonymous with machine
learning and deep learning due to its powerful and flexible capabilities for building and training
complex neural networks. The software excels in handling dataflow and differentiable
programming across various computing devices, including CPUs, GPUs, and TPUs (Tensor
Processing Units), making it highly scalable and efficient for both research and production.

Fig.1.8
Core Features and Functionalities
At its core, TensorFlow allows users to create advanced machine learning models through an
intuitive and flexible architecture that supports defining, optimizing, and computing multi-
dimensional arrays, or tensors. This functionality is crucial for developing sophisticated
algorithms that can learn from and make decisions based on large sets of data.
TensorFlow's architecture is built to be modular, which means developers can use and reuse
components as needed. The framework supports a range of tasks from simple regression models
to complex neural networks involving deep learning. Its ability to automate the differentiation of
complex expressions enables developers to focus more on the architecture of their models rather
than the calculus behind them.
Tooling and Libraries
To aid in the development and deployment of machine learning models, TensorFlow provides a
rich ecosystem of tools, including TensorFlow Lite for mobile and embedded devices,
TensorFlow Extended (TFX) for production environments, and TensorBoard for visualization
and monitoring of model training. These tools are designed to streamline the process of model
development, from initial data preprocessing and model design to training, evaluation, and
deployment.
TensorFlow also offers a comprehensive library, TensorFlow Hub, which is a repository for
reusable machine learning models and parts. This allows developers to import and implement
pre-built models and layers in their projects, significantly speeding up the development process
and encouraging best practices in machine learning.
Community and Support
The TensorFlow community is a robust and vibrant network of developers, researchers, and
technology enthusiasts who contribute to its continuous development. This community plays a
critical role in the iterative improvement of the library, by providing feedback, sharing
innovative uses, and contributing code. TensorFlow’s support for multiple programming
languages, including Python—the most popular language for machine learning—as well as C++
and Java, makes it accessible to a broad audience.
Applications and Impact
TensorFlow's impact is widespread across industries—from healthcare, where it enables better
diagnostics and predictive analytics, to automotive, where it is used for vehicle recognition and
autonomous driving technologies. In the field of natural language processing, TensorFlow
powers systems that can understand and generate human-like text, and in computer vision, it
helps in creating systems that can identify objects, faces, and emotions from images and videos.
In academics and research, TensorFlow is used to push the boundaries of machine learning and
artificial intelligence, helping researchers to uncover new possibilities and innovations that could
lead to further breakthroughs in the field.

3. ML LIBRARIES

ML: Machine learning (ML) is a subfield of artificial intelligence (AI) that involves the use
of algorithms and statistical models to enable computer systems to learn from data, identify
patterns, and make predictions without being explicitly programmed. It has a wide range of
applications, including image and speech recognition, natural language processing,
recommendation systems, and predictive analytics. ML algorithms can be supervised,
unsupervised, or semi-supervised, and they require large amounts of data to be trained
effectively. As the field continues to grow, new algorithms and techniques are constantly
being developed, making ML an exciting and dynamic area of research and innovation.
Machine learning (ML) libraries are software tools that enable developers and data
scientists to build and train machine learning models. These libraries provide a set of pre-
built algorithms, functions, and tools that make it easy for developers to implement complex
ML models without having to write extensive code from scratch. Popular ML libraries
include TensorFlow, PyTorch, and Scikit-learn, among others. Each library has its unique
features, advantages, and disadvantages that suit different use cases.

ML Kit Vision APIs, developed by Google, offers a versatile platform for integrating
computer vision capabilities into mobile applications. One notable feature is its Optical
Character Recognition (OCR) module, which provides seamless text recognition
functionalities. This API is designed to simplify complex computer vision tasks, allowing
developers to leverage pre-trained models without extensive machine learning expertise.

In the context of an OCR Reader project, developers can effortlessly integrate ML Kit's OCR
capabilities. By incorporating the necessary dependencies into the project and initializing the
OCR detector, developers gain access to a robust toolset for extracting textual information
from images. This includes support for multiple languages and the flexibility to choose
between on-device or cloud-based processing, offering a balance between real-time
responsiveness and computational efficiency.
Tensorflow Lite Object Detection API, is a framework that enables efficient deployment of
object detection models on mobile and edge devices. It is an extension of TensorFlow Lite,
designed specifically for on-device object detection tasks. The API allows developers to
integrate and run pre-trained models for real-time object detection in applications, balancing
accuracy, and speed for resource-constrained environments.

COCO SSD MobileNet V1 Model in TensorFlow:


The COCO SSD (Common Objects in Context Single Shot MultiBox Detector) MobileNet
V1 model is a specific instance of the SSD architecture paired with MobileNet as the base
feature extractor. Here is a breakdown of its components:

Single Shot MultiBox Detector (SSD): SSD is a popular object detection framework that
enables the simultaneous prediction of multiple bounding boxes and class scores in a single
forward pass. It operates at different scales to capture objects of varying sizes.

MobileNet V1: MobileNet is a lightweight and efficient convolutional neural network


architecture designed for mobile and edge devices. MobileNet V1, the first version of
MobileNet, is known for its ability to achieve a good balance between model size and
accuracy.

Feature Extractor: MobileNet V1 serves as the feature extractor for the COCO SSD model.
It transforms input images into a set of feature maps, capturing hierarchical features at
different spatial resolutions.
Anchor Boxes: SSD employs anchor boxes at different scales and aspect ratios to predict
bounding boxes efficiently. These anchor boxes serve as reference boxes for predicting
object locations.

Output Layers: The model's output layers provide predictions for bounding box coordinates
and associated class scores. The SSD architecture generates predictions across multiple
scales, contributing to its versatility in detecting objects of various sizes.

COCO (Common Objects in Context) Dataset: The COCO dataset is a widely used
benchmark in computer vision that encompasses a diverse range of object categories. The
COCO SSD MobileNet V1 model is trained on this dataset, enabling it to recognize and
classify a broad spectrum of objects.

Chapter-2
System Design

2. System Design
System design is the process of designing the architecture, components, and interfaces of a
system to meet the requirements of the end user. Designing a system for a technical interview
cannot be ignored! Almost every IT giant, including Facebook, Amazon, Google, and Apple,
asks a variety of questions in their interviews based on system design concepts such as
scalability, load balancing, caching, and more.
It is a broad field of engineering study that includes a variety of concepts and
principles to help design scalable systems. These concepts are widely requested during
interviews for SDE 2 and SDE 3 positions in various technology companies. These senior
roles require better understanding of how to solve specific design challenges, how to respond
when the system is expected to have more traffic, how to design the system's database, and so on.
All of these decisions must be made carefully, taking into account scalability, reliability,
availability, and maintainability.
Approaching a Design Problem
 Breaking Down the Problem: Given a design task, I start breaking it down into
smaller components. These components can be services or functions that the system must
implement. At first, your development system may have a lot of features,
and you don't need to design everything if it's an interview. Ask the interviewer what
features you want to implement in the system.
 Communicating your Ideas: Communicate well with the interviewer. Keep him up to
date as you develop your system. Discuss the process out loud. Visualize your designs on
the board using flowcharts and diagrams. Explain to the interviewer your ideas,
how to solve scalability problems, how to design databases, etc.
 Assumptions that make sense: Make some reasonable assumptions when designing
your system. Let's say you need to guess the number of queries your system
will handle per day, the number of database hits per month, or the efficiency level of
your caching system. Here are some numbers to consider when designing. Keep this
number as reasonable as possible. Back up your guesses with some compelling facts and
figures.

There are three main features of a system design:


Reliability in System Design: A system that can meet the needs of end users is reliable. When
designing a system, you need to plan the implementation of a set of functions and services
in the system. A system can be considered reliable if it can perform all these functions without
wear and tear. A fault-tolerant system is one that can continue to function reliably in
the event of a failure. An error is an error that occurs in one or another component of the
system. Failure does not guarantee system failure. A fault is a condition in which the
system cannot function properly. We can no longer provide certain services to end users.
Availability in System Design : Availability is the ability of a system to provide a
consistent level of performance, also referred to as uptime. It is important that
the system provides high availability to process user requests.

The degree of availability varies from system to system. If you're developing a social networking
application, you don't really need high availability. A delay of several seconds is acceptable. It's
not hard to see your favourite celebrity's Instagram posts with a 5-10 second delay. However, if
you are developing a system for a hospital, data center or banking institution, you must ensure
that the system is highly available. Because service delays can lead to huge losses.
There are various principles you should follow in order to ensure the availability of your system:
There should be no single point of failure in the system. Essentially, the system should
not rely on a single service to handle all requests. This is because if this service is interrupted,
the entire system may become corrupted and eventually become unusable. Detect and eliminate
current errors.

Scalability in System Design: Scalability refers to the ability of a system to cope


with increasing loads. When designing a system, the loads experienced must be taken into
account. It is said that if you have to design a system for load X, you should plan for 10 times the
load and test it 100 times. There may be situations where the load on the system increases.
Assuming you're developing an e-commerce application, you may experience a spike in
load when you're selling immediately or when a new product is released for sale.
In this case, the system needs to be smart enough to handle the increasing load efficiently,
allowing it to scale.
In order to ensure scalability, you should be able to compute the load that your system will
experience.

2.1 System Flowchart


Flowcharts can be used to represent the algorithms graphically. It is frequently used by
programmers as a technique for planning programs to address issues. It uses interconnected
symbols to represent the movement of information and processing.
"Flowcharting" is the process of creating a flowchart for an algorithm. The path that data takes
through the system and the decisions made at various levels are depicted in a system flowchart.
To depict the flow of data, including what is occurring to it and where it is going, many symbols
are linked together. There are four primary categories of flowcharts when considering user
groups. The sole difference between each flow chart is how this exercise is managed.
The following flowchart kinds exist:
Process Flowcharts
This kind of flowchart displays every step that goes into producing a good. In essence, it offers a
tool to examine the final result. The most popular tool in process engineering for displaying the
relationships between the major and minor components of a product is a flowchart. It is
employed in business product modelling to aid in the comprehension of project needs by
personnel and to obtain some understanding of the project.

Data Flowcharts
As its name implies, this flowchart is used to evaluate data, and more specifically, it aids in the
analysis of project-related structural information. This flowchart makes it simple to comprehend
how data enters and leaves the system. Most frequently, it is utilized to manage data or evaluate
information moving in and out of the system.

A Business Process Modelling Diagram


Using this flowchart or diagram, one can describe a business process analytically and simplify
ideas that are necessary to comprehend business operations and information flow. This flowchart
graphically represents the business process and opens the way for process improvement.

Boxes that can be used to create a Flowchart

There are various box kinds that can be used to create flowcharts. Arrow lines link each of the
many types of boxes to the others. Arrow lines are used to show control flow. Let's explore each
box in brief.

1. Terminal

This oval-shaped box is used to signal the beginning or end of the program. Every flowchart
diagram has two oval shapes, one to represent the beginning of an algorithm and the other to
represent its conclusion.

Data

The inputs and outputs are entered into a parallelogram-shaped box. The information
entering the system or algorithm and information leaving the system or algorithm is
essentially depicted like this.
2. Process

The main logic of the algorithm or the major body of the program is written inside this
rectangular box by the programmer. The primary processing codes are written inside this
box, making it the most important part of the flowchart.
3. Decision
This is a rhombus-shaped box, and inside it are control statements like if and conditions like
a > 0. There are two ways to go from this one; one is "yes," and the other is "no." These are
the possibilities in this box, just as there are just two options for any decision: yes or no.

4. Flow

The algorithm or process's flow is depicted by this arrow line. It stands for the process flow's
direction. Arrows were added to each stage in the examples before to show how the programs
flowed. arrow makes the software easier to read.

5. Delay

Any waiting interval that is a component of a process is represented by the Delay flowchart
symbol. Process mapping frequently uses delay shapes.

The basic idea/design of our project:


Chapter-3
Software Details/Standards

3. Software Details
1. Android Studio:

Fig.3.1
Android Studio is the official integrated development environment (IDE) for Android app
development. It is developed by Google and is based on the popular IntelliJ IDEA software.
Android Studio provides a comprehensive suite of tools for developing Android apps, including
a code editor, visual layout editor, debugger, and performance analysis tools. It also includes a
variety of templates and sample code to help developers get started with their projects quickly.
Some key features of Android Studio include:
 A Gradle-based build system that automates the building and packaging of app code and
resources.
 A layout editor that allows developers to drag and drop UI components and preview the
design of their app in real-time.
 A rich code editor that supports features like code completion, refactoring, and
debugging.
 Integration with Google Play services and other libraries, allowing developers to easily
add features like Google Maps, Firebase, and AdMob to their apps.
 Support for multiple programming languages, including Kotlin and Java.

Basic system requirements for Android Studio


Microsoft macOS Linux
Windows
Microsoft macOS 10.14 Any 64-bit Linux distribution that
Operating Windows 8/10 Mojave or supports GNOME, KDE, or Unity; GNU
System (64-bit) newer C Library (glibc) 2.31 or later
Version

Required 8 GB or more
RAM

8 GB of available disk space minimum


Free space
Minimum 1280 x 800 minimum screen resolution
screen
resolution

2. Firebase

Fig.3.2
Firebase is a mobile and web application development platform owned by Google. It provides a
wide range of services that help developers build, test, and deploy apps more quickly and easily.

Firebase includes a number of different features, such as real-time database, cloud storage,
authentication, hosting, analytics, and more. These features are designed to work seamlessly
together, allowing developers to create complex applications with ease.

One of the key advantages of Firebase is that it is a serverless platform, meaning that developers
don't have to worry about managing servers or infrastructure. Instead, Firebase takes care of all
the backend services, allowing developers to focus on building the frontend and user experience
of their applications.

Firebase also has a strong community of developers and resources available, including
documentation, code samples, and support forums. This makes it easier for developers to get
started with Firebase and troubleshoot any issues that may arise.

Overall, Firebase is a powerful platform that enables developers to build high-quality mobile and
web applications quickly and easily.

3. TTS (Text-to-speech) Software

F
i g.
3. 3
Text-to-speech (TTS) software is a type of computer software that converts written text into
spoken words. It uses natural language processing (NLP) and speech synthesis technology to
convert written text into audio output, which can then be played through speakers or headphones.
TTS software can be useful for individuals with visual impairments or reading difficulties, as it
allows them to listen to text rather than reading it. It can also be helpful for language learning or
for individuals who prefer listening to reading.
TTS software has come a long way in recent years, with advancements in NLP and machine
learning making it more accurate and natural-sounding. Some TTS software even allows users to
customize the voice and speed of the spoken output, and some can even generate multiple voices
and accents. Overall, TTS software has many practical applications and has the potential to make
information more accessible to a wider range of individuals.

Chapter-4

Implementation Work Details

4. Work Details
Details are important in the workplace because they make a lasting impression on colleagues,
customers, and bosses.
This shows that you are organized and attentive to your responsibilities. Also, the
accuracy and thoroughness of work is a great ways to earn trust and respect. People look for
attentive employees for every good reason.
Sensory perception is the ability to perceive information through the senses. Paying attention to
detail is a skill everyone needs from time to time. When attention to detail becomes part of your
nature, it helps you develop your sensory perception.
It is important to develop sensory perception at work and in life, as attention to detail has
negative consequences.
If you don't pay attention to the details, you won't know what needs to be
fixed or improved. Attention to detail develops sensory skills, helping you better deal with
distractions and focus.

4.1 Real Life Applications

A virtual assistant for visually impaired individuals can be a game-changer in many real-life
applications. For example, in the workplace, a virtual assistant can assist visually impaired
employees by reading out important documents, emails, and messages. This can help them stay
on top of their work and reduce the need for assistance from others. Additionally, a virtual
assistant can track the location of important objects, such as office supplies, and guide visually
impaired individuals to them. This can help improve their efficiency and independence at work.

In daily life, a virtual assistant can also be incredibly useful for visually impaired individuals. It
can read out labels on food items, medication, and household products, helping them to identify
what they are using or consuming. A virtual assistant can also provide information about the
location of objects within their home, such as keys, wallets, and phones, reducing the amount of
time spent searching for them. Furthermore, a virtual assistant can guide visually impaired
individuals through unfamiliar environments, such as public transportation systems or airports,
ensuring they arrive at their destination safely and on time.

Overall, we can say that a virtual assistant can help visually impaired individuals with everyday
tasks such as shopping and running errands. The assistant can read product labels and scan
barcodes, making it easier for individuals to identify items they need. The virtual assistant can
also track the location of items in stores, making it easier for individuals to navigate and find
what they need. Overall, a virtual assistant can significantly improve the quality of life for
visually impaired individuals, enabling them to be more independent and self-sufficient.

4.2 Data Implementation and Program Execution


Implementation is the execution or practice of a plan, method, or design, idea, model,
specification, standard, or policy to achieve something. Thus, implementation is an action that
must follow prior thoughts in order for something to actually happen.
The schema used are described below: -
1. XML files:
compiler.xml -
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="CompilerConfiguration">
<bytecodeTargetLevel target="11" />
</component>

</project>

deploymentTargetDropDown.xml
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="deploymentTargetDropDown">
<targetSelectedWithDropDown>
<Target>
<type value="QUICK_BOOT_TARGET" />
<deviceKey>
<Key>
<type value="VIRTUAL_DEVICE_PATH" />
<value value="C:\Users\DELL\.android\avd\
Pixel_2_XL_API_25.avd" />
</Key>
</deviceKey>
</Target>
</targetSelectedWithDropDown>
<timeTargetWasSelectedWithDropDown value="2023-03-
16T08:49:30.845948600Z" />
</component>
</project>

gradle.xml
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="GradleMigrationSettings" migrationVersion="1" />
<component name="GradleSettings">
<option name="linkedExternalProjectsSettings">
<GradleProjectSettings>
<option name="testRunner" value="GRADLE" />
<option name="distributionType" value="DEFAULT_WRAPPED" />
<option name="externalProjectPath" value="$PROJECT_DIR$" />
<option name="modules">
<set>
<option value="$PROJECT_DIR$" />
<option value="$PROJECT_DIR$/app" />
</set>
</option>
</GradleProjectSettings>
</option>
</component>
</project>

misc.xml
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="ExternalStorageConfigurationManager" enabled="true" />
<component name="ProjectRootManager" version="2" languageLevel="JDK_11"
default="true" project-jdk-name="Android Studio default JDK" project-jdk-
type="JavaSDK">
<output url="file://$PROJECT_DIR$/build/classes" />
</component>
<component name="ProjectType">
<option name="id" value="Android" />
</component>
</project>

workspace.xml
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="AndroidLayouts">
<shared>
<config />
</shared>
</component>
<component name="AutoImportSettings">
<option name="autoReloadType" value="NONE" />
</component>
<component name="ChangeListManager">
<list default="true" id="8b0fc1cc-7412-4c3d-9cc3-680e76b249a7"
name="Changes" comment="" />
<option name="SHOW_DIALOG" value="false" />
<option name="HIGHLIGHT_CONFLICTS" value="true" />
<option name="HIGHLIGHT_NON_ACTIVE_CHANGELIST" value="false" />
<option name="LAST_RESOLUTION" value="IGNORE" />
</component>
<component name="ExecutionTargetManager"
SELECTED_TARGET="device_and_snapshot_combo_box_target[C:\Users\DELL\.android\
avd\Pixel_2_XL_API_25.avd]" />
<component name="ExternalProjectsData">
<projectState path="$PROJECT_DIR$">
<ProjectState />
</projectState>
</component>
<component name="MarkdownSettingsMigration">
<option name="stateVersion" value="1" />
</component>
<component name="ProjectId" id="2N520S5DO2z8jAXIFM7Sgok4tKW" />
<component name="ProjectViewState">
<option name="hideEmptyMiddlePackages" value="true" />
<option name="showLibraryContents" value="true" />
</component>
<component name="PropertiesComponent"><![CDATA[{
"keyToString": {
"RunOnceActivity.OpenProjectViewOnStart": "true",
"RunOnceActivity.ShowReadmeOnStart": "true",
"RunOnceActivity.cidr.known.project.marker": "true",
"cidr.known.project.marker": "true",
"last_opened_file_path": "C:/Users/DELL/AndroidStudioProjects/Drishti",
"project.structure.last.edited": "Dependencies",
"project.structure.proportion": "0.17",
"project.structure.side.proportion": "0.2"
}
}]]></component>
<component name="RunManager">
<configuration name="app" type="AndroidRunConfigurationType"
factoryName="Android App">
<module name="Drishti.app.main" />
<option name="DEPLOY" value="true" />
<option name="DEPLOY_APK_FROM_BUNDLE" value="false" />
<option name="DEPLOY_AS_INSTANT" value="false" />
<option name="ARTIFACT_NAME" value="" />
<option name="PM_INSTALL_OPTIONS" value="" />
<option name="ALL_USERS" value="false" />
<option name="ALWAYS_INSTALL_WITH_PM" value="false" />
<option name="CLEAR_APP_STORAGE" value="false" />
<option name="DYNAMIC_FEATURES_DISABLED_LIST" value="" />
<option name="ACTIVITY_EXTRA_FLAGS" value="" />
<option name="MODE" value="default_activity" />
<option name="CLEAR_LOGCAT" value="false" />
<option name="SHOW_LOGCAT_AUTOMATICALLY" value="false" />
<option name="INSPECTION_WITHOUT_ACTIVITY_RESTART" value="false" />
<option name="TARGET_SELECTION_MODE"
value="DEVICE_AND_SNAPSHOT_COMBO_BOX" />
<option name="SELECTED_CLOUD_MATRIX_CONFIGURATION_ID" value="-1" />
<option name="SELECTED_CLOUD_MATRIX_PROJECT_ID" value="" />
<option name="DEBUGGER_TYPE" value="Auto" />
<Auto>
<option name="USE_JAVA_AWARE_DEBUGGER" value="false" />
<option name="SHOW_STATIC_VARS" value="true" />
<option name="WORKING_DIR" value="" />
<option name="TARGET_LOGGING_CHANNELS" value="lldb process:gdb-remote
packets" />
<option name="SHOW_OPTIMIZED_WARNING" value="true" />
</Auto>
<Hybrid>
<option name="USE_JAVA_AWARE_DEBUGGER" value="false" />
<option name="SHOW_STATIC_VARS" value="true" />
<option name="WORKING_DIR" value="" />
<option name="TARGET_LOGGING_CHANNELS" value="lldb process:gdb-remote
packets" />
<option name="SHOW_OPTIMIZED_WARNING" value="true" />
</Hybrid>
<Java />
<Native>
<option name="USE_JAVA_AWARE_DEBUGGER" value="false" />
<option name="SHOW_STATIC_VARS" value="true" />
<option name="WORKING_DIR" value="" />
<option name="TARGET_LOGGING_CHANNELS" value="lldb process:gdb-remote
packets" />
<option name="SHOW_OPTIMIZED_WARNING" value="true" />
</Native>
<Profilers>
<option name="ADVANCED_PROFILING_ENABLED" value="false" />
<option name="STARTUP_PROFILING_ENABLED" value="false" />
<option name="STARTUP_CPU_PROFILING_ENABLED" value="false" />
<option name="STARTUP_CPU_PROFILING_CONFIGURATION_NAME"
value="Java/Kotlin Method Sample (legacy)" />
<option name="STARTUP_NATIVE_MEMORY_PROFILING_ENABLED" value="false"
/>
<option name="NATIVE_MEMORY_SAMPLE_RATE_BYTES" value="2048" />
</Profilers>
<option name="DEEP_LINK" value="" />
<option name="ACTIVITY_CLASS" value="" />
<option name="SEARCH_ACTIVITY_IN_GLOBAL_SCOPE" value="false" />
<option name="SKIP_ACTIVITY_VALIDATION" value="false" />
<method v="2">
<option name="Android.Gradle.BeforeRunTask" enabled="true" />
</method>
</configuration>
</component>
<component name="SpellCheckerSettings" RuntimeDictionaries="0" Folders="0"
CustomDictionaries="0" DefaultDictionary="application-level"
UseSingleDictionary="true" transferred="true" />
<component name="TaskManager">
<task active="true" id="Default" summary="Default task">
<changelist id="8b0fc1cc-7412-4c3d-9cc3-680e76b249a7" name="Changes"
comment="" />
<created>1678939398967</created>
<option name="number" value="Default" />
<option name="presentableId" value="Default" />
<updated>1678939398967</updated>
</task>
<servers />
</component>
</project>

Chapter-5

Source Code

5. Source Code
In computing, source code is any set of code, with or without comments, written in a human-
readable programming language, usually in plain text. A program's source code is
designed specifically to facilitate the computer programmer's job, primarily in writing the source
code to determine what the computer should do.
AndroidManifest.xml:
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools"
package="com.example.drishti">
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/>
<uses-permission
android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
<uses-permission android:name="android.permission.CAMERA" />

<application
android:allowBackup="true"
android:dataExtractionRules="@xml/data_extraction_rules"
android:fullBackupContent="@xml/backup_rules"
android:icon="@mipmap/ic_launcher"
android:label="Drishti"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/Theme.Drishti"
tools:targetApi="31">
<meta-data
android:name="com.google.mlkit.vision.DEPENDENCIES"
android:value="ocr" />
<activity
android:name=".MainActivity"
android:exported="true">
<intent-filter>
<action android:name="android.intent.action.MAIN" />

<category android:name="android.intent.category.LAUNCHER" />


</intent-filter>
</activity>
</application>

</manifest>

Build.gradle:
plugins {
id 'com.android.application'
}

android {
namespace 'com.example.drishti'
compileSdk 33

defaultConfig {
applicationId "com.example.drishti"
minSdk 24
targetSdk 33
versionCode 1
versionName "1.0"
testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
}

buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android-
optimize.txt'), 'proguard-rules.pro'
}
}
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_8
targetCompatibility JavaVersion.VERSION_1_8
}
}

dependencies {

implementation 'androidx.appcompat:appcompat:1.6.1'
implementation 'com.google.android.material:material:1.8.0'
implementation 'androidx.constraintlayout:constraintlayout:2.1.4'
implementation 'com.google.android.gms:play-services-mlkit-text-
recognition'
implementation files('libs\\itextpdf-5.4.0.jar')
testImplementation 'junit:junit:4.13.2'
androidTestImplementation 'androidx.test.ext:junit:1.1.5'
androidTestImplementation 'androidx.test.espresso:espresso-core:3.5.1'
}

MainActivity.java:
package com.example.drishti;

import static android.Manifest.permission.CAMERA;

import android.Manifest;
import android.content.Context;
import android.content.pm.PackageManager;
import android.graphics.Paint;
import android.graphics.pdf.PdfDocument;

import android.os.Bundle;
import android.os.Environment;
import android.os.Handler;
import android.os.Looper;
import android.speech.tts.TextToSpeech;
import android.util.SparseArray;
import android.view.SurfaceHolder;
import android.view.SurfaceView;
import android.view.View;
import android.widget.Button;
import android.widget.TextView;
import android.widget.Toast;
import android.location.Location;
import android.location.LocationListener;
import android.location.LocationManager;

import androidx.annotation.NonNull;
import androidx.appcompat.app.AppCompatActivity;
import androidx.core.app.ActivityCompat;

import com.google.android.gms.vision.CameraSource;
import com.google.android.gms.vision.Detector;
import com.google.android.gms.vision.text.TextBlock;
import com.google.android.gms.vision.text.TextRecognizer;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Arrays;
import java.util.Locale;

public class MainActivity extends AppCompatActivity {


private TextView textView;
private SurfaceView surfaceView;
private LocationManager locationManager;
private LocationListener locationListener;

private CameraSource cameraSource;


private TextRecognizer textRecognizer;
Button button;
Button button1;
Button button2;
Button button3;

private TextToSpeech textToSpeech;


private String stringResult = null;

@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
button2 = findViewById(R.id.btn2);
button3 = findViewById(R.id.btn3);
button = findViewById(R.id.btn);
button1 = findViewById(R.id.btn1);
button1.setVisibility(View.GONE);
ActivityCompat.requestPermissions(this, new String[]{CAMERA},
PackageManager.PERMISSION_GRANTED);
textToSpeech = new TextToSpeech(this, new
TextToSpeech.OnInitListener() {
@Override
public void onInit(int status) {
textToSpeech.setLanguage(Locale.US);

}
});
}
@Override
protected void onDestroy() {
super.onDestroy();
cameraSource.release();
}

private void textRecognizer() {


textRecognizer = new
TextRecognizer.Builder(getApplicationContext()).build();
cameraSource = new CameraSource.Builder(getApplicationContext(),
textRecognizer)
.setRequestedPreviewSize(1280, 1024)
.setAutoFocusEnabled(true)
.build();

surfaceView = findViewById(R.id.surfaceView);
Context context = this;
surfaceView.getHolder().addCallback(new SurfaceHolder.Callback() {
@Override
public void surfaceCreated(SurfaceHolder holder) {

try {

if (ActivityCompat.checkSelfPermission(MainActivity.this,
Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {
return;
}
cameraSource.start(surfaceView.getHolder());

} catch (IOException e) {
e.printStackTrace();
}
}

@Override
public void surfaceChanged(SurfaceHolder holder, int format, int
width, int height) {

@Override
public void surfaceDestroyed(SurfaceHolder holder) {
cameraSource.stop();
}
});
}

private void capture() {


textRecognizer.setProcessor(new Detector.Processor<TextBlock>() {
@Override
public void release() {
}
@Override
public void receiveDetections(@NonNull
Detector.Detections<TextBlock> detections) {

SparseArray<TextBlock> sparseArray =
detections.getDetectedItems();
StringBuilder stringBuilder = new StringBuilder();

for (int i = 0; i < sparseArray.size(); ++i) {


TextBlock textBlock = sparseArray.valueAt(i);
if (textBlock != null) {
textBlock.getValue();
stringBuilder.append(Arrays.toString(new String[]
{textBlock.getValue()}));
}
}

final String stringText = stringBuilder.toString();

Handler handler = new Handler(Looper.getMainLooper());


handler.post(new Runnable() {
@Override
public void run() {
stringResult = stringText;
resultObtained();
}
});
}
});
}

private void resultObtained() {


setContentView(R.layout.activity_main);
textView = findViewById(R.id.textView);
button1.setVisibility(View.VISIBLE);
textView.setText(stringResult);
}

public void buttonStart(View view) {


setContentView(R.layout.surface);
Button capture = findViewById(R.id.capture);
capture.setOnClickListener(v -> capture());
textRecognizer();
}

public void createMyPDF(View view) {


textToSpeech.speak("pdf is generated", TextToSpeech.QUEUE_FLUSH,
null);
Toast.makeText(getApplicationContext(), "PDF is generated",
Toast.LENGTH_LONG).show();
PdfDocument myPdfDocument = new PdfDocument();
PdfDocument.PageInfo myPageInfo = new
PdfDocument.PageInfo.Builder(399, 660, 1).create();
PdfDocument.Page myPage = myPdfDocument.startPage(myPageInfo);

Paint myPaint = new Paint();


String myString = textView.getText().toString();
int x = 15, y = 40;

for (String line : myString.split("\n")) {


myPage.getCanvas().drawText(line, x, y, myPaint);
y += myPaint.descent() - myPaint.ascent();
}
myPdfDocument.finishPage(myPage);
String myFilePath =
Environment.getExternalStorageDirectory().getPath() + "/myPDFFile.pdf";

File myFile = new File(myFilePath);


int i = 0;
while (myFile.exists()) {
i++;
myFile = new File(Environment.getExternalStorageDirectory(),
"myPDFFile(" + i + ").pdf");
}
try {
myPdfDocument.writeTo(new FileOutputStream(myFile));
} catch (Exception e) {
e.printStackTrace();
textView.setText("ERROR");
}

myPdfDocument.close();
}

protected void onCreate(Bundle savedInstanceState) {


super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
locationManager = (LocationManager)
this.getSystemService(LOCATION_SERVICE);

microid = (CardView) findViewById(R.id.microid);


mtts = new TextToSpeech(this, new TextToSpeech.OnInitListener() {
@Override
public void onInit(int status) {
if (status == TextToSpeech.SUCCESS)
startApp();
if (result == TextToSpeech.LANG_MISSING_DATA ||
result == TextToSpeech.LANG_NOT_SUPPORTED) {

} else {

microid.setEnabled(true);
}

} else {
}
}
});

public void listner(View view) {


textToSpeech.speak(stringResult, TextToSpeech.QUEUE_FLUSH, null,
null);
}

public void onPause() {


if (textToSpeech != null) {
textToSpeech.stop();
}
super.onPause();
}

public void stop(View view) {


textToSpeech.stop();
}
}

MODEL AND DATASET:


Chapter-6

Input/ Output Screens

Emulator

Fig.5.1
Fig.5.1.1

Connecting Physical device through USB: Launched application on phone-


APP ICON:

Fig.5.2
App:

Fig.5.3

The assistant says that there are three options. We give the required permissions of Camera,
Audio and Microphone.
Fig.5.3.1
Fig.5.3.2
Reader: Input

Fig.5.4

Output

Fig.5.5
Text is read aloud by the assistant.

Location Tracker:

Fig.5.6

Object Detector:

Fig.5.7
Chapter-7

System Testing

System testing (ST) is a "black box" testing method performed to evaluate whether an
entire system meets specified requirements. In system testing, the functionality of a
system is tested on an end-to-end basis.
System testing is usually performed by a team independent of the development team
to impartially measure the quality of the system. This includes both functional and non-
functional tests.

Black box testing, which develops test cases using GUIs or user perspectives, and White box
testing, which uses internal coding to construct test cases, are the two most often used
approaches for software testing.
 White box testing
 Black box testing
White box testing:
White box testing, also known as transparent, clear box, structural, or code-based testing,
involves an intimate examination of the internal workings of an application. Developers
typically carry out this testing method before the software is sent to a testing team, which
may perform other types of testing such as black-box testing. The main objective of white
box testing is to analyze the internal structures and workings of the application, focusing
primarily on the code, including its paths, conditions, loops, and branches.
This type of testing is fundamental at the early stages of software development as it
encompasses both unit testing, where individual units or components of the software are
tested, and integration testing, where it is checked how well the individual units work
together. By focusing on both the inputs and outputs, white box testing ensures the
functional correctness of the software, enhances its security, and helps in optimizing code
structure. Moreover, white box testing requires a significant level of programming skills
and a deep understanding of the codebase, making it crucial for validating algorithmic
effectiveness in the software.
Black box testing:
Black box testing contrasts sharply with white box testing by focusing solely on the
software functionality without any regard to the internal workings of the application. It
does not require knowledge of the code or structure of the program and is thus accessible
to testers who may not have programming skills. This method uses external descriptions
of the software, including specifications, requirements, and design elements, to develop
test cases.
Each test case evaluates the system's responses to inputs and checks the output against
expected results. If the output matches the expected results, the test is passed; otherwise,
it is considered a failure. This method is particularly effective for validating business
processes and user requirements as it tests from the user's perspective, ensuring the
software meets the established criteria for functionality and user interface behavior.
Although black box testing may seem less comprehensive than white or grey box testing,
it is crucial for confirming that the software performs as intended from the end-user's
standpoint and typically requires less time than the more in-depth techniques.
A customer-stated requirement definition serves as the main source for black-box testing.
It is a different kind of manual test. It is a kind of software testing that looks at the
software's functionality without knowing anything about its coding or internal structure.
Software programming skills is not necessary. Every test case is created by taking into
account the input and output of a specific function. The test engineer compares the
software to the requirements, finds any flaws or bugs, and then returns it to the
development team.
In this method, the tester chooses a function, provides an input value to test its
functionality, and determines whether or not the function produces the desired results.
The function passes the test if it gives the expected result; else, it fails.
Compared to White Box and Grey Box testing techniques, Black Box testing is less
thorough. Of all the testing steps, it takes the least amount of time. Using black box
testing mostly serves the purpose of defining consumer or corporate requirements.
Following are the processes for testing a web app. In web-based testing, several regions
must be tested for potential error and bugs.
 App Functionality: In web-based testing, we must verify that a web application's
specified functionality, features, and operational behaviour match its specifications. For
instance, testing all mandatory fields, ensuring that all mandatory fields display an
asterisk, ensuring that optional fields do not trigger an error message, and ensuring that
links such as external, internal, anchor, and mailing links are properly checked and
checked for broken links, which should then be removed. Functional testing allows us to
test the functional requirements and requirements specifications of the app.

 Usability: The developers encounter problems with scalability and interactivity while
testing usability. Developers must form a team to test the application using various
hardware configurations and various browsers because different user populations will be
accessing the website. When a person browses an online store, for instance, a number of
queries could cross his or her mind, such as determining the website's legitimacy and
determining whether shipping costs are necessary.

 Browser Compatibility: We test the web application to see if the content on the website
is shown appropriately across all browsers in order to determine whether or not the
website is compatible with working the same in different browsers.

 Security: Every website that is accessible over the internet must consider security.
Testers for security examine topics like whether unauthorised access to secure pages
should be prohibited and whether files restricted to users should not be downloadable
without the appropriate access.
 Load Issues: We carry out this testing to see how the system responds to a particular
load in order to quantify some crucial transactions. The load on the database, application
server, etc. is also kept track of.

Storage and Database: Any web application's testing of its storage or database is a crucial
component, and we must ensure that the database is tested thoroughly. We check for problems
while running any database queries, the query's response time, and if the data obtained from the
database is shown appropriately on the website or not
Chapter-8

Conclusion

8.1Limitations
 Delayed updates due to differences in time zones.
 Due to linguistic and cultural barriers, the briefing may be insufficient, and output can be
decreased.
 For individuals who are not tech-savvy and are unfamiliar with smartphones, there may
be obstacles.

8.2Future Scope
The future scope of an Android application depends on various factors such as the type of app,
its functionality, target audience, market trends, and technological advancements.
Some possible future scopes of our app include integration with emerging technologies like
Computer Vision, AI and AR/VR to provide more help and support to blind people. We are
planning to improve security and enhanced user engagement using these features:
1. Advanced location tracking: The app can use GPS and other advanced technologies to
provide real-time location tracking, with voice-guided directions and alerts when users
are approaching a destination.
2. Integration with smart home devices: The app can be integrated with various smart home
devices such as Alexa or Google Home, allowing visually impaired users to control their
home appliances through voice commands.
3. Integration with wearable devices: The app can be integrated with wearable devices such
as smartwatches or fitness trackers, allowing users to receive notifications, track their
activity, and navigate using vibrations and voice commands.
4. Gesture recognition: The app can incorporate gesture recognition technology, allowing
visually impaired individuals to control their device using simple hand gestures.
5. Improved Language Support: Finally, the app could be expanded to support additional
languages, allowing visually impaired individuals from all around the world to benefit
from its features. This could involve partnering with organizations or experts in different
countries to ensure that the app is tailored to the specific needs of each region.
6. Integration with AI Assistants: Another future scope for the app is to integrate with
popular AI assistants like Google Assistant or Amazon Alexa. This could allow visually
impaired users to perform tasks like setting reminders, making phone calls, or sending
text messages using voice commands.
7. Enhanced Navigation and Route Planning: Another potential future scope for the app is
to expand its navigation and route planning features. For example, the app could use
machine learning algorithms to analyze real-time traffic data and suggest the fastest or
most efficient route for users to take. The app could also incorporate features like voice-
guided navigation or 3D mapping to provide more detailed information about the user's
surroundings.
REFERENCES
[1] Creating Accessible Mobile Apps for the Visually Impaired", Nielsen Norman Group, 2020.
[12-Jan-2020].

[2] Designing Mobile Applications for Visually Impaired Users", UX Planet, 2021. [ 25-Mar-
2021].

[3] 10 Features Every Visual Impaired App Should Have", American Foundation for the Blind,
2019. [ 8-Aug-2019].

[4] Accessibility Guidelines for Mobile Apps", WebAIM, 2022. [ 5-Feb-2022].

[5] Developing Accessible Apps for Blind and Visually Impaired Users", Apple Developer,
2023. [ 17-Jul-2023].

[6] Mobile Accessibility: How to Design an Accessible App", Interaction Design Foundation,
2021. [ 10-Oct-2021].

[7] Best Practices for Designing Accessible Apps", Google Accessibility, 2023.[ 22-May-2023].

[8] Designing Accessible Mobile Applications", A List Apart, 2020. [ 15-Nov-2020].

[9] The Importance of Accessibility in Mobile App Development", Smashing Magazine, 2022.
[ 7-Mar-2022].

[10] Ensuring Your Mobile App is Accessible to Everyone", Microsoft Accessibility, 2023.
[Accessed: 30-Sep-2023].

You might also like