0% found this document useful (0 votes)
63 views

Voice Assistant Notepad

The notepad is made on the basic concept of a real-time voice to text conversion technology that translates said words into text exactly as the user pronounces them. We developed a real-time speech recognition system and tested it in normal surroundings. The system is made up of two parts: the first is for processing an acoustic signal acquired by a microphone, and the second is for interpreting the processed signal and translating it to words.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Voice Assistant Notepad

The notepad is made on the basic concept of a real-time voice to text conversion technology that translates said words into text exactly as the user pronounces them. We developed a real-time speech recognition system and tested it in normal surroundings. The system is made up of two parts: the first is for processing an acoustic signal acquired by a microphone, and the second is for interpreting the processed signal and translating it to words.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

11 IV April 2023

https://doi.org/10.22214/ijraset.2023.50278
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com

Voice Assistant Notepad


Shyam Nayan Tirupathi1, Bhavani M2, Samba Siva Naidu Etamsetti3, Shreya Anumukonda4
GITAM Visakhapatnam

Abstract: The notepad is made on the basic concept of a real-time voice to text conversion technology that translates said words
into text exactly as the user pronounces them. We developed a real-time speech recognition system and tested it in normal
surroundings. The system is made up of two parts: the first is for processing an acoustic signal acquired by a microphone, and
the second is for interpreting the processed signal and translating it to words. We want a voice recognition system that is reliable
and inexpensive and with a good efficiency in performance. This system allows us to take notes faster which helps us to increase
productivity and maintain good work life balance at the same time. This helps people of all age groups such as kids to take down
notes who find writing difficult, adults to note down paragraphs or important points more easily and elderly persons to make use
of technology who find typing hard. The software program was created using an object-oriented analysis and design
methodology, and it accomplishes Speech Recognition by detecting and also capturing the audio using the microphone on the
device. Along with additional advantages, the suggested system decreased the note making duration to more than 50% depending
on the user's speed. To address the present issues with note taking, we decided to take on and do the project.

I. INTRODUCTION
People document important points from day to day activities or observations making notes an essential part of any data
documentation. Majority of people use technology to document notes more efficiently according to their convenience of using the
software, which influences how simple its to take notes.
The initial voice recognition systems concentrated on numbers rather than words. Bell Laboratories created the "Audrey" system in
1952, which could detect a singular voice speaking numbers aloud. Several years later, IBM released the "Shoebox," which
comprehended and replied to 16 English words. This resulted in the discovery of speech to text which identifies or recognises the
words spoken. For the purpose of extracting the audio from raw microphone input, it employs speech processing techniques.
In order to convert the raw input audio into words, the system often includes a microphone, processor and application that can
conduct advanced speech recognition. A monitor shows the processed data of the input using the above technique the words are
extracted and used for collecting information about the words features.
Note taking has become quicker and simpler as a result of effective system implementations. This has also made it quicker for users
to note information with a much faster rate to allow productive work flow.

II. LITERATURE REVIEW


A. Speech Recognition
Audio is recorded or recognised by the speech of words through a procedure called speech recognition. So to get the raw audio into
processed data it employs speech processing algorithms.
The system typically consist of 2 parts : a microphone for recording raw audio and a software to extract the audio from the
microphone using speech recognition that converts the raw audio input into readable character of words. The following are the
essential components of this process :-

B. Audio Capture
The stage of audio capture comes first. A microphone for recording the audio is used to record the audio of the user’s speech.
Creating a basic audio version, eliminating noise, and enhancing the key features are the key steps in audio pre-processing. Audio
filtering is typically done for audio pre-processing.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1037
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com

C. Feature Extraction
The feature recognition step, which comes next, performs a number of tasks, including scaling the audio to a workable aspect ratio.
In addition to making the speech into a set of objects.

D. Feature Segmentation
A technique that separates depicted bars or phrases within singular letters is known as feature segmentation. This procedure aims to
break down audio from string of letters into smaller depictions from the constituent symbols. The goal of feature segmentation is to
break down an audio from string of letters to smaller objects of singular notations.[6]

E. Feature Classification
Feature classifier is the action of extracting letters from a given audio sample, identifying them, then transforming them within
readable text in standard representation of data in computer science otherwise another system-mutable format. Action of classifying
the given letters in the manner of an established letter group is known as feature classification.

III. PROBLEM IDENTIFICATION AND OBJECTIVES


A. The Problem
There have been several problems regarding documentation or note making in day-to-day life. They are as follows:
1) If a person had to make note of huge amount of data in a short amount of time it can affect the person’s efficiency.
2) This might lead for the users to become tired or record inaccurate data.
3) The recorded data might be in a hard copy which would take more time to note down.

B. The Proposed Solution


1) Users can use a voice assisted notepad where we record and process the data where we can record and identify the text and note
down faster.
2) This software leads to productive work flow and allows more consistency while note making
3) The data lasts longer than any hard copy as it can be downloaded and stored as files in the system itself.

IV. USAGE REPRESENTATION

Here the flowchart or the above diagram shows how the speech recognition function works which is used in this project. Using this
as a base the later front-end of the system is developed both for a web application and as a mobile application also. The audio is first
acquired from a microphone from the device then through a step by step process converts it into editable text and displays it to the
user.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1038
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com

V. OVERVIEW OF TECHNOLOGIES
A. Hardware Technologies
1) Microphone: Audio microphones are employed in the audio recording phase of the process. They are mainly used to capture
audio or speech of users.
2) System: The physical processing system is used as the mainframe in this application and to apply different filtering algorithms.
In this project both a computer and a smartphone are used to run the application.

B. Software Technologies
1) In Computer
a) Speech Recognition Software: This program's speech recognition features enable it to extract the required audio sample from
raw audio input.
b) HTML: HTML is an abbreviation for Hyper Text Markup Language. HTML is the industry standard markup language for
developing Web pages. The structure of a Web page is described in HTML. HTML is made up of a number of elements. HTML
elements instruct the browser on how to render the material.
c) CSS: CSS is an abbreviation for Cascading Style Sheets.
CSS specifies how HTML components should appear on screen, paper, or in other mediums.
CSS saves a significant amount of time. It has the ability to control the layout of numerous web pages at the same time.
CSS files include external stylesheets.

d) JS: JavaScript is the Web Programming Language.


JavaScript has the ability to update and modify both HTML and CSS.
JavaScript has the ability to calculate, modify, and validate data.

2) In Smartphone App Development


a) Java: Java helps us to create different classes, functions, UI/UX based on file templates. It is used as to maintain the code that is
used to program the android device application. With the help of java development kit.
b) XML: Xml is known as an extensible markup language which is used to describe the data as compared to HTML which displays
the data using text files. It's very adaptable and used for a variety of things such as designing the interface of the android app.
c) Gradle: Android Studio consists of the package gradle which is an latest modern build kit which handles the build and
execution of the android application according the required settings.
d) Android Studio: It is the main important part of developing any android application as it is the Integrated Development
Environment used for the development process basically like an android code text editor.

VI. IMPLEMENTATION
A. In Computer Application
The user first opens the application who can select the required language as shown.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1039
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com

Give permission to the microphone and click the start recording button to take notes through speech.

If the text is undesirable click clear or if you want to save it click the download button to save your note text file in the desired
location.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1040
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com

B. In Smartphone
In the android application the user first clicks on the mic icon to speak and the app listens and then displays the text for the user to
edit freely.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1041
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com

VII. RESULTS
We looked into the note taking procedure to understand the issue better and discovered the duration of the action of notes taken
while collecting the textual data.
The problems were identified then came up with answers towards the issues by greatly decreasing the data collecting duration as
shown in the below results table.
Therefore the time was approximately shortened by a great margin as it increases more with longer the input the accuracy is >90%.

S no. Input Text Original method time Software testing time


duration duration
1. hello how are you 6s 3s
2. The leather jacked showed the scars 1 min 20 s 24 s
of being his favorite for years. It
wore those scars with pride, feeling
that they enhanced his presence
rather than diminishing it. The scars
gave it character and had not
overwhelmed to the point that it
had become ratty. The jacket was in
its prime and it knew it.
3. He scolded himself for being so 22 s 7s
tentative. He knew he shouldn't be
so cautious

VIII. CONCLUSION
The reasons for this study was the issues and difficulties related to data entry in notepads. The major objective of this study was to
create an android application for automatic voice assistant order to manage notes.
The complete method that we suggested as a way of addressing the difficulties faced during this textual data collecting procedure.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1042
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com

Below are the pros achieved from the completed system’s performance:
1) The data collection has been made virtual which lessens the maintenance of the physical notebooks or records.
2) Shortening the time by speeding the note taking procedure.
3) Thorough documentation of note data.
4) Offers a method for simple information backup and exchange.
5) Sharing real-time information with the user.
6) Easier examination of the recorded data.

REFERENCES
[1] Nikhil Jain, Manya Goyal, Agravi Gupta, Vivek Kumar Speech to text conversion for using sentiment analysis (v-3 june 2021)
[2] Android studio software development kit tutorialspoint
[3] Voice Recognition System Research Gate (Pranab Das Nov 2015
[4] JavaScript Languages Speech recognition Geeksforgeeks.com
[5] Automatic Speech Recognition Survey (Dr.Arbana Kadriu 2020)
[6] HTML, CSS, JS basics from w3schools.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1043

You might also like