0% found this document useful (0 votes)

199 views24 pages

Voice Recognition Using Python

Uploaded by

MA SHAIK SHOYEB

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

199 views24 pages

Voice Recognition Using Python

Uploaded by

MA SHAIK SHOYEB

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 24

1. ABSTRACT 1
2. INTRODUCTION
2
2.1 Python(Programming Language)
2.2 Introduction of Voice Recognition Using Python

3. THEORY AND METHODOLOGY 4

3.1 Primary Goals

3.2 The Python Programming Language
3.3 Uses of Python
3.4 Existing System
3.5 Proposed System
3.6 Methodology
3.7 Advantages of Voice Recognition System
3.8 Disadvantages and the Challengdes Faced by Voice Recognition System
3.9 Software System
3.10 Hardware System
3.11 Block Diagram of Voice Recognition System

4. DESCRIPTION OF ORGANIZATION 25
5. DESCRIPTION OF MY ACTIVITIES 26
5.1 System Testing
5.2 User Acceptance Testing
6. RESULTS, CONCLUSIONS AND DISCUSSION 30
6.1 Code: Input Code
6.2 Output Code
6.3 Useful applications of voice recognition system
6.4 Conclusion
6.5 Future Scope
7. RECOMMENDATIONS 36
8. SELF-ASSESSMENT 37
9. APPENDIX 37

1
1. ABSTRACT

Voice recognition simply is the process of converting the spoken words to text.It
is also known as speech to text and speech recognition. Voice recognition helps us
to save time instead of typing. It’s really impressive that how fast is the speech is
converting to text. In this we are going to use tkinter and pytesseract. Speech
recognition is a machine's ability to listen to spoken words and identify them. You
can then use speech recognition in Python to convert the spoken words into text,
make a query or give a reply. You can even program some devices to respond to
these spoken words. You can do speech recognition in python with the help of
computer programs that take in input from the microphone, process it, and convert
it into a suitable form.

Speech recognition seems highly futuristic, but it is present all around you.
Automated phone calls allow you to speak out your query or the query you wish
to be assisted on; your virtual assistants like Siri or Alexa also use speech
recognition to talk to you .

1
2.INTRODUCTION

2.1 Python (programming language)

History
Python is a high-level, general-purpose and a very popular programming
language. Python programming language (latest Python 3) is being used in web
development, Machine Learning applications, along with all cutting edge
technology in Software Industry. Python Programming Language is very well
suited for Beginners, also for experienced programmers with other programming
languages like C++ and Java.
Python is a widely-used, interpreted, object-oriented, and high-level programming
language with dynamic semantics, used for general-purpose programming. It was
created by Guido van Rossum, and first released on February 20, 1991. While you
may know the python as a large snake, the name of the Python programming
language comes from an old BBC television comedy sketch series called Monty
Python’s Flying Circus.
One of the amazing features of Python is the fact that it is actually one person’s
work. Usually, new programming languages are developed and published by large
companies employing lots of professionals, and due to copyright rules, it is very
hard to name any of the people involved in the project. Python is an exception.
Of course, van Rossum did not develop and evolve all the Python components
himself. The speed with which Python has spread around the world is a result of
the continuous work of thousands (very often anonymous) programmers, testers,
users (many of them aren’t IT specialists) and enthusiasts, but it must be said that
the very first idea (the seed from which Python sprouted) came to one head –
Guido’s.

2
2.2. Introduction Of Voice Recognition Using Python

• I selected this project because voice recognition is one from the fast
growing technologies.
• Nearly 20% of the world are suffering from various disabilities like blind,
not having hands etc..,in which voice recognition helps the people more as
it converts the speech to text.
• In this project we are going to deal with the “Voice Recognition”.
• Voice recognition is technically the process of converting an acoustic
signal captured by a microphone to a set of words.
• Voice recognition is an important feature used in several applications used
such as home automation, artificial intelligence, marketing, etc…
• In this project we are going to deal how to make use of the voice
recognition and pyttsx3 library of python.

3
3.THEORY AND METHODOLOGY

3.1 Primary goals

In 1999, Guido van Rossum defined his goals for Python:

 an easy and intuitive language just as powerful as those of the major competitors;

 open source, so anyone can contribute to its development;
 code that is as understandable as plain English;
 suitable for everyday tasks, allowing for short development times.

About 20 years later, it is clear that all these intentions have been fulfilled. Some sources say
that Python is the third-most popular programming language in the world, while others claim
it’s the fifth.

Either way, it still occupies a high rank in the top ten of the TIOBE Programming Community
and PYPL Popularity of Programming Language Indexes

Python isn’t a young language. It is mature and trustworthy. It’s not a one-hit wonder. It’s a
bright star in the programming firmament, and time spent learning Python is a very good
investment.

3.2. The Python Programming Language:

The Python language type is a high-level, dynamically typed one that is among
the most popular general-purpose programming languages. It is among the
world’s fastest growing languages and is used by software engineers,
mathematicians, data analysts, scientists, network engineers, students, and
accountants.
Python is an Interpreted, object-oriented, and high-level programming language.
It is called an interpreted language as its source code is compiled to bytecode
which is then interpreted. Python usually compiles Python code to bytecode
before interpreting it.

4
3.3.Uses of python:
The uses of Python are varied and quite impactful. Here is a list of fields where
Python is commonly used:
Web Development:
As a web developer, you have the option to choose from a wide range of web
frameworks while using Python as a server-side programming language. Both
Django and Flask are popular among Python programmers. Django is a full-stack
web framework for Python to develop complex large web applications,
whereas Flask is a lightweight and extensible Python web framework to build
simple web applications as it is easy to learn and is more Python-based. It is a
good start for beginners.
Application giants like Youtube, Spotify, Mozilla, Dropbox, Instagram use the
Django framework, whereas Airbnb, Netflix, Uber, Samsung use the Flask
framework.
Machine Learning:
As Python is a very accessible language, you have a lot of great libraries on top of
it that make your work easier. A large number of Python libraries that exist help
you to focus on more exciting things than reinventing the wheel. Python is also an
excellent wrapper language for working with more efficient C/ C++
implementations of algorithms and CUDA/cuDNN, which is why existing
machine learning and deep learning libraries run efficiently in Python. This is also
super important for working in the fields of machine learning and AI.

ARCHITECTURE OF PYTHON:

3.4 Existing System

• The, Existing system of voice recognition the very first prototype of voice recognition
in fact a toy, named radio rex which came around 1920’s.

5
• In 1962,shoebox model which able to recognize isolated words and also perform a few
arithmetic operations as well.

• In 1980’s Harpy was able to recognize connected speech from a 1000 word
vocabulary.

• In 2006 Deep neural networks, most of the voice recognition models work on deep
neural networks.

• Pyqt5 is the latest version of a GUI widget toolkit developed by Riverbank computing.
Python interface for Qt, one of the most powerful and popular cross platform of GUI
library.

• Gtts is a very easy to use tool which converts the text entered into audio of microphone
which can be saved as a mp3 file.

• DRAWBACKS OF EXISTING SYSTEM

• Less Accuracy

• More Time consumption

3.5 Proposed System

• Tkinter

• pytesseract

Tkinter:

Tkinter is the standard GUI library for Python. Python when combined with Tkinter
provides a fast and easy way to create TK GUI applications. Tkinter provides a
powerful object-oriented interface to the Tk GUI toolkit. GUI stands for Graphical User
Interface.

6
3.6 Methodology

Voice recognition is technically the process of converting an acoustic signal captured by a

microphone to a set of words.
Voice recognition involves four processes:
1. First, voice recognition that allows the machine to catch the words, phrases and sentences we
speak
2.Second, the processing happens and the detection of audio occurs allows the machine to
understand what we speak
3.Third, it takes the help of tool kits of tkinter and pytesseract are the GUI and OCR library of
python for proper detection of audio to get output
4.Finally the output which is appeared in which it is converted to text.
Voice recognition in python works with algorithms that perform linguistic and acoustic
modelling.
Ex:Siri is the apple’s microphone which recognises the speech and translates to text .

3.7 Advantages of Voice Recognition System

7
• Voice recognition helps to capture the speed of voice much faster than we can type.

• Voice recognition is the great example of machine learning in real life.

• It helps those who have problems with sight And also helps the who are not having
knowledge of typing .

• Voice recognition system help us to overcomes the barrier of illiteracy as well. It will
serve both the literate and illiterate as it is focused on the spoken words.

• Many industries are now utilizing voice recognition to help with everyday processes.

3.8 Disadvantages and the Challenges Faced by Voice Recognition System

• 1.It is difficult to make voice recognition because there are so many sources of
variability.

• For Example different person may have different style of speaking.

• 2.Pronunciation also Make a difficult for any voice recognition to translate the speech
altogether.

• 3.Environment .

• For example an isolated room and an auditorium will different background noise.

• 4.Echo also contribute a lot of noise .

• 5.Language constraints and task specifiers.

3.9 Software System

• Operating system : Windows 10

• Technology :Python 3.6.8

8
• IDE :IDE Python(3.6.8),Visual Studio Code

• Libraries : pyttsx3,pyaudio.

3.10 Hardware System

• Processor : Dual core

• Ram : 4GB

• Hard disk : 100GB

• Input devices : microphone, headset, sound card

3.11 Block Diagram of Voice Recognition System

9
4.DESCRIPTION OF ORGANISATION

Pantech Solutions Private Limited is a Private incorporated. It is classified as Non-government

company and is registered at Registrar of Companies, Hyderabad. The Company's status is
Active. It's a company limited by shares having an authorized capital of Rs 1.00 lakh and a
paid-up capital of Rs 0.50 lakh as per MCA.3 Director associated with the organization,
Srinivasan.N is presently director. Pantech Solutions is a community dedicated to support
enthusiastic and eager-to-learn students. It is a place to share your thoughts, ideas and seek
knowledge. We regularly conduct seminars and events where you can meet professionals froma
plethora of backgrounds. Meeting professionals not only increases your chances of getting
hired, but also helps you in scoring internships and gaining domain knowledge practically. We
believe no time is better spent than that spent in the service of your fellow man

10
5.DESCRIPTION OF ACTIVITIES
Verification of machinery and equipment usually consists of design qualification (DQ),
installation qualification (IQ), operational qualification (OQ), and performance qualification
(PQ). DQ is usually a vendor's job. However, DQ can also be performed by the user, by
confirming through review and testing that the equipment meets the written acquisition
specification. If the relevant document or manuals of machinery/equipment are provided by
vendors, the later 3Q needs to be thoroughly performed by the users who work in an industrial
regulatory environment. Otherwise, the process of IQ, OQ and PQ is the task of validation. The
typical example of such a case could be the loss or absence of vendor's documentation for
legacy equipment or do-it-yourself (DIY) assemblies (e.g., cars, computers etc.) and, therefore,
users should endeavour to acquire DQ document beforehand. Each template of DQ, IQ, OQ and
PQ usually can be found on the internet respectively, whereas the DIY qualifications of
machinery/equipment can be assisted either by the vendor's training course materials and
tutorials, or by the published guidance books, such as step-by-step series if the acquisition of
machinery/equipment is not bundled with on- site qualification services. This kind of the DIY
approach is also applicable to the qualifications of software, computer operating systems and a
manufacturing process. The most important and critical task as the last step of the activity is to
generating and archiving machinery/equipment qualification reports for auditing purposes, if
regulatory compliances are mandatory.

Qualification of machinery/equipment is venue dependent, in particular items that are

shock sensitive and require balancing or calibration, and re-qualification needs to be conducted
once the objects are relocated. The full scales of some equipment qualifications are even time
dependent as consumables are used up (i.e. filters) or springs stretch out,
11
requiring recalibration, and hence re-certification is necessary when a specified due time
lapse Re-qualification of machinery/equipment should also be conducted when replacement of
parts, or coupling with another device, or installing a new application software and
restructuring of the computer which affects especially the pre-settings, such as
on BIOS, registry, disk drive partition table, dynamically-linked (shared) libraries, or an ini
file etc., have been necessary. In such a situation, the specifications of the
parts/devices/software and restructuring proposals should be appended to the qualification
document whether the parts/devices/software are genuine or not.

Torres and Hyman have discussed the suitability of non-genuine parts for clinical use
and provided guidelines for equipment users to select appropriate substitutes which are capable
to avoid adverse effects. In the case when genuine parts/devices/software are demanded by
some of regulatory requirements, then re-qualification does not need to be conducted on the
non-genuine assemblies. Instead, the asset has to be recycled for non-regulatory purposes.

When machinery/equipment qualification is conducted by a standard endorsed third

party such as by an ISO standard accredited company for a particular division, the process is
called certification. Currently, the coverage of ISO/IEC 15408 certification by an ISO/IEC
27001 accredited organization is limited; the scheme requires a fair amount of efforts to get
popularized.

5.1 System testing

System testing of software or hardware is testing conducted on a complete, integrated

system to evaluate the system's compliance with its specified requirements. System testing falls
within the scope of black box testing, and as such, should require no knowledge of the inner
design of the code or logic.

As a rule, system testing takes, as its input, all of the "integrated" software components
that have passed integration testing and also the software system itself integrated with any
applicable hardware system(s). The purpose of integration testing is to detect any
inconsistencies between the software units that are integrated together (called assemblages) or
between any of the assemblages and the hardware. System testing is a more limited type of
testing; it seeks to detect defects both within the "inter-assemblages" and also within the system
as a whole.

System testing is performed on the entire system in the context of a Functional

Requirement Specification(s) (FRS) and/or a System Requirement Specification (SRS). System

12
testing tests not only the design, but also the behavior and even the believed expectations of the
customer. It is also intended to test up to and beyond the bounds defined in the
software/hardware requirements specification
Types of tests to include in system testing
The following examples are different types of testing that should be considered during System
testing:

 Graphical user interface testing

 Usability testing
 Software performance testing
 Compatibility testing
 Exception handling
 Load testing
 Volume testing
 Stress testing
 Security testing
 Scalability testing
 Sanity testing
 Smoke testing
 Exploratory testing
 Ad hoc testing
 Regression testing
 Installation testing
 Maintenance testing Recovery testing and failover testing.
 Accessibility testing, including compliance with:
 Americans with Disabilities Act of 1990
 Section 508 Amendment to the Rehabilitation Act of 1973
 Web Accessibility Initiative (WAI) of the World Wide Web Consortium (W3C)

Although different testing organizations may prescribe different tests as part of System testing,
this list serves as a general framework or foundation to begin with.

Structure Testing:
It is concerned with exercising the internal logic of a program and traversing particular
execution paths.
13
Output Testing:
 Output of test cases compared with the expected results created during design of test
cases.
 Asking the user about the format required by them tests the output generated or
displayed by the system under consideration.
 Here, the output format is considered into two was, one is on screen and another one is
printed format.
 The output on the screen is found to be correct as the format was designed in the system
design phase according to user needs.
 The output comes out as the specified requirements as the user’s hard copy.

5.2 User acceptance Testing:

 Final Stage, before handling over to the customer which is usually carried out by the
customer where the test cases are executed with actual data.
 The system under consideration is tested for user acceptance and constantly keeping
touch with the prospective system user at the time of developing and making changes
whenever required.
 It involves planning and execution of various types of test in order to demonstrate that
the implemented software system satisfies the requirements stated in the requirement
document.
Two set of acceptance test to be run:
1. Those developed by quality assurance group.
2. Those developed by customer.

14
6. RESULT, CONCLUSION AND DISCUSSION

6.1 Code:
6.1.Input Code:
# Python program to translate
# speech to text and text to speech
#pip install pyttsx3
#pip install speechrecognition
#pip install pyaudio

import speech_recognition as sr
import pyttsx3

# Initialize the recognizer

r = sr.Recognizer()

# Function to convert text to

# speech
def SpeakText(command):

# Initialize the engine
engine = pyttsx3.init()
engine.say(command)
engine.runAndWait()

# Loop infinitely for user to
# speak
15
while(1):

# Exception handling to handle
# exceptions at the runtime
try:

# use the microphone as source for input.
with sr.Microphone() as source2:

# wait for a second to let the recognizer
# adjust the energy threshold based on
# the surrounding noise level
r.adjust_for_ambient_noise(source2, duration=0.2)

#listens for the user's input
audio2 = r.listen(source2)

# Using ggogle to recognize audio
MyText = r.recognize_google(audio2)
MyText = MyText.lower()

print("i am saying:- "+MyText)

SpeakText(MyText)

except sr.RequestError as e:
print("Could not request results; {0}".format(e))

except sr.UnknownValueError:
print("unknown error occured")

Displaying the program code of input and output on Visual Studio Code of voice
detection using python :

16
The above diagram displaying the installation of libraries in command prompt of voice
detection using python.

6.2.Output :

17
Displaying the output of voice detection using python :

The input and the output shows the detection of voice by python.

6.3 Useful applications of voice recognition system

 Turning on live captions on google meet .
 Voice search –Text in mobiles.
 Telephony and other domains.
 Voice commands to smart home devices.
 Voice recognition for translation Applications.
 Virtual assistants on our phones like Siri, , Google Assistant,etc.
 Voice commands to a vehicle.
 Smart speakers at home like Alexa, Echo, etc.
 Learning languages.
 Medical transcription
 Voice identification for security.
 Mobile payments with voice recognition.
 AI assistants that recognize your specific voice .

18
6.4 CONCLUSION
• The conclusion of this project is voice recognition system is speech is most common
means of communication.
• We handle the background noise by the recogniser class has a Built-In_function called
adjust_for_ambient_noise function which also be parameter of duration.

• Voice recognition gives us power to communicate with our devices without even
writing one line of code. This makes technological devices more accessible and easier
to use.
• Majority of the population on the world realise on speech to communicate one another.
• Voice recognition helps us to save time by speaking instead of typing.

6.5 Future Scope

19
• The future of voice recognition is that users speaking to the voice assistants that the
idea of telling alexa to boil the water and many, will be a prominent part of our
everyday lives sooner than we think in future.
• Some of the applications which can be developed in future are biometry security with
voice biometry, Voice assistants in the workplace, catchinmg criminals using voice.
• We thank to smart devices such as Amazon’s ‘Alexa’ and Google’s ‘Home Hub”.
• Streamlined conversations : Google and Amazon recently announced that their
voice assistants will stop requiring the user to say ‘wake’ words such as ‘Alexa’
or ‘Google’ to start a conversation. This new facility is making interacting with
these assistants more natural for users—not to mention much more convenient.
Such devices are also expected to get better at understanding contextual
factors .

Focus on security: Some 41% of voice device users claim that they are concerned
about confidentiality while using their devices. This is why Amazon and Google
introduced a number of security measures (including speaker ID and verification) to
their voice assistant technologies. New solutions are also in the pipeline to make it
more for customers to buy things using voice.

20
7.RECOMMENDATION

I am writing to recommend the marketing services of Pantech Solutions. Pantech

has created and implemented many successful sessions for students across various
age groups since its inception. I have worked here as an intern for the period of
one month and I can tell that their knowledge and attention to detail have helped
me in learning my basics and further topics in a proper and systematic manner. I
feel confident in recommending Pantech Solutions. The atmosphere created here
is perfect for learning new things. The guide here is not only thorough with his
concepts but he also provides a simple and easy explanation for the most complex
topics. The guide here is always willing to take time to discuss the concerns and
also responds to the questions/queries. He is detail-oriented, organized, and
always open to constructive feedback, making our business relationship both
effortless and pleasant

21
8.SELF ASSESSMENT

Finally, I learn that there is no right or wrong in our application design, and there is no perfect
way to achieve a design solution. The only way to reach a design goal and perfect our product
is through iteration and incorporation of user feedback. Iteration makes my design get better
and better, and each iteration keeps me focused on the problem I want to solve. Input from
design critique makes me look at the problem from different directions and helps shape a better
product. Questioning :My question is clear, well-focused and requires high level thinking skills
in order to research. Planning:I made really good use of my time. I was able to remain focused
on the tasks and make changes when I needed to. I was able to develop a clear method to
organize my information. I was able to make revisions in my plan when needed. Gathering:I
used a variety of resources and carefully selected only the information that answered my
question. I was able to continually revise my search based on information I found. Sorting:I
thoroughly selected and organized information that answered my question in a organized way. I
selected information that was appropriate. Synthesizing: My product answers the question in a
22
way that reflects learning using some detail and accuracy.

9.APPENDIX
C:\Users\Office\OneDrive\Desktop\project11asif.pptx