Speech Recognition in Python using CMU Sphinx
Last Updated :
28 Apr, 2025
"Hey, Siri!", "Okay, Google!" and "Alexa playing some music" are some of the words that have become an integral part of our life as giving voice commands to our virtual assistants make our life a lot easier. But have you ever wondered how these devices are giving commands via voice/speech?
Do applications understand your voice? How does the computer even decode this if it only understands 0/1?
The answer is simple: it uses Speech Recognition software to decode the user input received as speech/voice using the device's microphone. Speech Recognition software to decode the user input received as speech/voice using the device's microphone. the task of this software is to convert the speech to a string(text) so that the computer can then decode it.
One such Toolkit is CMU Sphinx which is an open-source toolkit used for speech recognition, it also has a lightweight recognizer library called Pocketsphinx which will be used to recognize the speech. This library is a great resource especially when you are offline as when you have internet access you should prefer Google API with speech recognition due to higher precision. but when you are building a project that works offline or uses speech on an offline embedded device, use pocketsphinx.
Recognition Process
Let's discuss how this library works from behind to actually recognize our voice, It takes a waveform and then splits it according to utterances by silence then traverses and tries to find out what is being said in each utterance for accomplishing this task it takes all possible combinations of words and try to match them with audio choosing the best matching combination.
Installation of modules
Since pocketsphinx is an external library i.e. its not present as an inbuilt entity in python we would install it to our machines using pip installer and then using import to invoke all the functionalities of this library,
Now open your terminal and type the following command
NOTE- make sure that you have latest version of pip installed if not then type following
python -m pip install --upgrade pip setuptools wheel
If you have latest version of pip then proceed directly and type the following code into your terminal.
pip install pocketsphinx
Now that you have installed pocketsphinx in your machine lets move forward to more.
Prerequisites
There are two prerequisite library which is used along side with pocketsphinx they are :-
- SpeechRecognition - used for speech recognition ,with support for several engines and APIs, online and offline.
- PyAudio-used to play and even record audio in python.
Now it is recommended to install these two library using pip install command:-
pip install SpeechRecognition
brew install portaudio
pip install pyaudio
Now installation of all required external library is completed so lets move forward to code.
LiveSpeech
It is an external iterator class available in pocketsphinx which can be used for continuous recognition or keyword search from a microphone.
Here is the code for continuous recognition.
Python3
# import LiveSpeech
from pocketsphinx import LiveSpeech
for phrase in LiveSpeech():
# here the result is stored in phrase which
# ultimately displays all the words recognized
print(phrase)
else:
print("Sorry! could not recognize what you said")
Output :
We used LiveSpeech in a basic for in loop to fetch continuous speech input from user using the device microphone then we store the converted string into phrase and display each word uttered by the user.
Keyword searching
We use an variable named speech of type pocketsphinx.LiveSpeech , In which we invoke the class LiveSpeech with arguments keyphrase i.e. the keyword to be searched and kws_threshold then we used an for in loop on speech which continuously looks for user input in form of voice if the user utters the word 'forward' then it is printed along with segments.
Python3
# importing livespeech
from pocketsphinx import LiveSpeech
speech = LiveSpeech(keyphrase='forward', kws_threshold=1e-20)
# an for in loop to iterate in speech
for phrase in speech:
# printing if the keyword is spoken with segments along side.
print(phrase.segments(detailed=True))
Output :
Test program
First of all import speech_recognition with referencing it as some reference name aud now you can recognize speech using your code.
Now fetch audio from devices microphone and store in variable reference of type speech_recognition.Recognizer to recognize the audio and convert to text. After that define microphone as your source of input and define an variable reference say audio to listen i.e it takes user input of speech and stores it there, then we use invoke sphinx using try we try printing what user said here we invoke recognize_sphinx and pass argument audio, now the work of this class to convert what user said (in form of speech ) to text form and display it in console simply called Recognition.
If the code is unable to accept voice input due to unclear voice then we throw an exception for unclear voice and for RequestError tool.
Python3
import speech_recognition as aud
# fetch audio from devices microphone
# and store in variable reference of type speech_recognition
a = aud.Recognizer()
# declaring device microphone as the source to take audio input
with aud.Microphone() as source:
print("Say something!")
# variable audio prints what user said in text format the end
audio = a.listen(source)
# invoking sphinx for speech recognition
try:
# printing audio
print("You said " + a.recognize_sphinx(audio))
except aud.UnknownValueError:
# if the voice is unclear
print("Could not understand")
except aud.RequestError as e:
print("Error; {0}".format(e))
Output:
Conclusion
This winds up our topic of discussion of Speech recognition using CMU Sphinx , there lot of more applications of this useful library.
Similar Reads
Python Tutorial - Learn Python Programming Language Python is one of the most popular programming languages. Itâs simple to use, packed with features and supported by a wide range of libraries and frameworks. Its clean syntax makes it beginner-friendly. It'sA high-level language, used in web development, data science, automation, AI and more.Known fo
10 min read
Python Interview Questions and Answers Python is the most used language in top companies such as Intel, IBM, NASA, Pixar, Netflix, Facebook, JP Morgan Chase, Spotify and many more because of its simplicity and powerful libraries. To crack their Online Assessment and Interview Rounds as a Python developer, we need to master important Pyth
15+ min read
Non-linear Components In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co
11 min read
Python OOPs Concepts Object Oriented Programming is a fundamental concept in Python, empowering developers to build modular, maintainable, and scalable applications. By understanding the core OOP principles (classes, objects, inheritance, encapsulation, polymorphism, and abstraction), programmers can leverage the full p
11 min read
Python Projects - Beginner to Advanced Python is one of the most popular programming languages due to its simplicity, versatility, and supportive community. Whether youâre a beginner eager to learn the basics or an experienced programmer looking to challenge your skills, there are countless Python projects to help you grow.Hereâs a list
10 min read
Python Exercise with Practice Questions and Solutions Python Exercise for Beginner: Practice makes perfect in everything, and this is especially true when learning Python. If you're a beginner, regularly practicing Python exercises will build your confidence and sharpen your skills. To help you improve, try these Python exercises with solutions to test
9 min read
Python Programs Practice with Python program examples is always a good choice to scale up your logical understanding and programming skills and this article will provide you with the best sets of Python code examples.The below Python section contains a wide collection of Python programming examples. These Python co
11 min read
Spring Boot Tutorial Spring Boot is a Java framework that makes it easier to create and run Java applications. It simplifies the configuration and setup process, allowing developers to focus more on writing code for their applications. This Spring Boot Tutorial is a comprehensive guide that covers both basic and advance
10 min read
Python Introduction Python was created by Guido van Rossum in 1991 and further developed by the Python Software Foundation. It was designed with focus on code readability and its syntax allows us to express concepts in fewer lines of code.Key Features of PythonPythonâs simple and readable syntax makes it beginner-frien
3 min read
Python Data Types Python Data types are the classification or categorization of data items. It represents the kind of value that tells what operations can be performed on a particular data. Since everything is an object in Python programming, Python data types are classes and variables are instances (objects) of thes
9 min read