Sphinx Speech Recognition
Sphinx Speech Recognition
“Hey, Siri!”, “Okay, Google!” and “Alexa playing some music” are some of
the words that have become an integral part of our life as giving voice
commands to our virtual assistants make our life a lot easier. But have
you ever wondered how these devices are giving commands via
voice/speech?
Do applications understand your voice? How does the computer even
decode this if it only understands 0/1?
The answer is simple: it uses Speech Recognition software to decode the
user input received as speech/voice using the device’s
microphone. Speech Recognition software to decode the user input
received as speech/voice using the device’s microphone. the task of this
software is to convert the speech to a string(text) so that the computer can
then decode it.
One such Toolkit is CMU Sphinx which is an open-source toolkit used for
speech recognition, it also has a lightweight recognizer library
called Pocketsphinx which will be used to recognize the speech. This
library is a great resource especially when you are offline as when you
have internet access you should prefer Google API with speech
recognition due to higher precision. but when you are building a project
that works offline or uses speech on an offline embedded device,
use pocketsphinx.
Recognition Process
Let’s discuss how this library works from behind to actually recognize our
voice, It takes a waveform and then splits it according to utterances by
silence then traverses and tries to find out what is being said in each
utterance for accomplishing this task it takes all possible combinations of
words and try to match them with audio choosing the best matching
combination.
Installation of modules
Since pocketsphinx is an external library i.e. its not present as an inbuilt
entity in python we would install it to our machines using pip installer and
then using import to invoke all the functionalities of this library,
Now open your terminal and type the following command
NOTE- make sure that you have latest version of pip installed if not then
type following
python -m pip install --upgrade pip setuptools wheel
If you have latest version of pip then proceed directly and type the
following code into your terminal.
pip install pocketsphinx
Now that you have installed pocketsphinx in your machine lets move
forward to more.
Prerequisites
There are two prerequisite library which is used along side with
pocketsphinx they are :-
1. SpeechRecognition – used for speech recognition ,with support for
several engines and APIs, online and offline.
2. PyAudio-used to play and even record audio in python.
Now it is recommended to install these two library using pip install
command:-
pip install SpeechRecognition
brew install portaudio
pip install pyaudio
Now installation of all required external library is completed so lets move
forward to code.
LiveSpeech
Python3
# import LiveSpeech
from pocketsphinx import LiveSpeech
for phrase in LiveSpeech():
# here the result is stored in phrase which
# ultimately displays all the words recognized
print(phrase)
else:
print("Sorry! could not recognize what you said")
Output :
Keyword searching
# importing livespeech
from pocketsphinx import LiveSpeech
Output :
Test program
Python3
# variable audio prints what user said in text format the end
audio = a.listen(source)
except aud.UnknownValueError:
# if the voice is unclear
print("Could not understand")
except aud.RequestError as e:
print("Error; {0}".format(e))
Output:
Conclusion