Speech Recognition in Python using Google Speech API
Speech recognition means converting spoken words into text. It used in various artificial intelligence applications such as home automation, speech to text, etc. In this article, you’ll learn how to do basic speech recognition in Python using the Google Speech Recognition API.
Step 1: Install Required Library
We’ll use the SpeechRecognition library in Python. To install it open your terminal or command prompt and run:
!pip install SpeechRecognition
Step 2: Upload your Audio File
When you run the below code it will ask you to upload a file from your computer. Use a clear .wav file for best results. You can download sample audio file from here.
from google.colab import files
uploaded = files.upload()
Step 3: Convert Audio to Text
If you already have an audio file like a .wav file, you can use this method instead:
import speech_recognition as sr
recognizer = sr.Recognizer()
filename = list(uploaded.keys())[0]
with sr.AudioFile(filename) as source:
print("Reading audio...")
audio_data = recognizer.record(source)
try:
print("\nRecognized Text:")
text = recognizer.recognize_google(audio_data)
print(text)
except sr.UnknownValueError:
print("Sorry, could not understand the audio.")
except sr.RequestError:
print("Could not connect to Google API.")
Output:

Speech to text
Speech recognition in Python is very easy with the help of Google Speech API. You can use your voice to control programs, take notes or even build voice assistants.
You can download source code from here.