Open In App

Speech Recognition in Python using Google Speech API

Last Updated : 06 May, 2025
Comments
Improve
Suggest changes
36 Likes
Like
Report

Speech recognition means converting spoken words into text. It used in various artificial intelligence applications such as home automation, speech to text, etc. In this article, you’ll learn how to do basic speech recognition in Python using the Google Speech Recognition API.

Step 1: Install Required Library

We’ll use the SpeechRecognition library in Python. To install it open your terminal or command prompt and run:

!pip install SpeechRecognition

Step 2: Upload your Audio File

When you run the below code it will ask you to upload a file from your computer. Use a clear .wav file for best results. You can download sample audio file from here.

from google.colab import files
uploaded = files.upload()

Step 3: Convert Audio to Text

If you already have an audio file like a .wav file, you can use this method instead:

import speech_recognition as sr

recognizer = sr.Recognizer()
filename = list(uploaded.keys())[0]

with sr.AudioFile(filename) as source:
    print("Reading audio...")
    audio_data = recognizer.record(source)

try:
    print("\nRecognized Text:")
    text = recognizer.recognize_google(audio_data)
    print(text)

except sr.UnknownValueError:
    print("Sorry, could not understand the audio.")
except sr.RequestError:
    print("Could not connect to Google API.")

Output:

Speech-to-text

Speech to text

Speech recognition in Python is very easy with the help of Google Speech API. You can use your voice to control programs, take notes or even build voice assistants.

You can download source code from here.


Next Article
Article Tags :
Practice Tags :

Similar Reads