0% found this document useful (0 votes)
98 views

!pip Install Ibm - Watson

1) The document discusses building a voice-enabled chatbot using Watson services like Speech to Text and Text to Speech from Python. 2) It provides code to import necessary modules, initialize Watson services, and define functions to recognize speech, get responses from an Assistant, and synthesize speech. 3) Put together, the functions allow having a conversational interaction with the chatbot by running the code to recognize input, get the Assistant's response, and vocalize the response.

Uploaded by

Eros
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views

!pip Install Ibm - Watson

1) The document discusses building a voice-enabled chatbot using Watson services like Speech to Text and Text to Speech from Python. 2) It provides code to import necessary modules, initialize Watson services, and define functions to recognize speech, get responses from an Assistant, and synthesize speech. 3) Put together, the functions allow having a conversational interaction with the chatbot by running the code to recognize input, get the Assistant's response, and vocalize the response.

Uploaded by

Eros
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Now that you have a basic chatbot, we're going to learn about interacting with it through voice.

To
achieve this, we'll be calling the IBM Watson Speech to Text and Text to Speech APIs from Python.
The idea behind this lab is to show you how to interface with different Watson services, specifically
leveraging the Python SDK. Before we begin, create a new Speech to Text service and Text to
Speech service from within IBM Cloud, and copy your API Keys for the services somewhere.

To begin, head to https://labs.cognitiveclass.ai just like you did in Lab 4. Then, open a new Python
notebook, and follow these steps:

1. Install packages - In this case, you're only going to need one extra Python package: the
`ibm_watson` package. Instead of calling the IBM Watson REST APIs manually, this package acts
as a wrapper. It removes a lot of the hard work, specially for the speech services. Type the following
code into the cell:

!pip install ibm_watson

2. Import the right modules - For this lab, you'll need to import the following:

1. os - to run commands in the environment via "os.popen".


2. glob.glob - to find audio files.
3. ibm_cluod_sdk_core.authenticators.IAMAuthenticator - to help with API Key-based
authentication
4. ibm_watson:
a) SpeechToTextV1 - the Speech to Text service wrapper.

b) AssistantV2 - The Assistant service wrapper.

c) TextToSpeechV1 - the Text to Speech service wrapper.

To do so, type the following code into the next cell, and run the code:

import os

from glob import glob

import IPython

from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

from ibm_watson import SpeechToTextV1

from ibm_watson import AssistantV2

from ibm_watson import TextToSpeechV1

3. Implementing Speech to Text - In order to implement the Speech to Text service, you need to
first instantiate your service wrapper. To do so, create a new instance of `SpeechToTextV1`. You'll
need to pass your API key through the IAMAuthenticator type, as well as the endpoint URL which
you can find just under the API Key on the service instance page on IBM Cloud.

You'll also need to define two more constants:

1. "SPEECH_EXTENSION" - the extension of the audio files that Speech to Text will need to
analyze.

2. "SPEECH_AUDIOTYPE" - the type of audio that Speech to Text will analyze - Watson
supports these formats.

Then, I define another function called "recognize_audio()". This function is simple: it waits for a new
audio file to appear in the current working directory (using `SPEECH_EXTENSION`). Right as it
appears, it'll read the file, delete the file from the filesystem, and then pass it to Watson.
Once the file is sent to Watson through the "recognition_service.recognize()" function, Watson
returns a JSON object that can be accessed through the "get_result()" function.

To parse this JSON, you navigate the hierarchy to get to the transcription that Watson is most
confident in. This is how it's done:

1. "["results"][0]" - this will get the first set of results from Watson's response.

2. "["alternatives"][0]" - of all the alternative transcriptions, it'll get the first (most likely) one.

3. "["transcript"]" - of all the data Watson returns, only take the transcript string ("str" type in Python).

To implement all of this, you'll use the following code in a new cell:

recognition_service = SpeechToTextV1(IAMAuthenticator('{YOUR_APIKEY}'))
recognition_service.set_service_url('{YOUR_ENDPOINT}')
SPEECH_EXTENSION = "*.webm"
SPEECH_AUDIOTYPE = "audio/webm"
def recognize_audio():
while len(glob(SPEECH_EXTENSION)) == 0:
pass
filename = glob(SPEECH_EXTENSION)[0]
audio_file = open(filename, "rb")
os.popen("rm " + filename)
result = recognition_service.recognize(audio=audio_file, content_type
      =SPEECH_AUDIOTYPE).get_result()
return result["results"][0]["alternatives"][0]["transcript"]

Since you're running this code in a JupyterLab Notebook, you'll need record your audio via a special
method. On the very left of your screen, click the Palette option (the icon is a Color Palette). Then,
from the resulting list, click "Record Audio".

You'll be greeted with a little window, click the microphone when you're ready.
When you're done recording, click the stop button, and you should have put a "webm" file in the
current working directory.

4. Conversing with Watson Assistant - In order to facilitate the communication with the Assistant
service, let's define a helper function! This function will take some text from the user, and return
Watson's response. Before this function can be defined, we need to instantiate the wrapper around
the Assistant service itself. In order to do so, create a new instance of "AssistantV2". You'll need to
provide your API Key via an IAMAuthenticator through the "authenticator" argument. You'll also need
to provide a version of the AssistantV1 service - in this case, we're using "2019-02-28". You should
check the documentation for the current version. You'll also need to define the Assistant ID of your
assistant. Finally, you'll also need to specify your endpoint URL - you can find this on your service
instance page right under the API Key:

Finally, we'll go ahead and ask the Assistant to create a new "session". With a session, Watson can
automatically keep track of the context of a conversation. This means you don't need to handle the
context and pass it back and forth with Watson manually. To differentiate between session, you have
a session ID, which we store in "session_id". You can now define the "message_assistant" function.
The working of this function is simple:

1. Message the assistant with the user's utterance and the current session ID, and get a JSON
response.

2. Return the first response that Watson returned.

To implement this, you'll use the following code in a new cell:

assistant = AssistantV2(version='2019-02-28', authenticator=IAMAuthenticator


  ('{YOUR_APIKEY}'))
assistant.set_service_url('{YOUR_ENDPOINT}')
ASSISTANT_ID = "{YOUR_ASSISTANT_ID}"
session_id = assistant.create_session(assistant_id=ASSISTANT_ID).get_result
  ()["session_id"]
def message_assistant(text):
response = assistant.message(assistant_id=ASSISTANT_ID,
session_id=session_id,
input={'message_type': 'text', 'text': text}
                                   ).get_result()
return response["output"]["generic"][0]["text"]

5. Hearing Watson's response - To enable a truly end-to-end intuitive and interactive experience,
let's use Text to Speech to synthesize audio and have Watson speak! Start by initializing the
"TextToSpeechV1" wrapper. Pass it your API Key through an IAMAuthenticator, and your API
endpoint, which you can find right under the API Key in your service dashboard on IBM Cloud. Then,
define a new function called "speak_text". This is what it'll do:

1. Open a new file "temp.wav".

2. Take the text that Watson needs to speak and pass it to the "synthesis_service.synthesize()"
function. Tell it we're passing a WAV file, and tell it we want the "en-US_AllisonV3Voice" voice. You
can see more voices here.

3. Write Watson's response to the "temp.wav" file.

4. Play the "temp.wav" file.

This is the code you'll use to implement Text to Speech:

synthesis_service = TextToSpeechV1(IAMAuthenticator('{YOUR_APIKEY}'))
synthesis_service.set_service_url('{YOUR_ENDPOINT}')
def speak_text(text):
with open('temp.wav', 'wb') as audio_file:
response = synthesis_service.synthesize(text, accept='audio/wav', voice
          ="en-US_AllisonV3Voice").get_result()
audio_file.write(response.content)
return IPython.display.Audio("temp.wav", autoplay=True)
6. Putting the pieces together - Because of the way these functions work, putting them together is
as easy as chaining them together! By calling "recognize_audio()", you're waiting for the user to
provide some input. Then, that is passed to the "message_assistant()" function. The output of that
function is passed to "speak_text", which provides output to the user! To interact with the chatbot in
this lab, simply run this cell for every utterance. To be specific:

1. Run the following cell.

2. Record audio.

3. Wait until you hear Watson's response

4. Until you're done, repeat.

This is the simple code you'll need in the last cell:

speak_text(message_assistant(recognize_audio()))

That's all! Now, by running this cell every time you wish to speak to Watson, you'll be able to interact
in a natural, vocal manner.

You might also like