Building Applications With The Furhat Robot
Building Applications With The Furhat Robot
Introduction 3
Dialog framework 8
Multimodal utterances 10
2
Wizard-of-Oz 10
Logging interactions 11
Summary 14
Furhat Robotics
Introduction
The Furhat Robot is a state-of the art conversational robot and
Building applications with the Furhat Robot
Since the potential use cases of the Furhat Robot are so many, we
believe that the best way to accommodate developers is to provide
different ways of programming the robot. Currently, we provide three
ways of doing this, as illustrated in Figure 2:
• The Remote API, by which you can access the core Robot I/O
functionality, using any programming language.
• The Kotlin Skill API, by which you have access to the Skill Framework
Building applications with the Furhat Robot
furhat = FurhatRemoteAPI("localhost")
furhat.set_voice(name='Matthew')
Furhat Robotics
furhat.say(text="Hi there!")
result = furhat.listen()
The core I/O functionality that the Remote API supports includes:
• Speech synthesis: We use the Amazon Polly TTS and Acapela TTS,
Building applications with the Furhat Robot
with support for 210 voices and 43 languages. You can send a text to
synthesize to Furhat, and synchronized lip movements will be added
automatically. By using SSML tags, you can modify speaking rate,
stress, etc. You can also send a pointer to an audio file to play. If it
contains speech, lip movements will also be added.
• Face tracking: The users in front of Furhat are tracked with the built-in
6
camera. You can get access to the location of these users, as well as
their head pose and facial expressions.
• LED ring: You can control the color of the LED ring under Furhat.
Furhat Robotics
The Skill Framework
While the basic I/O functionality might be suitable if you already
Building applications with the Furhat Robot
You access the Skill Framework through the Skill API, which is
implemented in the Kotlin programming language. Kotlin is a modern
7
language, using the Java Runtime Environment, which means that
you can use all existing Java libraries directly from Kotlin. One of the
strengths of Kotlin is that it is statically typed, which means that you
will get very good code completion when developing in an IDE, to
explore all the different methods provided, get documentation, and
verify that you use the API correctly. We recommend using the IntelliJ
IDE when developing your skill, which has native support for Kotlin.
You can run and debug your skill directly from the IDE, either towards
a Furhat Robot, or Virtual Furhat running on your computer. When you
want to deploy your skill, you can package it as a single skill-file and
upload it to your robot, as well as distributing it to others if you want.
This way, the robot can run your skill completely stand-alone without
any additional computer connected.
Note that you can also use the Skill API even if you are only interested
in the basic Robot I/O functionality. For example, if you have built
an application in Google DialogFlow and want to integrate it with
the Furhat Robot, it is very easy to connect your skill to a Google
Furhat Robotics
DialogFlow agent, and make the Furhat Robot ask questions and read
out the responses.
Dialog framework
framework for managing the interaction. Basically, you can say that
the Flow defines how the robot should react to various events (such as
sensory input), depending on which state it is in. In dialog systems, this
is often referred to as dialog management, but since a human-robot
interaction is multimodal, it does not only handle the verbal input/
output, but also users entering and leaving, people shifting attention
and making facial gestures, etc.
The states are defined in a hierarchical fashion, which means that you
can define general behaviors (such as what should happen if a new
user enters the interaction) in a parent state, and then more specific
8
behaviors in the leaf states.
One of the strengths with Kotlin is its support for DSL:s (Domain
Specific Languages), which allows you to write well structured code in
a declarative way. The Flow is an example of this, and it allows you to
easily define Furhat’s behavior on a high level of abstraction. Here is a
very simple example of how two states within a Flow can be defined:
Furhat Robotics
val OrderFruitState = state {
onEntry {
Building applications with the Furhat Robot
onResponse<BuyFruit> {
users.current.fruits += fruits
furhat.say {
+"Okay, ${fruits.text}"
+Gestures.Smile
onResponse<No> {
goto(EndState)
}
9
onResponse {
reentry()
onNoResponse {
reentry()
onEntry {
furhat.say("Goodbye!")
furhat.attendNobody()
Furhat Robotics
}
Natural language understanding (NLU)
is also integrated with the NLU engine. For example, the trigger
onResponse<No> will be triggered if the user says something that
can be interpreted as the “intent” No, such as “nope” or “don’t think
so”, regardless of which of the supported languages was used. The
developer can use any of the built-in intents, or define new ones that
are specific to the domain. In the example above, the BuyFruit intent
would be an example of that. The intents also have “entities” in them,
which would be the fruit that is being ordered in this example. The
platform also comes with a set of built-in entities, such as time and
date expressions. If you have used other modern NLU frameworks,
such as DialogFlow, LUIS or RASA, you will find the basic principles to
be familiar. However, here the NLU is tightly integrated with the flow.
This means that the potential intents that can be recognized depend
on which state the dialog is currently in, and will be automatically
classified. Since the intents and entities are defined programmatically
in Kotlin, you can also create integrations with your own database or
other backends for creating these dynamically.
Multimodal utterances
10
In many cases, you want to add facial expressions and other behaviour
with the Robot’s speech. In the example above, a smile was inserted
in the middle of the Robot’s response after the user triggered the
intent BuyFruit. You can mix in any behaviors, including gaze shifts, or
arbitrary code to be executed to form multimodal utterances.
Wizard-of-Oz
The Skill Framework has very powerful built-in support for Wizard-of-
Oz. You can simply add onButton triggers in your flow with associated
robot behaviors. These buttons will then appear in the dashboard of
the web console where you can monitor and manage the Furhat Robot,
as shown in Figure 3. This means that you can have different buttons
appearing, depending on the current state of the dialog. And you can
mix autonomous behavior with controlled behavior. As you can see in
the figure, the buttons can also be organized with colors and grouped.
The camera feed from the robot is shown to the left. If you click in the
camera view, you can make Furhat attend a specific location or a user
(and then automatically follow that user as she moves around).
11
Logging interactions
very useful to be able to log the interactions. The SDK offers a Log
Viewer tool, where you can see logs from your interactions, as shown
in Figure 4. There you can see detailed timestamps of events, read
the transcriptions of what has been said, and also listen to the user’s
speech. Note that your interactions are not automatically logged, but it
is very easy to start and stop logging from anywhere within your skill.
Building applications with the Furhat Robot
12
Figure 4: The Log Viewer tool
Using Blockly is a very good (and fun!) way to familiarize yourself with
the basic principles of skill development and the functionality of the
Furhat Robotics
Furhat Robot. It is also very good if you want to build simple Wizard-
of-Oz interactions (as shown in Figure 5), if you want to prototype a
skill, or to make stage performances. It can also be used in education.
However, it should be noted that there is a limit as to how complex
interactions you can build with this tool, and you will not be able
to integrate your interactions with external components, such as
Building applications with the Furhat Robot
13
Furhat Robotics
Summary
The Furhat Robot is a state-of-the art social and conversational robot
Building applications with the Furhat Robot
Strengths Limitations
Remote API • Support for 50+ programming • Limited to basic robot I/O.
languages.
• Requires another dialog
• Easy to get started and system running on a
integrate with existing software. separate system.
Skill Framework • Support for all functionality in • You have to learn Kotlin.
& Skill API the Furhat platform, including
• If other components or
tools such as dialog flow, NLU,
frameworks you want to
14 Wizard-of-Oz and Logging.
use are not Kotlin or Java-
• Can be used to build complete based, it will require a more
interactions (Skills). advanced integration.
Blockly • Very easy to get started and • Can only be used for
learn. simpler applications.
FurhatRobotics
Furhat Robotics
@furhatrobotics
More info
•
•
•
•
www.furhatrobotics.com
[email protected]
•
•
The contents of this document are subject to continuous improvement and revision, in line with the evolution of Furhat products
Copyright Furhat Robotics AB © 2021 - Commercial In Confidence