0% found this document useful (0 votes)
180 views4 pages

Google Ai

1) Teachable Machine is an online tool created by Google's Creative Lab that allows users to easily train their own machine learning models using inputs like images from their webcam. 2) The tool is designed based on principles of machine learning like training models through examples rather than rules and embracing the uncertainty or "softness" of model predictions. 3) The creators debated important design decisions like whether to separate training and inference modes or have discrete output changes versus gradual transitions to best match how machine learning actually works.

Uploaded by

Michael Seer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
180 views4 pages

Google Ai

1) Teachable Machine is an online tool created by Google's Creative Lab that allows users to easily train their own machine learning models using inputs like images from their webcam. 2) The tool is designed based on principles of machine learning like training models through examples rather than rules and embracing the uncertainty or "softness" of model predictions. 3) The creators debated important design decisions like whether to separate training and inference modes or have discrete output changes versus gradual transitions to best match how machine learning actually works.

Uploaded by

Michael Seer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Designing (and Learning From) a

Teachable Machine
UX insights on designing simple, accessible interfaces
for teaching computers
By Barron Webster
Machine learning increasingly affects our digital lives—from recommending music to
translating foreign phrases and curating photo albums. It’s important for everyone
using this technology to understand how it works, but doing so isn’t always easy.
Machine learning is defined by its ability to “learn” from data without being explicitly
programmed, and ML-driven products don’t typically let users peek behind the
curtain to see why the system is making the decisions it does. Machine learning can
help find your cat photos, but it won’t really tell you how it does so.
Last October, Google's Creative Lab released Teachable Machine, a free online
experiment that lets you play with machine learning in your web browser, using your
webcam as an input to train a machine learning model of your very own—no
programming experience required. The team—a collaborative effort by Creative Lab
and PAIR team members, plus friends from Støj and Use All Five—wanted people
to get a feel for how machine learning actually “learns,” and to make the process
itself more understandable.
OK, but what are inputs? Or models? An input is just some piece of data—a picture,
video, song, soundbite, or article—that you feed into a machine learning model in
order to teach it. A model is a small computer program that learns from the
information it’s given in order to evaluate other information. For example, a feature
that recognizes faces in the photos stored on your phone, probably uses a model
that was trained on inputs of many photos containing faces.
Let’s say you train a model by showing it a bunch of pictures of oranges (the inputs).

Later, you could show that model a new picture (of say, a kiwi). It would use the
pictures (of oranges) you showed it earlier to decide how likely it is that the new
picture contains an orange.  

You can also teach the model about different classes, or categories. In the case of
fruit, you might have one class that’s really good at recognizing oranges, and another
that’s good at recognizing apples (or kiwi).
With Teachable Machine, we set out to let people easily train their own ML models—
and we hope to inspire more designers to build interfaces for experimenting with
machine learning. Just as we were inspired by the lessons of many fun, accessible
ML projects (like Rebecca Fiebrink’s Wekinator), we hope these lessons will help
other designers get through the process quickly and easily.
Training ≠ output
Instead of training a model to do something very narrow (for example, recognize
cats), we wanted Teachable Machine to feel as though you’re training a tool, like the
keyboard on your computer or phone. The model you train could be used for any
digital output. To emphasize the principle that input and output can be mixed and
matched, we broke the interface into three different blocks.

The “input” block comes first. This is where digital media enters the system (pictured
above, left). Then there’s the “learning” block (above, middle), where the model
learns and interprets the input. The “output” box (above, right), is where the model’s
interpretation does something like play a GIF or make a sound. Connections run
between each block, just like the wires leading from your keyboard to your computer
to your monitor. You can swap out your monitor for a new one, but your keyboard
and computer will still work, right? The same principle holds true for the outputs in
Teachable Machine—even when you switch them, the input and learning blocks still
work.
This means that anyone can change the output of their model. After spending a few
minutes on training, you can use it to show different GIFs and then switch to the
speech output. The model works just as well.
Train by example, not by rule-making
Another principle core to the capabilities of machine learning: training the model by
example, not by instruction. You teach the model by showing it a lot of examples—in
our case, individual image frames. When it’s running, or “inferring,” your model looks
at each new image frame, compares it to all the examples you’ve taught each class
(or category) and returns a confidence rating—a measure of how confident it is that
the new frame is similar to the classes you’ve trained before.
Our class model displays each class’ examples individually as they’re recorded. This
also helps communicate an important distinction—that the model is not seeing
motion, but just the still images it captures every few milliseconds.

Models are soft!


A third principle is that machine learning models behave “softly.” We’re
used to computers following very strict rules: if this, then that. If you press
the shutter, the picture is taken. If you press delete, the character is
removed. However, machine learning models don’t follow the same strict
logic.
There are many scenarios when a model can be uncertain. To a computer,
maybe a tennis ball and banana look really similar from a particular angle
—just a bunch of yellow-ish pixels. Most products that employ machine
learning today, try to hide this uncertainty from users, presenting results
as certain and discrete. This can cause a lot of confusion if those results
are incorrect. By honestly embracing the “softness” of machine learning,
people can start to understand how algorithms make the mistakes that
they do.
Teachable Machine visualizes the confidence of each class, so you can
see for yourself what makes the model confident and what makes it
uncertain. Each class operates on a spectrum, not a binary.
Debatable interactions
In addition to working towards the principles above, we wrestled with a few key
decisions while prototyping.
Training versus inference
One big question was if “training” and “inference” should be separate interaction
modes. Traditionally, the training of a machine learning model is done by teaching it
with a large dataset of examples. You might train the model on a ton of pictures of
oranges (as shown before) and switch to inference mode—asking the model to
“infer,” or make a guess, about a new picture. You could show it a lemon, and see
how confident it is that it’s similar to an orange.
Early in the design process, we considered displaying these two modes in the
interface. The model would start in “training” mode and learn from the inputs before
users switched it to “inference” mode to test the model.
While this clearly communicated what training and inference mean when creating a
model, it took longer to get a model up and running. Plus, the words “training” and
“inference”—while useful to know if you work in the field—aren’t crucial for someone
just getting a feel for machine learning. The final version of Teachable Machine is
always in “inference” mode unless it’s being actively trained. You can hold the button
to train for a second, release it to see how the model is working, and then start
training it again. This quick feedback loop, and the ability to get a model up and
running fast, were higher priorities than explicitly delineating the two modes in the
interface.
Thresholds versus gradients
We also debated about how to demonstrate the model’s effect on the output. Should
an output be triggered by the confidence level reaching a certain threshold (say, play
a GIF when the confidence of a class reaches 90%?) Or does it interpolate between
different states based on the confidence of each class (e.g. shifting between red and
blue so that if the model is 50% confident in both classes, it shows purple?) One
could imagine an interface that interpolates between two different sounds based on
how high your hand is, or one that just switches between the two sounds when your
hand crosses the confidence barrier.
The outputs we found to be familiar and easy to get started with were discrete
pieces of media—GIFs, sound bytes, and lines of text. Because switching between
those pieces of media is clearer than blending between them, we ended up switching
the outputs discretely. When the model switches which class has the highest
confidence level, we switch to the output associated with that class.
Neutral class versus no neutral class
For many of the outputs we originally designed, having a “neutral” or “do-nothing”
class made it easier to create games. For example, if you were playing Pong, you
might want one class to make the paddle move up, one class to make it move down,
and a “neutral” state where it does nothing. We considered having a dedicated
“neutral” state in our interface that always acted as a “don’t do anything” catch-all
class.
Having a dedicated neutral state might help you build a mental model for how to
control a specific game, but it’s not quite an honest representation of how the model
works. The model doesn’t care which class is neutral—from its perspective, “neutral”
is a class just like any other, just with a different name and design on the surface. We
didn’t want people thinking that a “neutral” state was intrinsically part of training an
ML model. To that end, we eliminated some outputs that were frustrating or
confusing without a neutral state. Our final outputs, gifs, sound, and text, work just by
switching the output when you switch classes—no need for a neutral state. Just an
honest one.
Design your own teachable experience
The more accessible machine learning becomes, the more we hope people will be
able to easily train their own custom models on all sorts of data, from all kinds of
sensors. If you’re interested in designing your own teachable experience, you can
start by using our Teachable Machine boilerplate, which provides simple access
to the underlying technology we used to build Teachable Machine. Good luck!
Barron Webster is a New York City-based designer in Google’s Creative Lab.
He works across a range of design disciplines to bring access and
understanding of useful technology to as many people as possible.
For more on how to take a human-centered approach to designing with ML
and AI,  visit our collection  of practical insights from Google's People AI +
Research team.

You might also like