0% found this document useful (0 votes)
177 views10 pages

Smart Speaker Fundamentals: Weighing The Many Design Trade-Offs

Uploaded by

Ion Postoronca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
177 views10 pages

Smart Speaker Fundamentals: Weighing The Many Design Trade-Offs

Uploaded by

Ion Postoronca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Smart speaker fundamentals:

Weighing the many


design trade-offs

Wenchau Albert Lo
System Engineer, Personal Electronics
Texas Instruments

Mike Gilbert
End Equipment Lead, Personal Electronics
Texas Instruments
There’s no denying that voice-enabled speakers
– better known as smart speakers – are a hot
consumer product.
According to market research firm eMarketer, 35.6 million U.S. consumers used a
voice-activated device at least once a month in 2017, with that number growing at a
compound annual growth rate of nearly 50%.

Future market predictions are also optimistic.


Juniper Research predicts that smart devices like
the Amazon Echo, Google Home, Apple HomePod
and Sonos One will be installed in a majority of
U.S. households by the year 2022. They add
that 70 million households will have at least one of
these smart speakers in their home, with the total
number of installed devices topping 175 million. This
is especially impressive for a product category that
didn’t exist before November 2014.

But there’s a lot more behind these simple-


looking devices than microphones and speakers
combined with an internet interface. Smart speakers
incorporate many electronic functions implemented Figure 1. As a media player, smart speakers must be simple and elegant
by dozens of sophisticated integrated circuits (ICs). with quality sound. As a smart home hub, they must provide accurate
Original equipment manufacturers (OEMs) entering voice recognition and connectivity to the entire suite of smart devices in
the household.
the smart-speaker market with a differentiated
product must make decisions about what to audio content from the web or from a Bluetooth®-
include; how to include it; and what trade-offs are connected device. As Figure 1 illustrates, many
acceptable in such a small, low-power device. smart speakers can now interact with other devices

What does a smart speaker actually do, and how is in the home, such as lights, door locks and climate-

it used in a home? In the simplest terms, it captures control systems.

and digitizes the end user’s voice command, It’s not just for this cycle that OEMs want to
passes results up to a web-based cloud service differentiate their product; instead, there’s a battle
for interpretation, and then responds to end users to control the access and flow of information within
by acting on the command or responding with the room, if not within the entire home, as the sole
results. Smart speakers can also search and play media and home automation hub.

Smart speaker fundamentals: Weighing the many design trade-offs 2 January 2019
USB Type C Buck Boost
PFC Controller
Power MUX and Converter Converter
Protection
AC/DC Controller Voltage
E-Fuse Battery Charger LDO Supervisor
Synchronous
Rectifier Battery Gauge PMIC
Current Sense Load Switch
Isolated AC/DC Power
Supply Input Power Protection Energy Storage
Non-Isolated DC/DC Power Supply

Bluetooth
LED Drivers Wifi Zigbee RAM Flash
Thread

LED Indicator
Wireless Interface Memory

Capacitive
Touch
MCU

Push Processor DAC Class D


Button Amplifier
Control

Input User Interface Digital Processing


Audio Output

MUX and
Temperature Logic Gates Switches
ADC
Ethernet Sensor
PHY
Level Shifter Buffer
Self-Diagnostics/
Voice Recognition Wired Interface Monitoring Logic & Control

Figure 2. TI’s smart speaker system block diagram.

Making the smart speaker real Microphones


Smart speakers require a considerable amount of When choosing a microphone technology, the
circuitry to work – and to work well. It’s a complex trade-offs that may not be all that clear. It’s a choice
array and interconnection of analog, digital, mixed- between:
signal and power-management subsystems, • A microelectromechanical systems (MEMS)-
interfaces and more (Figure 2). based “analog” microphone with an integrated
There are numerous design issues to address, preamplifier, paired with an external 24-bit audio
including the number and type of microphones, analog-to-digital converter (ADC) that outputs
audio output and speakers, power management, formatted digital code to the SoC.
the user interface and wireless connectivity. • A MEMS-based “digital” microphone with an
For OEMs, the first question is whether to use integrated single-bit, first-order delta-sigma
a “black-box” chipset that includes a system- modulator ADC that outputs a pulse-duration
on-a-chip (SoC) for audio decoding and signal modulation (PDM) digital bitstream that requires
processing, a combination Wi-Fi® and Bluetooth further filtering to create the formatted digital
radio with a microcontroller (MCU), and sometimes, code. Either the SoC or a digital signal processor
a custom power-management IC (PMIC). This (DSP) specialized for voice recognition would have
“canned” solution does not allow for much product to handle this filtering. A stand-alone voice DSP
differentiation, however. So let’s look at some of the offloads significant processing from the SoC, but
design areas and challenges. also adds cost.

Smart speaker fundamentals: Weighing the many design trade-offs 3 January 2019
A digital microphone is more expensive than its another 14-dB of dynamic range will allow you to
analog counterpart, but the analog microphone save cost by reducing the number of microphones
requires an additional ADC in front of the SoC. required. In addition to being more costly, adding
Compared to an analog microphone with a separate digital microphones may also be prohibitive by
ADC, a digital microphone also has a lower signal- adding to the complexity of the layout by routing
to-noise ratio (SNR) and dynamic range, given both three signal traces (data and clock) for each
limitations on the transducer size to accommodate microphone pair to the SoC, and by the number of
the ADC inside the microphone package and the PDM inputs available on the SoC itself. Add to that
performance limitations of the integrated ADC itself. the fact that each trace can pick up and/or radiate
Common digital microphones have an SNR on the noise, making electromagnetic interference a greater
order of 65-dB and dynamic range on the order of concern. Lastly, the clock lines that run to each digital
104-dB, and since the ADC is integrated, there is no microphone can introduce challenges with routing
opportunity to enhance the SNR or dynamic range and with jitter. Today’s analog microphones have
further with filtering and oversampling. differential outputs enabling common-mode rejection

The analog microphone, on the other hand, to the signal trace routing. The ADC also provides

combined with an external ADC, can experience a bias supply for each microphone, reducing the

an SNR or dynamic range (both are synonymous complexity of the power tree for the array.

in an ADC) up to 120-dB. This external ADC The combination of increasing the microphone
is often a multi-channel precision audio, 24-bit range and sensitivity by using analog microphones
ADC, employing third- or fourth-order delta-sigma with precision ADCs not only reduces cost and
modulators with high oversampling capability. complexity, it can dramatically reduce frustrating
They also integrate programmable, complex command-recognition errors across a variety
digital decimation filters; PGAs with configurable of noisy environments. As second-generation
automatic gain control; and mini DSP for additional smart speakers start rolling out, this error rate
noise filtering and equalization. Given that a typical will increasingly become an important market
crowded room, or one with music playing, could differentiator.
easily have 60-dB of ambient sound levels, the There’s no need to reinvent the wheel when it
lower dynamic range of the digital microphone may comes to implementing multi-microphone designs
result in an inability to properly recognize voice and speech recognition. TI’s PCM1864-based
commands unless they are significantly higher circular microphone board (CMB) reference
than the ambient sound (meaning that the end design, shown in Figure 3, uses two 4-channel
user would have to get closer to the microphones audio ADCs to interface with an array of up to eight
or the smart speaker would need to use more analog microphones, and can extract clear user
microphones). voice commands within noisy environments.
Going from a 104-dB dynamic range to 120-dB
has some amazing benefits that need to be put into Speaker amplifiers and power
perspective. A 6-dB improvement in dynamic range For the speaker amplifiers, there are trade-offs
can double the voice-recognition range. At some between output power (typically between 5 W
point, increasing range is not practical or useful, but and 25 W), power consumption, heat, size, speaker
you have more dynamic range to work with. Adding protection and sound fidelity.

Smart speaker fundamentals: Weighing the many design trade-offs 4 January 2019
boost in power, especially in bass-heavy parts (with
Crystal
24.576 MHz lots of peaks in the signal content).
DOUT1
Microphone PCM1864
1—4 (Master mode)
The stereo evaluation module reference design
DOUT2
of the digital input, class-D, IV sense audio
amplifier shown in Figure 4 not only accepts digital

LRCK

BCK
inputs in multiple formats and delivers high-quality

DOUT1
audio, but its Class-D topology includes additional
Microphone PCM1864
5—8 (Slave mode) features that minimize power consumption across
DOUT2
a range of output levels without diminishing fidelity
and performance.
Figure 3. Circular microphone board reference design.
Power management
A simple speaker system with a single mid-range As with most electronic systems, power management
tweeter and woofer can produce good sound, while plays a significant role in system design. The ultimate
multiple speakers, combined with the latest audio- goal is to provide power efficiently to dissipate less
processing techniques, can offer a 360-degree heat, allowing for a smaller and lower-cost system
audio experience. and, in the case of portable systems, extending
You also have a choice between implementing a battery run time. SoC and Wi-Fi chipsets are
one-time room calibration to tune and optimally sometimes bundled with a dedicated PMIC, but you
match the speaker’s spectral characteristics, may still prefer the added board layout and supplier
or taking a more complicated adaptive-tuning flexibility of a discrete implementation using individual
approach that compensates for movement within DC/DC converters, low-dropout regulators and voltage
the sound area. The TI PurePath™ Console supervisors to modify functions such as sequencing,
graphical development suite provides easy one-time change board layout, and reduce noise and/or cost.
tuning with impressive results. You may also want to optimize the design beyond
On the power consumption and heat side, one what a fixed, integrated solution offers, such as
approach to reducing ongoing power drain is to operating with lower quiescent current or using a
combine amplifier pulse-width modulation schemes higher switching frequency (such as 1.4 MHz up
with adaptive power supplies to reduce the power to 4 MHz) to achieve a smaller footprint, given the
requirements of the speaker. This technique uses 3.3 V
VBAT LDO LDO 1.8 V
a variable (not fixed) switching frequency for the TL760M33 TPS73618

Class-D output, with the frequency change based LDO


TPS62085
1.0 V

on the audio content. In other words, more content


3.3 V VBAT 1.8 V VBAT 1.8 V
3.3 V 1.0 V
results in more switching, and less content results in SDZ Level shift SDZ
USB
SN74AVC4T774
less switching. Audio MCU I2S I2S
TCA9406 Digital class D amp
I2C SN74AVC2T244 I2C
MUX TAS2770
SN74CBQ3257 LVC1G12
SDZ

To add efficiency, you can also dynamically adjust I2S

I2C VBAT 1.8 V

the amplifier’s output power-supply voltage based


on content. This technique is called envelope
Digital class D amp
TAS2770
JUMPER ADDRESS CTRL

tracking. It tracks the audio content and boosts the


CTRL SN74LVC266

voltage (output power) only when the music needs a Figure 4. Stereo evaluation module reference design.

Smart speaker fundamentals: Weighing the many design trade-offs 5 January 2019
need for smaller inductors. Or you may want to use
pulse skipping or eco mode to save power under 6-8.4 V
VIN
TPS61178 Boost
VO
Power amplifier
Battery
converter TPA3128D2
light loads, while at the same time staying out of 10-18 V

the audio band by not switching below 20 kHz


(which may result in audible noise). Further, you may
OPA4377 Envelope
also desire system input voltage flexibility. These (Operational audio signal
amplifier)
amplifiers require a 12-V to 24-V power supply that
comes from an internal power supply or an external
Audio
power adapter. signal

An internal AC/DC supply may provide the main


power, but an external AC/DC wall adapter with Figure 5. Envelope-tracking power supply reference design.
12- or 5-V outputs is more popular, depending on
the speaker power required. This main power can Since the speakers are the dominant power
be supplied through a Micro USB connector for consumer, a power supply that is closely integrated
low-power speakers or the newer streamlined USB with the needs of its amplifiers results in a power-
Type-C™ for higher-power speakers, replacing the and cost-efficient design. The envelope-tracking
bulky traditional wall AC/DC adapter and barrel power supply reference design for audio power
jack. Since these adapters can be different power amplifiers shown in Figure 5 is a good example of
levels, implementing USB Type-C would require such a solution: It operates from a 5.4-V to 8.4-V
some level of handshake from the speaker to the input-voltage rail and delivers 2 × 20 W into an 8-Ω
adapter, or employ input USB current-limit switches load (using a 7.2-V rail). In addition, it maintains
or battery chargers with integrated overcurrent and high efficiency across the output-voltage range by
overvoltage protection. changing the output voltage in accordance with the
peak-to-peak envelope of the audio signal. Thus,
For portable speakers, a technique called power-
it dynamically adapts the power amplifier’s supply
path management allows an external AC/DC wall
based on audio content, optimizing its
adapter to charge the battery while powering the
power consumption.
speaker “live” through an integrated regulator. If you
need a higher speaker amplifier power rail (such as User interface
12 V or 18 V), one option is to use a two-cell battery
You must decide what type of user interface to offer
at 8 V, then boost it as needed for the speaker
based on the desired end-user experience, since
amplifier. The battery charger will need to boost the
the human machine interface is a major factor in a
input to the higher battery voltage (if the adapter
smart speaker’s market differentiation. The interface
output is 5 V), and you’ll need an additional boost
can range from lower-cost simple buttons and
converter for the speaker amplifier rail to achieve
single-indicator LEDs, to an array of rotating LEDs,
higher voltages during peak power conditions. In
to a small LCD display, to an LCD display with touch
addition, the portable smart-speaker system must
inputs and haptic feedback.
have a low standby-power rating and efficient
step-down converters to provide a longer run time LEDs are used at minimum to indicate status, and
between recharge cycles when batteries are the more recently, to enhance the end-user experience
sole power source. by generating moving colors in various patterns.

Smart speaker fundamentals: Weighing the many design trade-offs 6 January 2019
Although simpler systems may use single-color Wireless connectivity
LEDs, most use red-green-blue (RGB) LEDs. If you
Finally, there is the issue of literally getting outside the
chose a multicolor option, you will need to decide
box. Without a connection to the internet, a smart
how many RGB LEDs to include, and whether the
speaker cannot function as intended. There are design
system processor, MCU or a newer multi-LED driver
decisions here regarding the best way to connect,
with integrated LED engines will control them; each
given speed requirements and power constraints.
choice brings cost, power and system-burden trade-
offs. Using an integrated LED pattern engine offloads The most common form of smart speaker

the processor as it manages pattern generation connects to the internet directly via Wi-Fi. Here, the

and drives an array of RGB LEDs even when the bandwidth of IEEE 802.11n is more than sufficient,

processor or MCU goes into low-power idle modes. and it also allows for a multi-room wireless speaker
mesh connection. However, Wi-Fi power amplifiers
As shown in Figure 6, the various LED ring
consume significant power and may limit the play
lighting patterns reference design illustrates
time of battery-operated smart speakers. For this
how to design a multicolor RGB LED ring pattern
reason, Wi-Fi-enabled speakers are often plugged
subsystem using new multichannel RGB LED
directly into wall power outlets or have AC adapters
drivers with an integrated LED engine. The use of an
for continuous operation.
ambient light sensor IC automatically controls
the LED brightness.
VCC

Corresponding panel push-buttons may be VLED

VCC
inexpensive, but they are prone to mechanical EN
CH0

failure and limited to a single function. They require SDA


CH1

SCL
that end users “push and hold” to effect action ADDR0
LP5024 CH2
24-channel
(up, down, scroll), an operation that is now dated ADDR1
RGB LED driver
w/ pattern engine

and counterintuitive in the world of smartphones. VCAP CH21

In contrast, a capacitive touch-sensitive surface IREF


CH22

allows for more interaction and enhances the GND CH23

user interface. No physical force is required, and VCC

the same surface can detect end-user proximity


and enable a backlight for easier use in the dark. SDA
VCC
OPT3001
SCL Ambient light
A touch-sensitive surface can implement a more MCU ADDR sensor
GND

familiar interface by supporting “swipe” or “spin”


instead of simple push, and offering it can help VCC
VLED

differentiate a smart speaker. A properly designed VCC CH0

capacitive-touch controller works on variety of EN


CH1
SDA
surfaces (plastic, glass or metal) and can be VCC SCL
CH2
LP5024
designed flush with the speaker case surface. ADDR0
24-channel
ADDR1 RGB LED driver
w/ pattern engine
The gesture-based capacitive touch interface VCAP CH21

reference design for speakers shown in CH22


IREF

Figure 7 provides an easy-to-use evaluation GND CH23

system for a multi-gesture capacitive-touch interface for


smart speakers using TI’s capacitive-touch MCU. The
design allows for tap, swipe, slide and rotate gestures. Figure 6. Various LED ring lighting patterns reference design.

Smart speaker fundamentals: Weighing the many design trade-offs 7 January 2019
limitations and power schemes of Bluetooth low
energy. When used in conjunction with Bluetooth
MSP430F5529
LaunchPadTM kit Classic, Bluetooth low energy can control
USB HID
keyboard interface communication between devices.
I2C USB Favorite music
player
Home automation is another function that currently
exists in many homes as a separate entity – a
I2C TIDM-02004
stand-alone hub connected to the internet via
CAPT-FR2633 Gestures converted to
CapTIvateTM MCU keyboard commands Wi-Fi, as well as to specialized lights and
board using the
MSP430FR2633 thermostats via a wireless mesh network set
up for home automation implemented by such
standards as Zigbee®, Thread, Z-wave, etc. Smart
speakers can legitimately lay claim to providing
home automation via the internet as long as this
CAPT-BSWP additional stand-alone hub is also implemented.
board

However, to eliminate the need for end users


to purchase this additional wireless hub, smart
speakers can become the home automation hub
Figure 7. Gesture-based capacitive-touch interface reference design.
by simply adding a multiband wireless MCU with
an integrated RF power amplifier. The wireless
MCU handles processing of the protocol stack and
End users who want to use multiple smart-speaker controls the radio, preventing the need to burden
units (for better room coverage or stereo sound the existing SoC or Wi-Fi network processor while
quality) will need IEEE 802.11n/s support to enabling communication through any of the popular
implement a mesh network. In a mesh network, any long-range home-automation protocols, both in the
one speaker can become the master (connected 2.4-GHz and Sub-1 GHz bands. Because Wi-Fi
to the cloud) while the others act as slaves. If a and Bluetooth also use the 2.4-GHz band, you’ll
speaker acting as master is powered off or loses the need to ensure co-existence through a combination
network, the mesh automatically assigns another of hardware and software built into the integrated
speaker as master. The biggest challenge in a wireless MCU.
multi-speaker mesh network is synchronization.
Wi-Fi controllers in a mesh network must have Looking to the future
robust synchronization schemes to avoid significant The smart speakers of the future will be more than
user frustration. stand-alone, audio-only units. As flat-panel TVs
Battery-powered portable speakers may offload become increasingly thinner, their speakers need
Wi-Fi cloud connectivity to nearby mobile devices. to be smaller, which negatively affects the TV’s
For connectivity to mobile devices for indirect cloud sound. As a result, soundbars (which enhance the
connection and/or to listen to content stored on sound of flat-panel TVs) are increasing in popularity.
a mobile device, Bluetooth Classic (or Bluetooth Adding voice recognition is the obvious next step of
Basic Rate) is required for continuous connection soundbar evolution.
to stream the audio content, due to the bandwidth

Smart speaker fundamentals: Weighing the many design trade-offs 8 January 2019
To complete the picture, smart soundbars will This will allow consumers a more pleasurable
incorporate a set-top box for wireless video experience when interacting with the smart speaker
streaming with only a single HDMI cable connected and will deliver crisp visual content.
to the TV, which then acts as an extremely large With this added display functionality, smart speakers
display monitor. As flat-panel TVs become even may then concede to the smart soundbar in the
thinner, the TV control circuitry and power supply may living room and focus on areas outside of the living
also be implemented into smart soundbars. Then, room. Smart speakers may provide smaller personal
the smart speaker and smart soundbar will compete displays, ranging from integrated LCD screens to
to be the hub for the overall home entertainment larger ultra-short-throw high-definition projection by
system. With the added connectivity for home using TI DLP® technologies to create large displays
automation, these devices will also compete to on any surface. Smart devices located near high-
become the automation hub for smart homes. traffic areas like the kitchen or family room will need
Another added feature is smart speaker display. to be aesthetically pleasing and nonintrusive. The
Adding display to the smart speaker is a natural addition of a tablet-size or larger flat-panel display
extension of its functionality. Just as center console generally may not always meet these criteria.
displays are proliferating in automobiles, consumers Projection display technology gives end users a
will demand the additional visual experience from more interactive experience when asking the smart
a home informational/entertainment device as speaker for information (weather, recipe, traffic) and
well. Additionally, the way in which content will puts a face to the anonymous voice. In this way,
be requested and displayed differs from that of the smart speaker’s role and importance in the
a more personal, handheld smartphone or tablet home continues to morph and grow, bringing with it
experience. Since voice commands are the primary opportunities for designers to start new trends and
mode of requesting content and control, simplified differentiate their designs.
search and control applications will be necessary
to facilitate quick and accurate results. Further, For more information
displayed images can be simplified, with minimal • Explore TI solutions and design resources for
need for touch interaction while also providing a smart speakers.
large enough image suitable for distance viewing.

Important Notice: The products and services of Texas Instruments Incorporated and its subsidiaries described herein are sold subject to TI’s standard terms and
conditions of sale. Customers are advised to obtain the most current and complete information about TI products and services before placing orders. TI assumes
no liability for applications assistance, customer’s applications or product designs, software performance, or infringement of patents. The publication of information
regarding any other company’s products or services does not constitute TI’s approval, warranty or endorsement thereof.
The platform bar, PurePath, LaunchPad, and CapTIvate are trademarks of Texas Instruments and DLP is a registered trademark of Texas Instruments. All other trademarks
are the property of their respective owners.

© 2019 Texas Instruments Incorporated SLAY053


IMPORTANT NOTICE AND DISCLAIMER

TI PROVIDES TECHNICAL AND RELIABILITY DATA (INCLUDING DATASHEETS), DESIGN RESOURCES (INCLUDING REFERENCE
DESIGNS), APPLICATION OR OTHER DESIGN ADVICE, WEB TOOLS, SAFETY INFORMATION, AND OTHER RESOURCES “AS IS”
AND WITH ALL FAULTS, AND DISCLAIMS ALL WARRANTIES, EXPRESS AND IMPLIED, INCLUDING WITHOUT LIMITATION ANY
IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT OF THIRD
PARTY INTELLECTUAL PROPERTY RIGHTS.
These resources are intended for skilled developers designing with TI products. You are solely responsible for (1) selecting the appropriate
TI products for your application, (2) designing, validating and testing your application, and (3) ensuring your application meets applicable
standards, and any other safety, security, or other requirements. These resources are subject to change without notice. TI grants you
permission to use these resources only for development of an application that uses the TI products described in the resource. Other
reproduction and display of these resources is prohibited. No license is granted to any other TI intellectual property right or to any third
party intellectual property right. TI disclaims responsibility for, and you will fully indemnify TI and its representatives against, any claims,
damages, costs, losses, and liabilities arising out of your use of these resources.
TI’s products are provided subject to TI’s Terms of Sale (www.ti.com/legal/termsofsale.html) or other applicable terms available either on
ti.com or provided in conjunction with such TI products. TI’s provision of these resources does not expand or otherwise alter TI’s applicable
warranties or warranty disclaimers for TI products.

Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265
Copyright © 2018, Texas Instruments Incorporated

You might also like