Chips For Artificial Intelligence
Chips For Artificial Intelligence
A
LOOK UNDER the hood of any
major search, commerce,
or social-networking site
today will reveal a profusion
of “deep-learning” algo-
rithms. Over the past decade, these
powerful artificial intelligence (AI) tools
have been increasingly and successfully
applied to image analysis, speech recog-
nition, translation, and many other
tasks. Indeed, the computational and
power requirements of these algorithms
now constitute a major and still-growing
fraction of datacenter demand.
Designers often offload much of the
highly parallel calculations to commer-
cial hardware, especially graphics-pro-
cessing units (GPUs) originally devel-
oped for rapid image rendering. These
chips are especially well-suited to the
computationally intensive “training”
phase, which tunes system parameters
using many validated examples. The
“inference” phase, in which deep learn- Google’s tensor processing unit is designed for high throughput of low-precision arithmetic.
ing is deployed to process novel inputs,
requires greater memory access and techniques take neural networks to more abstract representations of the
fast response, but has also historically much higher levels of complexity, their input data, ultimately leading to its
been implemented with GPUs. growing success enabled by enormous result; for example, a translated text, or
In response to the rapidly growing increases in computing power plus the recognition of whether an image con-
demand, however, companies are rac- availability of large databases of vali- tains a pedestrian.
ing to develop hardware that more di- dated examples needed to train the sys- The number of layers, the specific
rectly empowers deep learning, most tems in a particular domain. interconnections within and between
urgently for inference but also for The “neurons” in neural networks layers, the precise values of the weights,
training. Most efforts focus on “accel- are simple computer processes that and the threshold behavior combine to
erators” that, like GPUs, rapidly per- can be explicitly implemented in hard- give the response of the entire network
form their specialized tasks under the ware, but are usually simulated digi- to an input. As many as tens of millions
loose direction of a general-purpose tally. Each neuron combines tens or of weights are required to specify the
processor, although complete dedi- hundreds of inputs, either from the extensive interconnections between
cated systems are also being explored. outside world or the activity of other neurons. These parameters are deter-
Most of the companies contacted for neurons, assigning higher weights to mined during an exhaustive “training”
IMAGE COURTESY OF NEXTPL ATFORM. CO M
this article did not respond or declined some than to others. The output activ- process in which a model network is
to discuss their plans in this rapidly ity of the neuron is computed based given huge numbers of examples with
evolving and competitive field. on a nonlinear function of how this a known “correct” output.
weighted combination compares to a When the networks are ultimately
Deep Neural Networks chosen threshold. used for inference, the weights are
Neural networks, in use since the 1980s, “Deep” neural networks arrange generally kept fixed as the system is
were inspired by a simplified model of the neurons into layers (as many as exposed to new inputs. Each of the
the human brain. Deep learning tens of layers) that “infer” successively many neurons in a layer performs an
A P R I L 2 0 1 8 | VO L. 6 1 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 15
news
A P R I L 2 0 1 8 | VO L. 6 1 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 17