0% found this document useful (0 votes)
87 views3 pages

Chips For Artificial Intelligence

Uploaded by

the requiem Last
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views3 pages

Chips For Artificial Intelligence

Uploaded by

the requiem Last
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

news

Technology | DOI:10.1145/3185523 Don Monroe

Chips for Artificial


Intelligence
Companies are racing to develop hardware
that more directly empowers deep learning.

A
LOOK UNDER the hood of any
major search, commerce,
or social-networking site
today will reveal a profusion
of “deep-learning” algo-
rithms. Over the past decade, these
powerful artificial intelligence (AI) tools
have been increasingly and successfully
applied to image analysis, speech recog-
nition, translation, and many other
tasks. Indeed, the computational and
power requirements of these algorithms
now constitute a major and still-growing
fraction of datacenter demand.
Designers often offload much of the
highly parallel calculations to commer-
cial hardware, especially graphics-pro-
cessing units (GPUs) originally devel-
oped for rapid image rendering. These
chips are especially well-suited to the
computationally intensive “training”
phase, which tunes system parameters
using many validated examples. The
“inference” phase, in which deep learn- Google’s tensor processing unit is designed for high throughput of low-precision arithmetic.
ing is deployed to process novel inputs,
requires greater memory access and techniques take neural networks to more abstract representations of the
fast response, but has also historically much higher levels of complexity, their input data, ultimately leading to its
been implemented with GPUs. growing success enabled by enormous result; for example, a translated text, or
In response to the rapidly growing increases in computing power plus the recognition of whether an image con-
demand, however, companies are rac- availability of large databases of vali- tains a pedestrian.
ing to develop hardware that more di- dated examples needed to train the sys- The number of layers, the specific
rectly empowers deep learning, most tems in a particular domain. interconnections within and between
urgently for inference but also for The “neurons” in neural networks layers, the precise values of the weights,
training. Most efforts focus on “accel- are simple computer processes that and the threshold behavior combine to
erators” that, like GPUs, rapidly per- can be explicitly implemented in hard- give the response of the entire network
form their specialized tasks under the ware, but are usually simulated digi- to an input. As many as tens of millions
loose direction of a general-purpose tally. Each neuron combines tens or of weights are required to specify the
processor, although complete dedi- hundreds of inputs, either from the extensive interconnections between
cated systems are also being explored. outside world or the activity of other neurons. These parameters are deter-
Most of the companies contacted for neurons, assigning higher weights to mined during an exhaustive “training”
IMAGE COURTESY OF NEXTPL ATFORM. CO M

this article did not respond or declined some than to others. The output activ- process in which a model network is
to discuss their plans in this rapidly ity of the neuron is computed based given huge numbers of examples with
evolving and competitive field. on a nonlinear function of how this a known “correct” output.
weighted combination compares to a When the networks are ultimately
Deep Neural Networks chosen threshold. used for inference, the weights are
Neural networks, in use since the 1980s, “Deep” neural networks arrange generally kept fixed as the system is
were inspired by a simplified model of the neurons into layers (as many as exposed to new inputs. Each of the
the human brain. Deep learning tens of layers) that “infer” successively many neurons in a layer performs an

A P R I L 2 0 1 8 | VO L. 6 1 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 15
news

independent calculation (multiplying started life in Nervana, a startup that


each of its inputs by an associated Intel acquired in 2016. Unsurprisingly,
weight, adding the products, and do- Nvidia, already the dominant vendor
ing a nonlinear computation to deter- of GPUs, has released updated designs
mine the output). Much of this compu- that it says will better support neural
tation can be framed as a matrix network applications, in both infer-
multiplication, which allows many ence and training.
steps to be done in parallel, said Chris- These chips follow a strategy that is
topher Fletcher, a computer scientist familiar from other specialized appli-
at the University of Illinois at Urbana- cations, like gaming. Farming out the
Champaign, and “looks like problems heavy calculations to a specialized ac-
that we’ve been solving on GPUs and in celerator chip sharing a bus with a gen-
high-performance computing for a eral processor and memory allows rap-
very long time.” id implementation of new ideas, and
lets chip designers focus on dedicated
Customizing Hardware circuits assuming all needed data will
Advertise with ACM! During inference, unlike in offline be at hand. However, the memory
training, rapid response is critical, burdens posed by this “simplest” ap-
whether in self-driving cars or in web proach is likely to lead to systems with
Reach the innovators applications. “Latency is the most im- tighter integration, Fletcher said, such
and thought leaders portant thing for cloud providers,” as bringing accelerator functions on-
Fletcher noted. In contrast, he said, tra- chip with the processor. “I think we
working at the ditional “GPUs are designed from the will inevitably see the world move in
cutting edge ground up for people who don’t care that direction.”
about latency, but have so much work
of computing that as long as they get full throughput Neuromorphic Hardware
and information everything will turn out OK.” One technique exploited by the new
Recognizing the importance of chips is using low-precision, often
technology through response time and anticipating increas- fixed-point data, eight bits or even few-
ACM’s magazines, ing power demands by neural-network er, especially for inference. “Precision
applications, cloud behemoth Google is the wild, wild west of deep learning
websites developed its own application-specific research right now,” said Illinois’s
and newsletters. integrated circuit (ASIC) called a “tensor- Fletcher. “One of the major open ques-
processing unit,” or TPU, for inference. tions in all of this as far as hardware ac-
Google reported in 2017 that, in its data- celerators are concerned is how far can
◊◆◊◆◊ centers, the TPU ran common neural you actually push this down without
networks 15 to 30 times faster than a losing classification accuracy?”
contemporary CPU or GPU, and used 30 Results from Google, Intel, and others
Request a media kit to 80 times less power for the same show that such low-precision compu-
with specifications computational performance (operations tations can be very powerful when the
per second). To guarantee low latency, data is prepared correctly, which
and pricing: the designers streamlined the hardware also opens opportunities for novel
and omitted common features that keep electronics. Indeed, neural networks
Ilia Rodriguez modern processors busy, but also demand were inspired by biological brains, and
more power. The critical matrix-multi- researchers in the 1980s implemented
+1 212-626-0686 plication unit uses a “systolic” design in
[email protected] which data flows between operations
without being returned to memory. Google developed
So far, Google seems to be unusual
among Web giants in designing its its own application-
own chip, rather than adapting com- specific integrated
mercially available alternatives. Micro-
soft, for example, has been using field- circuit, the tensor
programmable gate arrays (FPGAs), processing unit
which can be rewired after deployment
to perform specific circuit functions. (TPU), for inference.
Facebook is collaborating with Intel
to evaluate its ASIC, called the Neural
Network Processor. That chip, aimed
at artificial-intelligence applications,

16 COMM UNICATIO NS O F THE ACM | A P R I L 201 8 | VO L . 61 | NO. 4


news

them with specialized hardware that


mimicked features of brain architec-
During
ACM
ture. Even within the last decade,
large government-funded programs
in both the U.S. and Europe pursued
the last decade, Member
“neuromorphic” chips that operate government-funded News
on biology-inspired principles to programs in
improve performance and increase
energy efficiency. Some of these the U.S. and Europe USING SPINTRONICS
AFTER MOORE’S LAW
projects, for example, directly hard- have pursued “The work I do
is at the
wire many inputs to a single electronic
neuron, while others communicate the development of interface of
computer
using short, asynchronous voltage neuromorphic chips. science and
electrical
spikes like biological neurons. Despite
engineering,”
this history, however, the new AI chips says Sachin Sapatnekar, a
all use traditional digital circuitry. professor in the Department of
Qualcomm, for example, which sells Electrical and Computer
many chips for cellphones, explored Engineering at the University of
Minnesota, where he holds the
spiking networks under the U.S. De- centers and especially for handheld de- Robert and Marjorie Henle
fense Advanced Research Projects vices, Pratt noted that even cars can face Chair, and the Distinguished
Agency (DARPA) program SyNAPSE, serious power challenges. Prototype ad- McKnight University
Professorship. His research
along with startup Brain Corporation (in vanced safety and self-driving features
interests lie in developing
which Qualcomm has a financial stake). require thousands of watts, but would efficient techniques for the
But Jeff Gehlhaar, Qualcomm’s vice pres- need much more to approach human computer-aided design of
ident for technology, said by email that capabilities, and Pratt thinks hardware integrated circuits.
Sapatnekar received his
those networks “had some limitations, will eventually need to exploit more neu- undergraduate degree in
which prevented us from bringing them romorphic principles. “I am extremely electrical engineering from the
to commercial status.” For now, Qual- optimistic that is going to happen,” he Indian Institute of Technology,
comm’s Artificial Intelligence Platform said. “It hasn’t happened yet, because Bombay; his master’s degree
in computer engineering from
aims to help designers exploit digital cir- there have been a lot of performance Syracuse University, and his
cuits for these applications. Still, Gehlhaar improvements, both in terms of effi- Ph.D. in electrical engineering
noted the results are being studied by oth- ciency and raw compute horsepower, to from the University of Illinois at
Urbana-Champaign.
ers as “this field is getting a second look.” be mined with traditional methods, but In recent years, as Moore’s
Indeed, although its NNP chip does we are going to run out.” Law has matured, there has been
not use the technology, Intel also an- concern about circuits growing
nounced a test chip called Loihi that old and degrading over time.
Further Reading Sapatnekar spent time looking
uses spiking circuitry. IBM exploited at the reliability of integrated
its SyNAPSE work to develop power- Joupi, N.P, et al systems, and developed
In-Datacenter Performance Analysis algorithms that allow the design
ful neuromorphic chip technology it
of a Tensor Processing Unit of chips that operate reliably,
called TrueNorth, and demonstrated 44th International Symposium on even as they degrade with age.
its power in image recognition and Computer Architecture (ISCA), His current interest is
other tasks. Toronto, Canada, June 26, 2017 exploring what happens
Gill Pratt, a leader for SyNAPSE at https://arxiv.org/ftp/arxiv/ generally after Moore’s Law
papers/1704/1704.04760.pdf ends, and specifically what
DARPA and now at Toyota, said even will happen to combined
though truly neuromorphic circuitry Monroe, D. metal oxide semiconductors
Neuromorphic Computing (CMOS). “There are some
has not been adopted commercially
Gets Ready for the (Really) Big Time, interesting directions there,
yet, some of the lessons from that Communications, April 2014, pp. 13-15 with new architectures
project are being leveraged in current https://cacm.acm.org/ coming up,” he asserts.
designs. “Traditional digital does not magazines/2014/6/175183-neuromorphic- Sapatnekar says he has been
mean lack of neuromorphic ideas,” he computing-gets-ready-for-the-really-big- using spintronics technology
time/fulltext (an emerging field that utilizes
stressed. In particular, “sparse compu- electron spin to improve
tation” achieves dramatically higher U.S. Defense Advanced efficiencies and create new
Research Projects Agency
energy efficiency by leaving large sec- functionalities in electronic
DARPA SyNAPSE Program devices) to look at building
tions of the chip underused. http://www.artificialbrains.com/ logic and computer memory
“Any system that is very power efficient darpa-synapse-program structures. “It is really exciting,
will tend to be very sparse,” Pratt said, because as Moore’s Law ends, it
the best example being the phenomenal Don Monroe is a science and technology writer based in is creating a rejuvenation in the
Boston, MA, USA. way people think about design,
computational power that our brains and that brings a lot of new
achieve with less than 20 watts of power. technical problems.”
Although power is critical to data- © 2018 ACM 0001-0782/18/4 $15.00 —John Delaney

A P R I L 2 0 1 8 | VO L. 6 1 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 17

You might also like