Computing Beyond Moores Law
Computing Beyond Moores Law
Computing beyond
Moore’s Law
John M. Shalf, Lawrence Berkeley National Laboratory
Robert Leland, Sandia National Laboratories
I
n 1965, Gordon Moore famously observed that the constant cost per generation, and Moore predicted that
number of components on an integrated circuit (IC) this improvement, in turn, would lead to a cornucopia
had doubled every year on average since the intro- of societal benefits that would flow from semiconductor
duction of this technology in 1959.1 He predicted that microelectronics technology. The serendipitous scal-
this trend, driven by economic considerations of cost and ing effects Moore predicted did indeed persist, lasting
yield, would continue for at least a decade, although later 40 years longer than he predicted. However, Dennard
the integration pace was moderated to doubling approx- scaling came to an end in 2004, which led to a power-ef-
imately every 18 months. He also noted that “shrinking ficiency crisis for CMOS logic and which poses an even
the dimensions on an integrated structure makes it pos- more fundamental challenge for traditional technology
sible to operate the structure at higher speed for the same scaling in the mid-2020s.
power per unit area”—an innovation that Robert Dennard Within that decade, the magical growth process
of IBM formalized nearly a decade later as Dennard scal- Moore described will come to an end as 2D lithography
ing, the ability to reduce device operating voltages and capability approaches the atomic realm. The end of con-
scale clock frequencies exponentially each generation.2 ventional scaling will impact all computing technolo-
This mutually reinforcing scaling of feature size, gies that depend on improvements in cost, energy effi-
frequency, and power meant that chip functionality ciency, and storage capacity—from large-scale systems
would improve exponentially with time at a roughly to the smallest consumer electronic devices.
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.
y
Spintronics Neuromorphic
Quantum
Carbon
nanotubes z Analog
New models of
and computaion
graphene Adabiatic
DECEMBER 2015 15
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.
REBOOTING COMPUTING
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.
that implement computa- as Richard Feynman proposed. How- the background noise level to
tion through direct physical ever, QC is not a suitable replacement enable detection;
principles; for CDC in domains where CDC excels. ›› scalability—the technology must
›› neuro-inspired computing (NC), These technology options create allow density increases and cor-
which includes devices based on the possibility of approaches that go responding energy reductions as
the principles of brain operation well beyond what CMOS and digital it improves; and
and general neuronal computa- electronics technologies have tradi- ›› scalable manufacturability—the
tion; and tionally performed effectively. How- technology must be produc-
›› quantum computing (QC), which ever, we do not believe that they are ible with a process capable of
could in theory be used to solve suitable as replacements for digital industrial-scale implementation.
some problems with combina- electronics in tasks that digital com-
torial complexity through the puting already performs well. For that Although we did not assess poten-
selection of a desired state from reason, we choose to focus on new tial post-CMOS technologies in
a superposition of all possible technological implementations of the detail, we used these four metrics
answers to a problem. CDC model because we view it as the and IARPA’s criteria of timescale,
most immediately relevant to a broad complexity, risk, and opportunity to
The authors also underline the set of societal concerns associated evaluate their merit. The results are
importance of distinguishing between with the end of Moore’s law. shown in Table 1.
new paradigms for computation and The list of options in the table is
new technological implementations EVALUATING by no means comprehensive, but is
of existing paradigms, making several CDC CANDIDATES meant as a glimpse of those most
key observations. One is that AC can be In the past, a competitor to CMOS or commonly debated in and outside the
simpler than some digital approxima- CDC would need to keep pace with a literature. No option is clearly supe-
tion, but does not lend itself to general- relentless improvement schedule in rior in all respects, so we believe that
purpose computing because the device which CMOS technology doubled its one or more of them will reach main-
is specialized for computation. The performance every 18 months or so stream use through integration with
computational precision is problem- and leveraged tremendous economies conventional silicon and CMOS plat-
atic to maintain and can be sensitive of scale. This combination proved forms. Indeed, chip stacking is already
to its environment. unbeatable except in relatively narrow enabling the stacking of photonics
Another observation is that digital niches. With the end of CMOS technol- technology directly on conventional
computers are good at deterministic/ ogy scaling, these competitive con- silicon logic and memory circuits.
algorithmic calculation, but poor at ditions have changed. A come-from- Packaging and computer architec-
simple reasoning and recognition. NC behind competitor to CMOS is not yet ture do not require fundamentally
devices have proven inherently resil- apparent, but metrics are in place to new materials and underlying pro-
ient and very good at problems that CDC assess the fitness of potential CMOS cess technology, which can extend
is not. Many unexplored opportunities replacements. Shekhar Borkar of Intel the same underlying silicon/CMOS
exist for such computational models, has developed three metrics—gain, technology. New devices, on the other
but much is still not understood about signal-to-noise immunity, and scal- hand, require fundamentally new
how the brain actually computes. ability 7—to which we have added a materials and even new data and
Finally, the authors note that quan- fourth—scalable manufacturability: computational representations—a far
tum information processing theoreti- deeper and less predictable revision of
cally could enable the efficient solution ›› gain—the energy required to the digital computing paradigm.
of some combinatorial and NP-hard switch the device state from
problems (problems not solvable in on to off must be less than the ARCHITECTURE AND
polynomial time using digital compu- energy the device controls; SOFTWARE ADVANCES
tation) or could be used to simulate the ›› signal-to-noise immunity—the Architectural schemes to extend dig-
electronic state of complex molecules, signal must be far enough above ital computing aim to better manage
DECEMBER 2015 17
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.
REBOOTING COMPUTING
3D integration and Chip stacking in 3D using through-silicon vias (TSVs) Near-term Medium Low Medium
packaging
Metal layers Mid-term Medium Medium Medium
Millivolt switches (a Tunnel field-effect transistors (TFETs) Mid-term Medium Medium High
better transistor)
Heterogeneous semiconductors/strained silicon Mid-term Medium Medium Medium
energy, decrease power consumption, by having the software direct state elusive, but recent examples show
lower overall chip cost, and improve changes. that this approach could be a viable
error detection and response. way to lower dynamic power con-
Circuit design sumption for both neuromorphic and
Energy management Studies have demonstrated approaches digital applications.9
Current energy-management tech- that enable wires to operate at a lower
nologies are ubiquitous and typically voltage for long-haul connections and System-on-chip (SoC)
coarse grained. Dynamic voltage and then reamplify efficiently at the end- specialization
frequency scaling (DVFS) and thermal points, although with some loss from The core precept of SoC technology is
throttling lower both clock frequen- reamplification. A recent NVIDIA that chip cost is dominated by com-
cies and voltages when computing paper estimated an opportunity for ponent design and verification costs.
demands do not require peak per- two to three times improvement using Therefore, tailoring chips to include
formance. Coarse-grained DVFS can such advanced circuit design tech- only the circuit components of value to
save significant power in current con- niques with current technologies.8 the application is more economically
sumer electronics devices, which are A more aggressive path to perfor- efficient than designing one chip that
mostly idle. However, it only mar- mance enhancement is clockless (or serves a broad application range—the
ginally benefits devices that oper- domino logic) design. Clock distribu- current commodity design practice.
ate near 100 percent utilization. tion consumes a large fraction of sys- This tailoring strategy is common
Finer-grained power management tem power, and constricts a circuit practice for cell-phone chips, such as
might provide additional potential to the operation speed of its slow- that in the Apple iPhone, which com-
to recover energy, enabling faster est component. Practical and effec- bines commodity-embedded proces-
transitions between power states tive clockless designs have proven sor cores in a specialized SoC design,
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.
but is only just being applied in server Near-threshold voltage operation 3D INTEGRATION
and high-performance computing The mainstream computing commu- AND PACKAGING
(HPC) chips. nity has traditionally shunned further 3D integration and packaging has
reductions in device voltage because been used successfully in mainstream
Logic specialization such measures would reduce transis- devices to increase logic density and
Field-programmable gate arrays tors’ signal-to-noise immunity and reduce data-movement distances.
(FPGAs) and reconfigurable comput- subject circuits to wider statistical Most memory devices involve some
ing hold promise for improving per- performance variation. Both effects form of chip stacking, which will be
formance by creating tailored cir- would result in the unreliable perfor- critical in increasing the density of
cuits for each problem, but they are mance of individual circuits and pres- future devices.
not efficient to implement. In a typ- ent daunting problems to software and The primary challenges to scaling
ical FPGA implementation, most of hardware development. 3D lithographic layering are improv-
the available reconfigurable wires From a software standpoint, conven- ing defect tolerance and managing the
remain unused to maximize the use of tional bulk-synchronous approaches thermal densities and intrinsic resis-
lookup tables.10 A custom application- to scaling parallel computing perfor- tance. Stacking cool technologies such
specific integrated circuit (ASIC) mance would become untenable, forc- as emerging nonvolatile memory cells
design improves performance by 10 ing a move to entirely asynchronous (magnetoresistive RAM, memristors,
times over the FPGA design of the software-execution models and a cor- and so on) provides substantial oppor-
same circuit because the ASIC design responding reformulation of algo- tunity for deeper lithographic layering
eliminates redundant wiring. rithms and infrastructure. Applica- and potentially a few orders of mag-
Unfortunately, tailoring in either tions and algorithm developers would nitude improvement in component
case requires substantial hardware need to substantially rewrite software density in terms of both increased
design expertise, and circuit design in to accommodate this unpredictable functionality and memory capacity.
general is much more expensive than performance heterogeneity. Although 3D stacking will substan-
software design. Possibly, the eco- In hardware, increased unreli- tially reduce data-movement require-
nomic disincentive of designing and ability would require more pervasive ments—a major contributor to ther-
verifying custom circuits will be over- error detection and corresponding mal density—it is unclear how much
come by the reality of having no per- software infrastructure to respond— additional room it affords for deeply
formance scaling at all. the cost of this is unknown. Clock stacking logic layers.
The most extreme proposals for frequencies would be substantially 3D memory technologies will pave
customizing logic are intended for lower, putting more pressure on the way to 3D integration. Technolo-
use in dark silicon—areas on the ASIC parallelism to gain performance gies that reduce operating voltages for
that remain turned off when not in improvements, already a daunting digital circuits (a development effort
use. The idea is to trade off increased software burden. that has been stalled since 2004), could
ASIC surface area for more efficient Near-threshold voltage (NTV) cir- provide further room to build circuits
specialized circuits with the aim of cuit operation provides the oppor- out vertically.
having performance gains offset the tunity to reduce operating voltages
cost of extra area. By turning off spe- and hence increase device energy Stacking with through-silicon vias
cialized circuits that are not required, efficiency (along with its usable per- In this approach, holes are drilled
this approach is energy neutral (in formance and scalability) by an order through silicon chips to provide elec-
the sense that it increases perfor- of magnitude. NTV is still an active trical connections between the stacked
mance without increasing power research focus, with efforts to deter- layers. Chip stacks of up to eight layers
consumption), and it is already being mine whether the software challenges are already available, and engineering
used in some specialized consumer posed by reliability, performance het- costs are lower relative to adding lay-
electronics applications. However, its erogeneity, and increased parallelism ers lithographically or through epitax-
utility in general-purpose computa- will detract from raw potential perfor- ial deposition. Through-silicon vias
tion is unproven. mance improvement.11 (TSVs) offer much higher bandwidth
DECEMBER 2015 19
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.
REBOOTING COMPUTING
and energy efficiency than conven- it will force a departure from the main- structures and materials that might
tional chip packages, such as ball grid stream because cooling is not likely to improve device performance are tun-
arrays and other pin packages. How- be practical for consumer devices. Even neling field-effect transistors (TFETs),
ever, relative to adding chip layers cuprate-based high-temperature super- heterogeneous semiconductors, car-
with photolithography, TSVs do not conductors have impractical cooling bon nanotubes and graphene, and
offer as much bandwidth, efficiency, and magnetic shielding requirements piezo-electric transducers (PETs).
and connectivity between layers. for such devices. The viability of cryo-
genically cooled electronics in stan- Tunneling field-effect transistors
Metal layers dard phones or laptops is doubtful. With conventional FETs, device perfor-
CMOS has traditionally been built out Using cryogenically cooled elec- mance is limited by the voltage swing
in 2D planar form with modest improve- tronics to extend HPC performance is required to turn them completely on or
ments in 3D. Modern chips have up to 11 technically feasible, but would entail off (gain). A TFET uses a channel mate-
metal layers. The number of metal lay- a departure from the traditional rial that modulates the quantum tun-
ers could be improved, but these pro- leveraging of commodity component neling effect, rather than the classical
vide additional connectivity among technology. There could be severe metal-oxide semiconductor (MOS) FET
components only on the 2D surface. repercussions for the US in HPC com- modulation of thermionic emission,
petitiveness and system affordability, which creates a switch that is more
Epitaxial deposition which critically depend on that lever- sensitive to gate voltage when turning
Lithographic layering yields only the aging ability. on or off and can thus operate at a lower
bottom silicon layer, which is still 2D voltage. Because the device’s power
planar. More active transistors require Crystalline metals dissipation is proportional to voltage
adding layers of semiconductor mate- Although copper is an excellent con- squared, there is substantial opportu-
rial on top of each other. Epitaxial ductor, in a typical polycrystalline nity to improve energy efficiency.
deposition meets that requirement configuration, electrons still scatter Different materials systems are
through a chemical, molecular beam off the boundaries between neigh- being investigated, but thermal sensi-
or vapor deposition process. Chal- boring crystalline grains. Metal lay- tivity, speed, obstacles to reliable man-
lenges remain in depositing high-qual- ers’ conductivity could be improved ufacturability, and other scalability
ity, single-crystal active layers, but by as much as five times by creating issues challenge current devices. With-
there has been substantial progress in larger grain sizes. Techniques to cre- out lithography improvements, suc-
studying approaches that go beyond ate larger crystal grains in a scalable cessful development of TFET devices
standard silicon, such as those that use chip-manufacturing process are still could enable one or two additional gen-
processes other than epitaxial deposi- not well understood or perhaps are not erations of improvement in technology
tion to directly transfer very thin lay- being shared because of proprietary performance scaling, but it will take a
ers of bulk crystalline material. concerns. decade to translate laboratory advances
to mainstream mass production.
RESISTANCE REDUCTION MILLIVOLT SWITCHES Other technologies involve new
Most ICs use a copper-based intercon- Millivolt switches are essentially gate designs to improve transistor sen-
nect to reduce resistance because cop- transistors that can operate at much sitivity, such as ferroelectric gate FETs.
per is a particularly good conductor, lower voltages. Many 3D stacking All have similar challenges in man-
and at room temperature few options approaches eventually fail to scale ufacturing, and offer similar oppor-
are capable of lower electrical resis- because stacking energy-intensive tunities to extend technology energy
tance. Two alternatives are supercon- logic layers creates energy-density lim- efficiency (and hence performance)
ductors and crystalline metals. its. Any future electronic system will through lower operating voltages.
need material that reduces switching
Superconductors power for the logic and resistive losses Heterogeneous semiconductors
Superconducting could be a way to from information transfer within each Silicon has become the primary semi-
advance HPC system performance, but constituent logic layer. Examples of conductor material for ICs because of
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.
its favorable chemical properties and growth techniques have made reliable band-gap distribution leads to prob-
physical robustness. Semiconduc- large-scale production more practical. lematic device variation, making
tors formed from III-V materials (so Strained silicon is a novel processing high-purity nanotubes with uniform
named for their source in columns approach that, although promising, is diameter hard to manufacture. The
3 and 5 of the periodic table), such still finding its way into mainstream primary challenge for nanotubes lies
as gallium arsenide (GaAs), provide lithography processes, and heteroge- in finding a scalable manufacturing
much higher performance, but are neous semiconductors face obstacles process, as current devices require
more susceptible to cracking from stemming from the required mate- precise tube placement to form tran-
low-quality oxides and consequently rials. For example, gallium arsenide sistors and circuits. Although recent
require more involved chemical pro- suffers from unbalanced P- and N-gate advances in self-assembly processes
cessing. These manufacturability performance, which in turn affects its for nanotube-based circuits have been
problems have kept III-V materials efficiency in CMOS devices. Combining significant,14 a competitive, commer-
on the margins of mainstream digi- silicon using epitaxial deposition could cially scalable process is still a long
tal electronics. solve some of these problems and pro- way off.
Recent dramatic improvements in vide an order-of-magnitude improve- Graphene is a planar matrix of car-
the ability to integrate islands of III-V ment in some device functions. How- bon atoms with no band gap, mak-
materials into bulk silicon substrates ever, to date, many III-V materials ing it unsuitable for digital switches
address these problems by enabling a are not candidates for an exact CMOS that turn off and have very low cur-
marriage of silicon’s manufacturing, replacement. rent leakage. One way around this is
chemical, and electrical benefits to to fashion graphene into very nar-
the performance benefits of embed- Carbon nanotubes and graphene row ribbons, as graphene rolled into
ded III-V materials. Heterogeneity is The band gap of carbon nanotubes is a perfectly smooth tube is in effect
achieved through silicon straining, a much smaller than that of silicon, a a nanotube.
process that alters the silicon substrate characteristic that translates to less The challenge is to manufacture
so that its atomic spacing aligns with energy in operating carbon nanotube– uniformly wide graphene nanorib-
that of the III-V material and dopes the based devices. These devices also bons with atomically smooth edges,
III-V materials with additional impuri- present lower resistance to electron and generally, graphene nanoribbons
ties so that the atomic spacing aligns movement, which increases noise sus- are less well developed than nano-
with that of silicon when it is vapor- ceptibility. Experiments have shown tubes. However, breakthrough syn-
deposited onto the silicon substrate. that transistors based on carbon nano- thesis techniques are emerging that
Alternatively, the silicon can be depos- tubes can deliver higher current den- might more efficiently and econom-
ited on top of a substrate material with sities than silicon-based devices,12 ically produce pure, uniform ribbon
slightly larger lattice spacing, such as which in principle would enable them width than the techniques used to
silicon germanium (SiGe). The atomic to operate at much higher switch- produce nanotubes. In addition to
bonding between the layers stretches ing rates and energy efficiency. Other graphene, there has been rapid devel-
out the silicon lattice and can substan- studies show that nanotube devices opment of other 2D materials systems
tially improve charge-carrier mobility have gain- (steeper subthreshold slope) such as molybdenum disulfide (MoS2)
through the silicon. and noise-rejection properties that can and phosphorene.15
Challenges in this area are due compete with those of classical semi-
mostly to the cost and complexity of conductors for individual devices.13 Piezo-electric transistors
producing crystollagraphically perfect Despite these favorable properties, Piezo-electric transistors (PETs) use
structures in a manner that integrates mundane issues like contact resis- the piezo-electric effect, in which
into a scaled-up production lithogra- tance are stalling progress in carbon an electric field induces mechanical
phy process. However, recent advances nanotubes, and gate dielectric mate- stress by changing material size. The
in ultra-high vacuum chemical vapor rials have yet to be fully engineered most common use of PETs has been in
deposition (UHV-CVD), molecular and optimized for nanotubes. Fur- micromechanical systems and force
beam epitaxy (MBE), and other epitaxial thermore, nanotube diameter and sensors, but if such piezo materials
DECEMBER 2015 21
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.
REBOOTING COMPUTING
can be successfully miniaturized, the require an adiabatic or a reversible than the wire cost.17 Even so, photonics
technology could also be used to form computing model. Such models can will be essential in overcoming wire
an extremely fast (multi- g igahertz) be highly restrictive and would fun- limits and the disparity in on-chip and
microscale electronic relay.16 This damentally disrupt the current digital off-chip communications costs.5 An
strategy is one of many microme- computing model. effective high-gain optical transistor
chanical approaches to developing would make nanophotonics a com-
h igher-performance switches. Topological insulators petitor in CMOS replacement, but the
Topological insulators confine energy technology for a high-performance,
BEYOND TRANSISTORS: to a 2D space. Relative to conven- optically controlled switch requires
NEW LOGIC PARADIGMS tional wires, these confined energy further development.
The devices and techniques described states can provide more efficient
so far aim to improve device perfor- (higher noise margin) information Biological and
mance with familiar digital comput- transport and storage, but the proper chemical computing
ing architectures and computational approach to implementing logic is Computing devices based on the ani-
models. However, technologies are uncertain. One method is to apply 2D mal brain aim to emulate the most
being proposed that are more than image-analysis algorithms that use a complex machine known. The prin-
just better transistors; they change photogalvanic effect to program ini- cipal challenges to biologically based
how bits are stored and trans- tial state for quantum bits (qubits) computing devices include low gain,
formed. These radical performance- embedded in the topological insula- a poor signal-to-noise ratio, and exotic
enhancement paths represent new tor. According to SEMATECH’s Cum- operating conditions.
logic paradigms, including spintron- mings, the electronics industry is The search continues for a chemi-
ics, topological insulators, nanopho- considering these 2D semiconductors cal switching mechanism that offers
tonics, and biological and chemical as well as other new semiconductors sufficient gain and noise rejection to
computing. with unique properties. compete with silicon. Good candidates
exist, but in addition to requiring com-
Spintronics Nanophotonics plex operating environments, they are
Computation on information and its Photonic technology has obvious difficult to scale.
communication through the manip- advantages for scalable communica-
ulation of magnetic domains takes tions, although using nanophotonics
S
less energy than moving electrons to at subwavelength scale as a replace- ociety relies heavily on the
such a degree that it is nearly incon- ment for computing and transistor benefits that Moore’s law
sequential to overall power consump- technologies is problematic because of provides—cheap technology
tion. Kevin Cummings of SEMATECH, scale incompatibilities: available opti- that continues to scale almost effort-
a global consortium of semiconduc- cal transistors have a low gain, and lessly. From this point, the energy cost
tor device, equipment, and materials optical wavelengths are large com- of data movement will dominate both
manufacturers, stated in an email to pared to current realizable photolitho- technical and economic issues because
us that spin materials could benefit graphic scales. the energy cost to compute data is
technologies aiming to provide dual Unlike standard electrical wires, decreasing faster than the cost to move
functionality (logic and memory) as which have a strong distance- it to computing operations. Increas-
well as new circuit designs, such as dependent energy cost, photonics’
ing the use of parallelism in software
static RAM. energy costs are nearly independent of is a short-term fix that will require
For applications involving memory data distance, allowing them to over- massive commercial effort. Longer
technologies, there is little impact on come the wire-resistance limitation. term, the current computation-centric
standard paradigms for computation, Unfortunately, although it has steadily model might need to give way to a data-
but broader use of spintronic devices decreased, the energy cost to activate centric model.
as general-purpose computing appli- the laser to send information over a Evolving technology in the Moore’s
cations (fully replacing CMOS) would photonic connection is still far higher law vacuum will require an investment
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.
ABOUT THE AUTHORS
JOHN M. SHALF is CTO of the National Energy Research Supercomputing Center
and head of the Computer Science Department at Lawrence Berkeley National
Laboratory. His research interests include parallel computing software and
now in basic sciences, including mate- high-performance computing technology. Shalf received an MS in electrical and
rials science, to study candidate computer engineering from Virginia Tech. He is a member of the American Asso-
replacement materials and alterna- ciation for the Advancement of Science, IEEE, and the Optical Society of America,
tive device physics to foster continued and coauthor of the whitepaper “The Landscape of Parallel Computing Research:
technology scaling. Using the history A View from Berkeley” (UC Berkeley, 2006). Contact him at [email protected].
of the silicon FinFET, it takes about 10
years for an advance in basic device ROBERT LELAND is vice president of science and technology and chief techni-
physics to reach mainstream use. Any cal officer (CTO) of Sandia National Laboratories. His research interests include
new technology will require a long parallel algorithm development, sparse iterative methods, and applied graph
lead time and sustained R&D of one to theory. Leland received a PhD in parallel computing from Oxford University.
two decades. Options abound, the race Contact him at [email protected].
outcome is undecided, and the prize is
invaluable. The winner not only will
influence chip technology, but will
define a new direction for the entire
computing industry. 4. R.E. Fontana, S.R. Hetzler, and G. Computing, AI tech. report 1586, MIT
Decad, “Technology Roadmap Com- Artificial Intelligence Lab., Sept.
parisons for TAPE, HDD, and NAND 1996; www.seas.upenn.edu/~andre
ACKNOWLEDGMENTS Flash: Implications for Data Storage /abstracts/dehon_phd.html.
We thank Shekhar Borkar, Justin Rattner, Applications,” IEEE Trans. Magnetics, 11. H. Kaul et al., “Near-Threshold Volt-
Steve Pawlowski, and Al Gara of Intel; Cath- vol. 48, no. 5, 2012, pp. 1692–1696. age (NTV) Design: Opportunities and
erine Jenkins and Jeff Bokor of Lawrence 5. D.A.B. Miller, “Device Requirements Challenges,” Proc. 49th ACM/EDAC/
Berkeley National Laboratory/ University for Optical Interconnects to Silicon IEEE Design Automation Conf. (DAC
of California, Berkeley; Erik DeBenedictis Chips,” Proc. IEEE, vol. 97, no. 7, 2009, 12), 2012, pp. 1149–1154.
of Sandia National Laboratories; Thomas pp. 1166–1185. 12. J. Appenzeller, “Carbon Nanotubes
Theis of SRI; and Kevin Cummings of 6. L. Joneckis, D. Koester, and J. Alspec- for High-Performance Electronics—
SEMATECH for their helpful input. tor, An Initial Look at Alternative Com- Progress and Prospect,” Proc. IEEE,
puting Technologies for the Intelligence vol.96, no.2, 2008, pp. 201–211.
REFERENCES Community, tech. report, Inst. for 13. A.D. Franklin et al. “Sub-10 nm
1. G.E. Moore, “Cramming More Com- Defense Analysis, Jan. 2014; http:// Carbon Nanotube Transistor,” Nano
ponents onto Integrated Circuits,” oai.dtic.mil/oai/oai?verb=getRecord technology Letters, vol. 12, 2012,
Electronics, vol. 38, no. 8, 1965, &metadataPrefix=html&identifier pp. 758–762.
pp. 114–117. =ADA610103. 14. H. Park et al., “High-Density Integra-
2. R.H. Dennard et al., “Design of 7. S. Borkar, “Electronics Beyond Nano- tion of Carbon Nanotubes via Chem
Ion-Implanted MOSFETs with Very Scale CMOS,” Proc. 43rd Ann. ACM ical Self-Assembly,” Nature Nano,
Small Physical Dimensions,” IEEE Design Automation Conf. (DAC 06), vol. 7, no. 12, 2012, pp. 787–791.
J. Solid-State Circuits, vol. SC-9, no. 5, 2006, pp. 807–808. 15. R.F. Service, “Beyond Graphene,”
1974, pp. 256–268. 8. O. Villa et al., “Scaling the Power Science, vol. 348, no. 6234, 2015,
3. R. Colwell, “The Chip Design Game Wall: A Path to Exascale,” Proc. IEEE pp. 490–492.
at the End of Moore’s Law,” Proc. Supercomputing Conf., 2014, 16. T.N. Thies, “In Quest of a Fast,
IEEE/ACM Symp. High-Performance pp. 830–841. Low-Voltage Digital Switch,” ECS
Chips (HC25), 2013; www.hotchips 9. N. Imam et al.,”Neural Spiking Trans., vol. 45, no. 6, 2012, pp. 3–11.
.org/wp-content/uploads/hc Dynamics in Asynchronous Digital 17. A.F. Benner et al., “Exploitation of
_archives/hc25/HC25.15-keynote1 Circuits,” Proc. Int’l Joint Conf. Neural Optical Interconnects in Future
-Chipdesign-epub/HC25.26.190 Networks (IJCNN 13), 2013, pp. 1–8. Server Architectures,” IBM J.
-Keynote1-ChipDesignGame 10. A. DeHon, Reconfigurable Archi Research and Development, vol. 49,
-Colwell-DARPA.pdf. tectures for General-Purpose nos. 4–5, 2005, pp. 755–776.
DECEMBER 2015 23
Authorized licensed use limited to: Chengdu University of Technology. Downloaded on July 17,2024 at 16:21:08 UTC from IEEE Xplore. Restrictions apply.