0% found this document useful (0 votes)
36 views14 pages

The SpiNNaker Project

The paper describes the SpiNNaker project, which aims to build a massively parallel computer for modeling large-scale neural networks in real time. SpiNNaker will contain over 1 million processor cores connected in a way inspired by the brain's connectivity. It uses an event-driven programming model and address event representation to transmit spike times across the network similarly to how neurons communicate via spikes. The project has already delivered systems with thousands of processors and supports flexible programming access for neuroscience research collaborations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views14 pages

The SpiNNaker Project

The paper describes the SpiNNaker project, which aims to build a massively parallel computer for modeling large-scale neural networks in real time. SpiNNaker will contain over 1 million processor cores connected in a way inspired by the brain's connectivity. It uses an event-driven programming model and address event representation to transmit spike times across the network similarly to how neurons communicate via spikes. The project has already delivered systems with thousands of processors and supports flexible programming access for neuroscience research collaborations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

INVITED

PAPER

The SpiNNaker Project


This paper describes the design of a massively parallel computer that is suitable for
computational neuroscience modeling of large-scale spiking neural networks in
biological real time.
By Steve B. Furber, Fellow IEEE , Francesco Galluppi, Steve Temple, and
Luis A. Plana, Senior Member IEEE

ABSTRACT | The spiking neural network architecture ‘‘Wet’’ neuroscience has told us a great deal about the
(SpiNNaker) project aims to deliver a massively parallel million- basic componentVthe neuronVfrom which the brain is
core computer whose interconnect architecture is inspired by the constructed. Brain imaging tells us yet more about how
connectivity characteristics of the mammalian brain, and which is activity moves around the brain as we perform certain
suited to the modeling of large-scale spiking neural networks in mental functions. The former is concerned with individual
biological real time. Specifically, the interconnect allows the neurons up to groups of tens or perhaps hundreds; the
transmission of a very large number of very small data packets, latter looks at the collective activity of many millions of
each conveying explicitly the source, and implicitly the time, of a neurons. But between these scales there are a few orders of
single neural action potential or ‘‘spike.’’ In this paper, we review magnitude of scale for which there exists no scientific
the current state of the project, which has already delivered instrument except the computer model, and it is at these
systems with up to 2500 processors, and present the real-time intermediate scales, we suggest, that all the interesting
event-driven programming model that supports flexible access to information processing takes place.
the resources of the machine and has enabled its use by a wide Our conclusion is that, if we wish to fully understand
range of collaborators around the world. how the brain represents and processes information, we
need to build computer models to test hypotheses of how
KEYWORDS | Brain modeling; multicast algorithms; multipro- the brain works.
cessor interconnection networks; neural network hardware;
parallel programming A. Neurons and Spikes
What sort of computer is required for such brain
modeling to work?
I. INTRODUCTION The human brain is generally viewed as comprising
THE spiking neural network architecture (SpiNNaker) somewhat under 100 billion neurons, where each neuron
project is motivated by the grand challenge of understand- is a multiple-input–single-output device.
ing how information is represented and processed in the There is some debate about the role of the more
brain [1]. Most of the frontiers of science are concerned numerous glial cells that form the structure upon which
with the very small, such as subatomic particles, or the the neurons build the brain, and, in particular, the role of
very large, such as exploring the outer regions of the astrocyte cells in synaptic plasticity [2], so any general-
universe. Yet there remains a great unsolved scientific purpose system should aim to accommodate these issues in
mystery at a very human scale: how does the brain, an case they prove to be important.
organ that we could readily hold in our hands and observe Neurons communicate principally through action
with the naked eye, perform its role that is so central to all potentials, or ‘‘spikes.’’ These are simply asynchronous
of our lives? impulses where, as a result of the electrochemical
regeneration process used to ensure the reliable propaga-
Manuscript received October 2, 2013; revised January 4, 2014; accepted February 2, tion of these signals along long biological ‘‘wires,’’
2014. Date of publication February 27, 2014; date of current version April 28, 2014. This
work was supported by the Engineering and Physical Sciences Research Council information is conveyed only in the identity of the neuron
(EPSRC) under GrantEP/G015740/01. that spiked and the time at which it spiked. The height and
The authors are with the School of Computer Science, University of Manchester,
Manchester M13 9PL, U.K. (e-mail: [email protected]; the width of the impulse are largely invariant at the
[email protected]; [email protected]; [email protected]). receiving synapse. This has led to the widespread adoption
Digital Object Identifier: 10.1109/JPROC.2014.2304638 of the address event representation (AER) encoding of
0018-9219  2014 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/
redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
652 Proceedings of the IEEE | Vol. 102, No. 5, May 2014
Furber et al: The SpiNNaker Project

neural activity [3], [4], where the information flow in a identifying the packet type, and such like. (The choice of a
network is represented as a time series of neural identifiers. 32-b AER identifier is not a fundamental limitation of the
There are some notable exceptions to the completeness architecture, and could be increased in a future implemen-
of the AER view of information flow. Some neurons tation to accommodate larger neural models.) The time of
transport and emit neuromodulators, such as dopamine, the AER spike is implicit; the communications infrastructure
that have a global effect on neurons within a neighborhood can deliver a packet in much less than a millisecond, which is
region; other neurons make direct contact through ‘‘gap’’ the requirement for real-time neural modeling.
junctions that make an electrical connection from one Although SpiNNaker’s design is centered on packet-
neuron to its neighbor. However, in much of the brain, the switched support for AER ‘‘spikes,’’ it can also support non-
primary real-time information flow is in the spikes that, in AER information flows through the same communication
a model, are represented by AER. A general-purpose mechanism delivering discrete (typically 1 ms) updates to
computer-modeling platform should offer mechanisms to continuously variable parameters.
support these other information flows while giving first- In order to achieve efficient massively parallel opera-
class support to AER ‘‘spikes.’’ tion, SpiNNaker’s design accepts certain compromises,
one of which is the requirement for deterministic
B. Computer Models operation. The asynchronous nature of the communica-
What computer power and architecture are required to tions system leads to nondeterministic ordering of packet
support a real-time model of the human brain? reception, and occasionally packets may be dropped to
The simplest estimate of an answer to this question avoid communication deadlock. It is possible to reimpose
suggests that there are around 1015 synapses in the brain, deterministic operation and lockstep operation to match a
with inputs firing at an average rate of 101 Hz, and each conventional sequential model under certain conditions,
synaptic event requires perhaps 102 instructions to update but this is not the natural or most efficient way to operate
the state of the postsynaptic neuron and implement any the machine.
synaptic plasticity algorithm. These figures lead to an
estimate of 1018 operations per second, the performance of D. Paper Organization
an exascale machine. Exascale high-performance compu- This paper is a review of the SpiNNaker project and a
ters do not yet exist, though recently the Chinese Tianhe 2 tutorial on the use of the machine. The contributions and
machine has achieved 3  1016 floating-point operations structure of the paper are as follows.
per second [5], so exascale computing is not too far away. • We present an overview of the architecture
However, raw computer performance is not the only (Section II) and of the hardware implementation
issue here. The communication patterns in the brain are (Section III).
based on sending very small ‘‘packets’’ of information • We present the system software (Section IV), des-
through complex paths to many targets. High-performance cribe the event-driven software model (Section V),
computers, on the other hand, are generally optimized for the API that supports this (Section VI), and a
point-to-point communication of large data packets. This simple example program that runs on top of the
mismatch leads to significant inefficiency in the mapping API (Section VII).
of brain-scale spiking neural networks onto conventional • We present the partitioning and configuration
cluster machines and high-performance computers. manager (PACMAN, Section VIII) that conceals
the physical structure of the machine.
C. SpiNNaker • Finally, we describe some typical applications that
The SpiNNaker machine is a computer designed run on the machine (Section IX), our future plans
specifically to support the sorts of communication found for larger scale machines (Section X), discuss
in the brain. Recognizing the huge computational require- related work (Section XI), and draw our conclu-
ments of the task, SpiNNaker is based on massively parallel sions (Section XII) from our experience with the
computation, and the architecture will accommodate up to machine at this stage in its development.
a million microprocessor cores, the limit being defined by
budget and architectural convenience rather than anything
fundamental. II. A RCHI TECTURE O VERVI EW
The key innovation in the SpiNNaker architecture is the A detailed description of the architecture of the machine
communications infrastructure, which is optimized to carry has been presented earlier [6], so here we present the key
very large numbers of very small packets, in contrast to the features of the architecture that are germane to what
conventional cluster and high-performance computer com- follows (see Fig. 1).
munications system which, as noted above, are optimized for A SpiNNaker machine is a homogeneous 2-D multiple
large data packets. Each packet carries a single neural ‘‘spike’’ instruction, multiple data array of processing nodes where
event in a 40-b packet, 32 b of which are the AER identifier of each node incorporates 18 ARM968 processor cores each
the neuron that spiked and 8 b are management bits with 96 kB of local memory, 128 MB of shared memory, a

Vol. 102, No. 5, May 2014 | Proceedings of the IEEE 653


Furber et al: The SpiNNaker Project

processor has no idea of where that packet will be conveyed


toVthat is entirely the responsibility of the routing fabric.
Each node incorporates a packet router that inspects
each packet to look at its source, and routes it accordingly
to any subset of its 18 local processors and/or any subset of
its six neighbor nodes using multicast transmission (which
has been shown to be optimal for neural applications [7])
in a 2-D triangular mesh. The selected routes are
determined by tables in the router that are initialized
when the application is loaded into the machine.
As the packet source identifier is 32 b, it is infeasible to
implement full routing tables for every possible source, so
a number of optimizations are employed to keep the table
sizes reasonable.
• The tables are implemented using content ad-
dressable memory (CAM), and entries are required
only for those packets that pass through a node.
• The CAM uses four states: match 0, match 1,
match all, and no match. This allows a single CAM
entry to route all of those neurons in a population
with common routing requirements.
• Where no CAM entry matches a source identifier,
a default routing mechanism allows the packet to
Fig. 1. Principal architectural components of a SpiNNaker node.
pass straight through the node.
These optimizations allow a routing table with 1024
packet router, and general system support peripherals. Each entries to be sufficient at each node. We will return to the
processor core is a general-purpose 200-MHz 32-b integer matter of initializing these tables in Section VIII.
processor with no floating-point hardware, so arithmetic is
generally implemented as fixed point.
C. Processor Disposition
Each SpiNNaker node selects one of its 18 processor
A. Achievable Performance cores to act as ‘‘monitor processor.’’ This selection is
Each core can model a few hundred point neuron flexible for fault-tolerance reasons. Once selected, the
models, such as leaky integrate and fire or Izhikevich’s monitor is assigned an operating system support role.
model, with the order of 1000 input synapses to each Sixteen of the remaining processors are assigned applica-
neuron. In practice, a number of different constraints may tion support roles, and the 18th processor is held in reserve
limit the number of neurons a processor core can support as a fault-tolerance spare, though on a proportion of nodes,
in real time, but often the compute budget is dominated by the 18th processor may be faulty as nodes with only 17
input connectionsVan incoming spike passing through an functional processors are accepted in production to
individual synapseVwhich imposes an upper limit on the enhance yield.
(number of neurons)  (number of inputs per neuron) 
(mean input firing rate). In principle, a processor core can
support up to 10 million connections/s, though the current I II . CHIPS, PACKAGES, B OARDS,
software implementation saturates at about half this AND S YSTEMS
throughput, and plastic synapse models reduce it consid- The physical implementation of the SpiNNaker architecture
erably further. has also been described in detail elsewhere [8], so again we
will restrict ourselves here to the relevant highlights.
B. Spikes and Packets
The key innovation in the SpiNNaker architecture is a A. Chips and Packages
lightweight multicast packet-routing mechanism that Each SpiNNaker node is implemented in a single
supports the very high connectivity found in biological 19-mm square 300 ball grid array package. The package
brains. The mechanism is an extension of conventional houses a custom-designed multiprocessor system-on-chip
AER [3], [4]. When the software running on a processor integrated circuit that includes the 18 ARM968 proces-
identifies that a neuron should emit a spike, it simply issues sors, each with its local 32-kB instruction memory and
a packet that identifies the spiking neuron. The issuing 64-kB data memory, interconnected through a self-timed

654 Proceedings of the IEEE | Vol. 102, No. 5, May 2014


Furber et al: The SpiNNaker Project

network on chip to various on-chip shared resources and a


second chip, a 128-MB low-power mobile dual-data-rate
(DDR) SDRAM. The two chips are stacked onto the
package substrate and interconnected using gold wire
bonding (Fig. 3). The aggregate SDRAM bandwidth has
been measured to be 900 MB/s [8].

B. Boards
The packages are then assembled onto printed circuit
boards (PCBs; see Fig. 2). The chip-to-chip connections on
the PCB are direct wired connections using a self-timed
2-of-7 non-return-to-zero protocol to transmit 4-b symbols
with two wire transitions, plus one wire transition for the
acknowledge response.
In principle, these direct connections could be used to Fig. 3. Inside a SpiNNaker package. The SpiNNaker chip is mounted on
the substrate, then a 128-MB mobile DDR SDRAM is stacked on top of it,
build a SpiNNaker machine of arbitrary size, but for and the connections are made inside the package with gold wire
practical reasons the machine is constructed from 48-node bonding. The packaging was carried out by Unisem Europe Ltd.
PCBs, and the PCB-to-PCB connections use high-speed
serial links where eight chip-to-chip links are multiplexed
through each serial link using Xilinx Spartan6 field- room cabinets and will require up to 75 kW of electrical
programmable gate arrays (FPGAs). power (peak).
C. Systems
SpiNNaker systems of varying sizes can then be
assembled from one or more of the 48-node PCBs. There IV. SPINNAKE R S YSTEM S OFTWARE
is also a smaller four-node board that is very convenient for SpiNNaker software can be categorized into that which
training, development, and mobile robotics. The largest runs on the SpiNNaker system itself and that which runs
machine, incorporating over a million ARM processor on other systems, some of which may interact with
cores, will comprise 1200 48-node boards in ten machine SpiNNaker. The majority of software that runs on the
SpiNNaker chips is written in C. This software can be
subdivided into control software (a primitive operating
system) and application software which performs the
user’s computations.
The primary interface between SpiNNaker systems and
the outside world is Ethernet and IP-based protocols. Every
SpiNNaker chip has an Ethernet interface and typically one
chip per PCB uses this interface. This is used to download
code and data to SpiNNaker and to gather results from
applications. For some applications, this (100 Mb/s) inter-
face is a bottleneck on getting data to and from SpiNNaker,
and we are investigating the use of gigabit links provided by
FPGAs on SpiNNaker PCBs to improve this.

A. SpiNNaker Software
The control software that runs on SpiNNaker systems
is known as the SpiNNaker Control and Monitor Program
(SC&MP). The SpiNNaker chips contain primary boot-
strap code which allows the loading of code via the
Ethernet interface or the interchip links, and this is used to
load SC&MP, initially via an Ethernet interface to a single
chip. SC&MP is then propagated to the entire system over
the interchip links; it runs continuously on the core that
Fig. 2. A 48-node SpiNNaker PCB. This circuit board incorporates
48 SpiNNaker packages (center) with a total of 864 ARM968 processor
has been selected as the monitor processor and provides a
cores, three FPGAs (top) for high-speed inter-PCB communications range of services to the outside world to allow applications
through serial advanced technology attachment connectors (top left to be loaded on the remaining 16 or 17 application cores on
and right), with onboard power regulation (bottom). each chip.

Vol. 102, No. 5, May 2014 | Proceedings of the IEEE 655


Furber et al: The SpiNNaker Project

A simple packet protocol known as SpiNNaker


Datagram Protocol (SDP) is used within the SpiNNaker
system. SC&MP acts as a router for SDP packets allowing
them to be sent to or from any core in the system and also
via Ethernet to external endpoints. This protocol forms the
basis for application loading and high-level communication
between SpiNNaker chips and/or external machines.
Within individual chips, SDP packets are exchanged
between cores using a shared-memory interface. Between
chips, SDP is transported as sequences of point-to-point
packets conveyed by the interchip links. To carry SDP out
of the system, the packets are embedded in UDP/IP
packets and sent via the Ethernet interface to external
endpoints.
A SpiNNaker ‘‘application’’ is a program that runs on
one or many of the application cores on a SpiNNaker
system. It will typically be written in C and utilize either Fig. 4. Example output from a SpiNNaker visualizer.
SDP or multicast packets for its communication needs.
Because of the limited code and data size provided by the
on-chip memories in SpiNNaker, there is little room for B. Host Software
operating system support and so only minimal ancillary We refer to the workstation that controls a SpiNNaker
code can be loaded along with the application. Each system as the ‘‘host.’’ A variety of SpiNNaker-related host
application is linked with a support library known as software has been developed within the project. A number
SpiNNaker Application Runtime Kernel (SARK). SARK of tools have been developed to download applications to
provides startup code for the application core to set up the SpiNNaker systems. The ‘‘ybug’’ program provides a
runtime environment for the application. It also provides a command line interface for this function and also allows
library of functions for the application such as memory scripted control of the system. A number of application
allocation and interrupt control. SARK also maintains a programmer interfaces (APIs) that implement interfaces
communications interface with SC&MP running on the based on SDP have been developed in C, Perl, and Python.
monitor processor that allows the application to commu- These allow programmed control of a SpiNNaker system to
nicate with and be controlled by other SpiNNaker chips or allow applications to be downloaded and controlled and
external systems. Protocols running on top of SDP are used results uploaded.
to achieve this functionality. In addition, a number of ‘‘visualizer’’ applications have
An application is built using an ARM cross compiler been produced which allow the results of SpiNNaker
and linked with SARK and any other runtime libraries that applications to be viewed on the host system. The simplest
it requires. The output file from the linking stage is of these just allows plain text output to be displayed on the
converted to a format known as application load and host while more sophisticated visualizers [9] display data in
execute (APLX) which is understood by a simple loader graphical form such as the raster plot of firing spikes in a
which is part of SC&MP. The APLX file can then be neural network simulation or the potentials inside a single
downloaded to the SpiNNaker system where it is loaded neuron (Fig. 4).
into the appropriate parts of memory of the relevant The provision of input data to SpiNNaker applications
application cores by the SC&MP loader. can also require host software to provide these data. One
Most SpiNNaker applications make use of an event such application is the ‘‘spike server’’ which is used to
management library known as the Spin1 API. This provides provide spikes (neural events) in real time to a neural
facilities for associating common interrupts with event simulation running on SpiNNaker.
handling code and for managing queues of events. While A significant part of the SpiNNaker software effort has
the processor is not processing events it is in a low-power been the development of programs that map complex
sleep mode. This API can be viewed as a software layer problems onto the SpiNNaker hardware. A typical example
between the user’s application and the underlying is a neural network simulation where individual neurons
hardware. To facilitate SpiNNaker program development or groups of neurons have to be allocated to cores in the
using the API, an emulator has been developed which system and the routing tables set up to allow them to
provides the same set of library calls as the Spin1 API but communicate appropriately for the connectivity of the
which runs on a Linux workstation. This allows users network. The ‘‘PACMAN’’ program, which is described in
without SpiNNaker hardware to develop and debug Section VIII, is typical of this class of program [10].
SpiNNaker applications and to familiarize themselves Fig. 5 shows the arrangement of the various software
with the programming model. components which make up a SpiNNaker system.

656 Proceedings of the IEEE | Vol. 102, No. 5, May 2014


Furber et al: The SpiNNaker Project

Fig. 5. The various software components running on the host machine, the root node, and other SpiNNaker nodes.

V. EVENT-DRIVEN SOFT WARE MODEL to be executed when specific events, such as the arrival of a
packet, the completion of a DMA transfer, or the lapse of a
The programming model employed on SpiNNaker is that of a
periodic time interval, occur. The callback mechanism is also
real-time event-driven system. The application processors
used to hide the details of the interrupt subsystem, which is
have a base state, which is halted and waiting for an interrupt,
handled directly and efficiently by the API.
contributing to the overall energy efficiency of the system. In
Fig. 6 shows the basic architecture of the event-driven
the standard neural modeling application, there are three
framework. Application developers write callback routines
principal events that cause the processor to wake up.
that are associated with events of interest and register
1) An incoming spike packet. This will usually cause them with the API at a priority level, which defines them
the processor to initiate a direct memory access as queueable or non-queueable. When the corresponding
(DMA) transfer from SDRAM of the synaptic data event occurs, the scheduler either executes the callback
structures associated with the source of this spike. immediately and atomically (in the case of a non-
2) DMA complete. Once the synaptic data have been queueable callback) or places it into a scheduling queue
transferred, the processor must process the data. at a position according to its priority (in the case of a
3) One-millisecond timer tick. Each processor has a queueable callback). When control is returned to the
local timer that marks the passage of time, and dispatcher (following the completion of a callback) the
each millisecond (typically, the interval is pro- highest priority queueable callback is executed. Queueable
grammable) the processor will compute a further callbacks do not necessarily execute atomically: they may
integration step in the neuron dynamics. be preempted by non-queueable callbacks if a
Of course, these events are asynchronous and unpredict- corresponding event occurs during their execution.
able, so the software running on the processor must be The dispatcher goes to sleep (in the low-power
capable of prioritizing the events and handling multiple consumption ‘‘wait for interrupt’’ state, where the processor
overlapping requests. This is achieved through the use of a core clock is turned off) when the callback queues are empty
real-time kernel that underpins the event-driven operation and will be awakened by any event. Application developers
of each application processor, and presents a straightfor- can designate one non-queueable callback as the preeminent
ward API to the user, who can build applications on top of callback, which has the highest priority and can preempt
the API entirely in C. other non-queueable callbacks as well as all queueable ones.
The API provides support for callbacks to control entry and
exit from critical sections to prevent higher priority callbacks
VI. SPINNAKER APPLICATION interrupting them at a bad time, e.g., during access to a
PROGRAMMING INTERFACE shared resource.
The SpiNNaker application programming interface (spin1 This real-time kernel is scalable to very large numbers
API) [11] provides an execution environment that supports a of processors, but is best suited to relatively simple models
lightweight, event-driven programming model. A central goal running on each processor. Clearly, the system will come
of the model is to save energy by keeping the cores in a low- to a halt if no events are generated, and real-time
power state, only responding to events of interest. To this performance will be lost if a processor is overwhelmed
effect, application programs do not control execution flow; by incoming events. In practice, careful mapping of a
they can only indicate the functions, referred to as callbacks, model onto the system can avoid both eventualities.

Vol. 102, No. 5, May 2014 | Proceedings of the IEEE 657


Furber et al: The SpiNNaker Project

Fig. 6. Event-driven software framework.

VI I. EXAMPLE LOW-LEVEL The simulation is started on each processor from


APPLICATION c_main. The chip and core addresses are found, then the
As a simple example of a parallel program that runs on top of initialization routines are called:
the SpiNNaker API, here are the key features of a simple void c main ðvoidÞ
example that implements Conway’s Life cellular automaton. f
First, the program should include the API calls: uint chip ¼ spin1 get chip id ðÞ;
#include hspin1a pi:hi uint core ¼ spin1 get core id ðÞ;
set up route tables ðchip; coreÞ;
Then, we need routines to set up the initial state of the init Life state ðchip; coreÞ;
automaton and the routing tables. In this case, setting up
the routing tables is by far the most complex aspect of the The timer period is set to 1 ms, and the event callbacks
programming task as the Life neighbor connections must are set up with appropriate priorities (packet received is
be established between processors across chip boundaries. usually at the highest priority):

void set up route tables spin1 set timer tickð1000Þ;


ðuintchip; uintcoreÞf. . .g spin1 callback on ðTIMER TICK;
void init Life state ðuintchip; uintcoreÞf. . .g tick callback; 1Þ;
spin1 callbacko nðMC PACKET RECEIVED,
Now we must define the event-driven callback routines. pkt in; 1Þ;
In this example, the relevant events are timer tick and an
incoming packet: Finally, the simulation is started:

void tick callback ðuintticks; uintdummyÞf. . .g spin1 startðÞ;


void pkt in ðuintkey; uintdataÞf. . .g g

658 Proceedings of the IEEE | Vol. 102, No. 5, May 2014


Furber et al: The SpiNNaker Project

VIII . PARTIONING AND


CONFIGURATION MANAGER
The example described in Section VII shows how API-
based applications can set up the simulation parameters,
SDRAM content, and routing tables with an algorithmic
process. While for simple or highly structured problems
this is possible, modeling networks with arbitrary inter-
connectivity and arbitrary neural types is a problem where
a further level of abstraction can be introduced. Config-
uring a million-core machine, with each core modeling up
to a thousand neurons and a million synapses, rapidly
becomes an intractable problem: one billion neurons need
Fig. 7. The flow of algorithms (left) and the data representations they
to be mapped and one trillion synapses need to be routed work on (right) within PACMAN.
to implement a user-specified model.
To solve this problem, we introduce PACMAN [10], a
software layer that enables users to write their model using information present in the system library. Finally, the
a standardized interface, translate it, and run it on whole model is translated into machine-executable code
SpiNNaker. The software is designed to keep different for each component (ARM cores, SDRAM, routers), using
concerns separated: users interface with the platform the translation mechanism stored in the model library,
through domain-specific, neural languages already present loaded onto the system, and executed.
in the scientific milieu, such as PyNN [12] or Nengo [13]. A simple example network is illustrated in Fig. 8 (left):
PACMAN is the set of algorithms that translate a model excitatory and inhibitory populations are recurrently
into machine-executable code. Such algorithms operate on interconnected. The ratio of excitatory to inhibitory
data representing the network model, information about neurons is set to 4 : 1 to keep a balance between excitation
the system (topology, fault status, etc.), and methods for and inhibition.
data structure translation. The network can be represented in PyNN [12], first by
PACMAN maps, routes, and translates network models creating the two populations of neurons, for a total of n
using populations of neurons and projections between them, neurons, with a set of parameters:
rather than single neurons and synapses. This approach
cell params ¼ f‘tau refrac’: 5.0, ‘v thresh’:50.0,
reduces the complexity of the algorithms involved in the
‘v reset’:60.0, ‘tau m’: 20.0, ‘tau syn E’: 5.0,
translation process, by exploiting the hierarchies present in a
‘tau syn I’: 10.0, ‘v rest’:49.0, ‘cm’: 0.2}
neural network. This choice is justified by studies on the
ex ¼ Populationðn  n=5; IF curr exp;
structure of the central nervous systems, where functionally
cell paramsÞ
segregated areas are interconnected by axonal pathways [14],
in ¼ Populationðn=5; IF curr exp, cell paramsÞ
and where cortical areas show a remarkably regular laminar
structure, with different layers of neurons stereotypically The resting potential is located above the threshold
connected in a canonical circuit [15]. Finally, many neural potential to induce spontaneous firing in all cells. Populations
languages [12], [13], [16], [17] use this abstraction natively, are interconnected by means of a FixedProbability-Connector,
making it a natural choice. which connects all the neurons in the presynaptic population
Using a neural language as a user interface makes the
platform more accessible to nonexperts, giving the users a
familiar environment to develop models and analyze
results, while hiding the complexity of configuring a
parallel system and encouraging model sharing across
different platforms. The translation process is performed
by PACMAN as illustrated in Fig. 7, which shows the flow
of the algorithms used to translate and execute the models
(left) and the data representations they work on (right).
The model is represented in terms of populations and
projections in the model view. It is then partitioned,
splitting populations, while preserving their interconnec-
tivity structure, accordingly to machine-specific con-
straints, depending on the neural and synaptic capacity
of each core. The model is represented in a digraph-like Fig. 8. Example network (left) with one excitatory and one inhibitory
structure (PACMAN view), and then mapped and routed population with a size ratio of 4 : 1 is mapped by PACMAN onto 60
on a physical machine instance (system view), using the processors on four SpiNNaker chips (right).

Vol. 102, No. 5, May 2014 | Proceedings of the IEEE 659


Furber et al: The SpiNNaker Project

to all the neurons in the postsynaptic population with a pro- I X. TYPICAL APPLICATIONS
bability p, weight w, and delay d. In this section, we review some scenarios highlighting the
con ¼ FixedProbabilityConnector ðp connect ¼ flexibility of the SpiNNaker platform, and present an
p; w; d) experiment running on a robot equipped with AER sensors
e e ¼ Projection ðex; ex; con; target ¼ and a 48-node SpiNNaker board.
‘excitatory’) With the hardware and software infrastructure pre-
e i ¼ Projection ðex; in; con; target ¼ sented in the previous sections we have simulated networks
‘excitatory’) with up to 250 000 neurons and 80 million synapses in real
i i ¼ Projection ðin; in; con; target ¼ time on a 48-node SpiNNaker board (as shown in Fig. 2)
‘inhibitory’) within a power budget of 1 W per SpiNNaker package
i e ¼ Projection ðin; ex; con; target ¼ (containing a SpiNNaker chip and a 128-MB SDRAM; see
‘inhibitory’) Fig. 3). In terms of spike delivery (the dominant cost in
neural simulations [18]) and power consumption, these
All the projections coming from the excitatory population
experiments show 1.8 billion connections per second, using
target excitatory synapses; conversely, all the projections
a few nanojoules per event and per neuron [19], and
coming from the inhibitory population target inhibitory
represent the maximum sustainable throughput of the
synapses.
system with the current software infrastructure.
PACMAN automatically partitions and maps the
Good power efficiency has also been demonstrated in a
network as illustrated in Fig. 8 (right), which shows an
biologically plausible model of cortical microcircuitry
example where the total number of neurons n is 6000, and
inspired by previous work [15], [20], comprising 10 000
each core maps 100 neurons. As a result, the model needs
Izhikevich neurons, replicating spiking dynamics found in
to be partitioned into 48 excitatory and 12 inhibitory
the cortex, and 40 million synapses in real time [21], while
subgroups, each to be allocated to a single core of a
the flexibility of the platform can be used to explore novel
physical machine, with the system library providing the
algorithms for learning [22].
geometry (in this case, a four-chip board) and the
functional status of the platform. The model library
provides the translation methods for the IF_curr_exp A. Interface With Nengo
neuron type (a leaky integrate and fire with exponential While with PyNN it is possible to define arbitrary
decaying synapses), its parameters, and its synapses. Fig. 9 network structures, using the neural engineering frame-
shows results of 1 s of simulation in the form of a raster work (NEF) [23], it is possible to encode functions and
plot, where each dot represents a spike from a neuron dynamical systems in networks of spiking neurons. Using
(ordinate) in time (abscissa). Red (blue) dots represent the NEF, it is possible to build complex cognitive
spikes from excitatory (inhibitory) neurons; the inter- architectures such as SPAUN [24], a spike-based functional
connectivity parameters are set to give rise to the model of the brain that makes comparisons with human
oscillatory activity shown in the figure. neural and behavioral data possible. SpiNNaker has,
therefore, been interfaced with Nengo [25], the software
that implements the NEF, enabling users to create neural
networks by specifying the functions to be computed [13].
Nengo translates the functions into neural circuitry by
calculating neuronal and connectivity parameters, while
PACMAN distributes and configures the model on the
board. Through the use of the NEF, SpiNNaker becomes a
‘‘neural computational box’’: input values and vectors are
encoded in spiking activity using the NEF principles
directly on the SpiNNaker board. The desired computation
is performed in real time by spiking neurons, and output
values and vectors are decoded from spiking activity.
Interfacing with Nengo shows how different front–ends
can be interfaced with PACMAN and how flexibly the
platform can be programmed with specialized neural
kernels, such as the ones performing the NEF encoding
and decoding processes.

Fig. 9. Raster plot of the results of running the simulation of the


network shown in Fig. 8. Each dot represents one neural spike; red dots B. Interface With AER sensors
are excitatory neurons, and blue dots are inhibitory neurons. Biological inspiration is not confined to the exploration
Oscillatory activity is visible across the network. of computational architectures and methods, but is also

660 Proceedings of the IEEE | Vol. 102, No. 5, May 2014


Furber et al: The SpiNNaker Project

extended to neuromorphic [26] sensors. Millisecond-precise C. Integration With Robotic Platforms


pulse encoding has been used to explain the ability of the While integration with AER sensors exploits the event-
visual system to process information and to recognize driven nature of the system, interfacing it with robotic
complex, dynamical scenes quickly [27]. With the first platforms in real environments shows SpiNNaker’s real-
observable differences in the temporal lobe starting after time characteristics.
150 ms of the stimulus onset, and with several synaptic stages As with AER sensors, the robotic platform becomes
required to arrive at the infero-temporal cortex (IT, the visual available at the model level using PyNN or Nengo, while the
area where object recognition takes place), neurons can emit system is configured automatically using PACMAN, enabling
at most one spike to encode the information, and are believed message transmission to and from the robot and the sensors
to encode it in the spike timing [28]. through a small customized interface board [37]. The robot is
AER sensors can be used to exploit the temporal a custom omnidirectional mobile platform, with embedded
characteristics of sensory information with event-based low-level motor control and elementary sensory systems,
approaches. Silicon retinae [29]–[31], for example, take developed by the Neuroscientific System Theory group of the
inspiration from their biological counterparts to implement Technische Universität München (Munich, Germany). The
an alternative approach to frame-based image processing on a overall system is a standalone, autonomous, reconfigurable
neuromorphic substrate. Each pixel operates asynchronous- robotic platform with no personal computer in the loop.
ly, sending an AER message within a few microseconds of a We demonstrate a closed perception–action loop in an
local light intensity change without having to wait for a example where the robot agent has to discriminate between
complete frame to be scanned, resulting in a reduction of two different stimuli and move toward the preferred one
latency and redundancy in visual information transmission. (a ‘‘þ’’), while backing off from the detractor (an ‘‘’’).
For an example showing the benefits of event-based over This is a small model that uses less than 10% of the
frame-based systems, see the European Union ‘‘Convolution resources on the 48-node board, but it serves to illustrate a
Address Event Representation (AER) Vision Architecture for number of the capabilities of the system.
Real-Time’’ project [32]. The network structure used is represented in Fig. 10. The
These sensors use native event-based processing and AER two populations representing the different polarities of a
representation to encode sensory information, and can, 128 128 silicon retina are instantiated, as illustrated in
therefore, be interconnected directly to SpiNNaker, which Section IX-B. These populations are connected to four
acts as an event-based computing platform. In collaboration different feature maps, representing the result of the
with the Instituto de Microelectronica de Seville (Sevilla, convolution between the retinal input and a kernel
Spain) we have connected a silicon retina to SpiNNaker represented as the white insert in the four feature maps in
using an FPGA [33], which translates incoming retinal AER Fig. 10, where the black lines represent excitatory connec-
events to the self-timed 2-of-7 protocol used by SpiNNaker tions while the white surround represents inhibitory flanks.
interchip links, directly injecting spikes (MC packets) into This operation, computed in parallel by all feature maps by
the packet-switched network fabric. Using this mechanism, means of spiking events, is similar to the one performed by
the sensor is represented on SpiNNaker as a ‘‘virtual chip.’’ the mammalian primary visual cortex, where cells are
At the model level, the silicon retina can be instantiated in selectively active accordingly to the stimulus orientation
PyNN as: [38], as previously done in a model of visual attention
pol 0; pol 1 ¼ p:instantiate retinað Þ running on a four-node SpiNNaker board [33]. Different
feature maps inhibit each other in order to enhance response
creating two populations (‘‘pol_0’’ and ‘‘pol_1’’, one for contrast. The following layer behaves as a local combination
each ‘‘polarity,’’ encoding increasing and decreasing of oriented edge detectors, similar to the first layers of the
luminance, respectively) where neurons are topographi- HMAX model, a model of object recognition inspired by the
cally organized in a 2-D visual field. These populations visual cortex [39]. If the ‘‘þ’’ is recognized (as a combination
produce spikes whenever the silicon retina emits an event, of vertical and horizontal edges), the agent is driven forward
and can arbitrarily be interconnected to other populations toward the preferred stimulus; conversely, if an ‘‘’’ is
in the model. PACMAN automatically maps each popula- detected as a combination of þ=45 oriented lines, the
tion to a specific model instantiation, preserving the robot moves backward.
connectivity information. Robot movements are controlled by the output popula-
Analogous interfaces with AER sensors have been tion, comprising two ‘‘motor’’ neurons (one for moving for-
developed in collaboration with the Institute of Neuroin- ward and one for moving backward), represented by the two
formatics (Zurich, Switzerland; using the DVS sensor [30] vertical bars in Fig. 10.
and the ‘‘silicon cochlea’’ [34]), with the Biology Group at The retina and the robot are accessible through PyNN,
the University of Osaka (Osaka, Japan; using a sensor which is also used to describe the rest of the network
inspired by the sustained and transient responses of the model, performing different steps of visual processing and
retina [35]), and with the Institute of Vision (Paris, France; orienting its response to the location where a preferred
using the ATIS silicon retina [36]). stimulus is detected.

Vol. 102, No. 5, May 2014 | Proceedings of the IEEE 661


Furber et al: The SpiNNaker Project

Fig. 10. Example robotic closed perception–action loop. A ‘‘þ’’ is shown to the robot, which extracts and combines the vertical and horizontal
lines, moving forward. Gray kernels and dashed lines represent the fact that the pathways for the ‘‘’’ detection are not activated,
as a ‘‘þ’’ is presented.

X. FUTURE PLANS mentation of the University of Waterloo (Waterloo, ON,


Current SpiNNaker hardware has seen use across the Canada) SPAUN model [24]. This is expected to require a
computational neuroscience and neurorobotic communi- system of around 36 48-node SpiNNaker boards, or 30 000
ties. All of the major hardware functions required to build processors, though this estimate should come down with
larger machines have now been developed and tested, and Nengo support for sparse connectivity and reduced firing
the remaining tasks to build larger machines are now rates, and will be a solid demonstration of the capability of
primarily related to the manufacture of further packages the SpiNNaker machine as a platform to support large-scale
and PCBs. real-time spiking neural models.
A major commitment over the next two years is to deliver a
machine with at least half a million processors as a contribution
to the European Union Flagship Human Brain Project (HBP), XI. RELATED WORK
where SpiNNaker will be one of the neuromorphic ‘‘platforms’’ While SpiNNaker represents a particular combination of
offered to the wider HBP community. digital many-core computing with a lightweight commu-
An earlier, less formal, commitment is to demonstrate nications infrastructure tuned to modeling large-scale
the capability of SpiNNaker to support a real-time imple- spiking neural networks in biological real time, there are a

662 Proceedings of the IEEE | Vol. 102, No. 5, May 2014


Furber et al: The SpiNNaker Project

number of other designs that take a different approach to problem domain and developing the architecture, silicon,
achieve similar end goals [40]. These various approaches and software infrastructure. While the software develop-
can be classified according to whether they use digital or ment will be ongoing, the architecture and silicon are now
analog technology to model the neurons and synapses, the working reliably and delivering very much as originally
communications topology employed, and the support for anticipated [1].
synaptic plasticity. The process of delivering the potential of the SpiNNaker
Digital models may be implemented on conventional platform is now underway, and early indications are largely
general-purpose computers, including cluster machines positive. The platform is proving flexible, relatively easy to
and high-performance computers, or on special-purpose use (though there is always room for improvement in this
hardware such as FPGAs [41], [42], graphics professor dimension), and capable of delivering useful results across a
units [43], or custom silicon [44]. Analog models [45] may wide range of application areas.
be subthreshold [46], whereupon biological real-time As the platform is scaled up toward the ultimate
performance is achievable, or above threshold [47], where million-core machine, new challenges will emerge, partic-
the circuits are likely to be much faster than biological real ularly in the area of management, application mapping and
time. Notable large-scale projects include the following. loading performance, the observability of activity within
the machine, and most notably with debugging large-scale
• The Stanford Neurogrid [46] employs subthresh- models running on the machine. All of these are ongoing
old analog circuits with digital spanning tree AER areas of research and development, but with help and
communications [48] for real-time neural model- feedback from a growing (and so far very forgiving)
ing. Neurogrid can model a million neurons in real community of users, and secure funding within the HBP
time while consuming only 3 W. It combines alongside a number of other funded projects that will
unicast and multicast digital routing with analog support extensive use of the platform at the University of
signaling across a local ‘‘diffusion network.’’ Manchester [including a European Research Council
• The IBM neurosynaptic core [49] employs custom Advanced Grant and several Engineering and Physical
digital circuits to achieve a one-to-one correspon- Sciences Research Council (EPSRC)-funded collaborations],
dence between the hardware and software simulation we are committed to continued improvement of the
models. It is intended to form a generic cognitive capabilities of the platform.
subsystem [44]. It uses AER communication. The time is right to scale up our ambition to under-
• The Heidelberg HICANN system [47] employs stand the information processing principles at work in the
wafer-scale above threshold analog circuits that brain, and the SpiNNaker platform has been designed to
operate at 104 x biological real time using a two- deliver a broad capability to support this ambition. The
layer AER protocol, one layer for intrawafer next five years will be crucial in determining the extent to
communication and a second layer for interwafer which we can succeed in delivering a platform with the
communication. capabilities required to support the global brain research
• The Cambridge BlueHive system [41] employs digital program. h
circuits on FPGAs to deliver real-time performance.
The communication is not pure AER; multicast is
implemented using a set of ‘‘fan-out’’ messages that
carry the destination, weight, and delay. Acknowledgment
The SpiNNaker project has benefited from contr-
These examples illustrate the diversity of approaches ibutions from many people in the team at the University
taken to address the problem of modeling large-scale of Manchester, collaborators at the universities of
systems of spiking neurons in real time or faster. There Southampton, Cambridge, and Sheffield, industry partners,
are arguments on both sides of the analog/digital divide and many external collaborators, only some of whom has
(for example, energy-efficiency favors analog, whereas there been space to mention in the text, but the authors
flexibility and repeatability favors digital), and on most would like to note here the contribution of J. Conradt of
other design decisions, so the area is still wide open to new the Technische Universität München (Munich, Germany)
ideas, and rather lacking in robust benchmarks that can who developed the robot platform shown in Fig. 10. They
be used to make quantitative comparisons between alter- particularly wish also to acknowledge the benefits accrued
native approaches. from participation in the Capo Caccia and Telluride
neuromorphic workshops, and they are grateful to the
organizers for the opportunities for collaborations that have
XII. CONCLUSION emerged from these workshops. The authors would also
The SpiNNaker project has been 15 years since conception like to acknowledge the helpful comments and feedback
and eight years in (funded) execution. Much time and from the anonymous reviewers of the first draft of this
effort has gone into understanding the brain-modeling paper.

Vol. 102, No. 5, May 2014 | Proceedings of the IEEE 663


Furber et al: The SpiNNaker Project

REFERENCES [17] D. Goodman, ‘‘Brian: A simulator for spiking for high-speed visual object recognition and
neural networks in Python,’’ Front. Neuroinf., tracking,’’ IEEE Trans. Neural Netw., vol. 20,
[1] S. Furber and S. Temple, ‘‘Neural systems vol. 2, 2008, DOI: 10.3389/neuro.11.005. no. 9, pp. 1417–1438, Sep. 2009.
engineering,’’ J. Roy. Soc. Interface, vol. 4, 2008.
no. 13, pp. 193–206, 2007. [33] F. Galluppi, K. Brohan, S. Davidson,
[18] H. Plesser, J. Eppler, A. Morrison, T. Serrano-Gotarredona, J. Carrasco,
[2] J. J. Wade, L. J. McDaid, J. Harkin, V. Crunelli, M. Diesmann, and M. O. Gewaltig, ‘‘Efficient B. Linares-Barranco, and S. B. Furber, ‘‘A
and J. A. S. Kelso, ‘‘Bidirectional coupling parallel simulation of large-scale neuronal real-time, event-driven neuromorphic system
between astrocytes and neurons mediates networks on clusters of multiprocessor for goal-directed attentional selection,’’ in
learning and dynamic coordination in computers,’’ Euro-Par 2007 Parallel Processing, Neural Information Processing, vol. 7664,
the brain: A multiple modeling approach,’’ vol. 4641, Berlin, Germany: Springer-Verlag, Berlin, Germany: Springer-Verlag, 2012,
PLoS ONE, vol. 6, no. 12, 2011, DOI: 10.1371/ 2007, pp. 672–681. pp. 226–233.
journal.pone.0029445.
[19] E. Stromatias, F. Galluppi, C. Patterson, and [34] A. van Schaik and S. Liu, ‘‘AER EAR: A
[3] M. Mahowald, An Analog VLSI System for S. Furber, ‘‘Power analysis of large-scale, matched silicon cochlea pair with address
Stereoscopic Vision. Boston, MA, USA: real-time neural networks on SpiNNaker,’’ in event representation interface,’’ in Proc. IEEE
Kluwer, 1994. Proc. Int. Joint Conf. Neural Netw., 2013, Int. Symp. Circuits Syst., 2005, pp. 4213–4216.
[4] K. Boahen, ‘‘Point-to-point connectivity pp. 1570–1577. [35] S. Kameda and T. Yagi, ‘‘An analog VLSI chip
between neuromorphic chips using address [20] E. Izhikevich and G. Edelman, ‘‘Large-scale emulating sustained and transient response
events,’’ IEEE Trans. Circuits Syst., vol. 47, model of mammalian thalamocortical channels of the vertebrate retina,’’ IEEE Trans.
no. 5, pp. 416–434, May 2000. systems,’’ Proc. Nat. Acad. Sci. USA, vol. 105, Neural Netw., vol. 14, no. 5, pp. 1405–1412,
[5] Top 500 Supercomputers, Jun. 2013. no. 9, pp. 3593–3598, Mar. 2008. Sep. 2003.
[Online]. http://top500.org/lists/2013/06/ . [21] T. Sharp, F. Galluppi, A. D. Rast, and [36] C. Posch, D. Matolin, and R. Wohlgenannt,
[6] S. B. Furber, D. R. Lester, L. A. Plana, S. B. Furber, ‘‘Real-time simulation of ‘‘A QVGA 143 dB dynamic range
J. D. Garside, E. Painkras, S. Temple, and detailed cortical microcircuits on asynchronous address-event PWM dynamic
A. D. Brown, ‘‘Overview of the SpiNNaker SpiNNaker,’’ J. Neurosci. Methods, vol. 210, image sensor with lossless pixel-level video
system architecture,’’ IEEE Trans. Comput., no. 1, pp. 110–118, 2012. compression,’’ in Proc. Int. Solid-State Circuits
vol. 62, no. 12, pp. 2454–2467, [22] S. Davies, F. Galluppi, A. D. Rast, and Conf., 2010, pp. 400–401.
Dec. 2013. S. B. Furber, ‘‘A forecast-based STDP rule [37] C. Denk, F. Llobet-Blandino, F. Galluppi,
[7] D. Vainbrand and R. Ginosar, ‘‘Scalable suitable for neuromorphic implementation,’’ L. Plana, S. Furber, and J. Conradt, ‘‘Real-time
network-on-chip architecture for configurable J. Neural Netw., vol. 32, pp. 3–14, 2012. interface board for closed-loop robotic tasks
neural networks,’’ Microprocessors Microsyst., [23] C. Eliasmith and C. H. Anderson, Neural on the SpiNNaker neural computing system,’’
vol. 35, no. 2, pp. 152–166, 2011. Engineering: Computation, Representation, and Artificial Neural Networks and Machine
[8] E. Painkras, L. A. Plana, J. D. Garside, Dynamics in Neurobiological Systems. LearningVICANN 2013, vol. 8131,
S. Temple, F. Galluppi, C. Patterson, Cambridge, MA, USA: MIT Press, 2003. Berlin, Germany: Springer-Verlag, 2013,
D. R. Lester, A. D. Brown, and S. B. Furber, pp. 467–474.
[24] C. Eliasmith, T. Stewart, X. Choo, T. Bekolay,
‘‘SpiNNaker: A 1 W 18-core system-on-Chip T. DeWolf, Y. Tang, and D. Rasmussen, ‘‘A [38] D. H. Hubel and T. N. Wiesel, ‘‘Receptive
for massively-parallel neural network large-scale model of the functioning brain,’’ fields, binocular interaction and functional
simulation,’’ IEEE J. Solid-State Circuits, Science, vol. 338, no. 6111, pp. 1202–1205, architecture of the cat’s cortex,’’ J. Physiol.,
vol. 48, no. 8, pp. 1943–1953, Aug. 2013. 2012. vol. 160, pp. 106–154, 1962.
[9] C. Patterson, F. Galluppi, A. D. Rast, and [25] F. Galluppi, S. Davies, S. Furber, T. Stewart, [39] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber,
S. B. Furber, ‘‘Visualising large-scale neural and C. Eliasmith, ‘‘Real time on-chip and T. Poggio, ‘‘Robust object recognition
network models in real-time,’’ in Proc. Int. implementation of dynamical systems with with cortex-like mechanisms,’’ IEEE Trans.
Joint Conf. Neural Netw., Brisbane, Australia, spiking neurons,’’ in Proc. Int. Joint Conf. Pattern Anal. Mach. Intell., vol. 29, no. 3,
Jun. 10–15, 2012, DOI: 10.1109/IJCNN.2012. Neural Netw., 2012, DOI: 10.1109/IJCNN. pp. 411–426, Mar. 2007.
6252490. 2012.6252706. [40] R. Cattell and A. Parker, ‘‘Challenges for brain
[10] F. Galluppi, S. Davies, A. D. Rast, T. Sharp, [26] C. A. Mead, Analog VLSI and Neural emulation: Why is it so difficult?’’ Natural
L. A. Plana, and S. B. Furber, ‘‘A hierarchical Systems. Reading, MA, USA: Intell., vol. 1, no. 3, pp. 17–31, 2012.
configuration system for a massively parallel Addison-Wesley, 1989. [41] P. J. Fox, S. W. Moore, S. J. T. Marsh,
neural hardware platform,’’ in Proc. 9th Conf. A. T. Markettos, and A. Mujumdar,
Comput. Front., 2012, pp. 183–192. [27] S. Thorpe, D. Fize, and C. Marlot, ‘‘Speed of
processing in the human visual system,’’ ‘‘BluehiveVA field-programmable custom
[11] T. Sharp, L. A. Plana, F. Galluppi, and Nature, vol. 381, pp. 520–522, 1996. computing machine for extreme-scale
S. B. Furber, ‘‘Event-driven simulation real-time neural network simulation,’’ in Proc.
of arbitrary spiking neural networks on [28] R. Van Rullen and S. Thorpe, ‘‘Rate coding
IEEE 20th Int. Symp. Field-Programmable
SpiNNaker,’’ Neural Information Processing, versus temporal order coding: What the
Custom Comput. Mach., 2012, pp. 133–140.
Lecture Notes in Computer Science, Berlin, retinal ganglion cells tell the visual cortex,’’
Neural Comput., vol. 13, no. 6, pp. 1255–1283, [42] A. S. Cassidy, J. Georgiou, and A. G. Andreou,
Germany: Springer-Verlag, 2011, vol. 7064, ‘‘Design of silicon brains in the nano-CMOS
pp. 424–430. 2001.
era: Spiking neurons, learning synapses and
[12] A. Davison, D. Brüderle, J. Eppler, [29] C. Mead and M. Mahowald, ‘‘A silicon model
neural architecture optimization,’’ Neural
J. Kremkow, E. Muller, D. Pecevski, of early visual processing,’’ Neural Netw.,
Netw., vol. 45, pp. 4–26, Jun. 2013.
L. Perrinet, and P. Yger, ‘‘PyNN: A common vol. 1, no. 1, pp. 91–97, 1988.
[43] A. K. Fidjeland and M. P. Shanahan,
interface for neuronal network simulators,’’ [30] P. Lichtsteiner, C. Posch, and T. Delbruck, ‘‘A
‘‘Accelerated simulation of spiking neural
Front. Neuroinf., vol. 2, 2008, DOI: 10.3389/ 128 128 120 dB 15 s latency asynchronous
networks using GPUs,’’ in Proc. Int. Joint Conf.
neuro.11.011.2008. temporal contrast vision sensor,’’ IEEE J. Solid
Neural Netw., Jul. 2010, DOI: 10.1109/IJCNN.
[13] T. C. Stewart, B. Tripp, and C. Eliasmith, State Circuits, vol. 43, no. 2, pp. 566–576,
2010.5596678.
‘‘Python scripting in the Nengo simulator,’’ Feb. 2008.
[44] J. V. Arthur, P. A. Merolla, F. Akopyan,
Front. Neuroinf., vol. 3, 2009, DOI: 10.3389/ [31] J. Lenero-Bardallo, T. Serrano-Gotarredona,
R. Alvarez-Icaza, A. Cassidy, S. Chandra,
neuro.11.007.2009. and B. Linares-Barranco, ‘‘A 3.6s latency
S. K. Esser, N. Imam, W. Risk, D. Rubin,
[14] P. Hagmann, L. Cammoun, X. Gigandet, asynchronous frame-free event-driven
R. Manohar, and D. S. Modha, ‘‘Building
R. Meuli, C. J. Honey, V. J. Wedeen, and dynamic-vision-sensor,’’ IEEE J. Solid State
block of a programmable neuromorphic
O. Sporns, ‘‘Mapping the structural core of Circuits, vol. 46, no. 6, pp. 1443–1455,
substrate: A digital neurosynaptic core,’’ in
human cerebral cortex,’’ PLoS Biol., vol. 6, Jun. 2011.
Proc. Int. Joint Conf. Neural Netw., Jun. 2012,
no. 7, Jul. 2008, DOI: 10.1371/journal.pbio. [32] R. Serrano-Gotarredona, M. Oster, DOI: 10.1109/IJCNN.2012.6252637.
0060159. P. Lichtsteiner, A. Linares-Barranco,
[45] G. Indiveri, B. Linares-Barranco, T. Hamilton,
[15] T. Binzegger, R. Douglas, and K. Martin, ‘‘A R. Paz-Vicente, F. Gomez-Rodriguez,
A. Van Schaik, R. Etienne-Cummings,
quantitative map of the circuit of cat primary L. Camunas-Mesa, R. Berner,
T. Delbruck, S. C. Liu, P. Dudek, P. Hafliger,
visual cortex,’’ J. Neurosci., vol. 24, no. 39, M. Rivas-Perez, T. Delbruck, S. Liu,
S. Renaud, J. Schemmel, G. Cauwenberghs,
pp. 8441–8453, Sep. 2004. R. Douglas, P. Hafliger, G. Jimenez-Moreno,
J. Arthur, K. Hynna, F. Folowosele, S. Saighi,
A. Civit Ballcels, T. Serrano-Gotarredona,
[16] O. Gewaltig, M. Diesmann, and T. Serrano-Gotarredona, J. Wijekoon,
A. Acosta-Jimenez, and B. Linares-Barranco,
M.-O. Gewaltig, ‘‘NEST (NEural Simulation Y. Wang, and K. Boahen, ‘‘Neuromorphic
‘‘CAVIAR: A 45 k neuron, 5 M synapse,
Tool),’’ Scholarpedia, vol. 2, no. 4, 2007, DOI: silicon neuron circuits,’’ Front. Neurosci.,
12 G connects/s AER hardware
10.4249/scholarpedia.1430. sensory-processing- learning-actuating system

664 Proceedings of the IEEE | Vol. 102, No. 5, May 2014


Furber et al: The SpiNNaker Project

vol. 5, no. 23, May 2011, DOI: 10.3389/fnins. neuromorphic hardware system for [49] P. Merolla, J. Arthur, F. Akopyan, N. Imam,
2011.00073. large-scale neural modeling,’’ in Proc. Int. R. Manohar, and D. S. Modha, ‘‘A digital
[46] R. Silver, K. Boahen, S. Grillner, N. Kopell, Symp. Circuits Syst., 2010, pp. 1947–1950. neurosynaptic core using embedded crossbar
and K. Olsen, ‘‘Neurotech for neuroscience: [48] P. Merolla, J. Arthur, R. Alvarez, J.-M. Bussat, memory with 45 pj per spike in 45 nm,’’ in
Unifying concepts, organizing principles, and K. Boahen, ‘‘A multicast tree router for Proc. Custom Integr. Circuits Conf., Sep. 2011,
and emerging tools,’’ J. Neurosci., multichip neuromorphic systems,’’ IEEE DOI: 10.1109/CICC.2011.6055294.
pp. 11807–11819, Oct. 2007. Trans. Circuits Syst., 2013, DOI: 10.1109/TCSI.
[47] J. Schemmel, D. Bruderle, A. Grubl, M. Hock, 2013.2284184.
K. Meier, and S. Millner, ‘‘A wafer-scale

ABOUT THE AUTHORS


Steve B. Furber (Fellow, IEEE) was born in Steve Temple received the B.A. degree in com-
Manchester, U.K., in 1953. He received the B.A. puter science and the Ph.D. degree for research
degree in mathematics and the Ph.D. degree in into local area networks from the University of
aerodynamics from the University of Cambridge, Cambridge, Cambridge, U.K., in 1980 and 1984,
Cambridge, U.K., in 1974 and 1980, respectively, respectively.
and honorary doctorates from Edinburgh Univer- He was subsequently employed as a Research
sity, Edinburgh, U.K., in 2010 and Anglia Ruskin Fellow at the University of Cambridge Computer
University, Cambridge, U.K., in 2012. Laboratory. He was a self-employed computer
From 1978 to 1981, he was Rolls Royce consultant from 1986 to 1993 when he took up his
Research Fellow in Aerodynamics at Emmanuel current post of Research Fellow in the School of
College, Cambridge, U.K., and from 1981 to 1990, he was at Acorn Computer Science, University of Manchester, Manchester, U.K.
Computers Ltd., Cambridge, U.K., where he was a principal architect of
the BBC Microcomputer, which introduced computing into most U.K.
Luis A. Plana (Senior Member, IEEE) received the
schools, and the ARM 32-bit RISC microprocessor, over 40 billion of
Ingeniero Electrónico degree (cum laude) from
which have been shipped by ARM Ltd.’s partners. In 1990, he moved to
Universidad Simón Bolı́var, Caracas, Venezuela, in
the ICL Chair in Computer Engineering at the University of Manchester,
1978, the M.Sc. degree in electrical engineering from
Manchester, U.K., where his research interests include asynchronous
Stanford University, Stanford, CA, USA, in 1980, and
digital design, low-power systems on chip, and neural systems
the Ph.D. degree in computer science from Columbia
engineering.
University, New York, NY, USA, in 1998.
Prof. Furber is a Fellow of the Royal Society, the Royal Academy of
He was with Universidad Politécnica, Venezuela,
Engineering, the British Computer Society, the Institution of Engineering
for over 20 years, where he was a Professor of
and Technology and the Computer History Museum (Mountain View, CA).
Electronic Engineering. Currently, he is a Research
He was a Millennium Technology Prize Laureate (2010) and holds an IEEE
Fellow with the School of Computer Science, University of Manchester,
Computer Society Computer Pioneer Award (2013).
Manchester, U.K. His research interests include embedded system design
and on-chip interconnect.
Francesco Galluppi received the B.Sc. degree in
electronic engineering from the University of
Rome ‘‘Roma Tre,’’ Rome, Italy, in 2005, the
M.Sc. degree in cognitive psychology from the
University of Rome ‘‘La Sapienza,’’ Rome, Italy, in
2010, and the Ph.D. degree from the University of
Manchester, Manchester, U.K., in 2013, working on
neurally inspired hardware systems and sensors.
He visited the Departamento de Tecnologı́a
Electrónica, University of Málaga, Málaga, Spain,
in 2007, working on robotic assistive technologies. His interests lie in
studying the biological basis of human behavior, and how technology
interacts with it.

Vol. 102, No. 5, May 2014 | Proceedings of the IEEE 665

You might also like