Homogeneous and Heterogeneous Multicore Systems
Homogeneous and Heterogeneous Multicore Systems
Priyadharshini S
Engineer, ADAS Continental Autonomous Mobility,
Bangalore, India
Multicore systems can be categorized as Homogeneous They can be operated in single OS scheduling per core or
multicore systems and Heterogeneous multicore systems [1][3] common OS scheduling for all cores in the system[5]. A
popular example of a homogeneous system is the ARM
Homogeneous (Symmetric) Multicore QUAD-CORE Cortex-A53 system. [17]
Homogeneous cores display homogeneity in their
attributes, as they possess uniformity in their ability to execute Heterogeneous (Asymmetric) Multicore Systems
the same functions and possess an equivalent set of Heterogeneous cores are not identical. They can differ in
capabilities. capabilities, and speed, may lack certain features, or otherwise
perform a task differently.
Personal computer processors have homogenous cores,
which means that no power is consumed when a task is carried Modern high-end mobile phones tend to have
out on any of the processor's cores. A task is expected to be heterogeneous cores. Each core has its own specified
completed at the same time irrespective of which core it is functionality, different performance and power dissipation.
scheduled on. Heterogenous multicore systems can be operated in single OS
scheduling per core. Renesas R-Car-M3Ne uses
heterogeneous architecture.
The private peripheral bus is an APB(Advanced silicon bus and comprises three levels namely the Processor
Peripheral Bus) that is used to connect additional on-chip Local Bus (PLB), the On-chip Peripheral Bus (OPB) and the
peripherals like debug components. Fig.4[21] shows the AMBA Device Control Register (DCR) bus as shown in Fig. 5.
bus architecture devised for ARM processors.
PLB is the main on-chip bus system and is used to
B. Core Connect communicate with high-speed peripherals. It is used to connect
Core Connect is an on-chip silicon bus that was processors with on-chip memory, memory controllers and
developed by IBM for FPGA or ASIC designs[12]. other high-speed peripherals, including DMA controllers. It is
a synchronous, high-performance, arbitrated bus. It supports
CoreConnect technology is from IBM[12]CoreConnect concurrent read and writes operations for the same master. A
was developed for FPGA or ASIC designs. It is an on-chip central arbitration mechanism is used to grant access.
OPB is designed to support slower peripherals. It is Wishbone Supports Various IP Core Interconnections
implemented as a straightforward multimaster, arbitrated bus. Including
It is a fully synchronous, 32-bit address bus and 32-bit data
bus. Data transactions between master and slave take place in Point-to-point
a single clock and burst operations are supported. The masters Data flow interconnection
compete for the bus through arbitration. PLB transactions are Shared bus interconnection
converted into OPB transactions by the PLB to OPB Bridge. Crossbar switch
DCR is used for configuring on-chip device. It offers an Off-chip
alternative path to the system for setting the individual device
control registers. A bus arbiter decides which one of the multi- Fig 6 shows possible wishbone topologies and how
masters will be allowed to control the bus for each bus cycle. masters and slaves can be connected. Point-to-point is a simple
way of connecting master and slave[19]. Data flow
C. Wishbone interconnection is a connection in which the data flows
It is a portable interface for semiconductor IP cores[13][12]. through a set of IP cores in a sequential order that is
Design reusability is made possible by establishing a shared, prearranged. In shared bus interconnection only one master
logical interface between IP cores. Therefore, the system is can use the interconnection at a time, as it initiates the
more reliable, portable and the end user experiences a shorter addressable bus cycles for a slave. Crossbar switch
time to market. Wishbone is a standard for building IP cores, architecture helps to operate several channels in parallel. It
not an IP core in and of itself. The handshaking protocol increases the data transfer rate. In Fig 6 dotted lines denote a
implemented in Wishbone allows each IP core to regulate possible connection. Similar connections can be made to
speed.[14] establish communication channels. Off-chip is used when we
have an interface that extends off-chip.[14]
A. Isolation
Isolation physically and logically separates the resources
and software programs between the multiple cores. [7][11].
Physical isolation involves physically separating the When moving to multicore, the sequential execution
resources, such as memory and I/O devices, between different assumption in a single-core system can no longer be made
cores. Each core has its own dedicated resources and there is no secure. The applications now run simultaneously and have non-
sharing between cores. This approach ensures that there is no exclusive access to resources. Within a multicore environment,
interference between tasks or threads running on different cores. applications contend for access to system resources, which is
Fig 7 shows how two cores in a multicore system are physically typically mediated through hardware implementation in an
isolated from each other but may share some resources. implicit manner. The execution has non-deterministic temporal
delays as a response. A subset of the effects described in this
High availability is often valued more than performance. paper might be present in each processor architecture.
To achieve high system availability, redundant hardware is
commonly used to detect errors. Chip multiprocessors with a lot B. Energy Efficiency
of the same resources, such as cores, memory and Architects can use multicore processors to reduce the
interconnection networks, would seem to be the best starting number of embedded computers. Multinational systems
points for developing high-availability solutions on chips. But overcome excessive heat generation due to Moore’s law, which
on the contrary, doing so poses significant challenges with reduces the need for cooling. As less energy is released in the
respect to error containment and replacing the faulty form of heat while using multicore processing, battery life is
component. Increasing silicon and transient fault rates with increased while power usage is reduced. [2,9,10]
future technology scaling make the issue worse.
Various Energy Efficiency techniques in multicore
Temporal Isolation architecture can be implemented based on application.
It ensures that the execution of software on one core does
not impact the temporal behavior of software running on They are:
another core.
Voltage and Frequency scaling[15]
Cache configuration
Individual core control
Low-power and high-power domain configuration
C. Concurrency
Multicore processing improves the intrinsic support for
real (as opposed to virtual) parallel processing within
individual software applications across multiple applications
by allocating applications to various cores as illustrated in fig
8. Concurrency is the capacity for many tasks to be carried out
simultaneously and in any sequence without impacting the
output.
Fig 8: Temporal Isolation
Performance Metrics
Performance Metrics that are used in evaluating multicore
CPUs performance for applications.
Considering all these multicore evaluation parameters, a Fig 12: Performance and Realtime ARM based Multicore
few examples are discussed with respect to ARM architecture. Architecture
ARM Cortex-A cores are utilized for devices that run powerful
operating systems like Linux or Android which require High- The fig 10 shows an ARM based multicore architecture
performance and computational power. For real-time that can be considered where optimum power utilization is
applications which require deterministic interrupt response to main use case. There is always a tradeoff between power and
meet hard real time requirements, Cortex-R cores with features performance whenever we consider any system.
like tightly coupled memory are recommended. Cortex-M core
are optimized for low-power, cost-sensitive embedded The fig 11 represent a multicore architecture in which the
systems. An example of multicore architecture that can be cores are selected based on the performance of the cores. The
considered for power optimization based and performance and cores used for an application can also be a core cluster where
safety or real time based is shown in the figures below. two or more homogeneous cores can be used.
VI. SUMMARY [6]. M. Mitić and M. Stojčev, "A Survey of Three System-
on-Chip Buses: AMBA, CoreConnect and Wishbone,"
In modern automotive applications like ADAS, both International Journal of Electrical and Computer
homogeneous and heterogeneous categories of multi core Engineering
processors are widely used based on application. Using [7]. J. Zamorano and J. A. de la Puente, "Memory Isolation
multicore system, we can increase the performance of a system in Many-Core Embedded Systems,"
without increasing the clock frequency. In this paper, different [8]. N. Aggarwal, P. Ranganathan, N. P. Jouppi, and J. E.
parameters, considerations and factors like core-connect buses, Smith, "Configurable Isolation: Building High
isolation and energy efficiency, are discussed with reference to Availability Systems with Commodity Multicore
homogeneous and heterogeneous multicore systems. In Processors,"
general, where performance optimization, system reliability [9]. J. Cong and B. Yuan, "Energy-Efficient Scheduling on
and reduction of power consumption are the focus areas despite Heterogeneous Multicore Architectures,"
architectural and design complexity, heterogeneous systems [10]. A. Merkel and F. Bellosa, "Memory-aware Scheduling
are beneficial. However, if architectural & design simplicity is for Energy Efficiency on Multicore Processors."
the key criterion, homogeneous systems are a better choice. [11]. H. Omar, H. Dogan, B. Kahne, and O. Khan, "Multicore
The combination of both heterogenous and homogenous cores Resource Isolation for Deterministic, Resilient, and
also can be considered for system designs. All these aspects Secure Concurrent Execution of Safety-Critical
need to be evaluated in a balanced way to select an appropriate Applications," IEEE Computer Architecture Letters,
multicore system based on the application needs for migrating vol. 17, no.2, July Dec.2018
from an existing single core system to a multicore system. [12]. R. Usselmann, "OpenCores SoC Bus Review," Rev.
These topics can be part of further study by considering 1.0, January 9, 2001
specific multicore SoCs. [13]. "Specification for the: WISHBONE System-on-Chip
(SoC), Interconnection Architecture for Portable IP
ACKNOWLEDGEMENTS Cores," Revision: B.3, Released: September 7, 2002.
[14]. "Wishbone B4, WISHBONE System-on-Chip (SoC)
We sincerely wish to thank Mr. Sudeepth Puthumana, Interconnection, Architecture for Portable IP Cores,"
Head of Market Segments, Autonomous Mobility (AM) India, [15]. Hwang-cheng Wang and Alagan Anpalagan, "Energy-
Continental Tech Center India (TCI), Bangalore; Mr. Vinayaka efficient tasks scheduling algorithm for real-time
Nagaraja, Head of Market Japan, Korea and India; Mr. Praveen multiprocessor embedded systems," Journal of Systems
Kumar BL, Head of Application Projects, Japan, Korea and Architecture, vol. 57, no. 5, pp. 498-505, May 2011.
India Segment for providing this opportunity and giving doi: 10.1016/j.sysarc.2010.10.003.
valuable support to us for working on this study. We would like [16]. Nik Jedrzejewski, "Three Reasons Why Embedded
to thank all the members of Market Japan, Korea and India SW Heterogeneous Systems Are More Efficient," NXP
Architecture team, Mr. Srinath Murthy, Head of Market Blog, Jan. 30, 2019.
Europe and our sincere thanks to Corporate Communications [17]. ARM Cortex-A53 MPCore Processor Technical
Team and Technical Paper Publishing team at TCI for their Reference Manual,
cooperation. Ver:r0P4,https://developer.arm.com/documentation/dd
i0500/j/Introduction/About-the-Cortex-A53-processor.
REFERENCES [18]. IJSRD - International Journal for Scientific Research
and Development. (2014, May 24). A comparative
[1]. A.S. Radhamani, "Performance Analysis of Study of Different system-on-Chip Buses based on
Homogeneous and Heterogeneous Multicore Processor Industry standards: AMBA, CoreConnect and
Using Static and Dynamic Schedulers," Asian Journal Wishbone.
of Information Technology [19]. IJEERT - Kolte, Mahesh. (2014). Design and
[2]. C. Leech and T. J. Kazmierski, "Energy Efficient Verification Point-to-Point Architecture of Wishbone
Multicore Processing," Electronics Bus for System On Chip.International Journal of
[3]. Sergio Saponara and Luca Fanucci, Homogeneous and Emerging Engineering Research and Technology. 2.
Heterogeneous MPSoC Architectures with Network- 155-159.
On-Chip Connectivity for Low-Power and Real-Time [20]. ARM, "AMBA,"https://developer.arm.com/ip-
[4]. IntervalZero, "How to Optimize the Scalability & products/system-ip/amba.
Performance of a Multicore Operating System," [21]. J. Yiu, "CHAPTER 6 - Cortex-M3 Implementation
IntervalZero.com Overview," The Definitive Guide to the ARM Cortex-
[5]. Ajeya Naithani, Stijn Eyerman, Lieven Eeckhout. M3 (Second Edition), Newnes, 2010, pp. 99-108
Reliability-Aware Scheduling on Heterogeneous
Multicore Processors