0% found this document useful (0 votes)
17 views6 pages

PWRARBYNDBITSRAS

Uploaded by

vishnuks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views6 pages

PWRARBYNDBITSRAS

Uploaded by

vishnuks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Enablement

Running AMP, SMP or BMP Mode


for Multicore Embedded Systems
QNX Software Systems
Introduction Table 1: Three Approaches to Multiprocessing
Multicore processors have become Model How it Works Key Advantages
mainstream, introducing advanced levels Asymmetric A separate OS, or a separate copy Provides an execution environment similar
of performance to servers, desktops, multipro- of the same OS, manages each to that of uniprocessor systems, allowing
core. Typically, each software simple migration of legacy code. Also
netbooks and even the latest generation cessing process is locked to a single core allows developers to manage each core
(AMP) (e.g. process A runs only on core independently.
of tablets. The potential benefits for 1, process B runs only on core 2,
embedded systems are even greater. etc.).

Networking elements, medical Symmetric A single OS manages all processor Provides greater scalability and parallelism
multipro- cores simultaneously. The OS can than AMP, along with simpler shared
devices and defense and aerospace dynamically schedule any process resource management.
cessing on any core, enabling full utilization
applications are all growing in (SMP) of all cores.
complexity and demanding ever- Bound mul- A single OS manages all cores Combines the developer control of AMP
increasing computational power. tiprocessing simultaneously. As in SMP, the with the transparent resource management
OS can dynamically schedule of SMP. The option to lock threads to any
At the same time, many of these (BMP) processes on any core. However, core simplifies migration of legacy code
the developer can also lock any and allows designers to dedicate cores to
systems must continue to address process (and all of its associated specific operations.
the thermal dissipation and low power threads) to a specific core.
constraints inherent to embedded
devices. Freescale QorIQ processors To address these challenges, developers Multiprocessing Modes and
directly address these requirements must find tools that can analyze the the Role of the OS
by providing much better processing complex system-level behavior that
Developers must also choose the
capacity per watt and per square occurs in a multicore chip. At any
appropriate form of multiprocessing
inch than conventional single-core instant, threads can be migrating across
for their application requirements. This
processors. cores, communicating with threads
choice will determine how easily both
Multicore processors such as Freescale on other cores or sharing resources
new and existing code can achieve
QorIQ platforms are, in effect, with threads on other cores­—complex
maximum concurrency. As Table 1
multiprocessing systems-on-chip (SoC). interactions that conventional debug
illustrates, developers have three basic
Many Freescale SoCs have separate tools were never designed to analyze.
forms to choose from: Asymmetric
L1 and L2 caches per core, but use a Fortunately, vendors such as QNX multiprocessing (AMP), symmetric
shared L3 cache, memory subsystem, Software Systems have introduced multiprocessing (SMP) and bound
interrupt subsystem and peripherals. system tracing tools that provide multiprocessing (BMP).
To take advantage of these processors, a comprehensive view of multicore
embedded developers must graduate behavior, allowing the developer to
from a serial execution model, where visualize interactions between cores
software tasks take turns running on a and eliminate a variety of performance
single processor, to a parallel execution bottlenecks. Using the information that
model, where multiple software tasks these tools generate, the developer can
can run simultaneously. The more reduce resource contention, optimize
parallelism developers can achieve, thread migration, identify opportunities
the better their multicore systems will for parallelism and achieve maximum
perform. utilization of every processor core.

92
Beyond Bits Power Architecture Edition

Asymmetric Multiprocessing In the homogenous example shown In the heterogeneous example shown
(AMP) in Figure 1, one core of the P2020 in Figure 3, one core implements the
processor handles ingress traffic from control plane, while the other handles
AMP provides an execution environment
a hardware interface while the other all the data plane traffic, which has real-
similar to that of conventional
handles the egress traffic. Because time performance requirements. In this
uniprocessor systems, which most
the traffic exists as two independent case, the OSs running on the two cores
developers already know and
streams, the two cores don’t need both need to provide a consistent IPC
understand. Consequently, it offers a
to communicate or share data with mechanism, such as the transparent
relatively straightforward path for porting
each other. As a result, the OS doesn’t inter-process communication (TIPC)
legacy code. It also provides a direct
have to provide core-to-core IPC. It protocol, that allows the cores to
mechanism for controlling how the CPU
must, however, provide the real-time communicate efficiently, possibly
cores are used. And, in most cases,
performance needed to manage the through shared data structures.
it lets developers work with standard
traffic flows.
debugging tools and techniques.
Figure 3: AMP Control/Data
AMP can be either homogeneous, Figure 1: Using AMP Control/Data
Plane Plane
where each core runs the same type
Homogenous AMP to
and version of OS, or heterogeneous, Control Plane/Data Plane
Handle Both Ingress and
where each core runs either a different
Homogenous AMP: Ingress and Egress Traffic
Egress Traffic
OS or a different version of the same
Linux IPC QNX Neutrino
OS. In a homogeneous environment, Data Plane (Half-Duplex Mode)
developers can make best use of the Core 0 Core 1
multiple cores by choosing an OS that QNX Neutrino QNX Neutrino
offers a distributed programming model,
Core 0 Core 1
such as the QNX® Neutrino® RTOS.
In virtually all cases, OS support for a
Properly implemented, the model will
lean and easy-to-use communications
allow applications running on one core
Figure 2 shows another homogenous protocol will greatly enhance core-to-
to communicate transparently with
example, but this time both e500 core operation. In particular, an OS
applications and system services (e.g.
cores implement a distributed control built with the distributed programming
device drivers, protocol stacks) on
plane, with each core handling different paradigm in mind can take greater
other cores, but without the high CPU
aspects of a data plane. To control advantage of the parallelism provided by
utilization imposed by traditional forms
the data plane correctly, applications the multiple cores.
of interprocessor communication.
running on the multiple cores must
A heterogeneous environment has function in a coordinated fashion. To
somewhat different requirements. enable this coordination, the OS should
In this case, the developer must provide strong IPC support, such as a
either implement a proprietary shared memory infrastructure for routing
communications scheme or choose two table information.
OSs that share a common infrastructure
(likely IP based) for interprocessor Figure 2: Using
communications. To help avoid resource Homogenous AMP to
conflicts, the OSs should also provide
Implement a Distributed
standardized mechanisms for accessingHomogenous AMP
Control Plane
shared hardware components. In
virtually all cases, OS support for a Distributed Control Plane
lean and easy-to-use communications
QNX Neutrino IPC QNX Neutrino
protocol will greatly enhance core-to-
core operation. In particular, an OS Core 0 Core 1
built with the distributed programming
paradigm in mind can take greater
Data Plane Hardware
advantage of the parallelism provided by
the multiple cores.

freescale.com
93
Enablement

Figure System
AMP Multicore 4: AMP Multicore System CPU Utilization in
AMP Mode
Apps Apps Apps Apps In AMP mode, a process and all of its
OS 4
threads are locked to a single processor
OS 1 OS 2 OS 3
core. While this approach is useful for
running legacy code, it can result in
CPU CPU CPU CPU underutilization of processor cores.
For instance, if one core becomes
busy, applications running on that
core cannot, in most cases, migrate
System Interconnect to a core that has more CPU cycles
available (refer Figure 4). Though such
dynamic migration is possible, it typically
involves complex checkpointing of the
I/O I/O I/O I/O Memory Controller application’s state and can result in a
service interruption while the application
OS 1 Memory
is stopped on one core and restarted on
another. This migration becomes even
OS 2 Memory more difficult, if not impossible, if the
User management
of shared resources OS 3 Memory cores use different OSs.
complicates design
OS 4 Memory
Symmetric Multiprocessing
Shared Memory (SMP) Mode
Allocating resources in a multicore
design can be difficult, especially when
multiple software components are
unaware of how other components
are employing those resources. SMP
addresses many of the issues by
running only one copy of an OS across
all the chip’s cores. Because the OS
has insight into all system elements
at all times, it can allocate resources
on multiple cores with little or no input
from the application designer. By
running only one copy of the OS, SMP
can dynamically allocate resources
to specific applications rather than to
CPU cores, thereby enabling greater
utilization of available processing power.

94
Beyond Bits Power Architecture Edition

A well-designed SMP OS such as the


Figure 5:
SMP Multicore SMP
System Multicore System
QNX Neutrino RTOS allows the threads
Applications
of execution within an application to
run concurrently on any core. This
OS concurrency makes the majority of the
compute power of the chip available
to applications at nearly all times. If the
CPU CPU CPU CPU
OS provides appropriate preemption
and thread prioritization capabilities, it
can also help the application designer
ensure that CPU cycles go to the
System Interconnect application that needs them the most.
In the control plane scenario in
Figure 5, SMP allows all of the threads
in the various processes to run on any
I/O I/O I/O I/O Memory Controller
core. For instance, the command-line
interface (CLI) process can run at the
OS 1 Memory same time that the routing application
performs a compute-intensive
calculation.
OS transparently manages all
resource sharing and arbitration issues. Once designed, a process can run
equally well on a single-core, dual-core,
or N-core system, the only potential
Because a single OS controls every A single instance of the OS across
change being the number of threads
core, all intercore IPC is local. This all cores simplifies optimization and
that the application needs to create
reduces the memory footprint and debugging. Visualization tools such
to maximize performance. In full SMP
improves performance dramatically as the system profiler in the QNX
mode, an RTOS like QNX Neutrino
as the system no longer needs Momentics® Tool Suite can track
will schedule the highest-priority ready
a heavy networking protocol to thread migration from core to core,
thread to execute on the first available
implement communication between scheduling events, application-to-
CPU core. As a result, application
applications running on different cores. application messaging, CPU utilization
threads can utilize the full extent of
Communication and synchronization and other events, all with high-resolution
available CPU power rather than being
can take the simple form of POSIX timestamping.
restricted to a single CPU.
primitives (e.g. semaphores) or a native
lightweight local-transport capability
such as QNX distributed processing.

freescale.com
95
Enablement

Using BMP for Half-Duplex Mode


Bound Multiprocessing Figure 6: Using BMP for Half-Duplex Mode
(BMP) Mode
BMP, an approach pioneered by Data Plane (Half-Duplex Mode)
QNX Software Systems, offers the
benefits of SMP’s transparent resource QNX Neutrino (single copy)
management, but gives designers the Rx (Core 0) Tx (Core 1)
Core 0 Core 1
ability to lock any application (and all
of its threads) to a specific core to
help migrate uniprocessor code to a
multicore environment. Figure 7:
BMP forUsing BMP for Both Control-Plane andOperations
Control-Plane/Data-Plane
Data-Plane Operations
As with SMP, a single copy of the
OS maintains an overall view of all Control Plane/Data Plane
system resources, allowing them to
be dynamically allocated and shared CLI (Core 0) QNX Neutrino (single copy) Rx (Core 1)
among applications. But, during
application initialization, a setting OAM (Core 0) Core 0 Core 1 Tx (Core 1)
determined by the system designer
forces all of an application’s threads to DPM (Core 0)
execute only on a specified core.
Compared to full, floating SMP In the example shown in Figure 6, a A Solid Foundation
operation, BMP offers several BMP system is running in half-duplex
Making the leap to multicore processors
advantages. mode. A receive process with multiple
may seem daunting at first, but the
• Allows legacy applications written threads runs on core 0 and a transmit
benefits can far outweigh any potential
for uniprocessor environments to run process, also with multiple threads, runs
misgivings. The choice of hardware
correctly in a concurrent multicore on core 1. As in SMP, the OS is fully
and software is not to be taken lightly
environment, without modifications aware of what all the cores are doing,
and may dictate many design decisions
making operational and performance
• Eliminates the processor-cache and ultimately the success of the
information for the system as a whole
“thrashing” that can sometimes project. Selecting an OS that supports
readily available. This approach spares
reduce performance in an SMP processing models which allow for
developers the onerous task of having
system both migration of legacy code and full
to gather information from each of the
• Enables simpler application symmetric operation while minimizing
cores separately and then somehow
debugging than traditional SMP by complexity provides a solid foundation
combining that information for analysis.
running all execution threads within on which to build new products. Tools
In the example shown in Figure 7, play a key role as well. The ability to
an application on a single core
control plane applications (command- visualize thread-level interaction, CPU
• Supports simultaneous BMP and
line interface; operations, administration, utilization and other key variables across
SMP operation, allowing legacy
and maintenance; data plane multiple cores provides the developer
applications to coexist with
management) run on core 0, while data with a white-box view into system
applications that take full advantage
plane ingress and egress applications operation.
of parallelism of multicore hardware
run on core 1. Developers can easily
QNX Software Systems and
implement the IPC for this scenario,
Freescale have jointly supported
using either local OS mechanisms
many development programs using
or synchronized protected shared
multiprocessing in their products.
memory structures.
Starting more than 10 years ago with
dual MPC744x processors combined
with a discrete SMP system controller to
the latest QorIQ processors, QNX and
Freescale have been at the forefront of
multicore development.

96
Beyond Bits Power Architecture Edition

How to Reach Us:


Home Page: Information in this document is provided solely to enable system and software implementers to
freescale.com use Freescale Semiconductor products. There are no express or implied copyright license granted
hereunder to design or fabricate any integrated circuits or integrated circuits based on the information
Power Architecture in this document.
Portfolio Information:
Freescale Semiconductor reserves the right to make changes without further notice to any products
freescale.com/power
herein. Freescale Semiconductor makes no warranty, representation or guarantee regarding the
suitability of its products for any particular purpose, nor does Freescale Semiconductor assume any
e-mail:
liability arising out of the application or use of any product or circuit, and specifically disclaims any
[email protected]
and all liability, including without limitation consequential or incidental damages. “Typical” parameters
which may be provided in Freescale Semiconductor data sheets and/or specifications can and do
USA/Europe or Locations Not Listed: vary in different applications and actual performance may vary over time. All operating parameters,
Freescale Semiconductor including “Typicals” must be validated for each customer application by customer’s technical experts.
Technical Information Center, CH370
Freescale Semiconductor does not convey any license under its patent rights nor the rights of others.
1300 N. Alma School Road
Freescale Semiconductor products are not designed, intended, or authorized for use as components
Chandler, Arizona 85224
1-800-521-6274 in systems intended for surgical implant into the body, or other applications intended to support or
480-768-2130 sustain life, or for any other application in which the failure of the Freescale Semiconductor product
[email protected] could create a situation where personal injury or death may occur. Should Buyer purchase or use
Freescale Semiconductor products for any such unintended or unauthorized application, Buyer shall
Europe, Middle East, and Africa: indemnify and hold Freescale Semiconductor and its officers, employees, subsidiaries, affiliates, and
Freescale Halbleiter Deutschland GmbH distributors harmless against all claims, costs, damages, and expenses, and reasonable attorney
Technical Information Center fees arising out of, directly or indirectly, any claim of personal injury or death associated with such
Schatzbogen 7 unintended or unauthorized use, even if such claim alleges that Freescale Semiconductor was
81829 Muenchen, Germany negligent regarding the design or manufacture of the part.
+44 1296 380 456 (English)
+46 8 52200080 (English)
+49 89 92103 559 (German)
+33 1 69 35 48 48 (French)
[email protected]

Japan:
Freescale Semiconductor Japan Ltd.
Headquarters
ARCO Tower 15F
1-8-1, Shimo-Meguro, Meguro-ku,
Tokyo 153-0064, Japan
0120 191014
+81 3 5437 9125
[email protected]

Asia/Pacific:
Freescale Semiconductor Hong Kong Ltd.
Technical Information Center
2 Dai King Street
Tai Po Industrial Estate,
Tai Po, N.T., Hong Kong
+800 2666 8080
[email protected]

For Literature Requests Only:


Freescale Semiconductor
Literature Distribution Center
P.O. Box 5405
Denver, Colorado 80217
1-800-441-2447
303-675-2140
Fax: 303-675 2150
[email protected]

For more information, visit freescale.com/power


Freescale, the Freescale logo and QorIQ are trademarks of Freescale Semiconductor, Inc.,
Reg. U.S. Pat. & Tm. Off. The Power Architecture and Power.org word marks and the Power and
Power.org logos and related marks are trademarks and service marks licensed by Power.org.
All other product or service names are the property of their respective owners.
© 2012 Freescale Semiconductor, Inc.
Document Number: PWRARBYNDBITSRAS REV 0

You might also like