Lecture 07-Real-Time Linux
Lecture 07-Real-Time Linux
1
Real-Time in Handheld & Embedded
Systems
• Cost / Performance / Power / Weight Compromise
– Competitive, High-Volume, Low-margin Markets
– Maximum Feature-set, Add-ons, Responsive UI feel
– Device specs: minimal CPU & Memory & Battery Powered
– Minimal CPU = High CPU utilization
– High CPU load + Time-Critical functionality RT specs
– Real-time Requirements will never be alleviated by
Improvements in Hardware Performance / Efficiency
• Software utilizing latest hardware technologies easily
keep up with, and usually out-paces, advances in
hardware technology
• If you don't believe that, go shopping (for a mobile
phone) 2
Beginning of Linux – as a UNIX clone
3
Linux System
• Linux kernel
– In Sept 1991, Linus Torvalds, a second year student of
Computer Science at the University of Helsinki, developed the
preliminary kernel of Linux, known as Linux version 0.0.1, and
released to the Internet.
– Version 1.0, 2.0, 2.2, 2.4, and 2.6 were released in 1994, 1996,
1999, 2001, and 2003, respectively.
– Current version is 2.6.37.2 released on 24 Feb 2011.
• Linux system
– Software from the GNU project + Linux kernel = Linux system
– Redhat, Fedora, CentOS, Ubuntu, etc.
4
Evolution of Linux
• Early Linux is Not Designed for Real-Time Processing
– Early Linux (1.x Kernel) installations on retired Windows PCs
• Old/Obsolete hardware useful under Linux due to efficiency of O/S
• Linux outperformed Windows in reliability and uptime (still does)
– Linux Design: Fairness, Throughput and Resource-Sharing
• Basic Unix development design principles applied in Kernel
• Heavily (over)-loaded systems continue to make progress
• Does not drop network connections or starve users / applications
– Fairness- and Resource-Sharing Design is Linux's Strength
• contributed to make Linux competitive and popular in the
enterprise-server and development-application environments
• Gave rise to RedHat and others.
• Essential to the evolution of Linux, endemic of UNIX legacy
5
Real time computing requirements
• Timing
Scheduling
• Hardware Interface
Configuration
• Communication
External Communications
and Synchronization
• Memory
Error Reporting
Management
• Task Management
Embedded Features
• File system
• Reliability
6
Linux as a Real-Time OS
• Traditional RT Systems used custom built systems – which
were not extensible I.e. tough to develop new applications
• However, as technology improved, generic real-time OS
became acceptable
• In OS suited for extensible development Linux looks more
appealing
Linux as your real-time solution?
Could increase priority for “real-time” tasks and assume
they get scheduled
Problem – Linux optimizes average case whereas an
RTOS should work under worst case assumptions
7
Why and why not Linux?
Cons
• Linux (and its Real Time versions) are free and Open Source!!
• Easy for developing RT applications
Pros
• Linux didn’t have any corporate support until now
• Linux, is a very good general purpose operating system, but not so
for real-time OS
• Fairness, progress and resource-sharing conflict with the
requirements of time-critical applications
• Because, the design motive of a conventional OS and RTOS is
different
• UNIX-legacy Operating Systems were designed with operating
principles focused on throughput and progress
– User tasks should not stall under heavy load
8
– System resources must be shared fairly between users
Linux – A Simplified View
9
Linux – conflicts with RT constraints
• Coarse grained synchronization – long intervals when a task
has exclusive use of data ( as fine grained – leads to lot of
overhead reducing the average case performance)
• Linux batch all operations for efficient use of H/W
– E.g freeing a list of pages when memory is full
reducing the worst case performance
• Linux doesn't preempt low-priority task during system call
• Linux will make high priority tasks wait for low priority to
release resources
10
Real-Time Linux
• A class of Linux systems which could meet the real-time
requirements.
• There are a number of such real-time Linux systems available:
– RTLinux – hard real-time
– RTAI – hard real-time
– MontaVista – soft real-time
– and many more …
• Many of them are developed as research projects and were
released to public freely.
• Some of them are commercialized though.
11
Real Time Linux approaches
• Modify the current Linux kernel to guarantee
RT constraints
– Used by KURT
• Make the standard Linux kernel run as a task
of the real-time kernel
– Used by RT-Linux, RTAI
12
Modifying Linux kernel
• Advantages
– Most problems, such as interrupt handling,
already solved
– Less initial labor
• Disadvantages
– No guaranteed performance
– RT tasks don’t always have precedence over non-
RT tasks.
13
Linux as a process of a RT kernel
• Advantages
– Can make hard real time guarantees
– Easy to implement a new scheduler
• Disadvantages
– Initial port difficult, must know a tremendous
amount about underlying hardware
– Running a small real-time executive is not a
substitute for a full fledged RTOS
14
RTLinux or RTCore
• The first real-time Linux system developed as a
research project over a decade ago.
• Subsequently commercialized to become a real-time
core (RTCore) of a commercial Linux called Wind River
Linux.
• RTLinux or RTCore is a hard real-time microkernel that
runs the entire Linux operating system as a fully
preemptable process.
• The technologies developed to enable the real-time
operation are patented and can’t be used freely.
15
RTLinux / RTAI Architecture
User’s task
HARDWARE
17
RTAI – A Simplified View
18
RTAI Overview
• Based on Real-Time Hardware Abstraction Layer (HAL)
(also used in Windows NT)
• HAL exports Linux data and functions related to hardware
• HAL defines an interface between RTAI and Linux
• Software architecture
– Interface to Linux hardware management (HAL)
– 3 basic components: dispatcher, scheduler, fifos
– 1 interface used in user tasks to initialize and start the
components
• RTAI is basically an interrupt dispatcher (reroutes
interrupts to Linux if necessary) e.g: Disk interrupt
• The Windows NT hardware abstraction layer (HAL) refers to a layer of software that deals directly with your
computer hardware.
19
RT-Linux
• Open source Linux project, Supports x86, PowerPC, Alpha
• Patch of the regular Linux kernel (simply install the patch
and recompile the kernel)
• Provides an RT API for developers
• Runs Linux kernel as lowest priority process
• RT tasks are coded as modules, modules are inserted and
removed at users discretion
• Extremely good at handling periodic tasks
• Communicates with non-RT kernel and other RT tasks via
fifo queues
• Tools are provided for graphical analysis of RT execution
20
RT-Linux
Task Structure
Interrupt Dispatcher
21
Problems with RT-Linux
• Currently no support for aperiodic tasks
• Not very useful for complex RT systems
• Currently limited to simple problems
22
Linux kernel improvements for RT
• Features and performance enhancement for
upgrading standard Linux kernel to challenge real
time products and applications:
– Enhanced schedulers
– Virtual memory
– Shared memory
– Portable operating system interface X (POSIX) timers
– Real-time signals
– POSIX asynchronous I/O
– POSIX threads
– Quality of service capabilities
– Low latency/preemptable kernel modifications
23
Low-Latency patches for Linux
24
Low-Latency Linux (Red-Linux, some Real-
time Linux)
• This approach corrects the monolithic structure by
inserting explicit rescheduling points inside the kernel. In
this approach, when a thread is executing inside the
kernel it can explicitly decide to yield the CPU to some
other thread
• In this way, the size of non-preemptable sections is
reduced, thus decreasing the latency
• The consistency of kernel data is enforced by suing
cooperative scheduling
• Since the low-latency patch has been carefully hand-
tuned for quite a long time, it performs surprisingly well
25
Preemptable Linux (used in most real-time
systems)
• Removes the constraint of a single execution flow
inside the kernel. It is not necessary to disable
preemption when an execution flow enters the
kernel
• To support full kernel preemptability, kernel data
must be explicitly protected using mutexes or
spinlocks
• The Linux preemptable kernel pathc, maintained by
Robert Love and sponsored by MontaVista
26
Montavista
• Montavista Inc. provides a linux solution for
embedded systems
• The solution’s aim is to make the Linux kernel
fully preemptable
• It identifies the points where priority inversion
occurs in Linux and makes those points fully
preemptable
• A good embedded solution not a complete RT
solution.
27
Real-Time Linux 2.6 Enablers
28
Real-Time and Linux Kernel Evolution
29
Critical Section Locking
• Linux 2.6 Kernel Critical Sections are Non-Preemptible
– Critical sections protect shared resources, e.g. hardware registers, I/O
ports, and data in RAM
– Critical sections are shared by Processes, Interrupts and CPUs.
– Effective protection is provided by the Spin-Lock Subsystem
– Critical sections must be locked and unlocked
– Locked critical sections are not preemptible
– Linux 2.6 Kernel has 11,000 critical sections
– Exhaustive Kernel testing to identify worst-case code paths
– Labour-intensive cleanup of critical sections
– No control over 3rd party drivers
– Worst-case after cleanup still not acceptable
– Maintenance, community education, policing / regression testing
30
Interrupt Handlers
• Linux 2.6 Kernel: Unbounded IRQ subsystem latencies
– Task-Preemption latency increases with hardware-interrupt load
– Interrupts cannot be preempted
– No Priorities for Interrupts
• IRQ Subsystem always preempts tasks unconditionally
– Unbounded SoftIRQ subsystem (“Bottom Half Processing”)
• Activated by HW IRQs (Timers, SCSI, Network)
• SoftIRQs re-activate, iterate
– Driver-level adaptations
• Network Driver NAPI adaption reduces denial of service (D.o.S.) effects of
high packet loads
31
Legacy Locking
• Existing Locking Subsystems are not Priority-Aware
– System semaphore
• Counting semaphore used to wake multiple waiting tasks
• No support for priority inheritance
• No priority ordering of waiters
– Big Kernel Lock (BKL)
• Originally non-preemptible, now preemptible using system semaphore
• Can be released by blocking tasks, re-acquired upon wake-up
• No priority-awareness, or priority inheritance for contending tasks
– RCU (Read-Copy-Update) Locks in Network subsystem
• Read-optimized cached locking requiring race-free invalidation
– Read – Write Locks
• Classical blocking / starvation issues with no priority awareness
32
The Fully Preemptible Linux Kernel
• Dramatic Reduction in 2.6 Preemption Latencies
– Multiple Concurrent Tasks in Independent Critical Sections
– Generally Fully Preemptible “No Delays”
• Non-preemptible: Interrupt off paths and lowest-level interrupt management
• Non-preemptible: Scheduling and context switching code
• Design Flexibility
– Provides Full Access to Kernel Resources to RT Tasks
– Supports existing driver and application code
– User-space Real-Time
• Optimization Flexibility
– RT Tasks designed to use Kernel-resources in managed ways can reduce
or eliminate Priority-Inheritance delays
• Adequate Instrumentation
– Latency timing, latency triggers & stack tracing, histograms
33
Kernel Evolution: Preemptible Code
Kernel 2.0
Kernels
2.2-2.4
Preemptible
Kernel 2.4
Kernel 2.6
Real-Time
Kernel 2.6
Preemptible Non-Preemptible
34
Real-Time Linux 2.6 Performance
• Real-Time Linux 2.6 Kernel Performance
– Far exceeds most stringent Audio performance requirements
– Enables sub-millisecond control-loop response
– Enables Hard Real Time for qualified RT-aware Applications
35
Real-Time Linux 2.6 Performance (Cont)
No Preemption Preemptible
36
References
• RT-Linux : http://www.rtlinux.org
• RTAI :
http://www.aero.polimi.it/projects/rtai/contri
b.htm
• Montavista: http://www.mvista.com
• Linux as a real-time operating system –
Freescale semiconductor, David Beal,
Nov/2005
37
Software Environments for Embedded
Systems
SW: Embedded Software Tools
application compiler
U source Application
S code software
E
R a.out RTOS
A
S
C I ROM
simulator P
C
debugger A
U S
RAM
I
C
Another View of Microprocessor Architecture
G
• Machines (Nano-Machines)
I N
• Analog Signals
• Anything that communicates
•
T
Lots of stuff in our cars
H
Y
• Our Bodies
R
– Today - Pacemakers
E
– Soon - De-Fibrillators, Insulin Dispensers
V
– We can all be the $6M Person, for a lot cheaper
E
• All sorts of interfaces
– Speech, DNI, etc.
Embedded Microprocessor Evolution
> 500k transistors 2+M transistors 5+M transistors 22+M transistors
1 - 0.8 0.8 - 0.5 0.5 - 0.35 0.25 - 0.18
33 mHz 75 - 100 mHz 133 - 167 mHz 500 - 600 mHz
• Embedded CPU cores are getting smaller; ~ 2mm2 for up to 400 mHz
– Less than 5% of CPU size
• Higher Performance by:
– Faster clock, deeper pipelines, branch prediction, ...
• Trend is towards higher integration of processors with:
– Devices that were on the board now on chip: “system on a chip”
– Adding more compute power by add-on DSPs, ...
– Much larger L1 / L2 caches on silicon
Microprocessor Chaos
ST 20
J. Fiddler - WRS M32 R/D
StrongARM
ARM
SH-DSP
SH 4
MCORE
680x0 680x0
CPU32 CPU32
PowerPC PowerPC
80x86 80x86
MIPS 3k/4k/5k MIPS 3k/4k/5k
SPARC SPARC
SH 1/2/3 SH 1/2/3
29k 29k 29k
680x0 RAD 6k RAD 6k
CPU32 Siemens C16x Siemens C16x
80x86 NEC V8xx NEC V8xx
SPARC PARISC PARISC
MIPS R3k i960 i960
68000 i960 563xx 563xx
Increasing Embedded
Software More
Time-to-market
Crisis Applications
pressure
application compiler
U source Application
S code software
E
R a.out RTOS
A
S
C I ROM
simulator P
C
debugger A
U S
RAM
I
C
Outline on RTOS
• Introduction
• VxWorks
– General description
• System
• Supported processors
– Details
• Kernel
• Custom hardware support
• Closely coupled multiprocessor support
• Loosely coupled multiprocessor support
• pSOS
• eCos
• Conclusion
Embedded Development: Generation 0
• Development: Sneaker-net
• Attributes:
– No OS
– Painful!
– Simple software only
Embedded Development: Generation 1
• Hardware: SBC, minicomputer
• Development: Native
• Attributes:
– Full-function OS
• Non-Scalable
• Non-Portable
– Turnkey
– Very primitive
Embedded Development: Generation 2
• Hardware: Embedded
• Development: Cross, serial line
• Attributes
– Kernel
– Originally no file sys, I/O, etc.
– No development environment
– No network
– Non-portable, in assembly
Embedded Development: Generation 3
• Hardware: SBC, embedded
• Development: Cross, Ethernet
– Integrated, text-based, Unix
• Attributes
– Scalable, portable OS
• Includes network, file & I/O sys, etc.
– Tools on target
• Network required
• Heavy target required for development
– Closed development environment
Embedded Development: Generation 4
• Hardware: Embedded, SBC
• Development: Cross
– Any tool - Any connection - Any target
– Integrated GUI, Unix & PC
• Attributes
– Tools on host
• No target resources required
• Far More Powerful Tools (WindView, CodeTest, …)
– Open dev. environment, published API
– Internet is part of dev. environment
• Support, updates, manuals, etc.
Embedded Development: Generation 5???
• Super-scalable
• Communications-centric
• Virtual application platform
– Java?
• Multi-media
• Way-cool development environment
– Much easier to create, debug & re-use code
– Easy for non-programmers to contribute
The RTOS Evolution
Application
Browser / GUI
Java
Application Advanced Interconnect
X Windows Advanced Networking
WindNet Distributed Objects
Memory Management Fault Tolerance 90%*
Application Multiprocessing 75%* Multiprocessing
File System File System File System
Application Networking Networking Networking
30%*
Kernel 10%* Kernel Kernel Kernel
WindNet Networking
Core OS
Wind Microkernel
• PowerPC SPARC
• 68K, CPU 32
NEC V8xx
• ColdFire
M32 R/D
• MCORE
• 80x86 and Pentium RAD6000
• i960 ST 20
• ARM and Strong ARM TriCore
• MIPS
• SH
Wind microkernel
• Task management
– multitasking, unlimited number of tasks
– preemptive scheduling and round-robin scheduling(static
scheduling)
– fast, deterministic context switch
– 256 priority levels
Wind microkernel
• Fast, flexible inter-task communication
– binary, counting and mutual exclusion semaphores
with priority inheritance
– message queue
– POSIX pipes, counting semaphores, message
queues, signals and scheduling
– control sockets
– shared memory
Wind microkernel
• High scalability
• Incremental linking and loading of components
• Fast, efficient interrupt and exception handling
• Optimized floating-point support
• Dynamic memory management
• System clock and timing facilities
``Board Support Package’’
• BSP = Initializing code for hardware device +
device driver for peripherals
• BSP Developer’s Kit
Hardware Processor
independent dependent Device dependent code
code code
BSP
VxMP
• A closely coupled multiprocessor support accessory for VxWorks.
• Capabilities:
– Support up to 20 CPUs
– Binary and counting semaphores
– FIFO message queues
– Shared memory pools and partitions
– VxMP data structure is located in a shared memory area accessible to all CPUs
– Name service (translate symbol name to object ID)
– User-configurable shared memory pool size
– Support heterogeneous mix of CPU
VxMP
• Hardware requirements:
– Shared memory
– Individual hardware read-write-modify mechanism across the shared
memory bus
– CPU interrupt capability for best performance
– Supported architectures:
• 680x0 and 683xx
• SPARC
• SPARClite
• PPC6xx
• MIPS
• i960
VxFusion
• VxWorks accessory for loosely coupled configurations and standard IP
networking;
• An extension of VxWorks message queue, distributed message queue.
• Features:
– Media independent design;
– Group multicast/unicast messaging;
– Fault tolerant, locale-transparent App1 App2
operations;
– Heterogeneous environment.
• Supported targets: VxFusion
– Motorola: 68K, CPU32, PowerPC
Adapter Layer
– Intel x86, Pentium, Pentium Pro
Transport
pSOS
Memory POSIX
I/O system BSPs Management Library
pSOS+ Kernel
pSOS 2.5
Supported processors
• PowerPC M32/R
m.core
• 68K
NEC v8xx
• ColdFire ST20
• MIPS SPARClite
Kernel
pluggable schedulers, mem alloc,
synchronization, timers, interrupts,
threads
HAL
Supported processors
• Advanced RISC Machines ARM7
• Fujitsu SPARClite
• Matsushita MN10300
• Motorola PowerPC
• Toshiba TX39
• Hitachi SH3
• NEC VR4300
• MB8683x series
• Intel strong ARM
Kernel
• No definition of task, support multi-thread
• Interrupt and exception handling
• Preemptive scheduling: time-slice scheduler,
multi-level queue scheduler, bitmap scheduler
and priority inheritance scheduling
• Counters and clocks
• Mutex, semaphores, condition variable,
message box
Hardware Abstraction Layer
• Architecture HAL abstracts basic CPU, including:
– interrupt delivery
– context switching
– CPU startup and etc.
• Platform HAL abstracts current platform, including
– platform startup
– timer devices
– I/O register access
– interrupt control
• Implementation HAL abstracts properties that lie between the above,
– architecture variants
– on-chip devices
• The boundaries among them blurs.
Summary on RTOS
VxWorks pSOS eCos
Task Y Y Only Thread
Scheduler Preemptive, static Preemptive Preemptive
Synchronization mechanism No condition variable Y Y
POSIX support Y Y Linux
Scalable Y Y Y
Custom hw support BSP BSP HAL, I/O
package
Kernel size - 16KB -
Multiprocessor support VxMP/ VxFusion PSOS+m None
(accessories) kernel
Recall the ``Board Support Package’’
• BSP = Initializing code for hardware device +
device driver for peripherals
• BSP Developer’s Kit
Hardware Processor
independent dependent Device dependent code
code code
BSP
Introduction to Device Drivers
• What are device drivers?
– Make the attached device work.
– Insulate the complexities involved in I/O handling.
Application
RTOS
Device driver
Hardware
Proliferation of Interfaces
• New Connections
– USB
– 1394
– IrDA
– Wireless
• New Models
– JetSend
– Jini
– HTTP / HTML / XML / ???
– Distributed Objects (DCOM, CORBA)
Leads to Proliferation of Device Drivers
Device Driver Characterization
• Device Drivers’ Functionalities
– initialization
– data access
– data assignment
– interrupt handling
Device Characterization
• Block devices
– fixed data block sizes devices
• Character devices
– byte-stream devices
• Network device
– manage local area network and wide area network interconnections
I/O Processing Characteristics
• Initialization
– make itself known to the kernel
– initialize the interrupt handling
– optional: allocate the temporary memory for device
driver
– initialize the hardware device
• Front-End Processing
– initiation of an I/O request
• Back-End Processing
– handles the completion of I/O operations
Commercial Resources
• Aisys DriveWay 3DE
– Motorola MPC860, MC68360, MC68302, AMD E86, Philips
XA, 8C651, PIC 16/17
• Stenkil MakeApp
– Hitachi H8, SH1, SH3, SH7x, HCAN
• Intel’s ApBuilder
• Motorola MCUnit
• GO DSP Code Composer
– TI DSPs
• CoWare
Aysis 3DE DriveWay Features
• Extensive documentation: KB help along the
way as detailed as a chip manual: traffic.ext,
traffic.dwp
• CNFG for configuring the chip such as memory
and clock. Gives warning if necessary
• Can generate test function
• Can insert user code
• One file for each peripheral
DriveWay Design Methodology
.DWP
GUI Code
“generator”
User data
Little generation
more manipulation
Output
files
.DLL K.B.
Manipulation
Chip of K.B.database
specific
K.B. Database
• A specific K.B. per chip family
• Family of chips
– chip
• peripherals
– functional objects (timer, PWM counter)
» functions
» physicals (register setting, values, clock rate)
» actual code
DriveWay Builder
• Add chip
• Add peripheral
• Create skeleton, link to other thins such as GUI
• Code reuse in adding a new chip in an existing
family, e.g., use code in MPC 860 for MPC 821
• Easy to create infrastructure but specifics has
to be written
About the code generator (1)
• Cut and paste K.B. database
• Areas where we can use automation for
device driver generation:
– model user specification
– extract useful information for drivers from HDL
description of the chip
• MAP registers
• interrupt
About the code generator (2)
Hardware Software
Device
Glue Logic Driver
Interface Synthesis Example: Memory Mapped I/O
SW SW
Glue Logic
compiled on processor
SW
Processor HW
*FFA3 = value;
*FFA3
Device Driver
Memory
Address FFA3
SW: Embedded Software Tools
application compiler
U source Application
S code software
E
R a.out RTOS
A
S
C I ROM
simulator P
C
debugger A
U S
RAM
I
C
ASIC Value Proposition
S/P
RAM
RAMDMA µC
ASIC DSP
LOGIC CORE
Killian- Tensilica
5.0 Area vs. Program I nstructions
4.5
2
Processor + Code RAM mm
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0 1000 2000 3000 4000 5000 6000 7000 8000
Xtensa MIPS-4Kc ARC ARM9 ARM9-Thumb Program Size (Instructions)
• Based on base 0.18m implementation plus code RAM or cache
• Xtensa code ~10% smaller than ARM9 Thumb, ~50% smaller than MIPS-Jade, ARM9 and ARC
• ARM9-Thumb has reduced performance
• RAM/cache density = 8KB/mm2
SW Compiler Value Proposition
• 20% area decrease in RAM portion
• 25% higher performance
• move to higher level - C rather than assembler
R S/P
RAM
A µC
M DMA
ASIC DSP
LOGIC CORE
Compaq/Digital StrongARM
Compiler Support
• BUT, few companies focused on compiler
support for embedded systems:
– Cygnus => RedHat
– Tartan => TI
– Green Hills
• Why?
• Bad ``buying behaviors’’ – few seats,
low ASP’s
•
Current Status on Compiler Support
Adequate compiler and debugger support in breadth and quality for embedded
microprocessors/microcontrollers
– ARM
– MIPS
– Power PC
– Mot family
• From
– Cygnus/RedHat
– Manufacturer
– Green Hills
• DSP’s still poorly supported
– Tartan acquired by Texas Instruments
– WHY????
• NO support for growing generation of special purpose processors:
– TMS320C80
– IXP1200
Recall: Architectural Features of DSPs
• Data path configured for DSP
– Fixed-point arithmetic
– MAC- Multiply-accumulate
• Multiple memory banks and buses -
– Harvard Architecture
– Multiple data memories
• Specialized addressing modes
– Bit-reversed addressing
– Circular buffers
• Specialized instruction set and execution control
– Zero-overhead loops
– Support for MAC
• Specialized peripherals for DSP
Example: IXP1200
Host CPU (optional) PCI MAC Devices
PCI Bus 66 Mhz
32
64
Peripherals FIFO Bus 66 Mhz
• 6 micro-engines
– RISC engines
SDRAM
Ctrl – 4 contexts/eng
MicroEng – 24 threads total
PCI • IX Bus Interface
Interface
MicroEng Hash – packet I/O
Engine
– connect IXPs
MicroEng • scalable
ICache IX Bus
Interface • StrongARM
MicroEng – less critical tasks
SA
DCache Scratch
Core Pad
• Hash engine
MicroEng
Mini SRAM – level 2 lookups
DCache • PCI interface
MicroEng
SRAM
Ctrl
Summary
• Embedded software support for microcontrollers and microprocessors is
broadly available and of adequate quality
– RTOS
– Device drivers
– Compilers
– Debuggers
• Embedded software support for DSP processors is inadequate:
– Patchy support – many parts lack support
– Quality poor – lags hand coding by 20-100%
• Embedded software support for special purpose processors often non-
existent
• Still in a ``build a hardware then write the software’’ world
• Alternatives?
ASIP/Extensible micro DESIGN FLOW
APPLICATION
CODE µARCHITECTURE DESIGNER
Killian- Tensilica
Processor ASIC
Verilog flow
Processor RTL
Configure
Base uP Generator uP
*******
**** Software
Mem
******** Tools Software
Software
*** compile
Generator
Describe new
inst in TIE
*******
****
********
***
Application
Tensilica TIE Design Cycle
Killian- Tensilica
Develop application in C/C++
N
Id potential new instructions Acceptable ?
Y
Describe new instructions
Measure hardware impact