0% found this document useful (0 votes)
7 views

Lecture 07-Real-Time Linux

Uploaded by

Doug Lion
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Lecture 07-Real-Time Linux

Uploaded by

Doug Lion
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 109

Real-Time Linux

1
Real-Time in Handheld & Embedded
Systems
• Cost / Performance / Power / Weight Compromise
– Competitive, High-Volume, Low-margin Markets
– Maximum Feature-set, Add-ons, Responsive UI feel
– Device specs: minimal CPU & Memory & Battery Powered
– Minimal CPU = High CPU utilization
– High CPU load + Time-Critical functionality  RT specs
– Real-time Requirements will never be alleviated by
Improvements in Hardware Performance / Efficiency
• Software utilizing latest hardware technologies easily
keep up with, and usually out-paces, advances in
hardware technology
• If you don't believe that, go shopping (for a mobile
phone) 2
Beginning of Linux – as a UNIX clone

3
Linux System
• Linux kernel
– In Sept 1991, Linus Torvalds, a second year student of
Computer Science at the University of Helsinki, developed the
preliminary kernel of Linux, known as Linux version 0.0.1, and
released to the Internet.
– Version 1.0, 2.0, 2.2, 2.4, and 2.6 were released in 1994, 1996,
1999, 2001, and 2003, respectively.
– Current version is 2.6.37.2 released on 24 Feb 2011.
• Linux system
– Software from the GNU project + Linux kernel = Linux system
– Redhat, Fedora, CentOS, Ubuntu, etc.
4
Evolution of Linux
• Early Linux is Not Designed for Real-Time Processing
– Early Linux (1.x Kernel) installations on retired Windows PCs
• Old/Obsolete hardware useful under Linux due to efficiency of O/S
• Linux outperformed Windows in reliability and uptime (still does)
– Linux Design: Fairness, Throughput and Resource-Sharing
• Basic Unix development design principles applied in Kernel
• Heavily (over)-loaded systems continue to make progress
• Does not drop network connections or starve users / applications
– Fairness- and Resource-Sharing Design is Linux's Strength
• contributed to make Linux competitive and popular in the
enterprise-server and development-application environments
• Gave rise to RedHat and others.
• Essential to the evolution of Linux, endemic of UNIX legacy

5
Real time computing requirements
• Timing
Scheduling
• Hardware Interface
Configuration
• Communication
External Communications
and Synchronization
• Memory
Error Reporting
Management
• Task Management
Embedded Features
• File system
• Reliability

6
Linux as a Real-Time OS
• Traditional RT Systems used custom built systems – which
were not extensible I.e. tough to develop new applications
• However, as technology improved, generic real-time OS
became acceptable
• In OS suited for extensible development Linux looks more
appealing
Linux as your real-time solution?
 Could increase priority for “real-time” tasks and assume
they get scheduled
 Problem – Linux optimizes average case whereas an
RTOS should work under worst case assumptions

7
Why and why not Linux?
Cons
• Linux (and its Real Time versions) are free and Open Source!!
• Easy for developing RT applications
Pros
• Linux didn’t have any corporate support until now
• Linux, is a very good general purpose operating system, but not so
for real-time OS
• Fairness, progress and resource-sharing conflict with the
requirements of time-critical applications
• Because, the design motive of a conventional OS and RTOS is
different
• UNIX-legacy Operating Systems were designed with operating
principles focused on throughput and progress
– User tasks should not stall under heavy load
8
– System resources must be shared fairly between users
Linux – A Simplified View

9
Linux – conflicts with RT constraints
• Coarse grained synchronization – long intervals when a task
has exclusive use of data ( as fine grained – leads to lot of
overhead reducing the average case performance)
• Linux batch all operations for efficient use of H/W
– E.g freeing a list of pages when memory is full
reducing the worst case performance
• Linux doesn't preempt low-priority task during system call
• Linux will make high priority tasks wait for low priority to
release resources

10
Real-Time Linux
• A class of Linux systems which could meet the real-time
requirements.
• There are a number of such real-time Linux systems available:
– RTLinux – hard real-time
– RTAI – hard real-time
– MontaVista – soft real-time
– and many more …
• Many of them are developed as research projects and were
released to public freely.
• Some of them are commercialized though.

11
Real Time Linux approaches
• Modify the current Linux kernel to guarantee
RT constraints
– Used by KURT
• Make the standard Linux kernel run as a task
of the real-time kernel
– Used by RT-Linux, RTAI

12
Modifying Linux kernel
• Advantages
– Most problems, such as interrupt handling,
already solved
– Less initial labor
• Disadvantages
– No guaranteed performance
– RT tasks don’t always have precedence over non-
RT tasks.

13
Linux as a process of a RT kernel
• Advantages
– Can make hard real time guarantees
– Easy to implement a new scheduler
• Disadvantages
– Initial port difficult, must know a tremendous
amount about underlying hardware
– Running a small real-time executive is not a
substitute for a full fledged RTOS

14
RTLinux or RTCore
• The first real-time Linux system developed as a
research project over a decade ago.
• Subsequently commercialized to become a real-time
core (RTCore) of a commercial Linux called Wind River
Linux.
• RTLinux or RTCore is a hard real-time microkernel that
runs the entire Linux operating system as a fully
preemptable process.
• The technologies developed to enable the real-time
operation are patented and can’t be used freely.

15
RTLinux / RTAI Architecture
User’s task

Real Time task System libraries


Drivers Linux kernel

I/O Software interrupts


Real-time Scheduler Real-time plugin

I/O Hardware interrupts

HARDWARE

 Hardware abstraction layer


 real-time interrupt dispatcher, real-time scheduler
 Inter processes communication services
16
RTAI
• RTAI stands for Real-Time Application Interface and is a Linux
extension to hard real-time application.
• It is initially developed as a research project in Italy for
enhancing the RTLinux functionalities at that time.
• Subsequently, new technologies were invented to replace all
the RTLinux code adopted from the initial RTLinux prototype to
overcome any patent problems.
• The new technologies is collectively referred as ADEOS.
• Current version is 3.8.1 released in May 2010.
• Still actively under development and maintained by the
original developers in Italy and the Internal community.

17
RTAI – A Simplified View

18
RTAI Overview
• Based on Real-Time Hardware Abstraction Layer (HAL)
(also used in Windows NT)
• HAL exports Linux data and functions related to hardware
• HAL defines an interface between RTAI and Linux
• Software architecture
– Interface to Linux hardware management (HAL)
– 3 basic components: dispatcher, scheduler, fifos
– 1 interface used in user tasks to initialize and start the
components
• RTAI is basically an interrupt dispatcher (reroutes
interrupts to Linux if necessary) e.g: Disk interrupt
• The Windows NT hardware abstraction layer (HAL) refers to a layer of software that deals directly with your
computer hardware.

19
RT-Linux
• Open source Linux project, Supports x86, PowerPC, Alpha
• Patch of the regular Linux kernel (simply install the patch
and recompile the kernel)
• Provides an RT API for developers
• Runs Linux kernel as lowest priority process
• RT tasks are coded as modules, modules are inserted and
removed at users discretion
• Extremely good at handling periodic tasks
• Communicates with non-RT kernel and other RT tasks via
fifo queues
• Tools are provided for graphical analysis of RT execution
20
RT-Linux
Task Structure

Interrupt Dispatcher

21
Problems with RT-Linux
• Currently no support for aperiodic tasks
• Not very useful for complex RT systems
• Currently limited to simple problems

22
Linux kernel improvements for RT
• Features and performance enhancement for
upgrading standard Linux kernel to challenge real
time products and applications:
– Enhanced schedulers
– Virtual memory
– Shared memory
– Portable operating system interface X (POSIX) timers
– Real-time signals
– POSIX asynchronous I/O
– POSIX threads
– Quality of service capabilities
– Low latency/preemptable kernel modifications
23
Low-Latency patches for Linux

• The kernel latency is a quantity used to measure the


difference between the theoretical schedule and the
actual one
• In a standard Linux kernel, the maximum latency is
equal to the maximum length of a system call plus
the processing time of all the interrupts that fire
before returning to user mode.
• This value can be as large as 100 ms

24
Low-Latency Linux (Red-Linux, some Real-
time Linux)
• This approach corrects the monolithic structure by
inserting explicit rescheduling points inside the kernel. In
this approach, when a thread is executing inside the
kernel it can explicitly decide to yield the CPU to some
other thread
• In this way, the size of non-preemptable sections is
reduced, thus decreasing the latency
• The consistency of kernel data is enforced by suing
cooperative scheduling
• Since the low-latency patch has been carefully hand-
tuned for quite a long time, it performs surprisingly well
25
Preemptable Linux (used in most real-time
systems)
• Removes the constraint of a single execution flow
inside the kernel. It is not necessary to disable
preemption when an execution flow enters the
kernel
• To support full kernel preemptability, kernel data
must be explicitly protected using mutexes or
spinlocks
• The Linux preemptable kernel pathc, maintained by
Robert Love and sponsored by MontaVista

26
Montavista
• Montavista Inc. provides a linux solution for
embedded systems
• The solution’s aim is to make the Linux kernel
fully preemptable
• It identifies the points where priority inversion
occurs in Linux and makes those points fully
preemptable
• A good embedded solution not a complete RT
solution.

27
Real-Time Linux 2.6 Enablers

• Pro-Audio Performance Requirements


– Audio Community Involved in Kernel-Preemption since 2.2
– Audio Community strongly Endorsing RT technology
• Embedded Application Domain
– Single-Chip, Mobile Applications (Wireless / Cellular Handsets)
– Predictable OS performance eliminates HW design uncertainty
• Reliable Prototyping and Improved Product Scheduling
• Multimedia Carrier (QOS) Application Domain
– Telephony, Audio / Video / Multimedia / Home Entertainment
• Fine-Granular Preemption improves SMP scalability
– Mainstreaming of SMP Technology
• Dual / Quad / Octa - Core Intel, AMD, PPC, Arm

28
Real-Time and Linux Kernel Evolution

• Gradual Kernel Optimizations over Time


– SMP Critical sections (Linux 2.x)
– Low-Latency Patches (Linux 2.2)
– Preemption Points / Kernel Tuning (Linux 2.2 / 2.4)
– Preemptible Kernel Patches (Linux 2.4)
– Fixed-time “O(1)” Scheduler (Linux 2.6)
– Voluntary Preemption (Linux 2.6)
• In 2003-04 Linux 2.6 RT Technology Regressed
– Early Linux 2.6 Real-Time Performance was worse than 2.4 Kernel
Performance
– Audio Community and others balked at moving to 2.6 Kernel Base

29
Critical Section Locking
• Linux 2.6 Kernel Critical Sections are Non-Preemptible
– Critical sections protect shared resources, e.g. hardware registers, I/O
ports, and data in RAM
– Critical sections are shared by Processes, Interrupts and CPUs.
– Effective protection is provided by the Spin-Lock Subsystem
– Critical sections must be locked and unlocked
– Locked critical sections are not preemptible
– Linux 2.6 Kernel has 11,000 critical sections
– Exhaustive Kernel testing to identify worst-case code paths
– Labour-intensive cleanup of critical sections
– No control over 3rd party drivers
– Worst-case after cleanup still not acceptable
– Maintenance, community education, policing / regression testing

30
Interrupt Handlers
• Linux 2.6 Kernel: Unbounded IRQ subsystem latencies
– Task-Preemption latency increases with hardware-interrupt load
– Interrupts cannot be preempted
– No Priorities for Interrupts
• IRQ Subsystem always preempts tasks unconditionally
– Unbounded SoftIRQ subsystem (“Bottom Half Processing”)
• Activated by HW IRQs (Timers, SCSI, Network)
• SoftIRQs re-activate, iterate
– Driver-level adaptations
• Network Driver NAPI adaption reduces denial of service (D.o.S.) effects of
high packet loads

31
Legacy Locking
• Existing Locking Subsystems are not Priority-Aware
– System semaphore
• Counting semaphore used to wake multiple waiting tasks
• No support for priority inheritance
• No priority ordering of waiters
– Big Kernel Lock (BKL)
• Originally non-preemptible, now preemptible using system semaphore
• Can be released by blocking tasks, re-acquired upon wake-up
• No priority-awareness, or priority inheritance for contending tasks
– RCU (Read-Copy-Update) Locks in Network subsystem
• Read-optimized cached locking requiring race-free invalidation
– Read – Write Locks
• Classical blocking / starvation issues with no priority awareness

32
The Fully Preemptible Linux Kernel
• Dramatic Reduction in 2.6 Preemption Latencies
– Multiple Concurrent Tasks in Independent Critical Sections
– Generally Fully Preemptible  “No Delays”
• Non-preemptible: Interrupt off paths and lowest-level interrupt management
• Non-preemptible: Scheduling and context switching code
• Design Flexibility
– Provides Full Access to Kernel Resources to RT Tasks
– Supports existing driver and application code
– User-space Real-Time
• Optimization Flexibility
– RT Tasks designed to use Kernel-resources in managed ways can reduce
or eliminate Priority-Inheritance delays
• Adequate Instrumentation
– Latency timing, latency triggers & stack tracing, histograms

33
Kernel Evolution: Preemptible Code
Kernel 2.0

Kernels
2.2-2.4

Preemptible
Kernel 2.4
Kernel 2.6

Real-Time
Kernel 2.6

Preemptible Non-Preemptible
34
Real-Time Linux 2.6 Performance
• Real-Time Linux 2.6 Kernel Performance
– Far exceeds most stringent Audio performance requirements
– Enables sub-millisecond control-loop response
– Enables Hard Real Time for qualified RT-aware Applications

• SMP Kernel Performance


– SMP-safe code is by definition preemptible
– Any code that allows concurrent execution by multiple CPUs, also
allows context switching and therefore preemption
– Increased preemptible code surface in the Kernel also increases SMP
throughput / efficiency

35
Real-Time Linux 2.6 Performance (Cont)

No Preemption Preemptible

MontaVista Linux 4.0


• Target machine:
– Intel® Celeron® 800 MHz Real-Time
• CPU utilization during test: Preemption
– 100% most of the time
• Test Duration:
– 20 hours

36
References
• RT-Linux : http://www.rtlinux.org
• RTAI :
http://www.aero.polimi.it/projects/rtai/contri
b.htm
• Montavista: http://www.mvista.com
• Linux as a real-time operating system –
Freescale semiconductor, David Beal,
Nov/2005

37
Software Environments for Embedded
Systems
SW: Embedded Software Tools

application compiler
U source Application
S code software
E
R a.out RTOS

A
S
C I ROM
simulator P
C
debugger A
U S
RAM
I
C
Another View of Microprocessor Architecture

• Let’s look at current architectural evolution from the standpoint of the


software developers …, in particular Jerry Fiddler
Fiddler’s Predictions for the Next Ten Years
(2010)
• End of the “Age of the PC”
• Lots of Exciting Applications
• Development Will Continue To Be Hard
– Even as we and our competitors continue to make
incredible efforts
• Chips - No predictions
• MEMS / Nano-technology & Sensors Will
Impact Us
Fundamental Principles
• Computers are, and will be, everywhere
• The world itself is becoming more intelligent
• Our infrastructure will have major software content
• Most of our access to information will be through embedded systems
• Economics will inexorably drive deployment of embedded systems
• The Internet is one important factor in this trend
• Reliability is a critical issue
• EVERY tech and mfg. business will need to become good at embedded software
What Will Be Embedded in Ten Years?
• Everything That is Now Electro-Mechanical

G
• Machines (Nano-Machines)

I N
• Analog Signals
• Anything that communicates

T
Lots of stuff in our cars
H
Y
• Our Bodies

R
– Today - Pacemakers

E
– Soon - De-Fibrillators, Insulin Dispensers

V
– We can all be the $6M Person, for a lot cheaper

E
• All sorts of interfaces
– Speech, DNI, etc.
Embedded Microprocessor Evolution
> 500k transistors 2+M transistors 5+M transistors 22+M transistors
1 - 0.8  0.8 - 0.5  0.5 - 0.35  0.25 - 0.18 
33 mHz 75 - 100 mHz 133 - 167 mHz 500 - 600 mHz

1989 1993 1995 1999

• Embedded CPU cores are getting smaller; ~ 2mm2 for up to 400 mHz
– Less than 5% of CPU size
• Higher Performance by:
– Faster clock, deeper pipelines, branch prediction, ...
• Trend is towards higher integration of processors with:
– Devices that were on the board now on chip: “system on a chip”
– Adding more compute power by add-on DSPs, ...
– Much larger L1 / L2 caches on silicon
Microprocessor Chaos
ST 20
J. Fiddler - WRS M32 R/D
StrongARM
ARM
SH-DSP
SH 4
MCORE
680x0 680x0
CPU32 CPU32
PowerPC PowerPC
80x86 80x86
MIPS 3k/4k/5k MIPS 3k/4k/5k
SPARC SPARC
SH 1/2/3 SH 1/2/3
29k 29k 29k
680x0 RAD 6k RAD 6k
CPU32 Siemens C16x Siemens C16x
80x86 NEC V8xx NEC V8xx
SPARC PARISC PARISC
MIPS R3k i960 i960
68000 i960 563xx 563xx

1980 1990 1996 1998


A Challenging Environment
Expanding Functional Demands Of
Embedded Applications

And keep it small,


stupid!

Numerous Microprocessor Architectures


Derivative Processors
Application-Specific CPUs
Systems On A Chip
New Hardware Challenges Software
Development

• More & More Architectures


– User-Customizable µprocessors
• More Power Demands More Software
Functionality
– Software is not following Moore’s law (yet)
• System-on-a-chip
• DSP
Embedded Software Crisis
Cheaper, more powerful
Microprocessors

Increasing Embedded
Software More
Time-to-market
Crisis Applications
pressure

Bigger, More Complex


Applications
SW: Embedded Software Tools

application compiler
U source Application
S code software
E
R a.out RTOS

A
S
C I ROM
simulator P
C
debugger A
U S
RAM
I
C
Outline on RTOS
• Introduction
• VxWorks
– General description
• System
• Supported processors
– Details
• Kernel
• Custom hardware support
• Closely coupled multiprocessor support
• Loosely coupled multiprocessor support
• pSOS
• eCos
• Conclusion
Embedded Development: Generation 0
• Development: Sneaker-net
• Attributes:
– No OS
– Painful!
– Simple software only
Embedded Development: Generation 1
• Hardware: SBC, minicomputer
• Development: Native
• Attributes:
– Full-function OS
• Non-Scalable
• Non-Portable
– Turnkey
– Very primitive
Embedded Development: Generation 2
• Hardware: Embedded
• Development: Cross, serial line
• Attributes
– Kernel
– Originally no file sys, I/O, etc.
– No development environment
– No network
– Non-portable, in assembly
Embedded Development: Generation 3
• Hardware: SBC, embedded
• Development: Cross, Ethernet
– Integrated, text-based, Unix
• Attributes
– Scalable, portable OS
• Includes network, file & I/O sys, etc.
– Tools on target
• Network required
• Heavy target required for development
– Closed development environment
Embedded Development: Generation 4
• Hardware: Embedded, SBC
• Development: Cross
– Any tool - Any connection - Any target
– Integrated GUI, Unix & PC
• Attributes
– Tools on host
• No target resources required
• Far More Powerful Tools (WindView, CodeTest, …)
– Open dev. environment, published API
– Internet is part of dev. environment
• Support, updates, manuals, etc.
Embedded Development: Generation 5???

• Super-scalable
• Communications-centric
• Virtual application platform
– Java?
• Multi-media
• Way-cool development environment
– Much easier to create, debug & re-use code
– Easy for non-programmers to contribute
The RTOS Evolution

Application
Browser / GUI
Java
Application Advanced Interconnect
X Windows Advanced Networking
WindNet Distributed Objects
Memory Management Fault Tolerance 90%*
Application Multiprocessing 75%* Multiprocessing
File System File System File System
Application Networking Networking Networking
30%*
Kernel 10%* Kernel Kernel Kernel

1980 1990 1996 1998

*Percent of total software supplied by RTOS vendor in a typical embedded device


Introduction to RTOS
• Wind River Systems Inc. VxWorks
• http://www.wrs.com

• Integrated Systems Inc. pSOS


• http://www.isi.com

• Cygnus Inc. => RedHat eCos


• http://www.cygnus.com =>
www.redhat.com
VxWorks

Real-Time Embedded Applications

Graphics Multiprocessing support Internet support

Java support POSIX Library File system

WindNet Networking

Core OS

Wind Microkernel

VxWorks 5.4 Scalable Run-Time System


Supported Processors

• PowerPC SPARC
• 68K, CPU 32
NEC V8xx
• ColdFire
M32 R/D
• MCORE
• 80x86 and Pentium RAD6000
• i960 ST 20
• ARM and Strong ARM TriCore
• MIPS
• SH
Wind microkernel
• Task management
– multitasking, unlimited number of tasks
– preemptive scheduling and round-robin scheduling(static
scheduling)
– fast, deterministic context switch
– 256 priority levels
Wind microkernel
• Fast, flexible inter-task communication
– binary, counting and mutual exclusion semaphores
with priority inheritance
– message queue
– POSIX pipes, counting semaphores, message
queues, signals and scheduling
– control sockets
– shared memory
Wind microkernel
• High scalability
• Incremental linking and loading of components
• Fast, efficient interrupt and exception handling
• Optimized floating-point support
• Dynamic memory management
• System clock and timing facilities
``Board Support Package’’
• BSP = Initializing code for hardware device +
device driver for peripherals
• BSP Developer’s Kit

Hardware Processor
independent dependent Device dependent code
code code

BSP
VxMP
• A closely coupled multiprocessor support accessory for VxWorks.
• Capabilities:
– Support up to 20 CPUs
– Binary and counting semaphores
– FIFO message queues
– Shared memory pools and partitions
– VxMP data structure is located in a shared memory area accessible to all CPUs
– Name service (translate symbol name to object ID)
– User-configurable shared memory pool size
– Support heterogeneous mix of CPU
VxMP
• Hardware requirements:
– Shared memory
– Individual hardware read-write-modify mechanism across the shared
memory bus
– CPU interrupt capability for best performance
– Supported architectures:
• 680x0 and 683xx
• SPARC
• SPARClite
• PPC6xx
• MIPS
• i960
VxFusion
• VxWorks accessory for loosely coupled configurations and standard IP
networking;
• An extension of VxWorks message queue, distributed message queue.
• Features:
– Media independent design;
– Group multicast/unicast messaging;
– Fault tolerant, locale-transparent App1 App2
operations;
– Heterogeneous environment.
• Supported targets: VxFusion
– Motorola: 68K, CPU32, PowerPC
Adapter Layer
– Intel x86, Pentium, Pentium Pro

Transport
pSOS

Loader Debug C/C++ File System

Memory POSIX
I/O system BSPs Management Library

pSOS+ Kernel

pSOS 2.5
Supported processors
• PowerPC M32/R
m.core
• 68K
NEC v8xx
• ColdFire ST20
• MIPS SPARClite

• ARM and Strong


ARM
• X86 and Pentium
• i960
• SH
pSOS+ kernel
• Small Real Time multi-tasking kernel;
• Preemptive scheduling;
• Support memory region for different tasks;
• Mutex semaphores and condition variables (priority ceiling)
• No interrupt handling is included
Board Support Package
• BSP = skeleton device driver code + code for low-level system
functions each particular devices requires
pSOS+m kernel
• Tightly coupled or distributed processors;
• pSOS API + communication and coordination
functions;
• Fully heterogeneous;
• Connection can be any one of shared memory,
serial or parallel links, Ethernet
implementations;
• Dynamic create/modify/delete OS object;
• Completely device independent
eCos

ISO C Library Native Kernel C API ITRON 3.0 API

Internal Kernel API


Device Drivers

Kernel
pluggable schedulers, mem alloc,
synchronization, timers, interrupts,
threads

HAL
Supported processors
• Advanced RISC Machines ARM7
• Fujitsu SPARClite
• Matsushita MN10300
• Motorola PowerPC
• Toshiba TX39
• Hitachi SH3
• NEC VR4300
• MB8683x series
• Intel strong ARM
Kernel
• No definition of task, support multi-thread
• Interrupt and exception handling
• Preemptive scheduling: time-slice scheduler,
multi-level queue scheduler, bitmap scheduler
and priority inheritance scheduling
• Counters and clocks
• Mutex, semaphores, condition variable,
message box
Hardware Abstraction Layer
• Architecture HAL abstracts basic CPU, including:
– interrupt delivery
– context switching
– CPU startup and etc.
• Platform HAL abstracts current platform, including
– platform startup
– timer devices
– I/O register access
– interrupt control
• Implementation HAL abstracts properties that lie between the above,
– architecture variants
– on-chip devices
• The boundaries among them blurs.
Summary on RTOS
VxWorks pSOS eCos
Task Y Y Only Thread
Scheduler Preemptive, static Preemptive Preemptive
Synchronization mechanism No condition variable Y Y
POSIX support Y Y Linux
Scalable Y Y Y
Custom hw support BSP BSP HAL, I/O
package
Kernel size - 16KB -
Multiprocessor support VxMP/ VxFusion PSOS+m None
(accessories) kernel
Recall the ``Board Support Package’’
• BSP = Initializing code for hardware device +
device driver for peripherals
• BSP Developer’s Kit

Hardware Processor
independent dependent Device dependent code
code code

BSP
Introduction to Device Drivers
• What are device drivers?
– Make the attached device work.
– Insulate the complexities involved in I/O handling.

Application

RTOS

Device driver

Hardware
Proliferation of Interfaces
• New Connections
– USB
– 1394
– IrDA
– Wireless
• New Models
– JetSend
– Jini
– HTTP / HTML / XML / ???
– Distributed Objects (DCOM, CORBA)
Leads to Proliferation of Device Drivers
Device Driver Characterization
• Device Drivers’ Functionalities
– initialization
– data access
– data assignment
– interrupt handling
Device Characterization
• Block devices
– fixed data block sizes devices
• Character devices
– byte-stream devices
• Network device
– manage local area network and wide area network interconnections
I/O Processing Characteristics
• Initialization
– make itself known to the kernel
– initialize the interrupt handling
– optional: allocate the temporary memory for device
driver
– initialize the hardware device
• Front-End Processing
– initiation of an I/O request
• Back-End Processing
– handles the completion of I/O operations
Commercial Resources
• Aisys DriveWay 3DE
– Motorola MPC860, MC68360, MC68302, AMD E86, Philips
XA, 8C651, PIC 16/17
• Stenkil MakeApp
– Hitachi H8, SH1, SH3, SH7x, HCAN
• Intel’s ApBuilder
• Motorola MCUnit
• GO DSP Code Composer
– TI DSPs
• CoWare
Aysis 3DE DriveWay Features
• Extensive documentation: KB help along the
way as detailed as a chip manual: traffic.ext,
traffic.dwp
• CNFG for configuring the chip such as memory
and clock. Gives warning if necessary
• Can generate test function
• Can insert user code
• One file for each peripheral
DriveWay Design Methodology
.DWP
GUI Code
“generator”

User data
Little generation
more manipulation

Output
files
.DLL K.B.

Manipulation
Chip of K.B.database
specific
K.B. Database
• A specific K.B. per chip family
• Family of chips
– chip
• peripherals
– functional objects (timer, PWM counter)
» functions
» physicals (register setting, values, clock rate)
» actual code
DriveWay Builder
• Add chip
• Add peripheral
• Create skeleton, link to other thins such as GUI
• Code reuse in adding a new chip in an existing
family, e.g., use code in MPC 860 for MPC 821
• Easy to create infrastructure but specifics has
to be written
About the code generator (1)
• Cut and paste K.B. database
• Areas where we can use automation for
device driver generation:
– model user specification
– extract useful information for drivers from HDL
description of the chip
• MAP registers
• interrupt
About the code generator (2)

• Why is Aysis not using automation?


– Commercial efficiency
• e.g., easy to capture user specification from the GUI
rather than using a model such as UML or state
machine
– HDL code too low level, hard to extract
information

CoWare Interface Synthesis™
System suggests hardware/software interface protocols
– Handshaking, memory mapped I/O, interrupt scheme, DMA…
• Designer selects communication protocols & memory
• System synthesizes efficient device drivers and glue logic

Hardware Software

Device
Glue Logic Driver
Interface Synthesis Example: Memory Mapped I/O

SW SW

Port HW Device Glue HW


Driver Logic
Port = value;

Glue Logic
compiled on processor

SW

Processor HW

*FFA3 = value;
*FFA3

Device Driver

Memory
Address FFA3
SW: Embedded Software Tools

application compiler
U source Application
S code software
E
R a.out RTOS

A
S
C I ROM
simulator P
C
debugger A
U S
RAM
I
C
ASIC Value Proposition

S/P
RAM
RAMDMA µC

ASIC DSP
LOGIC CORE

• 20% area decrease in ASIC portion


• 25% higher performance
• move to higher level - HDL description at RTL
The Importance of Code Size

Killian- Tensilica
5.0 Area vs. Program I nstructions
4.5
2
Processor + Code RAM mm

4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0 1000 2000 3000 4000 5000 6000 7000 8000
Xtensa MIPS-4Kc ARC ARM9 ARM9-Thumb Program Size (Instructions)
• Based on base 0.18m implementation plus code RAM or cache
• Xtensa code ~10% smaller than ARM9 Thumb, ~50% smaller than MIPS-Jade, ARM9 and ARC
• ARM9-Thumb has reduced performance
• RAM/cache density = 8KB/mm2
SW Compiler Value Proposition
• 20% area decrease in RAM portion
• 25% higher performance
• move to higher level - C rather than assembler

R S/P
RAM
A µC
M DMA

ASIC DSP
LOGIC CORE

20% area decrease over ASIC portion


Memory? StrongARM Processor

Compaq/Digital StrongARM
Compiler Support
• BUT, few companies focused on compiler
support for embedded systems:
– Cygnus => RedHat
– Tartan => TI
– Green Hills

• Why?
• Bad ``buying behaviors’’ – few seats,
low ASP’s

Current Status on Compiler Support
Adequate compiler and debugger support in breadth and quality for embedded
microprocessors/microcontrollers
– ARM
– MIPS
– Power PC
– Mot family
• From
– Cygnus/RedHat
– Manufacturer
– Green Hills
• DSP’s still poorly supported
– Tartan acquired by Texas Instruments
– WHY????
• NO support for growing generation of special purpose processors:
– TMS320C80
– IXP1200
Recall: Architectural Features of DSPs
• Data path configured for DSP
– Fixed-point arithmetic
– MAC- Multiply-accumulate
• Multiple memory banks and buses -
– Harvard Architecture
– Multiple data memories
• Specialized addressing modes
– Bit-reversed addressing
– Circular buffers
• Specialized instruction set and execution control
– Zero-overhead loops
– Support for MAC
• Specialized peripherals for DSP
Example: IXP1200
Host CPU (optional) PCI MAC Devices
PCI Bus 66 Mhz

32

PCI Bus Unit StrongARM core


Microengin
SDRAM 64 SDRAM Memory e1
Microengin
(up to 256 MB) Unit e 2Microengin
e 3Microengin
e 4Microengin
SRAM 32 SRAM Memory e 5Microengin
e6
(up to 8 MB) Unit

Boot ROM IX Bus Interface


(up to 8 MB) Unit

64
Peripherals FIFO Bus 66 Mhz

Ethernet MAC ATM, T1/E1 Another IXP1200


IXP1200 Network Processor

• 6 micro-engines
– RISC engines
SDRAM
Ctrl – 4 contexts/eng
MicroEng – 24 threads total
PCI • IX Bus Interface
Interface
MicroEng Hash – packet I/O
Engine
– connect IXPs
MicroEng • scalable
ICache IX Bus
Interface • StrongARM
MicroEng – less critical tasks
SA
DCache Scratch
Core Pad
• Hash engine
MicroEng
Mini SRAM – level 2 lookups
DCache • PCI interface
MicroEng
SRAM
Ctrl
Summary
• Embedded software support for microcontrollers and microprocessors is
broadly available and of adequate quality
– RTOS
– Device drivers
– Compilers
– Debuggers
• Embedded software support for DSP processors is inadequate:
– Patchy support – many parts lack support
– Quality poor – lags hand coding by 20-100%
• Embedded software support for special purpose processors often non-
existent
• Still in a ``build a hardware then write the software’’ world
• Alternatives?
ASIP/Extensible micro DESIGN FLOW

APPLICATION_1 APPLICATION_2 APPLICATION_7

APPLICATION
CODE µARCHITECTURE DESIGNER

RETARGETABLE INSTRUCTION SET


COMPILER

OBJECT SIMULATION PERFORMANCE


CODE MODEL ANALYSIS
Tensilica TIE Overview

Killian- Tensilica

Processor ASIC
Verilog flow
Processor RTL
Configure
Base uP Generator uP
*******
**** Software
Mem
******** Tools Software
Software
*** compile
Generator
Describe new
inst in TIE

*******
****
********
***
Application
Tensilica TIE Design Cycle
Killian- Tensilica
Develop application in C/C++

Run cycle-accurate ISS


Profile and analyze

N
Id potential new instructions Acceptable ?

Y
Describe new instructions
Measure hardware impact

Generate new software tools


N
Acceptable ?
Compile and run application
Y
Build the entire processor
N Y
Correct ?
Conclusions
• Full embedded software support for will
be requirement for future embedded
system ``platforms’’
• Companies evolving hardware and
software together will have a significant
competitive advantage
• Few examples beginning to emerge-
Tensilica, ST Microelectronics

You might also like