0% found this document useful (0 votes)
100 views140 pages

Unit 1-ERTS

This document provides an introduction to embedded systems and real-time systems. It discusses key characteristics of embedded systems like sophisticated functionality, real-time operation, low cost and power. Common examples are given. Object-oriented design principles and techniques for modeling systems like state machines, classes, relationships between objects are also covered at a high level.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views140 pages

Unit 1-ERTS

This document provides an introduction to embedded systems and real-time systems. It discusses key characteristics of embedded systems like sophisticated functionality, real-time operation, low cost and power. Common examples are given. Object-oriented design principles and techniques for modeling systems like state machines, classes, relationships between objects are also covered at a high level.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 140

EC2042

EMBEDDED AND REAL TIME


SYSTEMS

04/28/2021 1
UNIT I INTRODUCTION TO
EMBEDDED COMPUTING
 Complex systems and microprocessors – Design
example: Model train controller Embedded system
design process – Formalism for system design –
Instruction sets Preliminaries – ARM Processor – CPU:
Programming input and output – Supervisor mode,
exception and traps – Coprocessor – Memory system
mechanism – CPU performance – CPU power
consumption.

04/28/2021 2
DEFINITION
 Embedded computing system: any device that
includes a programmable computer but is not itself
a general-purpose computer.
 Take advantage of application characteristics to
optimize the design
EMBEDDING A COMPUTER

output analog

input analog
CPU

mem
embedded
computer 04/28/2021 3
EXAMPLES
 Cell phone.
 Printer.
 Automobile: engine, brakes, dash, etc.
 Airplane: engine, flight controls, nav/comm.
 Digital television.
 Household appliances.

04/28/2021 4
CHARACTERISTICS OF EMBEDDED
SYSTEMS
 Sophisticated functionality.
 Real-time operation.
 Low manufacturing cost.
 Low power.
 Designed to tight deadlines by small teams.

FUNCTIONAL COMPLEXITY
 Often have to run sophisticated algorithms or multiple
algorithms.
 Cell phone, laser printer.
 Often provide sophisticated user interfaces.
04/28/2021 5
REAL-TIME OPERATION
 Must finish operations by deadlines.
 Hard real time: missing deadline causes failure.
 Soft real time: missing deadline results in degraded
performance.
Many systems are multi-rate: must handle operations
at widely varying rates.
NON-FUNCTIONAL REQUIREMENTS
 Many embedded systems are mass-market items that
must have low manufacturing costs.
 Limited memory, microprocessor power, etc.
 Power consumption is critical in battery-powered
devices.
 Excessivepower consumption increases system cost even
in wall-powered devices.
04/28/2021 6
WHAT DOES
“PERFORMANCE” MEAN?
 In general-purpose computing, performance often
means average-case, may not be well-defined.
 In real-time systems, performance means meeting
deadlines.
 Missing the deadline by even a little is bad.
 Finishing ahead of the deadline may not help.

04/28/2021 7
CHARACTERIZING
PERFORMANCE
 We need to analyze the system at several levels of
abstraction to understand performance:
 CPU.
 Platform.
 Program.
 Task.
 Multiprocessor.

04/28/2021 8
DESIGN GOALS
 Performance.
 Overall speed, deadlines.
 Functionality and user interface.
 Manufacturing cost.
 Power consumption.
 Other requirements (physical size, etc.)

04/28/2021 9
LEVELS OF ABSTRACTION
requirements

specification

architecture

component
design

system
integration

04/28/2021 10
TOP-DOWN VS. BOTTOM-UP
 Top-down design:
 start
from most abstract description;
 work to most detailed.
 Bottom-up design:
 work from small components to big system.
 Real design uses both techniques.

04/28/2021 11
STEPWISE REFINEMENT
 At each level of abstraction, we must:
 analyze the design to determine characteristics of the
current state of the design;
 refine the design to add detail.

REQUIREMENTS

 Plain language description of what the user wants and


expects to get.
 May be developed in several ways:
 talking directly to customers;
 talking to marketing representatives;
 providing prototypes to users for comment.
04/28/2021 12
FUNCTIONAL VS. NON-
FUNCTIONAL REQUIREMENTS
 Functional requirements:
 output as a function of input.
 Non-functional requirements:
 time required to compute output;
 size, weight, etc.;
 power consumption;
 reliability;
 etc.

04/28/2021 13
OBJECT-ORIENTED DESIGN
 Object-oriented (OO) design: A generalization of
object-oriented programming.
 Object = state + methods.
 State
provides each object with its own identity.
 Methods provide an abstract interface to the object.

OBJECTS AND CLASSES


 Class: object type.
 Class defines the object’s state elements but state
values may change over time.
 Class defines the methods used to interact with all
objects of that type.
 Each object has its own state.
04/28/2021 14
OO DESIGN PRINCIPLES
 Some objects will closely correspond to real-world
objects.
 Someobjects may be useful only for description or
implementation.
 Objects provide interfaces to read/write state,
hiding the object’s implementation from the rest of
the system.

04/28/2021 15
UML OBJECT
object name
class name

d1: Display

pixels is a
2-D array pixels: array[] of pixels
elements
menu_items

comment

attributes

04/28/2021 16
UML CLASS

Display class name

pixels
elements
menu_items

mouse_click() operations
draw_box

04/28/2021 17
THE CLASS INTERFACE
 The operations provide the abstract interface
between the class’s implementation and other
classes.
 Operations may have arguments, return values.
 An operation can examine and/or modify the
object’s state.

04/28/2021 18
RELATIONSHIPS BETWEEN
OBJECTS AND CLASSES
 Association: objects communicate but one does
not own the other.
 Aggregation: a complex object is made of several
smaller objects.
 Composition: aggregation in which owner does not
allow access to its components.
 Generalization: define one class in terms of
another.

04/28/2021 19
CLASS DERIVATION
 May want to define one class in terms of another.
 Derived class inherits attributes, operations of base class.

Derived_class

UML
generalization
Base_class

04/28/2021 20
CLASS DERIVATION EXAMPLE
Display base
class
pixels
elements
menu_items

pixel()
derived class set_pixel()
mouse_click()
draw_box

BW_display Color_map_display

04/28/2021 21
MULTIPLE INHERITANCE
base classes

Speaker Display

Multimedia_display

derived class

04/28/2021 22
LINKS AND ASSOCIATIONS
 Link: describes relationships between objects.
 Association: describes relationship between
classes.

LINK EXAMPLE
 Link defines the contains relationship:
message

msg = msg1 message set


length = 1102
count = 2

message

msg = msg2
length = 2114
04/28/2021 23
ASSOCIATION EXAMPLE
# contained messages # containing message sets

message 0..* 1 message set

msg: ADPCM_stream
count : integer
length : integer contains

STATE MACHINES
transition

a b

state state name


04/28/2021 24
EVENT-DRIVEN STATE
MACHINES
 Behavioral descriptions are written as event-driven
state machines.
 Machine changes state when receiving an input.
 An event may come from inside or outside of the
system.
TYPES OF EVENTS
 Signal: asynchronous event.
 Call: synchronized communication.
 Timer: activated by time.

04/28/2021 25
SIGNAL EVENT

<<signal>>
mouse_click a

leftorright: button mouse_click(x,y,button)


x, y: position

b
declaration

event description

04/28/2021 26
CALL EVENT

draw_box(10,5,3,2,blue)

c d

TIMER EVENT
tm(time-value)

e f

04/28/2021 27
EXAMPLE STATE MACHINE
start input/output

mouse_click(x,y,button)/ region = menu/


find_region(region) which_menu(i) call_menu(I)
region got menu called
found item menu item

region = drawing/
find_object(objid) highlight(objid)

found object
object highlighted

finish

04/28/2021 28
SEQUENCE DIAGRAM
EXAMPLE
m: Mouse d1: Display u: Menu

mouse_click(x,y,button)
which_menu(x,y,i)

time

call_menu(i)

04/28/2021 29
INTRODUCTION
 Example: model train controller.

PURPOSES OF EXAMPLE
 Follow a design through several levels of
abstraction.
 Gain experience with UML.

04/28/2021 30
MODEL TRAIN SETUP

rcvr motor

power
supply

console

ECC command address header

04/28/2021 31
REQUIREMENTS
 Console can control 8 trains on 1 track.
 Throttle has at least 63 levels.
 Inertia control adjusts responsiveness with at least
8 levels.
 Emergency stop button.
 Error detection scheme on messages.

04/28/2021 32
REQUIREMENTS FORM
name model train controller
purpose control speed of <= 8 model trains
inputs throttle, inertia, emergency stop, train
#
outputs train control signals
functions set engine speed w. inertia; emergency
stop
performance can update train speed at least 10
times/sec
manufacturing cost $50
power wall powered
physical console comfortable for 2 hands; < 2
size/weight lbs.

04/28/2021 33
DIGITAL COMMAND
CONTROL
 DCC created by model railroad hobbyists, picked
up by industry.
 Defines way in which model trains, controllers
communicate.
 Leavesmany system design aspects open, allowing
competition.
 This is a simple example of a big trend:
 Cell phones, digital TV rely on standards.

04/28/2021 34
DCC ELECTRICAL STANDARD
 Voltage moves around
logic 1 logic 0
the power supply
voltage; adds no DC
component.
 1 is 58 ms, 0 is at least time
100 ms.
58 ms >= 100 ms

04/28/2021 35
DCC COMMUNICATION
STANDARD
 Basic packet format: PSA(sD)+E.
 P: preamble = 1111111111.
 S: packet start bit = 0.
 A: address data byte.
 s: data byte start bit.
 D: data byte (data payload).
 E: packet end bit = 1.

04/28/2021 36
DCC PACKET TYPES
 Baseline packet: minimum packet that must be
accepted by all DCC implementations.
 Address data byte gives receiver address.
 Instruction data byte gives basic instruction.
 Error correction data byte gives ECC.

04/28/2021 37
BASIC SYSTEM COMMANDS

command name parameters

set-speed speed
(positive/negative)
set-inertia inertia-value (non-
negative)
estop none

04/28/2021 38
TYPICAL CONTROL
SEQUENCE

:console :train_rcvr
set-inertia
set-speed

set-speed

estop

set-speed

04/28/2021 39
MESSAGE CLASSES

command

set-speed set-inertia estop

value: unsigned-
value: integer
integer

04/28/2021 40
ROLES OF MESSAGE CLASSES
 Implemented message classes derived from
message class.
 Attributesand operations will be filled in for detailed
specification.
 Implemented message classes specify message
type by their class.
 May have to add type as parameter to data structure in
implementation.

04/28/2021 41
SUBSYSTEM
COLLABORATION DIAGRAM
Shows relationship between console and receiver (ignores
role of track):

1..n: command

:console :receiver

04/28/2021 42
MAJOR SUBSYSTEM ROLES
 Console:
 read state of front panel;
 format messages;
 transmit messages.
 Train:
 receive message;
 interpret message;
 control the train.

04/28/2021 43
CONSOLE SYSTEM CLASSES
1 console
1

1 1 1
1

panel formatter transmitter

1 1
1 1

receiver* sender*

04/28/2021 44
CONSOLE CLASS ROLES
 panel: describes analog knobs and interface
hardware.
 formatter: turns knob settings into bit streams.
 transmitter: sends data on track.

04/28/2021 45
TRAIN SYSTEM CLASSES
train set

1
1..t
1 1
train
1 1 motor
receiver interface
1 1

1 controller 1
1 1

detector* pulser*

04/28/2021 46
TRAIN CLASS ROLES
 receiver: digitizes signal from track.
 controller: interprets received commands and
makes control decisions.
 motor interface: generates signals required by
motor.

04/28/2021 47
DETAILED SPECIFICATION
 We can now fill in the details of the conceptual
specification:
 more classes;
 behaviors.
 Sketching out the spec first helps us understand the
basic relationships in the system.

04/28/2021 48
TRAIN SPEED CONTROL
 Motor controlled by pulse width modulation:

duty
cycle +

V
-

04/28/2021 49
CONSOLE PHYSICAL OBJECT
CLASSES
knobs* pulser*

train-knob: integer pulse-width: unsigned-


speed-knob: integer integer
inertia-knob: unsigned- direction: boolean
integer
emergency-stop: boolean

sender* detector*

send-bit() read-bit() : integer

04/28/2021 50
PANEL AND MOTOR
INTERFACE CLASSES

panel motor-interface

speed: integer
train-number() : integer
speed() : integer
inertia() : integer
estop() : boolean
new-settings()

04/28/2021 51
TRANSMITTER AND
RECEIVER CLASSES

transmitter receiver

current: command
new: boolean
send-speed(adrs: integer,
speed: integer)
send-inertia(adrs: integer, read-cmd()
val: integer) new-cmd() : boolean
set-estop(adrs: integer) rcv-type(msg-type:
command)
rcv-speed(val: integer)
rcv-inertia(val:integer)

04/28/2021 52
CLASS DESCRIPTIONS
 transmitter class has one behavior for each type of
message sent.
 receiver function provides methods to:
 detect a new message;
 determine its type;
 read its parameters (estop has no parameters).

04/28/2021 53
FORMATTER CLASS

formatter

current-train: integer
current-speed[ntrains]: integer
current-inertia[ntrains]:
unsigned-integer
current-estop[ntrains]: boolean

send-command()
panel-active() : boolean
operate()

04/28/2021 54
FORMATTER CLASS
DESCRIPTION
 Formatter class holds state for each train, setting
for current train.
 The operate() operation performs the basic
formatting task.

04/28/2021 55
CONTROL INPUT CASES
 Use a soft panel to show current panel settings for
each train.
 Changing train number:
 mustchange soft panel settings to reflect current train’s
speed, etc.
 Controlling throttle/inertia/estop:
 read panel, check for changes, perform command.

04/28/2021 56
CONTROL INPUT SEQUENCE
DIAGRAM
:knobs :panel :formatter :transmitter
change in read panel
control panel-active
change in speed/

settings
inertia/estop

panel settings send-command


read panel
send-speed,
send-inertia.
panel settings
send-estop
read panel
change in
train number

train panel settings


change in

number
new-settings

set-knobs

04/28/2021 57
FORMATTER OPERATE
BEHAVIOR

update-panel()

panel-active() new train number

idle

send-command()
other

04/28/2021 58
PANEL-ACTIVE BEHAVIOR

T
current-train = train-knob
panel*:read-train() update-screen
changed = true
F

T
panel*:read-speed() current-speed = throttle
changed = true

... ...

04/28/2021 59
CONTROLLER CLASS

controller

current-train: integer
current-speed[ntrains]: integer
current-direction[ntrains]: boolean
current-inertia[ntrains]:
unsigned-integer

operate()
issue-command()

04/28/2021 60
SEQUENCE DIAGRAM FOR
SET-SPEED COMMAND
:receiver :controller :motor-interface :pulser*
new-cmd

cmd-type

rcv-speed set-speed set-pulse


set-pulse

set-pulse

set-pulse

set-pulse

04/28/2021 61
CONTROLLER OPERATE
BEHAVIOR
wait for a
command
from receiver
receive-command()

issue-command()

04/28/2021 62
REFINED COMMAND CLASSES
command

type: 3-bits
address: 3-bits
parity: 1-bit

set-speed set-inertia estop

type=010 type=001
type=000
value: 7-bits value: 3-bits

04/28/2021 63
VON NEUMANN
ARCHITECTURE
 Memory holds data, instructions.
 Central processing unit (CPU) fetches instructions
from memory.
 Separate
CPU and memory distinguishes
programmable computer.
 CPU registers help out: program counter (PC),
instruction register (IR), general-purpose registers,
etc.

04/28/2021 64
CPU + MEMORY

address

200
PC
data
memory
CPU
200 ADD r5,r1,r3 ADD IR
r5,r1,r3

04/28/2021 65
HARVARD ARCHITECTURE

address

data memory
data PC

CPU
address

program memory data

04/28/2021 66
VON NEUMANN VS. HARVARD
 Harvard can’t use self-modifying code.
 Harvard allows two simultaneous memory fetches.
 Most DSPs use Harvard architecture for streaming
data:
 greatermemory bandwidth;
 more predictable bandwidth.

04/28/2021 67
RISC VS. CISC
 Complex instruction set computer (CISC):
 many addressing modes;
 many operations.
 Reduced instruction set computer (RISC):
 load/store;
 pipelinable instructions.

04/28/2021 68
INSTRUCTION SET
CHARACTERISTICS
 Fixed vs. variable length.
 Addressing modes.
 Number of operands.
 Types of operands.

04/28/2021 69
ARM - INTRODUCTION
 Advances RISC Machines (now known as ARM) was established
as a joint venture between Acorn, Apple and VLSI in November
1990
 ARM is the industry's leading provider of 16/32-bit embedded
RISC microprocessor solutions
 Architectural simplicity which allows Very small implementations
which result in Very low power consumption
 The company licenses its high-performance, low-cost, power-
efficient RISC processors, peripherals, and system-chip designs to
leading international electronics companies
 ARM provides comprehensive support required in developing a
complete system

04/28/2021 70
ARM - INTRODUCTION
Currently available processors:
 Arm 1
 This was the very first ARM processor and was superseded by the
ARM2 fairly quickly
 Arm 2
 The ARM2 chip, and processor cell, features 27 registers of which
16 are accessible at any one time. Four processor modes are
available:
 USR : user mode
 IRQ : interrupt mode (with a private copy of R13 and R14)
 FIQ : fast interrupt mode (private copies of R8 to R14)
 SVC : supervisor mode. (private copies of R13 and R14)

04/28/2021 71
ARM - INTRODUCTION
 Arm 3
 This is an upgraded ARM2 cell, with a cache and dedicated coprocessor
interface added
 Arm 6
 This processor cell is the first of the commercially available ARMs to
have a full 32bit addressing capability
 Additionally the processor now has 31 registers in it along with six new
processor modes:
 User32 - 32 bit USR mode

 Supervisor32 - 32 bit SVC mode (private SPSR register)

 IRQ32 - 32 bit IRQ mode (private SPSR register)

 FIQ32 - 32 bit FIQ mode. (private SPSR register)

 Abort32 - Memory fetch abort mode (private SPSR register)

 Undefined32 Undefined instruction mode (private SPSR register)

04/28/2021 72
ARM - INTRODUCTION
 Arm 7
 The ARM7 cell is functionally identical to the ARM6 cell in capabilities
but may be clocked faster than the ARM6
 A variant of the ARM7 cell offers an improved hardware multiply,
suitable for DSP work
 Arm 8
 Includes a five stage pipeline, a speculative instruction fetcher and
internal tweaks to the processor to allow a higher clock speed
 Strong ARM
 This is the high speed variant of the ARM chip family
 Architecturally it is similar to the ARM8 core, sharing the five stage
pipeline with that processor
 A further difference is change from a unified data and instruction cache
to a split, Harvard architecture, instruction and data cache

04/28/2021 73
ARM - INTRODUCTION

 ARM9
 An incremental improvement over the ARM8 this chip features
the same five stage pipeline but is now a Harvard Architecture
chip, like the Strong ARM
 ARM 10
 300 MHz

 400 MIPS
 600 mWatts

04/28/2021 74
ARM ASSEMBLY LANGUAGE
EXAMPLE
label1 ADR r4,c
LDR r0,[r4] ; a comment
ADR r4,d
LDR r1,[r4]
SUB r0,r0,r1 ; comment

04/28/2021 75
ARM INSTRUCTION SET
 ARM versions.
 ARM assembly language.
 ARM programming model.
 ARM memory organization.
 ARM data operations.
 ARM flow of control.

04/28/2021 76
ARM REGISTERS (4)
System & User FIQ Supervisor Abort IRQ Undefined
R0 R0 R0 R0 R0 R0
R1 R1 R1 R1 R1 R1
R2 R2 R2 R2 R2 R2
R3 R3 R3 R3 R3 R3
R4 R4 R4 R4 R4 R4
R5 R5 R5 R5 R5 R5
R6 R6 R6 R6 R6 R6
R7 R7_fiq R7 R7 R7 R7
R8 R8_fiq R8 R8 R8 R8
R9 R9_fiq R9 R9 R9 R9
R10 R10_fiq R10 R10 R10 R10
R11 R11_fiq R11 R11 R11 R11
R12 R12_fiq R12 R12 R12 R12
R13 R13_fiq R13_svc R13_abt R13_irq R13_und
R14 R14_fiq R14_svc R14_abt R14_irq R14_und
R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC)
CPSR CPSR CPSR CPSR CPSR CPSR
SPSR_fiq SPSR_svc SPSR_abt SPSR_irq SPSR_und
04/28/2021 77
ARM PROGRAMMING MODEL

r0 r8
r1 r9
31 0
r2 r10
r3 r11 CPSR
r4 r12
r5 r13
NZCV
r6 r14
r7 r15 (PC)

04/28/2021 78
ENDIANNESS
 Relationship between bit and byte/word ordering
defines endianness:

bit 31 bit 0 bit 0 bit 31

byte 3 byte 2 byte 1 byte 0 byte 0 byte 1 byte 2 byte 3

little-endian big-endian

04/28/2021 79
ARM DATA TYPES
 Word is 32 bits long.
 Word can be divided into four 8-bit bytes.
 ARM addresses cam be 32 bits long.
 Address refers to byte.
 Address 4 starts at byte 4.
 Can be configured at power-up as either little- or
bit-endian mode.

04/28/2021 80
ARM STATUS BITS
 Every arithmetic, logical, or shifting operation sets
CPSR bits:
N (negative), Z (zero), C (carry), V (overflow).
 Examples:
 -1 + 1 = 0: NZCV = 0110.
 231-1+1 = -231: NZCV = 0101.

04/28/2021 81
ARM INSTRUCTION SET
 Features:
 Load/Store architecture
 3-address data processing instructions
 Conditional execution
 Load/Store multiple registers
 Shift & ALU operation in single clock cycle

04/28/2021 82
ARITHMETIC OPERATIONS

04/28/2021 83
LOGICAL, SHIFT/ROTATE

04/28/2021 84
COMPARISON, MOVE
INSTRUCTIONS

04/28/2021 85
Load/Store Instructions

04/28/2021 86
FLOW OF CONTROL
B – Branch
BL – Branch and Link
 Conditional execution:
 Each data processing instruction
prefixed by condition code
 Result – smooth flow of instructions through pipeline
 16 condition codes:

unsigned signed greater


EQ equal MI negative HI GT
higher than
unsigned lower signed less than
NE not equal PL positive or zero LS LE
or same or equal
unsigned signed greater
CS VS overflow GE AL always
higher or same than or equal
CC unsigned lower VC no overflow LT signed less than NV special purpose
04/28/2021 87

• Register Addressing
eg : ADD r0, r1, r2
• Immediate Addressing
eg : ADD r0, r1, #2
• Register-indirect
Addressing
eg : set r1 = 0X100
LDR r0, [r1]
LDR
•Variations
- LDR r0, [r1, - r2] loads r1 from
the address given by r1 – r2
- LDR r0, [r1, #4] loads r0 from
the address r1 + 4

04/28/2021 88
ADDRESSING MODES IN ARM

• Absolute Addressing - adding or subtracting to the PC
(r15) a constant equal to the distance between the current instruction
and the desired location, the desired address can be generated
without performing load.

04/28/2021 89
ADDRESSING MODES IN ARM

• Base-plus-offset Addressing - related to indirect addressing
the register value is added to
anothervalue to form the address

Eg : LDR r0, [r1, #16] loads r0 with the value stored at location r1 + 16.
r1 – base and immediate value is offset

Variations : Auto-indexing updates the base register


eg : LDR r0, [r1, #16]! Adds 16 to the value of r1
and the new value is used as an address
! Operator causes the base register (r1) to be updated

Post-indexing : does not perform offset calculation until the


fetch has been performed
eg : LDR r0, [r1], #16 will load r0 with the value stored in memory
location pointed by r1 and then adds 16 to r1 and set the new value to r1

04/28/2021 90
C ASSIGNMENTS IN ARM
INSTRUCTIONS
x = (a+b) - c

04/28/2021 91
The C statement y = a*(b+c) can be
coded as follows

04/28/2021 92
IMPLEMENTING AN IF STATEMENT IN ARM
if (a < b) {
x = 5;
y = c + d;
}
else x = c – d;

04/28/2021 93
FIR FILTER
IMPLEMENTATION IN ARM

The C code for the FIR filter using for loop Using while loop
for (i = 0, f = 0; i < N; i++) i = 0;
f = f + c[i] * x[i]; f = 0;
while (i < N) {
f = f + c[i]*x[i];
i++;
}

04/28/2021 94
ARM CODE FOR FIR FILTER

04/28/2021 95
I/O DEVICES
 Usually includes some non-digital component.
 Typical digital interface to CPU:

status
reg

mechanism
CPU

data
reg

04/28/2021 96
PROGRAMMING I/O
 Two types of instructions can support I/O:
 special-purpose
I/O instructions;
 memory-mapped load/store instructions.
 Intel x86 provides in, out instructions. Most
other CPUs use memory-mapped I/O.
 I/O instructions do not preclude memory-mapped
I/O.

04/28/2021 97
INTERRUPT I/O
 Busy/wait is very inefficient.
 CPU can’t do other work while testing device.
 Hard to do simultaneous I/O.
 Interrupts allow a device to change the flow of
control in the CPU.
 Causes subroutine call to handle device.

INTERRUPT INTERFACE
intr request
status
reg

mechanism
intr ack
PC
IR

CPU
data/address
data
reg

04/28/2021 98
PRIORITIES AND VECTORS
 Two mechanisms allow us to make interrupts more
specific:
 Priorities determine what interrupt gets CPU first.
 Vectors determine what code is called for each type of
interrupt.
 Mechanisms are orthogonal: most CPUs provide
both.

04/28/2021 99
PRIORITIZED INTERRUPTS

device 1 device 2 device n

interrupt
acknowledge

L1 L2 .. Ln

CPU

04/28/2021 100
INTERRUPT PRIORITIZATION
 Masking: interrupt with priority lower than current
priority is not recognized until pending interrupt is
complete.
 Non-maskable interrupt (NMI): highest-priority,
never masked.
 Often used for power-down.

04/28/2021 101
EXAMPLE: PRIORITIZED I/O
:interrupts :foreground :A :B :C

A,B

04/28/2021 102
INTERRUPT VECTOR
ACQUISITION
:CPU :device

receive
request

receive
ack

receive
vector

04/28/2021 103
GENERIC INTERRUPT
MECHANISM
continue intr?
execution N Assume priority selection is
Y handled before this
point.
N intr priority >
ignore current
priority?
Y

ack
Y
Y N
bus error timeout? vector?
Y

call table[vector]

04/28/2021 104
INTERRUPT SEQUENCE
 CPU acknowledges request.
 Device sends vector.
 CPU calls handler.
 Software processes request.
 CPU restores state to foreground program.

04/28/2021 105
ARM INTERRUPTS
 ARM7 supports two types of interrupts:
 Fast interrupt requests (FIQs).
 Interrupt requests (IRQs).
 Interrupt table starts at location 0.

ARM INTERRUPT PROCEDURE


 CPU actions:
 Save PC. Copy CPSR to SPSR.
 Force bits in CPSR to record interrupt.
 Force PC to vector.
 Handler responsibilities:
 Restore proper PC.
 Restore CPSR from SPSR.
 Clear interrupt disable flags.
04/28/2021 106
SUPERVISOR MODE
 May want to provide protective barriers between
programs.
 Avoid memory corruption.
 Need supervisor mode to manage the various
programs.
 C55x does not have a supervisor mode.
ARM SUPERVISOR MODE
 Use SWI instruction to enter supervisor mode,
similar to subroutine:
SWI CODE_1
 Sets PC to 0x08.
 Argument to SWI is passed to supervisor mode
code.
 Saves CPSR in SPSR. 04/28/2021107
EXCEPTION
 Exception: internally detected error.
 Exceptions are synchronous with instructions but
unpredictable.
 Build exception mechanism on top of interrupt
mechanism.
 Exceptions are usually prioritized and vectorized.

04/28/2021 108
TRAP
 Trap (software interrupt): an exception generated
by an instruction.
 Call supervisor mode.
 ARM uses SWI instruction for traps.
 SHARC offers three levels of software interrupts.
 Called by setting bits in IRPTL register.

04/28/2021 109
CO-PROCESSOR
 Co-processor: added function unit that is called by
instruction.
 Floating-point units are often structured as co-
processors.
 ARM allows up to 16 designer-selected co-
processors.
 Floating-point co-processor uses units 1, 2.
 C55x uses co-processors as well.

04/28/2021 110
CACHES AND CPUS

address data

cache

controller
cache main
CPU
memory
address

data data

04/28/2021 111
CACHE OPERATION
 Many main memory locations are mapped onto one
cache entry.
 May have caches for:
 instructions;
 data;
 data + instructions (unified).

 Memory access time is no longer deterministic.

Terms
 Cache hit: required location is in cache.
 Cache miss: required location is not in cache.
 Working set: set of locations used by program in a time
interval.

04/28/2021 112
TYPES OF MISSES
 Compulsory (cold): location has never been
accessed.
 Capacity: working set is too large.
 Conflict: multiple locations in working set map to
same cache entry.

MEMORY SYSTEM PERFORMANCE


 h = cache hit rate.
 tcache = cache access time, tmain = main memory
access time.
 Average memory access time:
 tav = htcache + (1-h)tmain
04/28/2021 113
MULTIPLE LEVELS OF CACHE

CPU L1 cache L2 cache

04/28/2021 114
MULTI-LEVEL CACHE ACCESS
TIME
 h1 = cache hit rate.
 h2 = hit rate on L2.
 Average memory access time:
 tav = h1tL1 + (h2-h1)tL2 + (1- h2-h1)tmain

 Replacement policy: strategy for choosing which


cache entry to throw out to make room for a new
memory location.
 Two popular strategies:
 Random.
 Least-recently used (LRU).
04/28/2021 115
CACHE ORGANIZATIONS
 Fully-associative: any memory location can be
stored anywhere in the cache (almost never
implemented).
 Direct-mapped: each memory location maps onto
exactly one cache entry.
 N-way set-associative: each memory location can
go into one of n sets.

04/28/2021 116
CACHE PERFORMANCE
BENEFITS
 Keep frequently-accessed locations in fast cache.
 Cache retrieves more than one word at a time.
 Sequential accesses are faster after first access.

04/28/2021 117
DIRECT-MAPPED CACHE

1 0xabcd byte byte byte ...

valid tag data

cache block

tag index offset

hit value
byte

04/28/2021 118
WRITE OPERATIONS
 Write-through: immediately copy write to main
memory.
 Write-back: write to main memory only when
location is removed from cache.
SET-ASSOCIATIVE CACHE
 A set of direct-mapped caches:

Set 1 Set 2 ... Set n

hit data
04/28/2021 119
SCRATCH PAD MEMORIES
 Alternative to cache:
 Software determines what is stored in scratch pad.
 Provides predictable behavior at the cost of
software control.
 C55x cache can be configured as scratch pad.

04/28/2021 120
MEMORY MANAGEMENT
UNITS
 Memory management unit (MMU) translates addresses:

logical physical
address address
memory
main
CPU management
memory
unit

04/28/2021 121
MEMORY MANAGEMENT
TASKS
 Allows programs to move in physical memory
during execution.
 Allows virtual memory:
 memory images kept in secondary storage;
 images returned to main memory on demand during
execution.
 Page fault: request for location not resident in
memory.

04/28/2021 122
ADDRESS TRANSLATION
 Requires some sort of register/table to allow
arbitrary mappings of logical to physical
addresses.
 Two basic schemes:
 segmented;
 paged.
 Segmentation and paging can be combined (x86).

04/28/2021 123
SEGMENTS AND PAGES
page 1
page 2
segment 1

memory

segment 2

04/28/2021 124
SEGMENT ADDRESS
TRANSLATION

segment base address logical address

segment lower bound range


range error
segment upper bound check

physical address

04/28/2021 125
PAGE ADDRESS TRANSLATION

page offset

page i base

concatenate

page offset

04/28/2021 126
PAGE TABLE ORGANIZATIONS

page
descriptor
page descriptor

flat tree

04/28/2021 127
ARM MEMORY
MANAGEMENT
 Memory region types:
 section:1 Mbyte block;
 large page: 64 kbytes;
 small page: 4 kbytes.
 An address is marked as section-mapped or page-
mapped.
 Two-level translation scheme.

04/28/2021 128
ARM ADDRESS TRANSLATION
Translation table 1st index 2nd index offset
base register

descriptor concatenate
1st level table

concatenate

descriptor
2nd level table
physical address

04/28/2021 129
PIPELINING
 Several instructions are executed simultaneously at
different stages of completion.
 Various conditions can cause pipeline bubbles that
reduce utilization:
 branches;
 memory system delays;
 etc.

04/28/2021 130
PERFORMANCE MEASURES
 Latency: time it takes for an instruction to get
through the pipeline.
 Throughput: number of instructions executed per
time period.
 Pipelining increases throughput without reducing
latency.

04/28/2021 131
ARM7 PIPELINE
 ARM 7 has 3-stage pipe:
 fetchinstruction from memory;
 decode opcode and operands;
 execute.

04/28/2021 132
ARM PIPELINE EXECUTION

fetch decode execute add r0,r1,#5

sub r2,r3,r6 fetch decode execute

cmp r2,#3 fetch decode execute

time
1 2 3

04/28/2021 133
PIPELINE STALLS
 If every step cannot be completed in the same
amount of time, pipeline stalls.
 Bubbles introduced by stall increase latency,
reduce throughput.

04/28/2021 134
ARM MULTI-CYCLE LDMIA
INSTRUCTION

ldmia fetch decode ex ld r2ex ld r3


r0,{r2,r3}

sub fetch decode ex sub


r2,r3,r6

cmp fetch decode ex cmp


r2,#3

time

04/28/2021 135
CONTROL STALLS
 Branches often introduce stalls (branch penalty).
 Stall time may depend on whether branch is taken.
 May have to squash instructions that already
started executing.
 Don’t know what to fetch until condition is
evaluated.

04/28/2021 136
MEMORY SYSTEM PERFORMANCE
 Caches introduce indeterminacy in execution time.
 Depends on order of execution.
 Cache miss penalty: added time due to a cache
miss.

TYPES OF CACHE MISSES


 Compulsory miss: location has not been referenced
before.
 Conflict miss: two locations are fighting for the
same block.
 Capacity miss: working set is too large.

04/28/2021 137
CPU POWER CONSUMPTION
 Most modern CPUs are designed with power
consumption in mind to some degree.
 Power vs. energy:
 heat depends on power consumption;
 battery life depends on energy consumption.

CMOS POWER CONSUMPTION


 Voltage drops: power consumption proportional to
V2.
 Toggling: more activity means more power.
 Leakage: basic circuit characteristics; can be
eliminated by disconnecting power.

04/28/2021 138
CPU POWER-SAVING STRATEGIES
 Reduce power supply voltage.
 Run at lower clock frequency.
 Disable function units with control signals when
not in use.
 Disconnect parts from power supply when not in
use.
POWER MANAGEMENT STYLES
 Static power management: does not depend on
CPU activity.
 Example: user-activated power-down mode.
 Dynamic power management: based on CPU
activity.
 Example: disabling off function units.
04/28/2021 139
ARM7 - APPLICATIONS
 The ARM7 is ideally suited to those applications
requiring RISC performance from a compact, power-
efficient processor
 Telecomms - GSM terminal controller
 Datacomms - Protocol conversion
 Portable Computing - Palmtop computer
 Portable Instrument - Hendheld data acquisition unit
 Automotive - Engine management unit
 Information systems - Smart cards
 Imaging - JPEG controller

04/28/2021 140

You might also like