EC8791-Embedded and Real Time Systems UNITS NOTES
EC8791-Embedded and Real Time Systems UNITS NOTES
com
Objectives:
To Understand the concept of embedded system design and analysis.
To learn the architecture of ARM processor.
To learn the Programming of ARM processor
To Expose the basic concepts of embedded programming.
To Learn real time operating systems
UNIT I INTRODUCTION TO EMBEDDED SYSTEM DESIGN
Complex systems and microprocessors– Embedded system design process –Design example:
Model train controller- Design methodologies- Design flows - Requirement Analysis –
Specifications-System analysis and architecture design – Quality Assurance techniques -
Designing with computing platforms – consumer electronics architecture – platform-level
performance analysis.
Course Outcomes:
TEXT BOOK:
UNIT I
INTRODUCTION TO EMBEDDEDSYSTEM
DESIGN
Introduction-Embedded Systems
⚫ An Embedded system is an electronic system that has a software and
is embedded in computer hardware.
⚫ It is a system which has collection of components used to execute a
task according to a program or commands given to it.
⚫ Examples Microwave ovens, Washing machine, Telephone
answering machine system, Elevator controller system, Printers,
Automobiles, Cameras, etc.
www.rejinpaul.com
www.rejinpaul.com
Embedded designer-skills
⚫ Designer has a knowledge in the followings field,
⚫ Microcontrollers, Data comm., motors, sensors, measurements
,C programming, RTOS programming.
www.rejinpaul.com
Levels of
Microprocessor
1. 8-bit microcontroller for low-cost applications and includes on-
board memory and I/O devices.
2. 16-bit microcontroller used for more sophisticated applications that
may require either longer word lengths or off-chip I/O and memory.
3. 32-bit RISC microprocessor offers very high performance
for computation-intensive applications.
Microprocessor Uses/Applications
⚫ Microwave oven has at least one microprocessor to control oven operation
⚫ Thermostat systems, which change the temperature level at various times
during the day
⚫ The modern camera is a prime example of the powerful features that can
be added under microprocessor control.
⚫ Digital television makes extensive use of embedded processors.
www.rejinpaul.com
2) SPECIFICATION
⚫ The specification must be carefully written so that it accurately reflects
the customer’s requirements.
⚫ It can be clearly followed during design.
3) Architecture Design
⚫ The architecture is a plan for the overall structure of the system.
⚫ It is in the form block diagram that shows a major operation and data flow.
4) Designing Hardware and Software Components
⚫ The architectural description tells us what components we need include
both hardware—FPGAs, boards & software modules
5) System Integration
⚫ Only after the components are built, putting them together and seeing
a working system.
⚫ Bugs are found during system integration, and good planning can help us find
the bugs quickly.
www.rejinpaul.com
Design Process
Steps
1. Requirements analysis of a GPS moving map
⚫ The moving map is a handheld device that displays for the user a map of
the terrain around the user’s current position.
⚫ The map display changes as the user and the map device change position.
⚫ The moving map obtains its position from the GPS, a satellite-based
navigation system.
Name GPS moving map
Purpose Consumer-grade moving map for driving use
Inputs Power button, two control buttons
Outputs Back-lit LCD display 400 600
Functions Uses 5-receiver GPS system; three user-selectable
resolutions;always displays current latitude and
longitude
Performance Updates screen within 0.25 seconds upon movement
Manufacturing cost $30
Power 100mW
Physical size and No more than 2”X 6, ” 12 ounces
weight
www.rejinpaul.com
5) Cost The selling cost of the unit should be no more than $100.
6) Physical size and weightThe device should fit comfortably in the palm of
the hand.
7) Power consumption The device run for at least 8 hrs on 4 AA batteries.
www.rejinpaul.com
8) specification
1. Data received from the GPS satellite constellation.
2. Map data.
3. User interface.
4. Operations that must be performed to satisfy customer requests.
5. Background actions required to keep the system running, such
as operating the GPS receiver.
www.rejinpaul.com
Block Diagram
www.rejinpaul.com
Hardware architecture
•one central CPU surrounded
by memory and I/O devices.
• It used two memories: a frame
buffer for the pixels to be displayed
and a separate program/data memory
for general use by the CPU.
Software architecture
Timer to control when we read the buttons on the user interface and render data onto the screen.
Units in the software block diagram will be executed in the hardware block diagram and when opera
www.rejinpaul.com
Sequence diagram in
UML
⚫ Sequence diagram is similar to a hardware timing diagram, although the time
flows vertically in a sequence diagram, whereas time typically flows horizontally
in a timing diagram.
⚫ It is designed to show particular choice of events—it is not convenient
for showing a number of mutually exclusive possibilities.
www.rejinpaul.com
REQUIREMENTS
⚫ The console shall be able to control up to eight trains on a single track.
⚫ The speed of each train controllable by a throttle to at least 63
different levels in each direction (forward and reverse).
⚫ There shall be an inertia control to adjust the speed of train.
⚫ There shall be an emergency stop button.
⚫ An error detection scheme will be used to transmit messages.
www.rejinpaul.com
Requirements:Chart Format
⚫ Name Model train controller
⚫ Purpose Control speed of up to eight model trains
⚫ Inputs Throttle, inertia setting, emergency stop,
train number
⚫ Outputs Train control signals
⚫ Functions Set engine speed based upon inertia
settings; respond
Baseline packet
⚫ The minimum packet that must be accepted by all DCC implementations.
⚫ It has three data bytes.
⚫ Address data byte gives the intended receiver of the packet
⚫ Instruction data byte provides a basic instruction
⚫ Error correction data byte is used to detect and correct transmission errors.
Date byte
Bits 0–3 provide a 4-bit speed value.
Bit 4 has an additional speed bit.
Bit 5 gives direction, with 1 for forward and 0 for reverse.
Bits 6-7 are set at 01 provides speed and direction.
www.rejinpaul.com
Conceptual Specification
⚫ Conceptual specification allows us to understand the system a little better.
⚫ A train control system turns commands into packets.
⚫ A command comes from the command unit while a packet is transmitted over
the rails.
⚫ Commands and packets may not be generated in a 1-to-1 ratio
www.rejinpaul.com
panel
•The Panel class defines a behavior for each of the controls on the panel.
• The new-settings behavior uses the set-knobs behavior of the Knobs*
•Change the knobs settings whenever the train number setting is changed.
•The Motor-interface defines an attribute for speed that can be set by
other classes.
.
www.rejinpaul.com
⚫ The formatter holds the current control settings for all of the trains.
⚫ The send-command serves as the interface to the transmitter.
⚫ The operate function performs the basic actions for the object.
⚫ The panel-active behavior returns true whenever the panel’s values do
not correspond to the current values
www.rejinpaul.com
1.5.1) Waterfall
model
⚫ The waterfall development model consists of five major phases.
⚫ Requirements analysis determines the basic characteristics of the system.
⚫ Architecture designIt decomposes the functionality into major components
⚫ CodingIt implements the pieces and integrates them.
⚫ TestingIt detemines bugs.
⚫ Maintenance It entails deployment in the field, bug fixes,and upgrades.
⚫ The waterfall model makes work flow information from higher levels of abstraction to
more detailed design steps.
www.rejinpaul.com
1.7) SPECIFICATIONS
⚫ SpecificationsIt is a detailed
descriptions of the system that
can be used to create the
architecture.
Control-oriented specification languages
⚫ SDL specifications include states,
actions, and both conditional and
unconditional transitions
between states.
⚫ SDL is an event-oriented
state machine model.
⚫ State chart has some important
concepts.
⚫ State charts allow states to be
grouped together to show
common functionality.
www.rejinpaul.com
⚫ Basic groupings(OR)
⚫ State machine specifies that the machine goes to state s4 from any of s1, s2, or s3
when they receive the input i2.
⚫ The State chart denotes this commonality by drawing an OR state around s1, s2, and s3 .
⚫ Single transition out of the OR state s123 specifies that the machine goes to s4 when
it receives the i2 input while in any state included in s123.
⚫ Multiple ways to get into s123 (via s1 or s2), and transitions between states within the
OR state (from s1 to s3 or s2 to s3).
The OR state is simply a tool for specifying some of the transitions relating to these
states.
www.rejinpaul.com
⚫ Basic groupings(AND)
⚫ In the State chart, the AND state sab is decomposed into two components, sa and sb.
⚫ When the machine enters the AND state, it simultaneously inhabits the state s1
of component sa and the state s3 of component sb.
⚫ When it enters sab, the complete state of the machine requires examining both sa and sb.
⚫ State s1-3 in the State chart machine having its sa component in s1 and its sb component in
s3.
⚫ When exit from cluster states go to s5 only when in the traditional specification, we are
in state s2-4 and receive input r.
www.rejinpaul.com
Ex:Elevator system
1. One passenger requests a car on a floor, gets in the car when it arrives,
requests another floor, and gets out when the car reaches that floor.
2. One passenger requests a car on a floor, gets in the car when it arrives,
and requests the floor that the car is currently on.
3. A second passenger requests a car while another passenger is riding in
the elevator.
4. Two people push floor buttons on different floors at the same time.
5.Two people push car control buttons in different cars at the same time.
www.rejinpaul.com
www.rejinpaul.com
SOFTWARE
Run Time components
⚫ It is a critical part of the platform.
⚫ An operating system is required to control CPU and its
multiple processes .
⚫ A file system is used in many embedded systems to organize
internal data and interface with other systems
Support components
It is a complex hardware platform.
Without proper code development and operating system, the hardware itself is useless.
ARM evaluation www.rejinpaul.com
board
www.rejinpaul.com
1.10.2)The PC as a Platform
www.rejinpaul.com
1.10.5)Debugging Challenges
⚫ Logical errors in software can be hard to track down and it will
create many problems in real time code.
⚫ Real-time programs are required to finish their work within a
certain amount of time.
⚫ Run time pgm run too long, they can create very unexpected behavior.
Missing of Deadline makes debugging process as difficult.
www.rejinpaul.com
⚫ It is a two-processor architecture.
⚫ If more computation is required, more DSPs and CPUs may be added.
⚫ The RISC-CPU runs the operating system, runs the user interface,
maintains the file system, etc.
⚫ DSP it is a programmable one, which performs signal processing.
⚫ Operating system runs on the CPU must maintain processes and the
file system.
Depending on the complexity of the device, the operating system may not need
to create tasks dynamically.
If all tasks can be created using initialization code, the operating system can be made smaller and sim
www.rejinpaul.com
For example, bus to carry four bytes or 32 bits per transfer, we would reduce
the transfer time to 0.058 s. If we also increase the bus clock rate to 2 MHz,
then we would reduce the transfer time to 0.029 s ,which is within our time
budget for the transfer.
t=TP
tbus cycle counts
Tbus cycles.
pbus clock
period
www.rejinpaul.com
1.12.1 Parallelism
⚫ Direct memory access is a example of parallelism.
⚫ DMA was designed to off-load memory transfers from the CPU.
⚫ The CPU can do other useful work while the DMA transfer is running.
www.rejinpaul.com
www.rejinpaul.com
UNIT II
ARM 7family
⚫ ARM7 core has a von neumann style architecture
⚫ ARM7 TDMI is first processor introduced in 1995
by ARM
⚫ It provide a very good performance to power ratio
⚫ ARM7TDMI-S has the synthesizable
⚫ ARM720T is the most fexible member of ARM7 family
because it include MMU. MMU handle both
platforms Linux and windows
⚫ It having unified 8k cache and vector table
are relocated depend on the priority
www.rejinpaul.com
ARM9 family
⚫ The ARM9 family was announced in 1997
⚫ ARM9 has five stage pipeline and high
clock frequencies
⚫ Memory have been redesign Harvard architecture
⚫ ARM9 process includes cache and MMU
⚫ Operating system requiring virtual memory support
⚫ ETM (Embedded Trace Macrocell) which allows a
developer to trace instruction and data execution in
real time operation. So that debugging is done
during the critical time segments.
⚫
www.rejinpaul.com
ARM10 FAMILY
⚫ The ARM10 announced in 1997 was designed for
performance
⚫ It extended version of 6 stage pipeline
⚫ Vector floating point unit which adds a seventh
stage to the ARM10 pipeline
⚫ VFP combined with IEEE 754.1985 floating point
⚫ ARM1020 E it includes E instruction. it having
cache, VFP and MMU
⚫ ARM1026EJ-S is similar to ARM926EJ-S . But ARM10
is flexible when compare to ARM9
www.rejinpaul.com
ARM11
⚫ ARM1136J-S, announced in 2003 was designed for
high performance and power efficient applications
⚫ ARM1136J-S was the first processor to
execute architecture ARMv6 instructions
⚫ It has eight pipeline stages with load and store
arithmetic pipeline.
ARMv6 instruction are single instruction with multiple data extension
www.rejinpaul.com
ARM classic
processor
www.rejinpaul.com
ARM Data
Instruction
www.rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
⚫ Example Program:
⚫ int a,b,c, X;
⚫ X= a+b-c;
www.rejinpaul.com
www.rejinpaul.com
Move instruction
⚫ MOV operand2
⚫ MVN NOT operand2
⚫ MOVS – Update In Status Reg
Syntax:
<Operation>{<cond>}{S} Rd, Operand2
Examples:
MOV r0, r1
MOVS r2, #10
www.rejinpaul.com
CF Destination 0
www.rejinpaul.com
Rotate Right
Extended (RRX)
• This operation uses the
CPSR C flag as a 33rd
bit.
• Rotates right by 1 bit.
Encoded as ROR
Rotate Right
Destination CF
Destination
www.rejinpaul.com
Arithmetic instruction
www.rejinpaul.com
example
⚫ ADD r0, r1, r2
⚫ R0 = R1 + R2
SUB r5, r3, #10
R5 = R3 − 10
RSB r2, r5, #0xFF00
R2 = 0xFF00 − R5
www.rejinpaul.com
Logical instruction
www.rejinpaul.com
Comparison instruction
www.rejinpaul.com
Multiply instruction
www.rejinpaul.com
2. Branch Instructions
www.rejinpaul.com
Swap instruction
www.rejinpaul.com
Program Status
Register Instructions
www.rejinpaul.com
Coprocessor Instruction
www.rejinpaul.com
⚫ Calling A subroutine
⚫ Parameter passing
⚫ Software delay
www.rejinpaul.com
2.7 PERIPHERALS:
www.rejinpaul.com
Software abstraction
layers executing on
hardware
www.rejinpaul.com
⚫ Timer 0 register
1. T0IR(Timer 0 interrupt Register)
⚫
2.10 UART
www.rejinpaul.com
UART
⚫ Universal Asynchronous Receiver/Transmitter
www.rejinpaul.com
ARM9TDMI
Pipeline Process
www.rejinpaul.com
DATA FLOW
www.rejinpaul.com
COMPARISION SUMMARY
www.rejinpaul.com
LPC 2148
⚫ It consist of 32 timer /counter ie PWMTC
⚫ Counter count the cycles of peripheral clock(PCLK)
⚫ It having 32bit prescale register (PWMPR)
⚫ It having 7 matching register (PWMR0-PWMR06)
⚫ 6 different pwm signal in single edge controlled
pwm or 3 different pwm signal in double edge
controlled
pwm
⚫ Match register will match and then it will reset
the timer/counter or stop.
www.rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
PWM Registers
1. PWMIR (PWM Interrupt Register)
⚫ It is an 8-bit register.
⚫ It is used to control the operation of the PWM Timer Counter.
⚫ Bit 0 – Counter Enable
When 1, PWM Timer Counter and Prescale Counter are enabled.
When 0, the counters are disabled.
⚫ Bit 1 – Counter Reset
When 1, the PWM Timer Counter and PWM Prescale Counter are synchronously
reset on next positive edge of PCLK.
Counter remains reset until this bit is returned to 0.
⚫ Bit 3 – PWM Enable
This bit always needs to be 1 for PWM operation. Otherwise PWM will operate as a
normal timer.
When 1, PWM mode is enabled and the shadow registers operate along with match
registers.
A write to a match register will have no effect as long as corresponding bit in
PWMLER is not set.
www.rejinpaul.com
⚫ Bit 2 – PWMSEL2
0 = Single edge controlled mode for PWM2
1 = Double edge controlled mode for
PWM2
UNIT III
EMBEDDED PROGRAMMING
Syllabus www.rejinpaul.com
1.3 QUEUES
• Queues are also used in signal processing and
event processing.
• Queues are used whenever data may arrive and depart at
somewhat unpredictable times or when variable amounts
of data may arrive.
• A queue is often referred to as an Elastic buffer.
2.MODELS OF www.rejinpaul.com
PROGRAMS
• Programs are collection of instructions to execute a
specified task.
• Models for programs are more general than source code.
• source code can’t be used directly because of different type
s such as assembly language,C code.
• Single model to describe all of them.
• control/data flow graph (CDFG)it is the fundamental
model for programs
www.rejinpaul.com
⚫ A basic block in C
www.rejinpaul.com
An extended data flow graph for our sample basic block
while (a < b) {
a5proc1(a,b);
b5proc2(a,b);
}
www.rejinpaul.com
while (a < b) {
a5proc1(a,b);
b5proc2(a,b);
}
www.rejinpaul.com
3. ASSEMBLY, LINKING AND LOADING
•Assembly and linking last steps in the compilation process
•They convert list of instructions into an image of the
program’s bits in memory.
•Loading puts the program in memory so that it can
be executed.
www.rejinpaul.com
3.1 Assemblers
• Assembler Translating assembly code into object code
also assembler must translate opcodes and format the bits in
each instruction, and translate labels into addresses.
• Labels it is an abstraction provided by the assembler.
• Labelsknow the locations of instructions and
data. Label processing requires making two passes
1. first pass scans the code to determine the address of each label.
2. second pass assembles the instructions using the label
values computed in the first pass.
EXAMPLE www.rejinpaul.com
3.2)LINKING
• A linker allows a program to be stitched together out of several smaller
pieces.
• The linker operates on the object files and links between files.
• Some labels will be both defined and used in the same file.
• Other labels will be defined in a single file but used elsewhere .
• The place in the file where a label is defined is known as an entry point.
• The place in the file where the label is used is called an external
reference. Phases of linker
• First Phaseit determines the address of the start of each object file
• Second Phasethe loader merges all symbol tables from the object
files into a single,large table.
www.rejinpaul.com
4. PROGRAM-LEVEL PERFORMANCE ANALYSIS
• The techniques we use to analyze program execution time are
also helpful in analyzing properties such as power
consumption.
• The CPU executes the entire program at the rate we desire.
• The execution time of a program often varies with the input
data values.
• The cache has a major effect on program performance.
• Cache’s behavior depends in part on the data values input to
the program.
• The execution time of an instruction in a pipeline depends
not only on that instruction but on the instructions around it
in the pipeline.
www.rejinpaul.com
Execution time of a program
www.rejinpaul.com
4.1. Program Performance Measuring techniques
1. Simulator
• It runs on a PC, takes as input an executable for the
microprocessor along with input data, and simulates the
2. program.
Timer
Cyclomatic Complexity
• It is a software metric tool.
• Used to measure the control complexity of a program.
M = e – n + 2p.
• e number of edges in the flow graph
• n number of nodes in the flow graph
• p number of components in the graph
www.rejinpaul.com
2. Regression Tests
• When tests are created during earlier or previous versions of
the system.
• Those tests should be saved apply to the later versions of the
system.
• It simply exercise current version of the code and possibly
exercise different bugs.
• In digital signal processing systems Signal processing algorithms
are implemented to save hardware costs.
• Data sets can be generated for the numerical accuracy of the
system.
• These tests can often be generated from the original
formulas without reference to the source code.
www.rejinpaul.com
UNIT IV
REAL TIME SYSTEMS
Operating System
⚫ An Operating System performs all the basic tasks
like managing file, process, and memory.
⚫ Thus operating system acts as manager of all the
resources, i.e. resource manager.
Thus operating system becomes an interface between user and machin
www.rejinpaul.com
Time-Sharing
Operating Systems
www.rejinpaul.com
⚫ Example 2
www.rejinpaul.com
www.rejinpaul.com
⚫ Accounting of Pipeline
⚫ Cache Memory
⚫ Virtual Memory
www.rejinpaul.com
⚫ Release Time
⚫ A task is a time at which all the data that are required
to begin executing the Task are available
⚫ Deadline
⚫ The deadline is the time by which the task must complete
its execution
⚫ The deadline must be hard or soft
Task are classified as
Periodic
Sporadic
Aperiodic
www.rejinpaul.com
⚫ Periodic
A task ti is periodic if it is released periodically.
say every pi seconds pi is called the period of task Ti
⚫ Sporadic Task
⚫ Sporadic task is a not periodic task, but may be invoked at
irregular interval
Sporadic tasks are characterized by an upper bound on the rate at
which they may be invoked
APeriodic Task
Tasks to be those tasks which are not periodic and which also have no upper bound on
www.rejinpaul.com
A schedule may be
Precomputed(Offline scheduling)
Dynamically(Online Scheduling)
www.rejinpaul.com
⚫ Precomputed
⚫ Advance the operation with specification of periodic tasks will
be run and slots for the sporadic / aperiodic tasks in the event
that they are involved.
⚫ Dynamically
⚫ Tasks are scheduled as they arrive in the system
⚫ The algorithm used in online scheduling must be
fast and it takes to meet their deadlines is clearly
useless
⚫ Two types priority algorithms are used
⚫ Static priority algorithm
⚫ Dynamic priority algorithm
www.rejinpaul.com
UNIT V
PROCESSES AND OPERATING
SYSTEMS
Introduction – Multiple tasks and multiple processes – Multirate
systems- Preemptive real-time operating systems- Priority based
scheduling- Interprocess communication mechanisms – Evaluating
operating system performance- power optimization strategies for
processes – Example Real time operating systems-POSIX-Windows-
CE. Distributed embedded systems – MPSoCs and shared memory
multiprocessors. – Design Example - Audio player, Engine control
unit
– Video accelerator.
www.rejinpaul.com
5.1) INTRODUCTION
⚫ Simple applications can be programmed on a microprocessor by writing a
single piece of code.
⚫ But for a complex application, multiple operations must be performed at
widely varying times.
⚫ Two fundamental abstractions that allow us to build complex applications
on microprocessors.
1. Process defines the state of an executing program
2.operating system (OS)provides the mechanism for switching execution between the processes.
www.rejinpaul.com
5.2.2) Process
⚫ A process is a single execution of a program.
⚫ If we run the same program two different times, we have created
two different processes.
⚫ Each process has its own state that includes not only its registers but
all of its memory.
⚫ In some OSs, the memory management unit is used to keep
each process in a separate address space.
In others, particularly lightweight RTOSs, the processes run in the
same address space.
Processes that share the same address space are often called threads.
www.rejinpaul.com
⚫ The simplest automotive engine controllers, such as the ignition controller for
a basic motorcycle engine, perform only one task—timing the firing of the
spark plug, which takes the place of a mechanical distributor.
Spark Plug
⚫ The spark plug must be fired at a certain point in the combustion
cycle. Microcontroller
⚫ Using a microcontroller that senses the engine crankshaft position allows
the spark timing to vary with engine speed.
⚫ Firing the spark plug is a periodic
process. Engine controller
⚫ Automobile engine controllers use additional sensors, including the gas
pedal position and an oxygen sensor used to control emissions.
⚫ They also use a multimode control scheme. one mode may be used for
engine warm-up, another for cruise, and yet another for climbing steep hills.
⚫ The engine controller takes a variety of inputs that determine the state of
the engine.
⚫ It then controls two basic engine parameters: the spark plug firings and
the fuel/air mixture.
www.rejinpaul.com
2. Deadline
⚫ specifies when a computation must be finished.
⚫ The deadline for an a periodic process is generally measured from the
release time or initiation time.
⚫ The deadline for a periodic process may occur at the end of the period.
⚫ The period of a process is the time between successive executions.
⚫ The process’s rate is the inverse of its period.
⚫ In a Multi rate system, each process executes at its own distinct rate.
www.rejinpaul.com
•In this case, the initiation interval is equal to one fourth of the period.
•It is possible for a process to have an initiation rate less than the period even
in single-CPU systems.
•If the process execution time is less than the period, it may be possible to
initiate multiple copies of a program at slightly offset times.
www.rejinpaul.com
The system decoder process demultiplexes the audio and video data and distributes it to the appropr
Missing Deadline
Missing deadline in a multimedia system may cause an audio or video glitch.
The system can be designed to take a variety of actions when a deadline is missed.
www.rejinpaul.com
⚫ A process goes into the waiting state when it needs data that it has finished all its work
for the current period.
⚫ A process goes into the ready state when it receives its required data, when it
enters a new period.
⚫ Finally a process can go into the executing state only when it has all its data, is ready
to run, and the scheduler selects the process as the next process to run.
www.rejinpaul.com
2)Round Robin-scheduling
⚫ Uses the same hyper period as does cyclostatic.
⚫ It also evaluates the processes in order.
⚫ If a process does not have any useful work to do, the scheduler moves on to the
next process in order to fill the time slot with useful work.
5.4.3) Priorities
⚫ Based on the priorities kernel can do the processes sequentially.
⚫ which ones actually want to execute and select the highest priority process that is
ready to run.
⚫ This mechanism is both flexible and fast.
⚫ The priority is a non-negative integer value.
• When the system begins execution,P2 is the only ready process, so it is selected for execution.
• At T=15, P1 becomes ready; it preempts P2 because p1 has a higher priority, so it
execute immediately
• P3’s data arrive at time 18, it has lowest priority.
• P2 is still ready and has higher priority than P3.
• Only after both P1 and P2 finish can P3 execute
www.rejinpaul.com
⚫ 5.4.4) Context Switching
⚫ To understand the basics of a context switch, let’s assume that the set of tasks
is in steady state.
⚫ Everything has been initialized, the OS is running, and we are ready for a
timer interrupt.
⚫ This diagram shows the application tasks, the hardware timer, and all
the functions in the kernel that are involved in the context switch.
⚫ vPreemptiveTick() it is called when the timer ticks.
⚫ portSAVE_CONTEXT() swaps out the current task context.
vTaskSwitchContext ( ) chooses a new task.
portRESTORE_CONTEXT() swaps in the new context
www.rejinpaul.com
www.rejinpaul.com
Example-Rate-monotonic scheduling
⚫ set of processes and their characteristics
⚫ In this case, Even though each process alone has an execution time significantly less
than its period, combinations of processes can require more than 100% of the available
CPU cycles.
⚫ During one 12 time-unit interval, we must execute P1 -3 times, requiring 6 units of
CPU time; P2 twice, costing 6 units and P3 one time, costing 3 units.
⚫ The total of 6 + 6 + 3 = 15 units of CPU time is more than the 12 time units
available, clearly exceeding the available CPU capacity(12units).
www.rejinpaul.com
we give higher priority to P2, then execute all of P2 and all of P1 in one of P1’s periods in
the worst case.
Hyper-period is 60
www.rejinpaul.com
⚫ There is one time slot left at t= 30, giving a CPU utilization of 59/60.
⚫ EDF can achieve 100% utilization
⚫ RMS vs. EDF
www.rejinpaul.com
Ex:Priority inversion
⚫ Low-priority process blocks execution of a higher priority process by keeping
hold of its resource.
Consider a system with two processes
⚫ Higher-priority P1 and the lower-priority P2.
⚫ Each uses the microprocessor bus to communicate to peripherals.
⚫ When P2 executes, it requests the bus from the operating system and receives it.
⚫ If P1 becomes ready while P2 is using the bus, the OS will preempt P2 for P1,
leaving P2 with control of the bus.
When P1 requests the bus, it will be denied the bus, since P2 already owns it.
Unless P1 has a way to take the bus from P2, the two processes may deadlock.
www.rejinpaul.com
⚫ We know that P1 and P2 cannot execute at the same time, since P1 must finish before P2
can begin.
⚫ P3 has a higher priority, it will not preempt both P1 and P2 in a single iteration.
⚫ If P3 preempts P1, then P3 will complete before P2 begins.
⚫ if P3 preempts P2, then it will not interfere with P1 in that iteration.
⚫ Because we know that some combinations of processes cannot be ready at the same time,
worst-case CPU requirements are less than would be required if all processes could be
ready simultaneously.
www.rejinpaul.com
⚫ CPU and the I/O device want to communicate through a shared memory block.
⚫ There must be a flag that tells the CPU when the data from the I/O device is ready.
⚫ The flag value of 0 when the data are not ready and 1 when the data are ready.
⚫ If the flag is used only by the CPU, then the flag can be implemented using a
standard memory write operation.
⚫ If the same flag is used for bidirectional signaling between the CPU and the I/O
device, care must be taken.
Consider the following scenario to call flag
1. CPU reads the flag location and sees that it is 0.
2. I/O device reads the flag location and sees that it is 0.
CPU sets the flag location to 1 and writes data to the shared location.
I/O device erroneously sets the flag to 1 and overwrites the data left by the CPU.
www.rejinpaul.com
5.5.3) Signals
⚫ Generally signal communication used in Unix .
⚫ A signal is analogous to an interrupt, but it is entirely a software creation.
⚫ A signal is generated by a process and transmitted to another process by the OS.
⚫ A UML signal is actually a generalization of the Unix signal.
⚫ Unix signal carries no parameters other than a condition code.
⚫ UML signal is an object, carry parameters as object attributes.
⚫ The sigbehavior( ) behavior of the class is responsible for throwing the
signal, as indicated by<<send>>.
⚫ The signal object is indicated by the <<signal>>
www.rejinpaul.com
⚫ If a device interrupts during a critical section, that critical section must finish before
the kernel can handle the interrupt.
⚫ The longer the critical section, the greater the potential delay.
⚫ Critical sections are one important source of scheduling jitter because a device
may interrupt at different points in the execution of processes and hit critical
sections at different points.
Interrupt priorities and interrupt latency
⚫ A higher-priority interrupt may delay a lower-priority interrupt.
⚫ A hardware interrupt handler runs as part of the kernel, not as a user thread.
⚫ The priorities for interrupts are determined by hardware.
⚫ Any interrupt handler preempts all user threads because interrupts are part of the
CPU’s fundamental operation.
⚫ We can reduce the effects of hardware preemption by dividing interrupt handling
into two different pieces of code.
⚫ Interrupt service handler (ISH) performs the minimal operations required
to respond to the device.
⚫ Interrupt service routine (ISR) Performs updating user buffers or other
more complex operation.
www.rejinpaul.com
Predictive shutdown
⚫ The goal is to predict when the next request will be made and to start
the system just before that time, saving the requestor the start-up time.
⚫ Make guesses about activity patterns based on a probabilistic model
of expected behavior.
This can cause two types of problems
⚫ The requestor may have to wait for an activity period.
In the worst case,the requestor may not make a deadline due to the delay
incurred by system
www.rejinpaul.com
⚫Process in POSIX
⚫ A new process is created by making a copy of an existing process.
⚫ The copying process creates two different processes both running the same code.
⚫ The complex task is to ensuring that one process runs the code intended for the new
process while the other process continues the work of the old process .
⚫ Scheduling in POSIX
⚫ A process makes a copy of itself by calling the fork() function.
⚫ That function causes the operating system to create a new process (the child process) which
is a nearly exact copy of the process that called fork() (the parent process).
⚫ They both share the same code and the same data values with one exception, the return value
⚫ of fork().
⚫ The parent process is returned the process ID number of the child process, while the
child process gets a return value of 0.
⚫ We can therefore test the return value of fork() to determine which process is the
child childid = fork();
if (childid == 0) { /* must be the child */
/* do child process here */
}
www.rejinpaul.com
⚫ execv() function takes as argument the name of the file that holds the
child’s code and the array of arguments.
⚫ It overlays the process with the new code and starts executing it from
the main() function.
⚫ In the absence of an error, execv() should never return.
⚫ The code that follows the call to perror() and exit(), take care of the case
where execv() fails and returns to the parent process.
⚫ The exit() function is a C function that is used to leave a
process childid = fork();
if (childid == 0) { /* must be the child */
execv(“mychild”,childargs); perror(“execv”);
exit(1);
}
www.rejinpaul.com
⚫ The wait functions not only return the child process’s status, in
many implementations of POSIX they make sure that the child’s
resources .
⚫ The parent stuff() function performs the work of the parent function.
childid = fork();
if (childid == 0) { /* must be the child */
execv(“mychild”,childargs);
perror(“execl”);
exit(1);
}
else { /* is the parent */
parent_stuff(); /* execute parent functionality */ wait(&cstatus);
exit(0);
}
www.rejinpaul.com
POSIX semaphores
⚫ POSIX supports semaphores and also supports a direct shared memory mechanism.
⚫ POSIX supports counting semaphores in the _POSIX_SEMAPHORES option.
⚫ A counting semaphore allows more than one process access to a resource at a time.
⚫ If the semaphore allows up to N resources, then it will not block until N processes have
⚫ simultaneously passed the semaphore;
⚫ The blocked process can resume only after one of the processes has given up its
semaphore.
⚫ When the semaphore value is 0, the process must wait until another process gives up the
semaphore and increments the count.
POSIX pipes
Parent process uses the pipe() function to create a pipe to talk to a child.
Each end of a pipe appears to the programs as a file.
The pipe() function returns an array of file descriptors, the first for the write end and the second for the read en
POSIX also supports message queues under the _POSIX_MESSAGE_PASSING facility..
www.rejinpaul.com
5.8.2) Windows CE
⚫ Windows CE is designed to run on multiple hardware platforms
and instruction set architectures.
⚫ It supports devices such as smart phones, electronic instruments etc..,
www.rejinpaul.com
⚫ OAL provides services such as a real-time clock, power management, interrupts, and
a debugging interface.
⚫ A Board Support Package (BSP) for a particular hardware platform includes the OAL
and drivers.
www.rejinpaul.com
Memory Space
⚫ It support for virtual memory with a flat 32-bit virtual address space.
⚫ A virtual address can be statically mapped into main memory for key kernel-mode code.
⚫ An address can also be dynamically mapped, which is used for all user-mode and
some kernel-mode code.
⚫ Flash as well as magnetic disk can be used as a backing store
⚫ The top 1 GB is reserved for system elements such as DLLs, memory mapped files,
and shared system heap.
⚫ The bottom 1 GB holds user elements such as code, data, stack, and heap.
www.rejinpaul.com
User address space in windows CE
⚫ Threads are defined by executable files while drivers are defined by
dynamically-linked libraries (DLLs).
⚫ A process can run multiple threads.
⚫ Threads in different processes run in different
execution environments.
⚫ Threads are scheduled directly by the operating system.
⚫ Threads may be launched by a process or a device driver.
⚫ A driver may be loaded into the operating system or a process.
⚫ Drivers can create threads to handle interrupts
⚫ Each thread is assigned an integer priority.
⚫ 0 is the highest priority and 255 is the lowest priority.
⚫ Priorities 248 through 255 are used for non-real-time threads .
⚫ The operating system maintains a queue of ready processes at
each priority level.
www.rejinpaul.com
⚫ Arbitration field The first field in the packet contains the packet’s destination address 11 bits
⚫ Remote Transmission Request (RTR) bit is set to 0 if the data frame is used to request
data from the destination identifier.
⚫ When RTR = 1, the packet is used to write data to the destination identifier.
⚫ Control field 4-bit length for the data field with a 1 in between.
⚫ Data field0 to 64 bytes, depending on the value given in the control field.
⚫ CRC It is sent after the data field for error detection.
⚫ Acknowledge field identifier signal whether the frame was correctly received.( sender puts
a bit (1) in the ACK slot , if the receiver detected an error, it put (0) value)
www.rejinpaul.com
Arbitration
⚫ It uses a technique known as Carrier Sense Multiple Access with Arbitration on
Message Priority (CSMA/AMP).
⚫ When a node hears a dominant bit in the identifier when it tries to send a recessive bit,
it stops transmitting.
⚫ By the end of the arbitration field, only one transmitter will be left.
⚫ The identifier field acts as a priority identifier, with the all-0 having the highest
priority Error handling
⚫ An error frame can be generated by any node that detects an error on the bus.
⚫ Upon detecting an error, a node interrupts the current transmission.
⚫ Error flag field followed by an error delimiter field of 8 recessive bits.
⚫ Error delimiter field allows the bus to return to the quiescent state so that data
frame transmission can resume.
⚫ Overload frame signals that a node is overloaded and will not be able to handle the
next message. Hence the node can delay the transmission of the next frame .
www.rejinpaul.com
⚫ When a master wants to write a slave, it transmits the slave’s address followed by the data.
⚫ When a master send a read request with the slave’s address and the slave transmit the data.
⚫ Transmission address has 7-bit and 1 bit for data direction.( 0 for writing from the master
to the slave and 1 for reading from the slave to the master)
⚫ A bus transaction is initiated by a start signal and completed with an end signal.
⚫ A start is signaled by leaving the SCL high and sending a 1 to 0 transition on SDL.
⚫ A stop is signaled by setting the SCL high and sending a 0 to 1 transition on SDL.
www.rejinpaul.com
5.9.4) ETHERNET
⚫ It is widely used as a local area network for general-purpose computing.
⚫ It is also used as a network for embedded computing.
⚫ It is particularly useful when PCs are used as platforms, making it possible to
use standard components, and when the network does not have to meet real-
time requirements.
⚫ It is a bus with a single signal path.
⚫ It supports both twisted pair and coaxial cable.
Ethernet nodes are not synchronized, if two nodes decide to transmit at the same
time,the message will be ruined.
www.rejinpaul.com
5.9.5.1) IP packet
structure
www.rejinpaul.com
⚫ We can specify the system as a task graph. However, different processes may end up
on different processing elements. Here is a task graph
We have labeled the data transmissions on each arc ,We want to execute the task on the
platform below.
The platform has two processing elements and a single bus connecting both PEs. Here
are the process speeds:
www.rejinpaul.com
⚫ The schedule has length 19. The d1 message is sent between the processes internal to
⚫ P1 and does not appear on the bus.
⚫ Let’s try a different allocation. P1 on M1 and P2 and P3 on M2. This makes P2 run
more slowly. Here is the new schedule:.
⚫ The length of this schedule is 18, or one time unit less than the other schedule. The
⚫ increased computation time of P2 is more than made up for by being able to transmit a
⚫ shorter message on the bus. If we had not taken communication into account when
analyzing total execution time, we could have made the wrong choice of which processes
to put on the same processing element.
www.rejinpaul.com
5.11.4) Requirements
www.rejinpaul.com
5.11.5) Specification
⚫ The File ID class is an abstraction of a file in the flash file system.
⚫ The controller class provides the method that operates the player.
www.rejinpaul.com
5.12.2)Requirements
www.rejinpaul.com
5.12.3)Specification
⚫ The engine controller must deal with processes at different rates
⚫ ΔNE and ΔT to represent the change in RPM and throttle position.
⚫ Controller computes two output signals, injector pulse width PW and spark
advance angle S.
⚫ S=k2X ΔNE-k3VS
⚫ The controller then applies corrections to these initial values
⚫ If intake air temperature (THA) increases during engine warm-up, the controller
reduces the injection duration.
If the throttle opens, the controller temporarily increases the injection frequency.
Controller adjusts duration up or down based upon readings from the exhaust oxygen sensor (OX).
www.rejinpaul.com
5.12.4)System architecture
⚫ The two major processes, pulse-
width and advance-angle,
compute the control parameters
for the spark plugs and
injectors.
⚫ Control parameters rely on
changes in some of the
input signals.
⚫ Physical sensor classes used
to compute these values.
⚫ Each change must be updated
at the variable’s sampling rate.
www.rejinpaul.com
Accelerator
⚫ It is a hardware circuits on a display adapter that speed up fill motion video.
⚫ Primary video accelerator functions are color space conversion, which converts YUV to RGB.
⚫ Hardware scaling is used to enlarge the image to full screen and double buffering
which moves the frames into the frame buffer faster.
Video compression
MPEG-2 forms the basis for U.S. HDTV broadcasting.
This compression uses several component algorithms together in a feedback loop.
Discrete cosine transform (DCT) used in JPEG and MPEG-2.
DCT used a block of pixels which is quantized for lossy compression.
Variable-length coderassign number of bits required to represent the block.
www.rejinpaul.com
⚫ Specification for the system is relatively straightforward because the algorithm is simple.
⚫ The following classes used to describe basic data types in the system motion
vector, macro block, search area.
www.rejinpaul.com
5.13.7) Architecture
⚫ The macro block has 16 x16 = 256.
⚫ The search area has (8 + 8 + 1 + 8 + 8)2
= 1,089 pixels.
⚫ FPGA probably will not have
enough memory to hold 1,089 (8-bit
)values.
⚫ The machine has two memories, one
for the macro block and another for
the search memories.
⚫ It has 16 processing elements that
perform the difference calculation on
a pair of pixels.
⚫ Comparator sums them up and selects
the best value to find the motion vector.
www.rejinpaul.com
5.13.8) System
testing
⚫ Testing video algorithms requires a large amount of data.
⚫ we are designing only a motion estimation accelerator and not a complete
video compressor, it is probably easiest to use images, not video, for test data.
⚫ use standard video tools to extract a few frames from a digitized video and store them
in JPEG format.
⚫ Open source for JPEG encoders and decoders is available.
These programs can be modified to read JPEG images and put out pixels in the format
required by your accelerator.