Mod1 1
Mod1 1
Deepa Jain
Course: Embedded System Design
10/3/2022 1
Concept Covered
10/3/2022 2
Introduction
• We have been brought up in the age of computing.
• Computers are everywhere (some we see, some we do not see).
10/3/2022 3 3
What are Embedded Systems?
• Computers are embedded within other systems:
• What is “other systems”? – Hard to define.
• Any computing system other than desktop / laptop server.
• Typical examples:
• Washing machine, refrigerator, camera, vehicles, airplane,
missile, printer.
• Processors are often very simple and inexpensive
(depending on application of course).
• Billions of embedded system units produced yearly,
versus millions of desktop units.
10/3/2022 4 4
Embedded System Vs General Computing System
Parameter General Purpose Embedded system
10/3/2022 5
Classification of Embedded System
• Based on Generation
• Complexity and Performance requirement
• Based on Deterministic behavior
• Based on Triggering
10/3/2022 6
Hard Real time system
• Flight Control Systems
• Missile Guidance Systems
• Weapons Defense System
• Medical System
• Inkjet printer system
• Railway signaling system
• Air traffic control systems
• Nuclear reactor control systems
• Anti-missile system
• Chemical plant control
• Autopilot System In Plane
• Pacemakers
10/3/2022 7
Soft Real time System
• Personal computer
• Audio and video systems
• Set top boxes
• DVD Players
• Weather Monitoring Systems
• Electronic games
• Multimedia system
• Web browsing
• Online transaction systems
• Telephone switches
• Virtual reality
• Mobile communication
10/3/2022 8
Common Features of Embedded Systems
• They are special-purpose or single-functioned.
• Executes a single program, possibly with inputs from the environment.
• Imagine a microwave oven, a washing machine, an AC machine, etc.
• Tight constraints on cost, energy, form factor, etc.
• Low cost, low power, small size, relatively fast.
• They must react to events in real-time.
• Responds to inputs from the system’s environment.
• Must compute certain results in real-time without delay.
• The delay that can be tolerated depends on the application.
10/3/2022 9 9
10/3/2022 10
Core of the Embedded System
10/3/2022 11
MICROPROCESSORS and MICROCONTROLLERS :
Microprocessor
• Has CPU
• Capable of performing arithmetic & logical operations
• Dependent unit ie requires memory, timer unit ,interrupt controller etc.
Microcontroller
• Highly integrated chip Contains its own CPU,
• scratchpad RAM,
• on chip ROM/Flash memory
• Has dedicated ports
• Independent working
• Cheap
• cost effective and readily available.
10/3/2022 12
10/3/2022 13
10/3/2022 14
DIGITAL SIGNAL PROCESSING:
10/3/2022 15
CISC Vs RISC
CISC RISC
10/3/2022 16
10/3/2022 17
Core of the Embedded System
10/3/2022 18
10/3/2022 19
10/3/2022 20
10/3/2022 21
10/3/2022 22
Memory
10/3/2022 23
Program Storage Memory
10/3/2022 24
10/3/2022 25
10/3/2022 26
10/3/2022 27
10/3/2022 28
10/3/2022 29
RAM
10/3/2022 30
SRAM
10/3/2022 31
DRAM
10/3/2022 32
SRAM Vs DRAM
10/3/2022 33
10/3/2022 34
10/3/2022 35
10/3/2022 36
10/3/2022 37
Sensors and Actuators
10/3/2022 38
Sensors
10/3/2022 39
Actuators
10/3/2022 40
Silicon Technolab Digital Analog Arduino Starter
10/3/2022 41
I/O Subsystem
10/3/2022 42
LED
10/3/2022 43
7-Segment Display
10/3/2022 44
Optocoupler
10/3/2022 45
Stepper
10/3/2022 46
Stepper Motor
10/3/2022 47
10/3/2022 49
10/3/2022 50
10/3/2022 51
Watch Dog Timers
• A watchdog timer is a piece of hardware that can be used to automatically detect software
anomalies and reset the processor if any occur.
• A watchdog timer is based on a counter that counts down from some initial value to zero.
• The embedded software selects the counter’s initial value and periodically restarts it.
• If the counter ever reaches zero before the software restarts it, the software is presumed to be
malfunctioning and the processor’s reset signal is asserted.
10/3/2022 52
Embedded Firmware
10/3/2022 53
Design and Implementation of Embedded Firmware
10/3/2022 54
10/3/2022 55
10/3/2022 56
10/3/2022 57
10/3/2022 58
Assembly to Hex File Conversion
10/3/2022 59
10/3/2022 60
High level language to Hex File
10/3/2022 61
Inline Assembly
10/3/2022 62
Concept Covered
10/3/2022 69
Design Challenges
10/3/2022 3 70
10/3/2022 71
Common Design Metrics
• Non Recurring Engineering (NRE) Cost: One-time initial cost of designing a system.
• Unit Cost: The cost of manufacturing each copy of the system, without counting
the NRE cost.
• Size: The actual physical space occupied by the system.
• Performance: This is measured in terms of the time taken or throughput.
• Power: The amount of (battery) power consumed by the system.
• Flexibility: The ability to change the functionality of the system.
10/3/2022 4 72
• Maintainability: How easy or difficult it is to modify the design of the system?
• Time-to-prototype: How much time is required to build a working version of the
system (i.e. a prototype)?
• Time-to-market: How much time is required to develop a system such that it can
be released to the market commercially?
• Safety: Are there any adverse effects on the operating environment?
• Can be many more …
10/3/2022 5 73
Design Tradeoff
Performance Size
NRE cost
10/3/2022 6 74
• Often requires expertise in both hardware and software to take a proper
decision.
• Expertise in hardware may indicate the types of co-processor or I/O interfaces to use for
specific applications (e.g. analog ports, digital ports, PWM ports, etc.).
• Expertise in software is required to identify parts of the implementation that need to be
implemented in software and run on the microcontroller.
• Hardware / Software Co-design becomes important.
10/3/2022 7 75
Time-to-market Design Metric
• This is a very crucial design metric.
• Must be strictly followed to make a
product commercially viable.
• Requires exhaustive market study and
analysis.
• Starting from the point a product
design starts, we can define a
Market Window within which it is
expected to have the highest sales.
• Any delay can result in drastic reductions in
sales.
10/3/2022 8 76
Loss due to Delayed Market Entry
Peak revenue
Revenues ($)
Market rise
• Maximum sale occurs at time W Market fall
10/3/2022 9 77
Peak revenue
On-time
• Area (delayed) = ½ * (2W – D) * (W – D)
Market rise
Market fall
• Percentage revenue loss = D(3W –D)/2W2 * 100
Delayed
• Examples:
• 2W = 52 weeks, D = 4 weeks LOSS = 22%
D W 2W
• 2W = 52 weeks, D = 10 weeks LOSS = 50%
On-time Delayed Time
entry entry
10/3/2022 1 78
0
NRE and Unit Cost Metrics
• If CNRE denotes the NRE cost and Cunit the unit cost of a product, then the total
cost for manufacturing N units is given by:
Total Cost = CNRE + N * Cunit
• Therefore, per-unit cost is given by: CNRE / N + Cunit
• Example:
• CNRE= Rs. 5,00,000 and Cunit = Rs. 5,000
• Total cost for manufacturing 100 units = 5,00,000 + 5000 * 100 = 10,00,000
• Per unit cost = 5,00,000 / 100 + 5000 = 10,000
10/3/2022 1 79
1
Per-unit cost = (CNRE / N) + Cunit
• We can compare technologies by cost:
• Choice A: CNRE = Rs. 20,000, Cunit = Rs. 8,000
• Choice B: CNRE = Rs. 4,00,000, Cunit = Rs. 3,000
• Choice C: CNRE = Rs. 10,00,000, Cunit = Rs. 8,000
• Of course, time-to-market cost must also be considered.
10/3/2022 1 80
2
Performance Design Metric
10/3/2022 1 81
3
Basic Operation of a Computing System
• The central processing unit (CPU) carries
out all computations.
• Fetches instructions from the program
memory and executes it; may require access
to data in data memory.
• The input/output block provides interface
with the outside world.
• Allows users to interact with the computing
system, and also observe the output results.
10/3/2022 3 82
• About the instruction set architecture (ISA) of the CPU.
a) Complex Instruction Set Computer (CISC)
• Typically used in desktops, laptops and servers (courtesy Intel).
b) Reduced Instruction Set Computer (RISC)
• Typically used in microcontrollers, that are used to build embedded systems.
10/3/2022 4 83
Classification of CPU
Architecture
• Broadly two types of architectures:
a) Von Neumann Architecture
• Both instructions and data are stored in the same memory.
• This model is followed in conventional computing systems.
b) Harvard Architecture
• Instructions and data are stored in separate memories.
• Typically followed in microcontrollers, used for building embedded systems.
• Instructions are stored in a ROM (permanent), while temporary data are stored
in RAM.
10/3/2022 5 84
Von Neumann Architecture Harvard Architecture
10/3/2022 6 85
What is a Microprocessor?
10/3/2022 7 86
Schematic Diagram
of Microprocessor
10/3/2022 8 87
What is a Microcomputer?
• It is a computer system built using a microprocessor.
• Since a microprocessor does not contain memory and I/O, we have to interface these
to build a microcomputer.
• Too complex and expensive for very small and low-cost embedded systems.
10/3/2022 9 88
Microcontrollers: The Heart of Embedded Systems
10/3/2022 1 89
0
10/3/2022 1 90
1
Microcontroller Packaging and Appearance
• When a PC executes a program, the program is first loaded from disk/SSD into an
allocated section of memory.
• Usually the program is loaded part by part to conserve memory space.
• There is a complicated operating system that handles all low-level operations (includes low-
level driver codes for interfacing with various devices).
• In a microcontroller there is no disk to read from.
• On-chip ROM stores the program that is to be executed.
• Size of the ROM limits the maximum size of the application.
• There is no operating system, and the program is ROM is the only program that
is running (must include low-level routines).
10/3/2022 13 92
Where are Microcontrollers Used?
10/3/2022 14 93
Evolution of Microcontrollers
• Microcontroller evolved from a microprocessor-based board-level design to a single
chip in the mid-1970's.
• As the process of miniaturization continued, all of the components needed for a controller were
built into a single chip.
• In the mid-1980’s, microcontrollers got embedded into a larger ASIC (Application
Specific Integrated Circuit).
• Microcontrollers are fabricated as a module inside a larger chip.
10/3/2022 15 94
Advantages of using microcontrollers
• Fast and effective
• The architecture correlates closely with the problem being solved (control systems).
• Low cost / Low power
• High level of system integration within one component.
• Only a handful of components needed to create a working system.
• Compatibility
• Opcodes and binaries are the SAME for all 80x51 / ARM / PIC variants.
10/3/2022 16 95
History of ARM Series of Microcontrollers
10/3/2022 3 96
Why do we talk about ARM?
• One of the most widely used processor cores.
• Some application examples:
• ARM7: iPod
• ARM9: BenQ, Sony Ericsson
• ARM11: Apple iPhone, Nokia N93, N100
• 90% of 32-bit embedded RISC processors till 2010.
• Mainly used in battery-operated devices:
• Due to low power consumption and reasonably good
performance.
10/3/2022 4 97
About ARM Processors
10/3/2022 5 98
Popular ARM Architectures
• ARM7
• 3 pipeline stages (fetch/decode/execute)
• High code density / low power consumption
• Most widely used for low-end systems
• ARM9
• Compatible with ARM7
• 5 stages (fetch/decode/execute/memory/write)
• Separate instruction and data cache
• ARM10
• 6-stages (fetch/issue/decode/execute/memory/write)
10/3/2022 6 99
ARM Family
Comparison
ARM 7 (1995) ARM9 (1997) ARM10 (1999) ARM11 (2003)
Pipeline depth 3-stage 5-stage 6-stage 8-stage
Typical clock frequency (MHz) 80 150 260 335
Power (mW/MHz) 0.06 0.19 0.50 0.40
Throughput (MIPS/MHz) 0.97 1.1 1.3 1.2
Architecture Non Neumann Harvard Harvard Harvard
Multiplier 8 x 32 8 x 32 16 x 32 16 x 32
10/3/2022 7 100
ARM is based on RISC
Architecture
• RISC supports simple but powerful instructions that execute in a single cycle at
high clock frequency.
• Major design features:
• Instructions: reduced set / single cycle / fixed length
• Pipeline: decode in one stage / no need for microcode
• Registers: large number of general-purpose registers (GPRs)
• Load/Store Architecture: data processing instructions work on registers only;
load/store instructions to transfer data from/to memory.
• Now-a-days CISC machines also implement RISC concepts.
10/3/2022 8 101
ARM Features
10/3/2022 9 102
VonNeumann
Von Neumann Harvard
ARM9s
ARM7s and newers
and olders Inst. Data
AHB
bus
Memory-mapped I/O: I D
• No specific instructions for I/O Cache Cache
MEMORY
• Use Load/Store instr. for I/O & I/O
• Peripheral’s registers at some
Bus Interface
memory addresses
AHB
bus
MEMORY
& I/O
10/3/2022 1 103
0
A[31:0]
bus
PC
Typical PC
Architecture REGISTER
BANK
ALU
INSTRUCCTION
bus
Control
DECODER
Lines
Multiplier
bus
bus
A
B
SHIFT
A.L.U.
Instruction Reg.
Thumb to
ARM
Write Data Reg. Read Data Reg.
translator
D[31:0]
10/3/2022 1 104
1
What is Pipelining?
• A mechanism for overlapped execution of several input sets by partitioning some
computation into a set of k sub-computations (or stages).
• Very nominal increase in the cost of implementation.
• Very significant speedup (ideally, k).
10/3/2022 3 105
A Real-life Example W+D+R
• Suppose you have built a machine M that can wash (W),T dry (D), and
iron (R) clothes, one cloth at a time.
For N clothes, time T1 = N.T
• Total time required is T.
• As an alternative, we split the machine into three smaller machines MW,
MD and MR, which can perform the specific task only.
W D R
• Time required by each of the smaller machines is T/3 (say).
T/3 T/3 T/3
For N clothes, time T3 = (2 + N).T/3
10/3/2022 4 106
How does the pipeline work?
Cloth-1 Cloth-2 Cloth-3 Cloth-4 Cloth-5 W Finishing times:
• Cloth-1 – 3.T/3
• Cloth-2 – 4.T/3
Cloth-1 Cloth-2 Cloth-3 Cloth-4 D • Cloth-3 – 5.T/3
• …
• Cloth-N – (2 + N).T/3
Cloth-1 Cloth-2 Cloth-3 R
10/3/2022 5 107
Extending the Concept to Processor Pipeline
• The same concept can be extended to hardware pipelines.
• Suppose we want to attain k times speedup for some computation.
• Alternative 1: Replicate the hardware k times cost also goes up k times.
• Alternative 2: Split the computation into k stages very nominal cost increase.
• Need for buffering:
• In the washing example, we need a tray between machines (W & D, and D & R) to keep the cloth
temporarily before it is accepted by the next machine.
• Similarly in hardware pipeline, we need a latch between successive stages to hold the
intermediate results temporarily.
10/3/2022 6 108
Model of a Synchronous k-stage Pipeline
STAGE 1 STAGE 2 STAGE k
L S1 L S2 L … L Sk
Clock
• The latches are made with master-slave flip-flops, and serve the purpose of isolating
inputs from outputs.
• The pipeline stages are typically combinational circuits.
• When Clock is applied, all latches transfer data to the next stage simultaneously.
10/3/2022 7 109
Speedup and Efficiency
Some notations:
τ :: clock period of the pipeline
ti :: time delay of the circuitry in stage Si
dL :: delay of a latch
Maximum stage delay τm = max {ti}
Thus, τ = τm + dL
Pipeline frequency f = 1/τ
• If one result is expected to come out of the pipeline every clock cycle, f will represent
the maximum throughput of the pipeline.
10/3/2022 8 110
• The total time to process N data sets is given by
Tk = [(k – 1) + N].τ (k – 1) τ time required to fill the pipeline
1 result every τ time after that total N.τ
• For an equivalent non-pipelined processor (i.e. one stage), the total time is
T1 = N.k.τ (ignoring the latch overheads)
As N ∞, Sk k
10/3/2022 9 111
• Pipeline efficiency:
• How close is the performance to its ideal value?
Sk N
Ek = =
k k + (N – 1)
• Pipeline throughput:
• Number of operations completed per unit time.
N N
Hk = =
Tk [k + (N – 1)].τ
10/3/2022 1 112
0
14
12
10
8 k=4
Speedup
k=8
6
k = 12
4
0
1 2 4 8 16 32 64 128 256
Number of tasks N
10/3/2022 1 113
1
ARM Pipelining Examples
ARM7TDMI Pipeline
1 Clock cycle
ARM9TDMI Pipeline
1 Clock cycle
10/3/2022 1 114
2
Pipelining in
Simple instructions (like ADD, SUB)
ARM7 can complete at a rate of one
1 FETCH DECODE EXECUTE instruction per cycle.
instruction
time
10/3/2022 1 115
3
With more complex instructions … stall cycles possible
1 ADD FETCH DECODE EXECUTE
10/3/2022 1 116
4
ARM7 3-state Pipeline
10/3/2022 1 117
5
• In execution, the program counter (PC) is always 8 bytes ahead.
10/3/2022 1 118
6
THANK YOU
10/3/2022 119