Unit-I - : School of Electrical & Electronics Engineering Department of Electronics & Instrumentation
Unit-I - : School of Electrical & Electronics Engineering Department of Electronics & Instrumentation
UNIT-I
EMBEDDED SYSTEM DESIGN– SBMA5201
1
ARM ARCHITECTURE
Developed the first ARM Processor (Acorn RISC Machine) in 1985 at Acorn
Computers Limited.
-a load/store architecture
2
-simple addressing mode
3
Architecture V1 was implemented only in the ARM1 CPU and was not utilized in a
commercial product. Architecture V2 was the basis for the first shipped processors. These two
architectures were developed by Acorn Computers before ARM became a company in 1990.
After that introduced ARM the Architecture V3, which included many changes over its
predecessors .These changes resulted in an extremely small and power-efficient processor
suitable for embedded systems .Architecture V4, co-developed by ARM and Digital Electronics
Corporation, resulted in the Strong ARM series of processors. These processors are very
performance-centric and do not include the on chip debug extensions.
This architecture was further developed to include the Thumb 16-bitinstruction set
architecture enabling a 32-bit processor to utilize a 16-bit system. Today, ARM only licenses
cores based on Architecture V4T or above.
The latest architectures, version 5TE and 5TEJ, embody added instructions for DSP
applications and the Jazelle-Java extensions, respectively.
Currently, the ARM9E and 10E family of processors are theonly implementations of
these architectures. Details on these architectures and cores will be provided later in the course.
4
4.Architecture basics
ARM cores use a 32-bit, Load-Store RISC architecture. That means that the core cannot
directly manipulate the memory. All data manipulation must be done by loading registers with
information located in memory, performing the data operation and then storing the value back to
memory. There are 37 total registers in the processor. However, that number is split among seven
different processor modes. The seven processor modes are used to run user tasks, an operating
system, and to efficiently handle exceptions such as interrupts. Some of the registers with in each
mode are reserved for specific use by the core, while most are available for general use. The
reserved registers that are used by the core for specific functions are r13 is commonly used as the
stack pointer (SP), r14 as a link register (LR), r15as a program counter (PC), the Current Program
Status Register (CPSR), and the Saved Program Status Register (SPSR).
The SPSR and the CPSR contain the status and control bits specific to the properties the
processor core is operating under. These properties define the operating mode, ALU status flags,
interrupt disable/enable flags and whether the core is operating in 32-bit ARM or 16-bit Thumb
state.
5
There are 37 total registers divided among seven different processor modes. Figure 09
shows thebank of registers visible in each mode .User mode, the only non-privileged mode, has
the least number of total registers visible. It has noSPSR and limited access to the CPSR. FIQ
and IRQ are the two interrupt modes of the CPU
There are 37 total registers divided among seven different processor modes. Figure 02
shows the bank of registers visible in each mode. User mode, the only non-privileged mode, has
the least number of total registers visible. It has no SPSR and limited
6
access to the CPSR. FIQ and IRQ are the two interrupt modes of the CPU.Supervisor mode is the
default mode of the processor on start up or reset. Undefined mode traps unknown or illegal
instructions when they are passed through the pipeline. Abort mode traps illegal memory
accesses as a result of fetching instructions or accessing data.
Finally, system mode, which uses the user mode bank of registers, was introduced to provide an
additional privileged mode when dealing with nested interrupts.
Each additional mode offers unique registers that are available for use by exception handling
routines. These additional registers are the minimum number of registers required to preserve the
state of the processor, save the location in code, and switch between modes.
FIQ mode, however, has an additional five banked registers to provide more flexibility and
higher performance when handling critical interrupts.
When the ARM core is in Thumb state, the registers banks are split into low and high register
domains. The majority of instructions in Thumb state have a 3-bit register specifier. As a result,
these instructions can only access the low registers in Thumb, R0 through R7. The high
registers,
7
R8through R15, have more restricted use. Only a few instructions have access to these registers.
TDMI
stands for:
• Thumb, which is a 16-bit instruction set extension to the 32-bit ARM architecture, referred as
states of the processor.
• "D" and "I" together comprise the on-chip debug facilities offered on all ARM cores.These
stand for the Debug signals and EmbeddedICE logic, respectively.
• The M signifies the support for 64-bit results and an enhanced multiplier, resulting inhigher
performance. This multiplier is now standard on all ARMv4 architectures and\above.
8
The processor must take two memory access cycles to fetch an instruction or read and write data.
To address this issue, ARM introduced another set of reduced 16-bit instructions labeled Thumb,
based on the standard ARM 32-bit instruction set.
For Thumb to be used, the processor must go through a change of state from ARM to Thumb in
order to begin executing 16-bit code. This is because the default state of the core is ARM.
Therefore, every application must have code at boot up that is written in ARM. If the
application code is to be compiled entirely for Thumb, then the segment of ARM boot code
must change the state of the processor. Once this is done, 16-bit instructions are fetched
seamlessly into the pipeline without any result.
It is important to note that the architecture remains the same. The instruction set is actually a
reduced set of the ARM instruction set and only the instructions are 16-bit; everything else in the
core still operates as 32-bit.An application code compiled in Thumb is 30% smaller on average
than the same code compiled in ARM and normally 30% faster when using narrow 16-bit
memory systems.
9
Figure 1.3 Register Bank
Figure 1.3 shows the register bank in the center of the diagram, plus the required address bus
and data bus. The multiplier, in-line barrel shifter, and ALU are also shown. In addition, the
diagram illustrates the in-line decompression process of Thumb instructions while in the decode
stage of the pipeline. This process creates a 32-bit ARM equivalent instruction from the 16-bit
Thumb instruction, decodes the instruction, and passes it on to the execute stage.
10
6.ARM design philosophy
Small processor for lower power consumption (for embedded system)
6.1 Registers
ARM has 37 registers all of which are 32-bits long. 1 dedicated program counter 1 dedicated
current program status register 5 dedicated saved program status registers 30 general purpose
registers The current processor mode governs which of several banks is accessible. Each
mode can access a particular set of r0-r12 registers a particular r13 (the stack pointer, sp) and
r14 (the link register, lr) the program counter, r15 (pc) the current program status register,
cpsr Privileged modes (except System) can also access a particular spsr (saved program status
reg
11
• six 32-bit status registers. These registers are not all accessible at the same time. The
processor state and operating mode determine which registers are available to the programmer
The ARM state register set In ARM state, 16 general registers and one or two status
registers are accessible at any time. In privileged modes, mode-specific banked registers become
available.
The ARM state register set contains 16 directly-accessible registers, r0-r15. Another
register, the Current Program Status Register (CPSR), contains condition code flags, status bits,
and current mode bits. Registers r0-r13 are general-purpose registers used to hold either data or
address values. Registers r14, r15, and the SPSR have the following special functions
Link Register Register r14 is used as the subroutine Link Register (LR). Register r14 receives
the return address when a Branch with Link (BL or BLX) instruction is executed. You can treat
r14 as a general-purpose register at all other times. The corresponding banked registers r14_svc,
r14_irq, r14_fiq, r14_abt, and r14_und are similarly used to hold the return values when
interrupts and exceptions arise, or when BL or BLX instructions are executed within interrupt or
exception routines.
12
Program Counter Register r15 holds the PC:
FIQ mode has seven banked registers mapped to r8–r14 (r8_fiq– r14_fiq). As a result many FIQ
handlers do not have to save any
13
registers. The Supervisor, Abort, IRQ, and Undefined modes each have alternative mode-specific
registers mapped to r13 and r14, permitting a private stack pointer and link register foreach mode
14
Figure 1.5 ARM register
15
Figure 1.6 THUMB register
16
7.2 ARM state and Thumb state registers relationship
Figure1. 2-1.6 shows the relationships between the Thumb state and ARM state registers
Registers r0–r7 are known as the low registers. Registers r8–r15 are
known as the high registers.
7.3 The program status registers
17
program status registers:
18
7.4 The condition code flags
The N, Z, C, and V bits are the condition code flags, You can set these bits by arithmeticand
logical operations. The flags can also be set by MSR and LDM instructions.
TheARM7TDMI-S tests these flags to determine whether to execute an instruction.
All instructions can execute conditionally in ARM state. In Thumb state, only the Branch
instruction can be executed conditionally
19
The T bit reflects the operating state:
The ARM7TDMI-S is a member of the ARM family of general- purpose 32-bit microprocessors.
The ARM family offers high performance for very low power consumption and gate count. The
ARM architecture is based on Reduced Instruction Set Computer (RISC) principles. The RISC
instruction set, and related decode mechanism are much simpler than those of Complex
Instruction Set Computer (CISC) designs. This simplicity gives: • a high
20
Instruction throughput
The ARM7TDMI-S uses a pipeline to increase the speed of the flow of instructions to the
processor. This allows several operations to take place simultaneously, and the processing, and
memory systems to operate continuously.
• Fetch
• Decode
• Execute.
The three-stage pipeline is shown in Figure .
21
8.1The Program Counter (PC) points to the instruction being fetched rather than to the
instruction being executed.
During normal operation, while one instruction is being executed, its successor is being decoded,
and a third instruction is being fetched from memory
22
Widely used in many applications such as palmtop computers, portable instruments, smart card.
9.1 The ARM Processor Families (II)
The ARM9 Family
32-bit RISC Processor with ARM and Thumb instruction sets Supports five-stage pipeline
23
Uses Harvard architecture
ar
ETM Mtsñae
ARM920T
- MPU
24
Widely used in mobile phones, PDAs, digital cameras, automotive systems,
industrial control systems.
9.2 ARM Processor Families (III)
25
Widely used in videophone, PDAs, set-top boxes, game console, digital video cameras,
automotive and industrial control systems.
32-bit RISC processor with ARM, Thumb and DSP instruction sets.
Uses Harvard Architecture.
Supports eight-stage Pipelines except ARM1156T2 uses
nine-stage pipeline.
Widely used in automotive and industrial control systems, 3D graphics, security critical
applications
26
10. Characteristics of ARM11 family
10.1ARM Pipelines
Unprivileged mode
27
System mode
Undefined mode
10.3Exceptions
Exceptions are taken whenever the normal flow of a program must temporarily halt, for
example, to service an interrupt from a peripheral. Before attempting to handle an exception, the
processor preserves the critical parts of the current processor state so that the original program
can resume when the handler routine has finished
The ARM processor can work in one of many operating modes. So far we have only considered
user mode, which is the "normal" mode of operation.
The processor can also enter "privileged" operating modes which are used to handle exceptions
and SWIs
The Current Processor Status Register CPSR has 5 bits [bit4:0] to indicate which mode the
processor is in:-
28
10.5 How are exceptions generated
By default, the processor is usually in user mode
It enters one of the exception modes when unexpected events
occur.
There are three different types of exceptions (some are called interrupts):-
As a direct result of executing an instruction, such as: Software Interrupt
Instruction (SWI)
Undefined or illegal instruction
Memory error during fetching an instruction
29
r7 as before, but sees a new set of r8 - r14, and in addition, an extra register called the Saved
Processor Status Register (SPSR) stores the value of the CPSR. By swapping to some new
registers, it makes it easier for the programmer to preserve the state of the processor.
For example, during FIQ mode, r8 - r14 can be used freely. On returning back to user
mode, the original values of r8 - r14 will be automatically restored.
10.7 What happens when an exception occurs
ARM completes current instruction as best it can. It departs from current instruction sequence to
handle the exception by performing the following steps:-
2. It saves the current PC in the r14 corresponding to the new mode. For example, if FIQ
occurs, the PC value is stored in r14(FIQ).
3. It saves the old value of CPSR in the Saved Processor Status Register of the new mode.
30
10.8 Where is the exception handler routine
Exceptions can be viewed as "forced" subroutine calls. When and if an exception occurs is not
predictable (unless it is a SWI exception). A unique address is pre-defined for each exception
handler (IRQ, FIQ, etc), and a branch is made to this address. The address to which the processor
is forced to branch to is called the exception/interrupt vector.
10.9 Exception vector addresses
Each vector (except FIQ) is 4 bytes long (i.e. one instruction) You put a branch instruction at this
address: B exception handler FIQ is special in two ways:-
1. You can put the actual FIQ handler (also called Fast Interrupt Service Routine) at
0x0000001C onwards, because FIQ vector occupies the highest address
2. FIQ has many more shadow registers. So you don‟t have to save as many registers on the
stack as other exceptions -faster.
10.10Exception Return
Once the exception has been handled (by the exception handler),
31
the user task is resumed.
The handler program (or Interrupt Service Routine) must restore the user state exactly as it
was before the exception occurred:
32
PC must be changed back to the instruction address in the user instruction stream Steps 1 and 3
are done by user, step 2 by the processor
Restoring registers from the stack would be the same as in the case of subroutines Restoring PC
value is more complicated. The exact way to do it depends on which exception you are
returning from.
Remember that the return address was saved in r14 before entering the exception handler.
To return from a SWI or undefined instruction trap, use: MOVS pc, r14
To return from an IRQ, FIQ or prefetch abort, use: SUBS pc, r14, #4
To return from a data abort to retry the data access, use: SUBS pc, r14, #8 If the destination
register is the PC, the „S‟ modifier does NOT mean ―set the flags‖, but ―restore the CPSR‖
The differences between these three methods of return is due to the pipeline architecture
of the ARM processor. The PC value stored in r14 can be one or two instructions ahead due to
the instruction prefetch pipeline.
10.12Exception Priorities
Since exceptions can arise at the same time, a priority order has to be clearly defined. For the
ARM processor this is:
Reset (highest priority)
Data abort (i.e. Memory fault in read/write data) Fast Interrupt
Request (FIQ)
Normal Interrupt Request (IRQ) Prefetch
abort
10.13 Software Interrupt (SWI), undefined instruction
Consider the case of a FIQ and an IRQ occurring at the same time. The processor will process
the FIQ handler first and ―remember‖ that there is IRQ pending. On return from FIQ, the process
will immediately go to the IRQ handler.
33
11. Interrupts
The processor has two interrupt inputs, for normal interrupts (nIRQ) and fast interrupts (nFIQ).
Each interrupt pin, when asserted and not masked, causes the processor to take the appropriate
type of interrupt exception..The CPSR.F and CPSR.I bits control masking of fast and normal
interrupts respectively.
A number of features exist to improve the interrupt latency, that is, the time taken between the
assertion of the interrupt input and the execution of the interrupt handler. By default, the
processor uses the Low Interrupt Latency (LIL) behaviors introduced in version 6
34
and later of the ARM architecture. The processor also has a port for connection of a Vectored
Interrupt Controller (VIC), and supports Non-Maskable Fast Interrupts (NMFI).
Interrupt request
Fast interrupt request
Non-maskable fast interrupts
Low interrupt latency
Interrupt controller.
The IRQ exception is a normal interrupt caused by a LOW level on the nIRQ input. An IRQ has
a lower priority than an FIQ, and is masked on entry to an FIQ sequence. You must ensure that
the nIRQ input is held LOW until the processor acknowledges the interrupt request, either from
the VIC interface or the software handler.
Irrespective of whether the exception is taken from ARM state or Thumb state, an IRQ handler
returns from the interrupt by executing:
35
You can disable IRQ exceptions within a Privileged mode by setting the CPSR.I bit to b1. See
Program status registers. IRQ interrupts are automatically disabled when an IRQ occurs, by
setting the CPSR.I bit. You can use nested interrupts but it is up to you to save any corruptible
registers and to re-enable IRQs by clearing the CPSR.I bit.
The Fast Interrupt Request (FIQ) reduces the execution time of the exception handler relative to
a normal interrupt. FIQ mode has eight private registers to reduce, or even remove the
requirement for register saving (minimizing the overhead of context switching).
An FIQ is externally generated by taking the nFIQ input signal LOW. You must ensure that the
nFIQ input is held LOW until the processor acknowledges the interrupt request from the
software handler.
Irrespective of whether exception entry is from ARM state or Thumb state, an FIQ handler
returns from the interrupt by executing:
36
If Non-Maskable Fast Interrupts (NMFIs) are not enabled, you can mask FIQ exceptions by
setting the CPSR.F bit to b1. For more information see:
FIQ and IRQ interrupts are automatically masked by setting the CPSR.F and CPSR.I bits when
an FIQ occurs. You can use nested interrupts but it is up to you to save any corruptible registers
and to re-enable interrupts.
When NMFI behavior is enabled, FIQ interrupts cannot be masked by software. Enabling NMFI
behavior ensures that when the FIQ mask, that is, the CPSR.F bit, is cleared by the reset handler,
fast interrupts are always taken as quickly as possible, except during handling of a fast interrupt.
This makes the fast interrupt suitable for signaling critical events. NMFI behavior is controlled
by a configuration input signal CFGNMFI, that is asserted HIGH to enable NMFI operation.
There is no software control of NMFI.
Software can detect whether NMFI operation is enabled by reading the NMFI bit of the SCTLR:
37
NMFI == 0
NMFI == 1
For more information see c1, System Control Register. When the NMFI bit in
the CPSR.F bit can be set to b1 only by an FIQ or reset exception entry.
Low Interrupt Latency (LIL) is a set of behaviors that reduce the interrupt latency for the
processor, and is enabled by default. That is, the FI bit [21] in the SCTLR is Read-as-One.
LIL behavior enables accesses to Normal memory, including multiword accesses and external
accesses, to be abandoned part-way through execution so that the processor can react to a
pending interrupt faster than would otherwise be the case. When an instruction is abandoned in
this way, the processor behaves as
38
if the instruction was not executed at all. If, after handling the interrupt, the interrupt handler
returns to the program in the normal way using instruction SUBS pc, r14, #4, the abandoned
instruction is re-executed. This means that some of the memory accesses generated by the
instruction are performed twice.
Memory that is marked as Strongly-ordered or Device type is typically sensitive to the number of
reads or writes performed. Because of this, instructions that access Strongly-ordered or Device
memory are never abandoned when they have started accessing memory. These instructions
always complete either all or none of their memory accesses. Therefore, to minimize the interrupt
latency, you must avoid the use of multiword load/store instructions to memory locations that are
marked as Strongly- ordered or Device.
The processor includes a VIC port for connection of a Vectored Interrupt Controller (VIC). An
interrupt controller is a peripheral that handles multiple interrupt sources. Features usually found
in an interrupt controller are:
39
Multiple interrupt request inputs, one for each interrupt source, and one or more
amalgamated interrupt request outputs to the processor
Determine from the interrupt controller which interrupt source is requesting service
Determine where the service routine for that interrupt source is loaded
A VIC does all these in hardware to reduce the interrupt latency. It supplies the starting address
of the service routine corresponding to the highest priority asserted interrupt source directly to
the processor. When the processor has accepted this address, it masks the interrupt so that the
processor can re-enable interrupts without clearing the source. The PL192 VIC is an AMBA
compliant, SoC peripheral that is developed, tested, and licensed by ARM.
40
You can use the VIC port to connect a PL192 VIC to the processor. See the ARM PrimeCell Vectored
Interrupt Controller (PL192) Technical Reference Manual for more information about the PL192 VIC.
You can enable the VIC port by setting the VE bit in the SCTLR. When the VIC port is enabled and an
IRQ occurs, the processor performs an handshake over the VIC interface to obtain the address of the
handling routine for the IRQ.
Vector table
The vector table All ARM systems have a vector table. The vector table does not form part of the
initialization sequence, but it must be present for any exception to be serviced. It must be placed at a
specific address, usually 0x0.
41
Questions Bank:
42
SCHOOL OF ELECTRICAL & ELECTRONICS ENGINEERING
UNIT-II
EMBEDDED SYSTEM DESIGN– SBMA5201
43
1. Arm Instruction Set
Data Processing Instructions, Addressing Modes, Branch, Load, Store Instructions, PSR
Instructions, Conditional Instructions. Thumb Instruction Set: Register Usage, Other Branch
Instructions, Data Processing Instructions, Single Register and Multi Register Load -Store
Instructions, Stack, Software Interrupt Instructions
44
1.1.ARM has 37 registers all of which are 32-bits long.
The current processor mode governs which of several banks is accessible. Each mode can
access a particular set of r0-r12 registers a particular r13 (the stack pointer, sp) and r14 (the link
register, lr) the program counter, r15 (pc) the current program status register, cpsr
ARM processor was designed by Advanced RISC Machine (ARM) Limited Company
ARM processors are major used for low-power and low cost applications
Mobile phones
Communication modems
45
Different versions of ARM processors share the same basic machine instruction sets
• Instruction set defines the operations that can change the state.
Memory system Byte
ordering
46
Byte ordering
47
Instruction set
• Load-store architecture
• 3-address instructions
• Conditional execution of every instruction
• Possible to load/store multiple registers at once
• Possible to combine shift and ALU operations in a single instruction
4.Registers and Memory Access
48
In the ARM architecture
32-bit addresses
Two operand lengths are used in moving data between the memory and the
processor registers
o Bytes (8 bits) and words (32 bits)
When a byte is loaded from memory into a processor register or stored from a
register into the memory
The basic encoding format for the instructions, such as Load, Store, Move,
Arithmetic, and Logic instructions, is shown below
49
An instruction specifies a conditional execution code (Condition), the OP code, two or three
registers (Rn, Rd,and Rm), and some other information
Conditional Execution of Instructions
A distinctive and somewhat unusual feature of ARM processors is that all instructions are
conditionally executed
The instruction is executed only if the current state of the processor condition code
flag satisfies the condition specified in bits b31-b28 of the instruction
Thus the instructions whose condition is not meet the process or condition code flag are
not executed
One of the conditions is used to indicate that the instruction is always executed
50
ARM Instruction Set supports six different types of instructions Data Processing
Instructions
The data processing instructions operate on data held in general purpose registers. Of the two
source operands, one is always a register.
• An immediate value
Four types of shift can be specified. Most data processing instructions can perform a shift
followed by a logical or arithmetic operation.
51
Multiply instructions come in two classes:
MVN
ORR, BIC
CMP, CMN
MLA
52
6.3 Addressing modes Load /store instructions
• offset
• pre-indexed
• post-indexed.
offset :They are formed by adding or subtracting an
immediate or register-based offset to or from a base register
• Preindex
• Postindex
53
• Base register acts as index register for pre index and post index addressing
Pre-indexed and post-indexed addressing modes update the base register with the
base plus offset calculation.
Branch Instructions
Change the flow of sequencial execution of instructions and force to modify the program counter
Branch with link(BL) suitable for subroutine call by storing the address of next instructions
after BL into the link register and
54
restore the program counter from the link register while returning from subroutine.
Branch Exchange and Branch Exchange Link Link for switching the
processor state from Thumb to ARM and vice versa
The second class of instruction is load and store instructions. These instructions come in
two main types
Load and store single register instructions can transfer a 32-bit word, a 16-bit half word and an
eight-bit byte between memory and a register.
Byte and half word loads may be automatically zero extended or sign extended as they are
loaded.
55
Transfer data between memory and registers
Used to move a single data item inand out of register (signed, unsigned, 16-bit half words and
32-bit word)
Supports register indirect, base-plus-offset and stack addressing mode LDR, STR, LDRB,
STRB, LDRH, STRH, LDRSB
6.6 Multiple Register Transfer Instructions
Any subset or all the 16 registers loaded from or stored to memory but increase interrupt
latency.
Swap Instructions :swap the contect of memory with the content of registers.SWP, SWPB
56
* It causes an exception trap to the SWI hardware vector (thus causing a change to
supervisor mode, plus the associated state saving), thus causing the SWI exception
handler to be called.
* The handler can then examine the comment field of the instruction to decide what
operation has been requested.
* By making use of the SWI mechansim, an operating system can implement a set of
privileged operations which applications running in user mode can request.
1 1 0 1 1 1 1 1 8 – Bit Immediate
• PC is forced to 0x08
• 8 bit immediate is zero extended to fill the 24-bit field in the ARM instruction.
57
6.7 Program Status Register Instructions
• Used to transfer the content of program status registers to/from a general-purpose register
.• MRS (copy program status register to a general purpose register), MSR(move a general-
purpose register to a program status register)
58
7.1 condition code flags
The N, Z, C, and V bits are the condition code flags, You can set these bits by arithmetic
and logical operations. The flags can also be set by MSR and LDM instructions. The
ARM7TDMI-S tests these flags to determine whether to execute an instruction.
All instructions can execute conditionally in ARM state. In Thumb state, only the
Branch instruction can be executed conditionally
59
• when the I bit is set, IRQ interrupts are disabled
• when the F bit is set, FIQ interrupts are disabled.
T bit
The T bit reflects the operating state:
Mode bits
The M4, M3, M2, M1, and M0 bits (M[4:0]) are the mode bits. These bits determine the
processor operating mode as listed in Table 2-2. Not all combinations of the mode bits define a
valid processor mode, so take care to use only the bit combinations shown
Reserved bits
The remaining bits in the PSRs are unused but are reserved. When changing a PSR flag or
control bits make sure that these reserved bits are not altered. Also, make sure that your
program does not rely on reserved bits containing specific values because future processors
might have these bits set to one or zero Coprocessor Instructions
Used to extend the instruction set, to control on-chips functions (caches and memory
management) and for additional computations.
60
•CDP(data processing), MRC/MCR (register transfer), LDC/STC (memory transfer).
7.4 Program status registers
The processor contains one CPSR and five SPSRs for exception handlers to use.
8.Addressing modes
• offset
• pre-indexed
• post-indexed.
61
base plus offset calculation. As the PC is a general purpose register, a 32-bit value can
be loaded directly into the PC to perform a jump to any address in the 4GB memory
space
ARM and Thumb instructions can execute conditionally on the condition flags set by a previous
instruction.
After any number of intervening instructions that have not updated the flags.
The instructions that you can make conditional depends on whether the processor is in ARM
state or Thumb state.
To make an instruction conditional, you must add a condition code suffix to the instruction
mnemonic. The condition code suffix enables the processor to test a condition based on the flags.
If the condition test of a conditional instruction fails, the instruction:
62
Conditional execution in ARM state Almost all ARM instructions can be executed conditionally
on the value of the ALU status flags in the APSR. You can either add a condition code suffix to
the instruction or you can conditionally skip over the instruction using a conditional branch
instruction. Using conditional branches instructions to control the flow of execution can be better
when a series of instructions depend on the same condition.
63
PL : Plus
If the N flag is clear after an arithmetical operation. For the purposes of defining
'plus', zero is positive because it isn't negative...
MI : Minus
If the N flag is set after an arithmetical operation.
CS : Carry Set
Set if the C flag is set after an arithmetical operation OR a shift operation, the result of
which cannot be represented in 32bits. You can think of the C flag as the 33rd bit of
the result.
CC : Carry Clear
The reverse of CS.
GE : Greater Than or Equal (signed) If after a
comparison...
the N flag is set AND the V flag is set or...
the N flag is clear AND the V flag is clear.
GT : Greater Than (signed) If after a
comparison...
the N flag is set AND the V flag is set or...
the N flag is clear AND the V flag is clear
64
and...
65
be updated. This does not happen automatically - except for those instructions whose
purpose is to set the status. For example:
ADD R0, R0, R1
The second example shows us the same addition, only this time it will cause the status registers
to be updated.
The last example shows us the addition again, updating the status registers. The difference here is
that it is a conditional instruction. It will only be executed if the result of a previous operation
was EQual (if the Z flag is set).
Here is an example of conditional execution at work. You want to compare register zero with the
contents of something stored in register ten. If not equal to R10, then call a software interrupt,
increment and branch back to do it again. Otherwise clear R10 and return to a calling piece of
code (whose address is stored in R14)
66
An example of conditional execution
67
9.The 32 bit PSR
Processors after the ARM 3 provide a 32 bit addressing space by moving the PSR out of R15
and giving R15 a full 32 bits in which to store the address of the current position.
Currently, RISC OS works in 26 bit mode, except for a few special cases which is unlikely to be
encountered.
The 32 bit mode is important because 26 bits (as in the old PSR) restricts the maximum amount
of addressable memory per- application to 28Mb. That is why you can't drag the Next slot
beyond 28Mb irrespective of how much memory you have installed.
The allocation of the bits within the CPSR (and the SPSR registers to which it is
saved) is:
68
31 30 29 28 --- 7 6 - 4 3 2 1 0
N Z C V I F M4 M3 M2 M1 M0
0 0 0 0 0 User26 mode
0 0 0 0 1 FIQ26 mode
0 0 0 1 0 IRQ26 mode
0 0 0 1 1 SVC26 mode
1 0 0 0 0 User mode
1 0 0 0 1 FIQ mode
1 0 0 1 0 IRQ mode
1 0 0 1 1 SVC mode
1 0 1 1 1 ABT mode
1 1 0 1 1 UND mode
Typically, the processor will be operating in User26, FIQ26, IRQ26 or SVC26 mode. It is
possible to enter a 32 bit mode, but extreme care must be taken. RISC OS won't expect it, and
will sulk greatly if it finds itself in it!
(except RISC OS 5 which works totally in 32bit mode - and you cannot enter 26bit as the
processor doesn't have that anymore...)
You cannot use MOVS PC, R14 or LDMFD R13!, {registers, PC}^ in 32 bit code.
Neither can you use ORRS PC, R14,
69
#1<<28 to set the V flag.
All of this is now possible using MRS and MSR.
You have two PSRs - CPSR which is the Current Program Status Register and SPSR which is
the Saved Program Status Register Each privileged mode has its own PSR, so the total available
selection of PSR is:
CPSR_all - current
SPSR_svc - saved, SVC(32) mode
SPSR_irq - saved, IRQ(32) mode
SPSR_abt - saved, ABT(32) mode
SPSR_und - saved, UND(32) mode
SPSR_fiq - saved, FIQ(32) mode
70
It appears as if you cannot explicitly specify to save the current PSR in, say, SPSR_fiq. Instead,
you should change to FIQ mode and then save to SPSR. In other words, you can only alter the
SPSR of the mode you are in.
Using the _flg suffix allows you to alter the flag bits without affecting the control bits.
In user(32) mode, the control bits of CPSR are protected, you can only alter the condition flags.
In other modes, the entire CPSR is available. You should not specify R15 as a source or
destination register. And finally, you must not attempt to access the SPSR in user(32) mode as it
doesn't exist!
Here, for your delectation, is a way to set the V flag on any ARM processor:
To change mode:
71
MRS R0, CPSR_all ; Copy the PSR BIC R0,
R0, #&1F ; Clear the mode bits
ORR R0, R0, #new_mode ; Set bits for new mode
MSR CPSR_all, R0 ; write PSR back, changing mode
STACK
The ARM architecture offers extensive support for memory stack by allowing programmers to
chose one of four stack format/orientation.
● Empty or Full:
● Empty: Stack Pointer points to the next free space on stack
● Full: Stack Pointer points to the last item on the stack
● Ascending or Descending:
● Ascending: Grows from low memory to high memory
● Descending: Grows from high memory to low memory
● I386, Sparc and PowerPC all use a ―Full, Descending‖ stack format.
72
The multiple data transfer instructions provide a mechanism for storing state on the stack
(pointed to by R13).
The STM and LDM instructions‟ modes have aliases for accessing stacks:
FD = Full Descending
o STMFD/LDMFD = STMDB/LDMIA
ED = Empty Descending
o STMED/LDMED = STMDA/LDMIB
FA = Full Ascending
o STMFA/LDMFA = STMIB/LDMDA
EA = Empty Ascending
o STMEA/LDMEA = STMIA/LDMDB Anything
73
LDMFD r13!, {r4-r7} – Pop R4,R5,R6 and R7 from the stack
• ARM architecture versions v4T and above define a 16-bit instruction set called the
Thumb instruction set. The functionality of the Thumb instruction set is a subset of
the functionality of the 32-bit ARM instruction set.
74
receives instructions of the wrong instruction set for the current state.
• Thumb instruction formats are less regular than ARM instruction formats, as a
result of the dense encoding.
• The processor in Thumb mode uses same eight general- purpose integer
registers that are available ARM
75
mode.Some Thumb instructions also access the PC(ARM register 15),the Link
Register(ARM register 14) and Stack Pointer(ARM register 13).
• When R15 is read, bit[0] is zero and bits[31:1]contain the PC. when R15 is written ,bit[0]
is IGNORED and bits[31:1] are written to the PC.
• Thumb does not provide direct access to the CPSR or any SPSR.
• Thumb execution is flagged by the T bit(bit[5]) in the CPSR. T==0 32-bit instructions
• use ARM code in 32-bit on-chip memory for small speed- critical
routines
• use Thumb code in 16-bit off-chip memory for large non-critical control
routines
76
Figure 2.5 Thumb Instructions
Data Types
77
• r13 is used as the stack pointer
78
ARM-Thumb differences
Thumb exception
Thumb Branching
• Thumb-ARM Decompression
79
• Zero extending 3-bit register specifiers to give 4-bit specifiers
Properties
• Thumb code uses 40% more instructions than the ARM code
• With 16-bit memory Thumb code is 45% faster than ARM code
Thumb code uses 30% less external memory power than ARM code
Question Bank:
5. Draw the architectural block diagram of ARM and explain data flow of each unit.
6. Explain the working of "Barrel shifter" with an example instruction and diagram.
80
8.How ZIGBEE can be interfaced with an ARM processor Draw and explain an
interfacing diagram.
9.Explain the need for a fast interrupt service and a normal interrupt service in ARM
programmer model with proper diagram.
1. Andrew N.Sloss, Dominic Symes, Chris Wright, ARM Systems Developer‘s Guide:
Designing & Optimizing System Software, Elsevier, 2004.
4. C.M.Krishna, Kang G.Shin, Real time systems, McGraw Hill, 3rd reprint, 2010.
5. Herma K., Real Time Systems: Design for Distributed Embedded Applications,
Kluwer Academic Publishers, 1997.
6. William Hohl, ARM Assembly Language, Fundamentals and Techniques, Taylor &
Francis, 2009
81
SCHOOL OF ELECTRICAL & ELECTRONICS ENGINEERING
UNIT-III
EMBEDDED SYSTEM DESIGN– SBMA5201
82
REAL TIME OPERATING SYSTEM
Real time operating systems (RTOS) – real time kernel – OS tasks – task states – task
scheduling – interrupt processing – clocking communication and synchronization – control
blocks – memory requirements and control – kernel services
1. RTOS
A Real-Time Operating System (RTOS) comprises of two components, viz., ―Real- Time‖
and ―Operating System‖.An Operating system (OS) is nothing but a collection of system calls
or functions which provides an interface between hardware and application programs. It
manages the hardware resources of a computer and hosting applications that run on the
computer. An OS typically provides multitasking, synchronization, Interrupt and Event
Handling, Input/ Output, Inter-task Communication, Timers and Clocks and Memory
Management. Core of the OS is the Kernel which is typically a small, highly optimized set of
libraries.
Real-time systems are those systems in which the correctness of the system depends not only
on the logical result of computation, but also on the time at which the results are produced.
RTOS is key to many embedded systems and provides a platform to build applications. All
embedded systems are not designed with RTOS. Embedded systems with relatively
simple/small hardware/code might not require an RTOS. Embedded systems withmoderate-to-
large software applications require some form of scheduling, and hence RTOS.
3.RTOS CLASSFICATION
RTOS specifies a known maximum time for each of the operations that it
performs. Based upon the degree of tolerance in meeting deadlines, RTOS are classified into
following categories
84
doesn‘t fail and also, system quality is acceptable
For a life saving device, automatic parachute opening device for skydivers,
delay can be fatal. Parachute opening device deploys the parachute at a specific altitude based
on various conditions. If it fails to respond in specified time, parachute may not get deployed
at all leading to casualty. Similar situation exists during inflation of air bags, used in cars, at
the time of accident. If airbags don‘t get inflated at appropriate time, it may be fatal for a
driver. So such systems must be hard real time systems, whereas for TV live broadcast, delay
can be acceptable. In such cases, soft real time systems can be used
Fig 3. 2 Jitter
85
consecutive interrupt latency measures is calculated. Finally, the greatest
encountered difference is selected as the worst jitter of this system;
4.RTOS Features
ii. Thread Priority - All tasks are assigned priority level to facilitate pre-
emption.The highest priority task that is ready to run will be the task that will be running.
iv. Priority Inheritance - RTOS should have large number of priority levels
& should prevent priority inversion using priority inheritance.
• Interrupt dispatch latency: The time from the last instruction in the
interrupt handler to the next task scheduled to run.
86
5.RTOS Architecture
5.1KERNEL
87
Kernel" – the part of an operating system that provides the most basic services to application software
running on a processor. The "kernel" of a real-time operating system ("RTOS") provides an
"abstraction layer" that hides from application software the hardware details of the processor (or set of
processors) upon which the application software will run
Fig3.4 kernel
88
in the RAM, it determines which programs get access to which hardware
resources, it sets up or resets the CPU's operating states for optimal operation at all times, and
it organizes the data for long-term non-volatile storage with file systems on such media as
disks, tapes, flash memory, etc RTOS kernel acts as an abstraction layer between the hardware
and the applications.
· Monolithic kernel
Monolithic kernels are part of Unix-like operating systems like Linux, FreeBSD
etc. A monolithic kernel is one single program that contains all of the code necessary to
perform every kernel related task. It runs all basic system services (i.e. process and memory
management, interrupt handling and I/O communication, file system, etc) and provides
powerful abstractions of the underlying hardware. Amount of context switches and messaging
involved are greatly reduced which makes it run faster than microkernel.
In a monolithic kernel, all OS services run along with the main kernel thread,
thus also residing in the same memory area. This approach provides rich and powerful
hardware access.. The main disadvantages of monolithic kernels are the dependencies between
89
system components — a bug in a device driver might crash the entire system —
and the fact that large kernels can become very difficult to maintain.
. In the monolithic kernel, some advantages hinge on these points:
As it is one single piece of software it should be smaller both in source and compiled
forms.
Less code generally means fewer bugs which can translate to fewer security problems
Coding in kernel can be challenging, in part because one cannot use common libraries and
because one needs to use a source-level debugger. Rebooting the computer is often required.
This is not just a problem of convenience to the developers. When debugging is harder, and as
difficulties become stronger, it becomes more likely that code will be "buggier".
Bugs in one part of the kernel have strong side effects; since every function in the kernel has
all the privileges, a bug in one function can corrupt data structure of another, totally unrelated
part of the kernel, or of any running program.
Kernels often become very large and difficult to maintain.
Even if the modules servicing these operations are separate from the whole, the code
integration is tight and difficult to do correctly.
Since the modules run in the same address space, a bug can bring down the entire system.
Monolithic kernels are not portable; therefore, they must be rewritten for each new architecture
that the operating system is to be used on.
5.2 Microkernel
Figure3.6 Microkernel
The very essence of the microkernel architecture illustrates some of its advantages:
Rapid development time and new software can be tested without having
to reboot the kernel.
Messaging bugs can be harder to fix due to the longer trip they have to
take versus the one off copy in a monolithic kernel.
91
Process management in general can be very complicated
Hybrid kernels are micro kernels that have some "non-essential" code in kernel-
space in order for the code to run more quickly than it would were it to be in user-space.
Hybrid kernels are a compromise between the monolithic and microkernel designs. This
implies running some services (such as the network stack or the file system) in kernel space to
reduce the performance overhead of a traditional microkernel, but still running kernel code
(such as device drivers) as servers in user space
92
Faster development time for drivers that can operate from within
modules. No reboot required for testing (provided the kernel is not
destabilized).
A nanokernel delegates virtually all services — including even the most basic
ones like interrupt controllers or the timer — to device drivers to make the kernel memory
requirement even smaller than a traditional microkernel
5.6 Exokernel
93
enables application developers to determine how to make the most efficient use of the
available hardware for each specific program.
Exokernels in themselves are extremely small. However, they are accompanied by library
operating systems providing application developers with the functionalities of a conventional
operating system. A major advantage of exokernel-based systems is that they can incorporate
multiple library operating systems, each exporting a different API, for example one for high
level UI development and one for real-time control
Six types of common services are shown in the following figure below and explained in
subsequent sections
94
regular OS but, in addition, it provides mechanisms to allow real time scheduling of tasks.
Though real-time operating systems may or may not increase the speed of
execution, they can provide much more precise and predictable timing characteristics than
general-purpose OS.
The heart of a real-time OS (and the heart of every OS, for that matter) is the kernel. A kernel
is the central core of an operating system, and it takes care of all the OS jobs:
1. Booting
2. Task Scheduling
6.TASK MANAGEMENT
Task Object
Task States: – Running – Ready (possibly: suspended, pended) – Blocked (possibly: waiting,
dormant, delayed) – Scheduler – schedules/shuffles tasks between Running and Ready states –
Blocking is self-blocking by tasks, and moved to Running state via other tasks‘ interrupt
Ready: There may be several tasks in this state. The attributes of the task and the resources
required to run it must be available for it to be placed in the 'ready' state.
Waiting: The execution of tasks placed in this state has been suspended because the task
requires some resources which is not available or because the task is waiting for some signal
from the plant, e.g., input from the analog-to-digital converter, or the task is waiting for the
elapse of time.
New: The operating system is aware of the existence of this task, but the task has not been
allocated a priority and a context and has not been included into the list of schedulable tasks
96
Terminated: The operating system has not as yet been made aware of the existence of this
task, although it may be resident in the memory of the computer.
During such times the task is in the new state. Once these are over, it enters the
ready state where it waits. At this time it is within the view of the scheduler and is considered
for execution according to the scheduling policy.
A task is made to enter the running state from the ready state by the operating
system dispatcher when the scheduler determines the task to be the one to be run according to
its scheduling policy. While the task is running, it may execute a normal or abnormal exit
according to the program logic, in which case it enters the terminated state and then removed
from the view of the OS.Software or hardware interrupts may also occur while the task is
running.
In such a case, depending on the priority of the interrupt, the current task may be transferred to
the ready state and wait for its next time allocation by the scheduler
97
Finally, a task may need to wait at times during its course of execution, either
due to requirements of synchronization with other tasks or for completion of some service
such as I/O that it has requested for.
Task uses TCBs to remember its context. TCBs are data structures residing in
RAM, accessible only by RTOS
Task_ID
Task_State
Task_Priority
Task_Stack_Pointer
Task_Prog
_Counter
98
Scheduler
The scheduler keeps record of the state of each task and selects from among
them that are ready to execute and allocates the CPU to one of them. Various scheduling
algorithms are used in RTOS
Polled System with interrupts. In addition to polling, it takes care of critical tasks
Round Robin : Sequences from task to task, each task getting a slice of time
Hybrid System: Sensitive to sensitive interrupts, with Round Robin system working in
background.
99
Non pre-emptive scheduling or Cooperative Multitasking: Highest priority task executes for
some time, then relinquishes control, re-enters ready state.
Dispatcher
The dispatcher gives control of the CPU to the task selected by the scheduler by
performing context switching and changes the flow of execution.
100
8.TASK SCHEDULING
Scheduling
It is "unsafe" for two tasks to access the same specific data or hardware resource simultaneously.
• 3 Ways to resolve this:
• Binary Semaphores
• Message passing
Memory Allocation
• Speed of allocation
Interrupt Handling
101
Task Scheduling Rules
102
Next, consider what happens when Task B goes to WAITING state in the
conditions in Figure 3. Since task precedence is defined among tasks that can be run, the
precedence among tasks becomes as shown in 3.15. Thereafter when the Task
B waiting state is released, Task B goes to run state after Task C and Task D, and thus assumes
the lowest precedence among tasks of the same priority
Summarizing the above, immediately after a task that goes from READY state
to RUNNING state reverts to READY state, it has the highest precedence among tasks of the
same priority; but after a task goes from RUNNING state to WAITING state and then the wait
is released, its precedence is the lowest among tasks of the same priority.
Note that after a task goes from SUSPENDED state to a run state, it has the
lowest precedence among tasks of the same priority. In a virtual memory system, if a task is
made to wait for paging by putting the task in SUSPENDED state, in such a system the task
precedence changes as a result of a paging wait.
103
Figure 3.16 Precedence after Task B Goes To RUNNING State
104
Figure 3.17. Precedence after Task B Goes To WAITING State
Task Synchronisation
105
Task synchronization is achieved using two types of mechanisms:
Event Objects
Event objects are used when task synchronization is required without resource
sharing. They allow one or more tasks to keep waiting for a specified event to occur. Event
object can exist either in triggered or non-triggered state. Triggered state indicates resumption
of the task.
Semaphores.
A semaphore has an associated resource count and a wait queue. The resource
count indicates availability of resource. The wait queue manages the tasks waiting for
resources from the semaphore. A semaphore functions like a key that define whether a task
has the access to the resource. A task gets an access to the resource when it acquires the
semaphore.
· Binary Semaphores
· Counting Semaphores
106
Intertask communication
107
3) Priority (PRI) sequence.
Pipes - A pipe is an object that provide simple communication channel used for
unstructured data exchange among tasks. A pipe does not store multiple messages but stream
of bytes. Also, data flow from a pipe cannot be prioritize
Remote procedure call (RPC) - It permits distributed computing where task can invoke the
execution of another task on a remote computer.
108
OS intervention.On single-processor systems, if the application runs in kernel mode and can
mask interrupts, this method is the solution with the lowest overhead to prevent simultaneous
access to a shared resource. While interrupts are masked and the current task does not make a
blocking OS call, then the current task has exclusive use of the CPU since no other task or
interrupt can take control, so the critical section is protected. When the task exits its critical
section, it must unmask interrupts; pending interrupts, if any, will then execute. Temporarily
masking interrupts should only be done when the longest path through the critical section is
shorter than the desired maximum interrupt latency. Typically this method of protection is
used only when the critical section is just a few instructions and contains no loops. This
method is ideal for protecting hardware bit- mapped registers when the bits are controlled by
different tasks.
Binary semaphores
When the shared resource must be reserved without blocking all other tasks (such as waiting
for Flash memory to be written), it is better to use mechanisms also available on general-
purpose operating systems, such as semaphores and OS-supervised interprocess messaging.
Such mechanisms involve system calls, and usually invoke the OS's dispatcher code on exit,
so they typically take hundreds of CPU instructions to execute, while masking interrupts may
take as few as one instruction on some processors.
In priority inversion a high priority task waits because a low priority task has a
semaphore, but the lower priority task is not given CPU time to finish its work. A typical
solution is to have the task that owns a semaphore run at, or 'inherit,' the priority of the highest
waiting task. But this simple approach fails when there are multiple levels of
109
waiting: task Awaits for a binary semaphore locked by task B, which waits for a
binary semaphore locked by task C. Handling multiple levels of inheritance without
introducing instability in cycles is complex and problematic.
In a deadlock, two or more tasks lock semaphores without timeouts and then
wait forever for the other task's semaphore, creating a cyclic dependency. The simplest
deadlock scenario occurs when two tasks alternately lock two semaphores, but in the opposite
order. Deadlock is prevented by careful design or by having floored semaphores, which pass
control of a semaphore to the higher priority task on defined conditions.
Message passing
Memory Management
Two types of memory managements are provided in RTOS – Stack and Heap.Stack
management is used during context switching for TCBs. Memory other than memory used for
program code, program data and system stack is called heap memory and it is used for
dynamic allocation of data space for tasks. Management of this memory is called heap
management. Memory allocation is more critical in a real-time operating system than in other
operating systems.
First, for stability there cannot be memory leaks (memory that is allocated, then unused but
never freed). The device should work indefinitely, without ever a need for a reboot.
For this reason, dynamic memory allocation is frowned upon. Whenever possible, allocation
110
of all required memory is specified statically at compile time.
Memory management
Among other things, a multiprogramming operating system kernel must be responsible for
managing all system memory which is currently in use by programs. This ensures that a
program does not interfere with memory already in use by another program. Since programs
time share, each program must have independent access to memory.
Cooperative memory management, used by many early operating systems, assumes that all
programs make voluntary use of the kernel's memory manager, and do not exceed their
allocated memory. This system of memory management is almost never seen any more, since
programs often contain bugs which can cause them to exceed their allocated memory. If a
program fails, it may cause memory used by one or more other programs to be affected or
overwritten. Malicious programs or viruses may purposefully alter another program's memory,
or may affect the operation of the operating system itself. With cooperative memory
management, it takes only one misbehaved program to crash the system.
Memory protection enables the kernel to limit a process' access to the computer's memory.
Various methods of memory protection exist, including memory
111
segmentation and paging. All methods require some level of hardware support
(such as the 80286 MMU), which doesn't exist in all computers.In both segmentation and
paging, certain protected mode registers specify to the CPU what memory address it should
allow a running program to access. Attempts to access other addresses trigger an interrupt
which cause the CPU to re-enter supervisor mode, placing the kernel in charge. This is called
a segmentation violation or Seg-V for short, and since it is both difficult to assign a
meaningful result to such an operation, and because it is usually a sign of a misbehaving
program, the kernel generally resorts to terminating the offending program, and reports the
error.Windows versions 3.1 through ME had some level of memory protection, but programs
could easily circumvent the need to use it. A general protection fault would be produced,
indicating a segmentation violation had occurred; however, the system would often crash
anyway
Virtual memory
Many operating systems can "trick" programs into using memory scattered around the hard
disk and RAM as if it is one continuous chunk of memory, called virtual memory.The use of
virtual memory addressing (such as paging or segmentation) means that the kernel can choose
what memory each program may use at any given time, allowing the operating system to use
the same memory locations for multiple tasks.If a program tries to access memory that isn't in
its current range of accessible memory, but nonetheless has been allocated to it, the kernel is
interrupted in the same way as it would if the program were to exceed its allocated memory.
(See section on memory management.) Under UNIX this kind of interrupt is referred to as a
page fault.When the kernel detects a page fault it generally adjusts the virtual memory range of
the program which triggered it, granting it access to the memory requested. This gives the
kernel discretionary power over where a particular application's memory is stored, or even
whether or not it has actually been allocated yet.In modern operating systems, memory which
is accessed less frequently can be temporarily stored on disk or other media to make that space
available for use by other programs. This is called swapping, as an area of memory can be used
by multiple programs, and what that memory area contains can be swapped or exchanged on
demand.
112
"Virtual memory" provides the programmer or the user with the perception that
there is a much larger amount of RAM in the computer than is really there
Timer Management
113
Interrupt and event handling
RTOS provides various functions for interrupt and event handling, viz.,
Defining interrupt handler, creation and deletion of ISR, referencing the state of an ISR,
enabling and disabling of an interrupt, etc. It also restricts interrupts from occurring when
modifying a data structure, minimize interrupt latencies due to disabling of interrupts when
RTOS is performing critical operations, minimizes interrupt response times.
Interrupts are central to operating systems, as they provide an efficient way for
the operating system to interact with and react to its environment. The alternative – having the
operating system "watch" the various sources of input for events (polling) that require action –
can be found in older systems with very small stacks (50 or 60 bytes) but is unusual in modern
systems with large stacks. Interrupt-based programming is directly supported by most modern
CPUs. Interrupts provide a computer with a way of automatically saving local register
contexts, and running specific code in response to events. Even very basic computers support
hardware interrupts, and allow the programmer to specify code which may be run when that
event takes place.
114
A program may also trigger an interrupt to the operating system. If a program
wishes to access hardware, for example, it may interrupt the operating system's kernel, which
causes control to be passed back to the kernel. The kernel then processes the request. If a
program wishes additional resources (or wishes to shed resources) such as memory, it triggers
an interrupt to get the kernel's attention
115
Most interrupt routines: Copy peripheral data into a buffer Indicate to other
code that data has arrived
RTKernel-32 uses the timer interrupt to activate tasks waiting for a certain point
in time. Module RTCom provide interrupt support for serial ports. This section discusses how
to implement a handler for any interrupt source.
Interrupt handlers usually have little stack space. Therefore, interrupt handlers
should be very economical in their stack usage (e.g., refrain from using functions like sprintf
by all means).
While an interrupt handler is active, no other interrupts with lower priorities can
be processed. Therefore, it is important to minimize the execution times of interrupt handlers,
because otherwise the interrupt response time for other interrupts might suffer. The handler
should avoid any processing not immediately required and delegate it to a task.
116
section Module RTKernel-32, Interrupt Handling). While the handler is being executed, the
scheduler is disabled; thus, the handler need not consider being disrupted by a task switch (it
can, however, be interrupted by interrupts of higher priority). Since the scheduler is disabled,
interrupt handlers must not force blocking task switches.
use more than 512 bytes of stack (the less, the better),
Asynchronous (or hardware interrupt) by hardware event the interrupt handler as a separated
task in a different context.Synchronous by software instructi a divide by zero, a memory
segmentation fault, etc. The interrupt handler runs in the context of the interrupting task
117
Interrupt latency
The time delay between the arrival of interrupt and the start of corresponding ISR.
Modern processors with multiple levels of caches and instruction pipelines that need to
be reset before ISR can start might result in longer latency.
Since an interrupt handler blocks the highest priority task from running, and since real time
operating systems are designed to keep thread latency to a minimum, interrupt handlers are
typically kept as short as possible. The interrupt handler defers allinteraction with the hardware
if possible; typically all that is necessary is to acknowledge or disable the interrupt (so that it
won't occur again when the interrupt handler returns) and notify a task that work needs to be
done. This can be done by unblocking a driver task through releasing a semaphore, setting a
flag or sending a message. A scheduler often provides the ability to unblock a task from
interrupt handler context.
118
10.Device I/O Management
119
is also operating system specific that enables another program, typically an
operating system or applications software package or computer program running under the
operating system kernel, to interact transparently with a hardware device, and usually provides
the requisite interrupt handling necessary for any necessary asynchronous time-dependent
hardware interfacing needs.
The key design goal of device drivers is abstraction. Every model of hardware
(even within the same class of device) is different. Newer models also are released by
manufacturers that provide more reliable or better performance and these newer models are
often controlled differently. Computers and their operating systems cannot be expected to
know how to control every device, both now and in the future. To solve this problem,
operating systems essentially dictate how every type of device should be controlled. The
function of the device driver is then to translate these operating system mandated function
calls into device specific calls. In theory a new device, which is controlled in a new manner,
should function correctly if a suitable driver is available. This new driver ensures that the
device appears to operate as usual from the operating system's point of view.
Under versions of Windows before Vista and versions of Linux before 2.6, all
driver execution was co-operative, meaning that if a driver entered an infinite loop it would
freeze the system. More recent revisions of these operating systems incorporate kernel
preemption, where the kernel interrupts the driver to give it tasks, and then separates itself
from the process until it receives a response from the device driver, or gives it more tasks to
do.
11.NETWORKING
120
using networked file systems or even sharing another computer's graphics or
sound hardware. Some network services allow the resources of a computer to be accessed
transparently, such as SSH which allows networked users direct access to a computer's
command line interface.
12.POPULAR RTOS
There are number of commercially available RTOS, each with some distinct
features and targeted for a specific set of applications. Following table lists some of the widely
used commercially available RTOS.
RTOS Applications/Features
Windows CE Used for Small footprint, mobile and connected devices Supported by
ARM,MIPS, SH4 & x86 architectures
121
VxWorks Most widely adopted RTOS in the embedded industry. Used in famous
NASA rover robots Spirit and Opportunity
Micrium µC/OS- Ported to more than a hundred architectures including x86, mainly used
II in microcontrollers with low resources.
VRTX architectures
Multitasking
o Tasks
o Timers
122
o Time-triggered tasks
Synchronisation
o Semaphores
o Mutexes
o Flags
Intertask Communication
o Mailboxes
o Signals
Error Detection
Interrupt Services
Memory Allocation
Event Logging
Statistics
Multitasking is a way of letting several different execution units, or tasks, share a single
processor so that all tasks can be said to run in parallel. In reality the RTOS scheduler chooses
which task it should run according to the scheduling policy, but because tasks can be swapped
in and out of the processor at a high rate, the illusion of parallelism occurs.
Timers are a simpler form of tasks that are invoked periodically by the RTOS, as defined by
the application.
Time-triggered tasks are tasks that are activated according to cyclic schedules defined by
the application. Time-triggered tasks have priority over normal tasks. The schedule
defines a set of time-triggered tasks, along with activation points and deadlines for each task.
The RTOS monitors time-triggered tasks for missed deadlines.
The rt-kernel embedded RTOS performs error detection when running in the RTOS kernel.
123
Unlike many traditional embedded RTOS, rt-kernel services do not return error codes. Instead,
a common error handling routine is invoked whenever the RTOS kernel detects an error
condition. This simplifies application code as the tedious and error- prone checking of return
values is eliminated.Interrupt services are for the most part provided by the architecture layers
of each processor. The RTOS provides a unified mechanism for enabling, disabling and
attaching to interrupts, thereby making it possible to reuse drivers regardless of the system they
were originally written for.The RTOS supports dynamic memory allocation. It can also be
configured for static memory allocation.An optional event logging mechanism is provided. All
application interactions with the RTOS kernel can be logged. A host tool is used to present the
collected data. This is normally used during debugging when it can be a great help to visualise
the behaviour of the real-time system.The RTOS can optionally continually monitor the
processor load to provide some run- time statistics. There is also functionality for measuring
the processor load over a given block of code.
The RTOS provides the functionality needed for creating a dynamic real-time
system. Tasks and all kernel objects can be created and destroyed at run-time. This provides a
lot of flexibility for many types of applications. A server could for instance create a new task
to handle each new session. When the session finishes, the task and all its resources can be
returned to the system.The drawback to a completely dynamic system is the memory
fragmentation problem. The memory is said to be fragmented when a request to allocate
memory fails because there is no contiguous memory area large enough to satisfy the request,
even though there is enough free memory in the memory heap.The dynamic memory
allocation algorithm used by the rt-kernel RTOS is not particularly prone to memory
fragmentation, but for some applications it can nevertheless be an issue. For this type of
application the RTOS also provides a static memory heap.
An application using the static memory heap must create all resources (tasks,
semaphores, etc) when the system starts up. No resources should be dynamically created when
the system is running. Likewise, no resources should be destroyed. The static memory heap
124
will not reuse any memory that is returned to it. Fragmentation willnot be an issue because
once the system is running, no extra memory will be allocated from the heap.The heap can be
further subdivided into pools suitable for allocation of fixed size messages. This allows
dynamic allocation of data for signals and mailboxes even in an otherwise completely static
systems. A default pool for signals is allocated when the system boots
Time-triggered Tasks
125
Figure 2.23 RTOS processing levels
13.Timers
A timer is a function that is called by the RTOS kernel at intervals defined by the application.
The timer function can be called periodically or one time only. The timer can be thought of as a
simplified task that may be invoked periodically.The RTOS will call the timer function from
interrupt context. This means that timers are subject to the same restrictions as interrupt service
routines. In particular, the timer function must not call any function that may block, such as
sem_wait(), mtx_lock(), and others. See Interrupt Services for further details on interrupt
services.Timers offer a simple, low-overhead mechanism for periodic execution compared to
tasks. However, a timer is more restricted than a task. Timers and tasks can work in unison; a
timer can for instance perform some initial work and then signal a task to perform the
remainder of the work from task context instead.
126
Semaphores:Semaphores are used for synchronisation in real-time systems. A
semaphore is essentially a counter with atomic updates. The value of the counter determines if
the semaphore is available. In order to proceed a task using the semaphore must first read,
then write the counter. The RTOS guarantees that access to the counter is atomic.A semaphore
can be used to guard access to shared resources. The semaphore is initialised to the number of
resources it protects. This type of semaphore is called a counting semaphore. A task trying to
take the semaphore will be blocked if the value of the counter is less than 1, indicating that
there are no free resources, otherwise it will decrease the counter and proceed. When it has
finished with the resource it signals the semaphore, and in doing so it increases the value of
the counter and unblocks the first task that may have been blocked on the semaphore.A
semaphore that can only have the values 1 and 0 is a binary semaphore and can be used to
implement mutual exclusion, however mutexes are optimised for that type of operation and
should be used instead.
Mutexes:Mutexes are binary semaphores optimised for mutual exclusion. They are typically
used to guard a critical region in an application against simultaneous execution by multiple
tasks.
Mailboxes:Mailboxes are RTOS kernel objects that can hold messages to be delivered
between tasks. Mailboxes have a finite size. The size is configured when the mailbox is
created. A task that tries to post a message to a mailbox that is full will be blocked. A task that
tries to fetch from a mailbox that is empty will also be blocked.A mailbox can hold any type
of message. The message is just a pointer to a data structure. All tasks that access the mailbox
127
must agree on the representation of the data. The RTOS transfers the value of the pointer
between the posting and fetching tasks. The message itself is not copied. The posting task
must not use the message after posting it to the mailbox. The fetching task should free the
message if it was dynamically allocated
Signals:Signals are messages that can be sent directly from task to task. Unlike mailboxes
there is no need to provide a RTOS kernel object to hold undelivered messages.Signals can
represent any kind of data structure. Each type of signal is associated with a number. The
number is chosen by the application when the signal is created. The number should be unique,
so that no two types of signals share the same number. When a task receives a signal it can
decide what course of action to take based on the number identifying the signal type.Signals
can be filtered. A task can choose to receive only certain types of signals. Signals that are sent
to the task while the filter is being used will be kept in a queue, and can be received later. This
mechanism can for instance be used in a subroutine to only deal with the types of signals that
are of interest for the subroutine. Signals that were delivered while in the subroutine can be
received by the main task when execution returns from the subroutine.The type of the signal
must be defined so that the first member of the signal data structure is the signal number. This
is the only information about the signal that is of interest to the RTOS. The number will be
used to match against the filter if one has
128
been applied by the receiving task. The signal number 0 is used to terminate filter lists
and is therefore reserved.
Error Detection
Unlike many traditional embedded RTOS, rt-kernel services do not return error codes.
Errors that are detected by the RTOS kernel are with few exceptions fatal errors. There is
very little the application can do to handle the error gracefully. In a production system,
the only possible course of action is often to reset the system and start over.When the
RTOS detects an error, it calls a common error handler. The default error handler will halt
the system in a busy loop. This is normally used during development, when a debugger is
used to load and run code. If an error occurs, the debugger will be in the busy loop, the
error code can be inspected, and the debugger backtrace function can be used to find out
exactly where in the application the RTOS detected the error.Alternatively, the
application can install its own error handler. The error handler can attempt to handle the
error. For instance, if an out of memory error was detected, the application could attempt
to free memory if it is known that some memory area can be safely deallocated. If there is
no safe way to handle the error, the application should reset the system.The approach
taken by the rt-kernel RTOS also has the beneficial side effect that the application does
not have to check return values from RTOS services. This is a tedious and error-prone
procedure that can lead to errors going undetected.
Interrupt Services
Interrupt Service Routines (ISRs), are subroutines that are called by the
RTOS kernel to handle interrupts. Each Interrupt Request Line (IRQ) can be mapped to
an ISR that handles the interrupt.The rt-kernel embedded RTOS supports nested
interrupts, i.e. interrupts of higher priority can preempt lower priority interrupts. A
dedicated interrupt stack is used to store the state of nested interrupts.The application
should call int_connect() to install the interrupt service routine. The RTOS will store the
address of the ISR in an internal table. When the interrupt occurs, the RTOS kernel will
first swap out the currently running task, then call the ISR.Lower level interrupts are
disabled while the ISR is running. To maintain a low interrupt latency for the system, it
129
is important that all interrupts are handled as quickly as possible. A common design
pattern for complex peripherals is to let the ISR clear the interrupt source, then notify a
task that handles the higher level processing of the interrupt.
Memory Allocation
The rt-kernel embedded RTOS supports dynamic memory allocation from a heap. The
standard C memory allocation functions malloc() and free() are supported. The malloc
functions executes with interrupts locked and are therefore thread-safe. The heap is
created when the system boots and will fill the available RAM.
Memory Pools
The RTOS memory pools support allocation of fixed size messages. They are primarily
intended for allocating signal and mailbox payloads, but can be used to allocate any
object. The sig_create()call allocates signals from the default signal pool, which is
created when the system boots.A memory pool can be used to allocated messages of up to
8 user-definable sizes. The buffer that is returned will be of the smallest available size
that will hold the requested number of bytes.
Event Logging
All RTOS kernel events can be logged. An event is in this context defined as any
interaction with the RTOS kernel. Examples of events are tasks being swapped in and out,
calling RTOS kernel services, interrupts occurring, etc. User-defined events can also be
logged.Event logging is useful as a complement to traditional debugging tools. The event
log offers insight into how the system behaves over a period of time, which traditional
debuggers can not do. It should be noted that there is an overhead associated with
collecting the log data. This may cause systems running under tight margins to behave
differently when the events are being logged.Event logging can be completely disabled by
setting the size of the event log buffer to zero. Event logging is always disabled
initially, and must be enabled by calling log_enable().Note that currently the rt-
kernel RTOS is only shipped with event logging compiled in. This increases the size of
the the RTOS libraries. Contact rt-labs if you have no need for event logging and prefer
the space savings instead.
130
Board Support Packages
The Board Support Package is responsible for configuring the board and initialising the
RTOS. The BSP will normally contain an assembly file that sets up the board so that the
RTOS can run, a timer driver, and a driver for the interrupt controller.
Boot Sequence
The boot sequence is the time from power-on until the RTOS starts
executing the first task
Boot sequence
The target's reset vector should be mapped so that it starts to execute the
function
_start in the assembler file crt0.S. This function is responsible for setting
up the embedded target to a point where it can execute C code. At a minimum, this
consists of:
Disabling interrupts
Configuring a stack for the RTOS kernel
131
Copying data section to RAM (if running from ROM)
Clearing the BSS
Jump to the RTOS kernel init function
Question Bank
132
SCHOOL OF ELECTRICAL & ELECTRONICS ENGINEERING
UNIT-IV
EMBEDDED SYSTEM DESIGN– SBMA5201
133
Embedded Networks
1.Introduction
In a distributed embedded system, several processing elements (PEs) (either microprocessors
o ASICs) are connected by a network that allows them to communicate .The application is
distributed over the PEs, and some of the work is done at each node in the network are
several reasons to build network-based embedded systems. When the processing tasks are
physically distributed, it may be necessary to put some of the computing power near where
the events occur.
Consider, for example an automobile: the short time delays required for tasks such as engine
control generally mean that at least parts of the task are done physically close to the engine.
Data reduction is another important reason for distributed processing. It may be possible to
perform some initial signal processing on captured data to reduce its volume—for example,
detecting a certain type of event in a sampled data stream. Reducing the data on a separate
processor may significantly reduce the load on the processor that makes use of that data.
Modularity is another motivation for network-based design. For instance, when a large
system is assembled out of existing components, those components may use a network port
as a clean interface that does not interfere with the internal operation of the component tin
ways that using the microprocessor bus would. A distributed system can also be easier to
debug—the microprocessors in one part of the network can be used to probe components in
another part of the network. Finally, in some cases, networks are used to build fault tolerance
into systems. Distributed embedded system design is another example
of hardware/software co-design, since we must design the network topology as
well as the software running on the network nodes. Of course, the microprocessor bus is a
simple type of network. However, we use the term network to mean an interconnection
scheme that does not provide shared memory communication
Building an embedded system with several PEs talking over a network is definitely
more complicated than using a single large microprocessor to perform the same tasks. So why
would anyone build a distributed embedded system? All the reasons for designing accelerator
systems also apply to distributed embedded systems, and several more reasons are unique to
distributed systems.
In some cases, distributed systems are necessary because the devices that the PEs
communicate with are physically separated. If the deadlines for processing the data are short,
it may be more cost-effective to put the PEs where the data are located rather than build a
higher-speed network to carry the data to a distant, fast PE. An important advantage of a
distributed system with several CPUs is that one part of the system can be used to help
diagnose problems in another part. Whether you are debugging a prototype or diagnosing a
problem in the field, isolating the error to one part of the system can be difficult when
everything is done on a single CPU. If you have several CPUs in the system ,you can use one
to generate inputs for another and to watch its output.
3.Network Abstractions
Networks are complex systems. Ideally, they provide high-level services while hiding
many of the details of data transmission from the other components in the system. In order to
help understand (and design) networks, the International Standards Organization has
developed a seven- layer model for networks known as Open Systems Interconnection (OSI )
models [Sta97A]. Understanding the OSI layers will help us to understand the details of real
networks.
The seven layers of the OSI model, shown in Figure 4.2, are intended to cover a
broad spectrum of networks and their uses. Some networks may not need the services of one
135
or more layers because the higher layers may be totally missing or an intermediate layer may
not be necessary. However, any data network should fit into the OSI model. The OSI layers
from lowest to highest level of abstraction are described below.
■ Physical:
The physical layer defines the basic properties of the interface between systems,
including the physical connections ( plugs and wires), electrical properties, basic functions of
the electrical and physical components, and the basic procedures for exchanging bits.
■ Data link:
The primary purpose of this layer is error detection and control across a single
link.However,if the network requires multiple hops over several data links, the data link
layer does not define the mechanism for data integrity between hops, but only within a
single hop.
Network:
This layer defines the basic end-to-end data transmission service. The network layer is
particularly important in multihop networks.
■ Transport:
The transport layer defines connection-oriented services that ensure that data are
delivered in the proper order and without errors across multiple links. This layer may also try
to optimize network resource utilization.
■ Session:
A session provides mechanisms for controlling the interaction of end user services
across a network, such as data grouping and check pointing.
■ Presentation:
This layer defines data exchange formats and provides transformation utilities to application
programs.
■ Application:
The application layer provides the application interface between the network and end-
user programs.
Although it may seem that embedded systems would be too simple to require use of the OSI
model, the model is in fact quite useful. Even relatively simple embedded networks provide
physical, data link, and network services. An increasing number of embedded systems
provide Internet service that requires implementing the full range of functions in the OSI
model.
136
Figure 4.2 OSI model layers.
Distributed embedded systems can be organized in many different ways depending upon the
needs of the application and cost constraints. One good way to understand possible
architectures is to consider the different types of interconnection networks that can be used.
A point-to-point link establishes a connection between exactly two PEs. Point to-point links
are simple to design precisely because they deal with only two components. We do not have
to worry about other PEs interfering with communication on the link.
Figure4.3 A signal processing system built from print-to-point links shows a simple example
of a distributed embedded system built from point-to-point links. The input signal is sampled
by the input device and passed to the first digital filter, F1, over a point-to-point link. The
results of that filter are sent through a second point-to-point link to filter F2. The results in
turn are sent to the output device over a third point-to-point link. A digital filtering system
requires that its outputs arrive at strict intervals, which means that the filters must process
their inputs in a timely fashion. Using point-to- point connections allows both F1 and F2 to
receive a new sample and send a new output at the same time without worrying about
collisions on the communications network.
137
Figure 4. 4 Format of a typical message on a bus.
A packet contains an address for the destination and the data to be delivered. It frequently
includes error detection/correction information such as parity. It also may include bits that
serve to signal to other PEs that the bus is in use, such as the header shown in the figure. The
data to be transmitted from one PE to another may not fit exactly into the size of the data
payload on the packet. It is the responsibility of the transmitting PE to divide its data into
packets; the receiving PE must of course reassemble the complete data message from the
packets.
Distributed system buses must be arbitrated to control simultaneous access, justas with
microprocessor buses. Arbitration scheme types are summarized below.
■ Fixed-priority arbitration always gives priority to competing devices in the same way. If a
high- priority and a low-priority device both have long data transmissions ready at the same
time, it is quite possible that the low-priority device will not be able to transmit anything until
the high-priority device has sent all its data packets.
or any other combinations of inputs. (Multicast connections can also be made from one input
to several outputs.) A crosspoint is a switch that connects an input to an output. To connect
an input to an output, we activate the cross point at the intersection between the
corresponding input and output lines in the crossbar. For example, to connect in2 and out3 in
the figure, we would activate crossbar A as shown. The major drawback of the crossbar
network is expense: The size of the network grows as the square of the number of inputs
(assuming the numbers of inputs and output sare equal).
Many other networks have been designed that provide varying amounts of parallel
communication at varying hardware costs. Figure 6 shows an example multistage network.
The crossbar of Figure 5 is a direct network in which messages go from source to
destination without going through any memory element. Multistage networks have
intermediate routing nodes to guide the data packets. Most networks are blocking, meaning
that there are some combinations of sources and destinations for which messages cannot be
delivered simultaneously.
A bus is a maximally blocking network since any message on the bus blocks messages from
any other node. A crossbar is non-blocking. In general, networks differ from microprocessor
buses in how they implement communication protocols. Both need handshaking to ensure
that PEs do not interfere with each other. But in most networks, most of the protocol is
performed in software. Microprocessors rely on bus hardware for fast transfers of instructions
and data to and from the CPU. Most embedded network ports on microprocessors implement
the basic communication functions (such as driving
the communications medium) in hardware and implement many other operations in software.
139
.
■ The Controller Area Network (CAN) bus was developed for automotive electronics. It
provides megabit rates and can handle large numbers of devices.
140
Figure 4.7 shows the structure of a typical I2C bus system.
A pull-up resistor keeps the default state of the signal high, and transistors are used in
each bus device to pull down the signal when a 0 is to be transmitted. Open collector/open
drain signaling allows several devices to simultaneously write the bus without causing
electrical damage. The open collector/open drain circuitry allows a slave device to stretch a
clock signal during a read from a slave. The master is responsible for generating the SCL
clock ,but the slave can stretch the low period of the clock (but not the high period)if
necessary .The I2C bus is designed as a multi master bus—any one of several different
devices may act as the master at various times. As a result, there is no global masterto
generate the clock signal on SCL. Instead, a master drives both SCL and SDL when it is
sending data .When the bus is idle, both SCL and SDL remain high .When two devices try to
drive either SCL or SDL to different values, the open collector/open drain circuitry prevents
errors, but each master device must listen to the bus while transmitting to be sure that it is not
interfering with another message—if the device receives a different value than it is trying to
transmit, then it knows that it is interfering with another message.
141
Every I2C device has an address. The addresses of the devices are determined by the system
designer, usually as part of the program for the I2C driver. The addresses must of course be
chosen so that no two devices in the system have the same address. A device address is 7 bits
in the standard I2C definition (the extended I2Callows 10-bit addresses). The address
0000000 is used to signal a general call or bus broadcast, which can be used to signal all
devices simultaneously. The address 11110XX is reserved for the extended 10-bit addressing
scheme; there are several other reserved addresses as well.
A bus transaction is initiated by a start signal and completed with an end signal as follows:
■ A start is signalled by leaving the SCL high and sending a 1 to 0 transition on SDL.
■ A stop is signalled by setting the SCL high and sending a 0 to 1 transition on SDL.
However, starts and stops must be paired. A master can write and then read(or read and then
write) by sending a start after the data transmission, followed by another address transmission
and then more data. The basic state transition graph for the master‘s actions in a bus
transaction is shown in Figure 8.10.The formats of some typical complete bus transactions
are shown in Figure 4.11.In the first example, the master writes 2 bytes to the addressed
slave. In the second, the master requests a read from a slave. In the third, the master writes1
byte to the slave, and then sends another start to initiate a read from the slave.
Figure 8.12 shows how a data byte is transmitted on the bus, including start and stop events.
The transmission starts when SDL is pulled low while SCL remains high.
142
Figure 4.10 State transition graph for an I2C bus master.
After this start condition, the clock line is pulled low to initiate the data transfer. At each bit,
143
the clock line goes high while the data line assumes its proper value of0 or 1.An
acknowledgment is sent at the end of every 8-bit transmission, whether it is an address or
data. For acknowledgment, the transmitter does not pull down the SDL, allowing the receiver
to set the SDL to 0 if it properly received the byte. After acknowledgment, the SDL goes
from low to high while the SCL is high, signalling the stop condition.
The bus uses this feature to arbitrate on each message. When sending, devices listen to the
bus as well. If a device is trying to send a logic 1 but hears a logic 0,it immediately stops
transmitting and gives the other sender priority. (The devices should be designed so that they
can stop transmitting in time to allow a valid bit to be sent.) In many cases, arbitration will be
completed during the address portion of a transmission, but arbitration may continue into the
data portion. If two devices are trying to send identical data to the same address, then of
course they never interfere and both succeed in sending their message.
The I2C interface on a microcontroller can be implemented with varying percentages of the
functionality in software and hardware [Phi89]. As illustrated in Figure 13, a typical system
has a 1-bit hardware interface with routines for byte level functions. The I2C device takes
care of generating the clock and data. The application code calls routines to send an address,
send a data byte, and so on, which then generates the SCL and SDL, acknowledges, and so
forth. One of the microcontroller‘s timers is typically used to control the length of bits on the
bus. Interrupts may be used to recognize
144
bits. However, when used in master mode, polled I/O may be acceptable if no other pending
tasks can be performed, since masters initiate their own transfers.
6.Ethernet
Ethernet is very widely used as a local area network for general-purpose computing.
Because of its ubiquity and the low cost of Ethernet interfaces, it has seen significant use as a
network for embedded computing. Ethernet is particularly useful when PCsare used as
platforms, making it possible to use standard components, and when the network does not
have to meet rigorous real- time requirements. The physical organization of an Ethernet is
very simple, as shown in Figure 4.14.
The network is a bus with a single signal path; the Ethernet standard allows for
several different implementations such as twisted pair and coaxial cable. Unlike the I2C bus,
nodes on the Ethernet are not synchronized—they can send their bits at any time. I2C relies
on the fact that a collision can be detected and quashed within a single bit time thanks to
synchronization. But since Ethernet nodes are not synchronized, if two nodes decide to
transmit at the same time, the message will be ruined. The Ethernet arbitration scheme is
known as Carrier Sense Multiple Access with Collision Detection (CSMA/CD).
The algorithm is outlined in Figure 4.15. A node that has a message waits for the bus
to become silent and then starts transmitting. It simultaneously listens, and if it hears another
transmission that interferes with its transmission, it stops transmitting and waits to retransmit.
The waiting time is random, but weighted by an exponential function of the number of times
the message has been aborted. Figure 8.16 shows the exponentialbackoff function both before
and after it is modulated by the random wait time. Since a message may be interfered with
several times before it is successfully transmitted, the exponential backoff technique helps to
ensure that the network does not become overloaded at high demand factors. The random
factor in the wait time minimizes the chance that two messages will repeatedly interfere with
each other. The maximum length of an Ethernet is determined by the nodes ‘ability to detect
145
collisions. The worst case occurs when two nodes at opposite ends of the bus are transmitting
simultaneously. For the collision to be detected by both nodes, each node‘s signal must be
able to travel to the opposite end of the bus so that it can
can run be heard by the other node. In practice, Ethernets up to several hundred meters.
146
Figure 4.16Exponential backoff times.
Figure 4.17 shows the basic format of an Ethernet packet. It provides addresses of both the
destination and the source. It also provides for a variable-length data payload. The fact that it
may take several attempts to successfully transmit a message and that the waiting time
includes a random factor makes Ethernet performance difficult to analyze. It is possible to
perform data streaming and other real-time activities on Ethernets particularly when the total
network load is kept to a reasonable level, but care must be taken in designing such systems.
Ethernet was not designed to support real-time operations; the exponentialbackoff scheme
cannot guarantee delivery time of any data. Because so much Ethernet hardware and software
is available
,many different approaches have been developed to extend Ethernet to real-time operation;
some of these are compatible with the standard while others are not. As Decotignie points out
[Dec05],there are three ways to reduce the variance in Ethernet‘s packet delivery time:
suppress collisions on the network ,reduce the number of collisions,or resolve collisions
deterministically.
Felser [Fel05] describes several real-time Ethernet architectures
147
The Internet Protocol (IP) is the fundamental protocol on the Internet. It provides
connectionless, packet-based communication. Industrial automation has long been a good
application area for Internet-based embedded systems. Information appliances that use the
Internet are rapidly becoming another use of IP in embedded computing. Internet protocol is
not defined over a particular physical implementation—it is an internetworking standard.
Internet packets are assumed to be carried by some other network, such as an Ethernet. In
general, an Internet packet will travel over several different networks from source to
destination. The IP allows data to flow seamlessly through these networks from one end user
to another. The relationship between IP and individual networks is illustrated in Figure 19.
IP works at the network layer. When node A wants to send data to node B, the
application‘s data pass through several layers of the protocol stack to send to the IP. IP
creates packets for routing to the destination, which are then sent to the data link and physical
layers.
A node that transmits data among different types of networks is known as a router.
The router‘s functionality must go up to the IP layer, but since it is not running applications,
it does not need to go to higher levels of the OSI model. In general, a packet may go through
several routers to get to its destination. At the destination, the IP layer provides data to the
transport layer and ultimately the receiving application. As the data pass through several
layers of the protocol stack, the IP packet data are encapsulated in packet formats appropriate
to each layer. The basic format of an IP packet is shown in Figure 20. The header and data
payload are both of variable length. The maximum total length of the header and data payload
is 65,535 bytes.
An Internet address is a number (32 bits in early versions of IP, 128 bits inIPv6). The
IP address is typically written in the form xxx.xx.xx.xx. The names by which users and
applications typically refer to Internet nodes.
148
Figure 4.20 IP packet structure.
are translated into IP addresses via calls to a Domain Name Server, one of thehigher-level
services built on top of IP. The fact that IP works at the network layer tells us that it does not
guarantee that a packet is delivered to its destination. Furthermore, packets that do arrive may
come out of order. This is referred to as best-effort routing. Since routes for data may
change quickly with subsequent packets being routed along very different paths with different
delays, real-time performance of IP can
149
be hard to predict. When a small network is contained totally within the embedded system,
performance can be evaluated through simulation or other methods because the possible
inputs are limited. Since the performance of the Internet may depend on worldwide usage
patterns, its real-time performance is inherently harder to predict. The Internet also provides
higher-level services built on top of IP. The Transmission Control Protocol (TCP) is one
such example. It provides a connection oriented service that ensures that data arrive in the
appropriate order, and it uses an acknowledgment protocol to ensure that packets arrive.
Because many higher level services are built on top of TCP, the basic protocol is often
referred to as TCP Thus, for example, we can simultaneously /IP.
Management Protocol.
8.Internet Applications
150
The Internet provides a standard way for an embedded system to act in concert with other
devices and with users, such as:
■ One of the earliest Internet-enabled embedded systems was the laser printer. High-end
laser printers often use IP to receive print jobs from host machines.
■ Portable Internet devices can display Web pages, read email, and synchronize calendar
information with remote computers.
■ A home control system allows the homeowner to remotely monitor and control home
cameras, lights, and so on.
The CAN bus [Bos07] was designed for automotive electronics and was first used in
production cars in 1991. CAN is very widely used in cars as well as in other applications.
The CAN bus uses bit-serial transmission. CAN runs at rates of 1 MB/s over a twisted pair
connection of 40 m. An optical link can also be used .The bus protocol supports multiple
masters on the bus. Many of the details of the CAN and I2C buses are similar, but there are
also significant differences.
As shown in Figure 4.22,each node in the CAN bus has its own electrical drivers and
receivers that connect the node to the bus in wired-AND fashion. In CAN terminology, a
logical 1 on the bus is called recessive and a logical 0 is dominant. The driving circuits on
the bus cause the bus to be pulled down to 0 if any node on the bus pulls the bus down
(making 0 dominant over 1). When all nodes are transmitting 1s, the bus is said to be in the
recessive state ;when a node transmits a0, the bus is in the dominant state. Data are sent on
the network in packets known as data frames. CAN is a synchronous bus—all transmitters
must send at the same time for bus arbitration to work. Nodes synchronize themselves to the
bus by listening to the bit transitions on the bus .The first bit of a data frame provides the first
synchronization opportunity in a frame. The nodes must also continue to synchronize
themselves against later transitions in each frame
For example, you cannot use an identifier to specify a device and provide a parameter to say
which data value you want from that device. Instead ,each possible data request must have its
own identifier. An error frame can be generated by any node that detects an error on the bus
.Upon detecting an error, a node interrupts the current transmission with an error frame,
which consists of an error flag field followed by an error delimiter field of8 recessive bits.
The error delimiter field allows the bus to return to the quiescent state so that data frame
transmission can resume. The bus also supports an overload frame, which is a special error
frame sent during the interframe quiescent period
151
.An overload frame signals that a node is overloaded and will not be able to handle the next
message. The node can delay the transmission of the next frame with up to two overload
frames in a row, hopefully giving it enough time to recover from its overload. The CRC field
can be used to check a message‘s data field for correctness .If a transmitting node does not
receive an acknowledgment for a data frame, it should retransmit the data frame until the
frame is acknowledged. This action corresponds to the data link layer in the OSI model.
Figure 4.24 shows the basic architecture of a typical CAN controller. The controller
implements the physical and data link layers; since CAN is a bus, it does not need network
layer services to establish end-to-end connections. The protocol control block is responsible
for determining when to send messages, when a message must be resent due to arbitration
losses, and when a message should be received. The FlexRay network has been designed as
the next generation of system buses for cars. FlexRay provides high data rates—up to 10
MB/s—with deterministic communication. It is also designed to be fault –tolerant .The Local
Interconnect Network ( LIN) bus [Bos07] was created to connect components in a small area,
such as a single door. The physical medium is a single wire that provides data rates of up to
20 KB/s for up to 16 bus subscribers. All transactions are initiated by the master and
responded to by a frame. The software for the network is often generated from a LIN
description file that describes the network subscribers, the signals to be generated, and the
frames. Several buses have come into use for passenger entertainment. Bluetooth is
becoming the standard mechanism for cars to interact with consumer electronics devices such
as audio players or phones. The Media Oriented Systems Transport(MOST) bus [Bos07] was
designed for entertainment and multimedia information.
The basic MOST bus runs at 24.8 MB/s and is known as MOST 25; 50 and 150 MB/s
versions have also been developed. MOST can support up to 64 devices. The network is
organized as a ring. Data transmission is divided into channels. A control channel transfers
control and system management data. Synchronous channels are used to transmit multimedia
data; MOST 25 provides up to 15 audio channels. An asynchronous channel provides high
data rates but without the quality-of- service guarantees of the synchronous channels
152
Figure 4.24 Architecture of a CAN controller.
9.NETWORK-BASED DESIGN
Designing a distributed embedded system around a network involves some of the same
design tasks we faced in accelerated systems. We must schedule computations in time and
allocate them to PEs. Scheduling and allocation of communication are important additional
design tasks required for many distributed networks. Many embedded networks are designed
for low cost and therefore do not provide excessively high communication speed. If we are
not careful, the network can become the bottleneck in system design. In this section we
concentrate on design tasks unique to network-based distributed embedded systems.
10.COMMUNICATIONAL ANALYSIS
We know how to analyze the execution time of programs and systems of processes on single
CPUs, but to analyze the performance of networks we must know how to determine the delay
incurred by transmitting messages. Let us assume for the moment that messages are sent
reliably—we do not have to retransmit a message.
The message delay for a single message with no contention (as would be the case in a point-
to-point connection) can be modeled as
where tx is the transmitter-side overhead, tn is the network transmission time, and tr is the
receiver- side overhead. In I2C, tx and tr are negligible relative to tn, as illustrated by
Example .
153
If messages can interfere with each other in the network, analyzing communicationdelay
becomes difficult. In general, because we must wait for the networkto become available and
then transmit the message, we can write the message delay as
where td is the network availability delay incurred waiting for the network to become
available. The main problem, therefore ,is calculating td. That value depends on the type of
arbitration used in the network.
■ If the network uses fixed-priority arbitration, the network availability delay is unbounded
for all but the highest-priority device. Since the highest-priority device always gets the
network first, unless there is an application-specific limit on how long it will transmit before
relinquishing the network, it can keep blocking the other devices indefinitely.
■ If the network uses fair arbitration, the network availability delay is bounded
In the case of round-robin arbitration, if there are N devices, then the worst case
network availability delay is N(tx _tarb),where tarb is the delay incurred for arbitration. tarb
is usually small compared to transmission time. Even when round-robin arbitration is used to
bound the network availability delay, the waiting time can be very long. If we add
acknowledgment and data corruption into the analysis, figuring network delay is more
difficult. Assuming tha terrors are random, we cannot predict a worst-case delay since every
packet may contain an error. We can, however, compute the probability that a packet will be
delayed for more than a given amount of time. However, such analysis is beyond the scope of
this book.
This cannot cause deadlock since each message has a bounded length, but it can slow
Figure 4.18 shows an example of a multihop communication. The hardware platform has two
separate networks( perhaps so that communications between subsets of the PEs do not
interfere),but there is no direct path from M1 to M5.The message is therefore routed through
M3,which reads it from one network and sends it on to the other one.
Analyzing delays through multihop systems is very difficult. For example ,the time that the
message is held at M3 depends on both the computational load of M3 and the other messages
that it must handle
.If there is more than one network we must allocate communications to the networks .We
may establish multiple networks so that lower-priority communications can be handled
separately without interfering with high-priority communications on the primary network.
Scheduling and allocation of computations and communications are clearly interrelated. If we
change the allocation of computations, we change not onlythe scheduling of processes on
155
those PEs but also potentially the schedules of PEs with which they communicate. For
example, if we move a computation to a slower PE, its results will be available later, which
may mean rescheduling both the process that uses the value and the communication that
sends the value to its destination.
Now that we know how to compute delay for messages we can develop strategies for
designing the schedule and allocation of process and communication . designing the hardware
platform is necessarily closely related to our choice in scheduling and allocating processes
.we want to use only as much hardware as is necessary , but we cannot know how much
hardware to use until we can construct a system schedule . Creating that schedule requires an
allocation of process to PEs , which in turn requires knowing the available hardware .
When designing the hardware platform, we have the following design choice to make :
In making these choices , we need to construct allocations and schedules for the processes to
evaluate the platform. In turn , allocation and scheduling are driven by system performance
analysis.
It helps to start with a basic assessment of the computation and communication needs of the
system. A lower bound on the computational needs of the system can be obtained by
summing up the worst case execution times of the processes
Where Tpi is the execution time of process Pi and Tl is the least – common multiple of all the
periods T1 . This formula computes the total execution time over the schedule unrolled to the
least-common multiple periods .Similarly, we can compute the communication volume over
the least – common multiple of the periods .
156
The above formula computes the total number of bytes transmitted in the unrolled schedule
by counting the output bytes of all the processes in the system . Of course , these figures do
not account for overheads such as operating system scheduling or communication
interference . They simply provide lower bounds on our needs .
Depending n the types of system we are designing , the following two strategies may be
useful to help as quickly come up with efficient system :
For I/O intensive systems we will start with the I/O devices and their associated processing .
For computation – intensive systems we will start with the processes . For systems hat do a lot
of I/O
, we definitely need to support I/O devices themselves and perhaps do some processing of the
data locally before shipping the data over the network .
Determine which processing has deadlines short enough that they cannot be met by
any network within your price range . I/O devices that do not require local processing
may be attached to the network with simplest available interface .
Design the rest of the system using the procedure for computation intensive systems
For computation – intensive systems , we want to consider the process and their deadlines
and communication as follows :
Start with the tasks with the shortest deadlines . The shorter the deadline for a task ,
the more likel it is to require its own processing element or elements . IF a high
priority task shares a PE with a low – priority task not only will a more expensive PE
be required , but scheduling overhead will be paid for at the nn linear rate .
157
After we have designed a basic system that meets our performance goals , we can improve
it to satisfy power consumption or other requirements .
Once you have an initial allocation , use the system schedule as a guide for fine tuning . By
reallocating processes you may be able to improve one or more attributes , such as hardware
cost , slack time in the schedule , power consumption and so on . In particular , load
balancing is oten a good idea . If you have some PEs that are more heavily loaded than others
, it may be possible to move some of those processes to other PEs . Doing so can reduce the
chance that the system fails to meet a deadline due to mistaken estimation of run times
11.2ELEVATOR CONTROLLER
158
Figure 4.19 A bank of elevators.
per hoistway on each floor. Each display has an up light and a down light; if theelevator is
idle,neither light is on.The displays for a hoistway always show the same state on all floors.
Thecar controller must of course sense button presses on the car control panel, but it must
also sense the current position of the elevator.
As shown in Figure 8.26, thecar controller reads two sets of indicators on the wall of the
elevator hoistway tosense position.
The coarse indicators run the entire length of the hoistway and a sensor determines when the
elevator passes each one.
Fine indicators are locate only around the stopping point for each floor. There are 2S _ 1 fine
indicators on each floor, one at the exact stopping point and S on each side of it. The sensor
also reads fine indicators; it puts out separate signals for the coarse and fine indicators. The
elevator system can stop at the proper position by counting coarse and fineindicators.
The elevator‘s movement is controlled by two motor control inputs: one for up and
one for down. When both are disabled, the elevator does not move. The system should not
enable both up and down on a single hoistway simultaneously. The master controller has
several tasks—it must read inputs fromthe floor control panels, send signals to the lights on
the floor displays, read floor requests from the car controllers, and take inputs from the car
159
sensors. Most importantly, it must tell the elevators when to move and when to stop. It must
also schedule the elevators to efficiently answer passenger requests.
In this design ,we are much more aware of the surrounding mechanical elements than we
have been in previous examples. The electronics are clearly a small part of the cost and bulk
of the elevator system. But because the elevators are controlled by the computers, the proper
operation of the embedded hardware and software is very important.
160
Figure 4.21 Basic class diagram for the elevator system.
161
Specification
The basic class diagram for the elevator system is shown in Figure 4.27.This diagram concentrates
on the relationships among the classes and the number of objects of each type that the system
requires.
The physical interface classes are defined in more detail in Figure 4.28.We have used inheritance to
define the sensors, even though these classes represent physical objects.The only difference among
the sensors to the elevator controller is whether they indicate coarse or fine positions;other physical
distinctions among the sensors do not matter.
The Car and Floor classes, which describe the control panels on the floors and in the cars, are
shown in Figure 4 .29.These classes define the basic attributes of the car and floor control panels.
The Controller class is defined in Figure 4.30. This class defines attributes that describe the state of
the system, including where each car is and whether the system has made an emergency stop. It
also defines several behaviors, such as an operate behavior and behaviors to check the state of parts
of the system
Architecture
Computation and I/O occur at three major locations in this system:the floor control
panels/displays,the elevator cabs,and the system controller. Let‘s consider the basic operation of
each of these subsystems one at a time and then go back and design the network that connects them
162
.
.
The floor control panels and displays are relatively simple since they have no hard real-time
requirements. Each one takes a set of inputs for the up/down indicators and lights the appropriate
lights. Each also watches for button events and sends the results to the system controller.We can
use a simple microcontroller for all these tasks.
The cab controller must read the cab‘s buttons and send events to the system controller. It must also
read the sensor inputs and send them to the system controller. Reading the sensors is a hard real-
time task—proper operation of the elevator requires that the cab controller not miss any of the
indicators.We have to decide whether to use one or two PEs in the cab.A conservative design
would use separate PEs for the button panel and the sensor. We could also use a single processor to
handle both the buttons and the sensor.
The system controller must take inputs from all these units. Its control of the elevators has both
hard and soft real-time aspects: It must constantly monitor all moving elevators to be sure they stop
properly, as well as choose which elevator to dispatch to a request. Figure 8.31 shows the set of
networkswe will use in the system.The floor control panels/displays are connected along a single
bus network. Each elevator car has its own point-to-point link with the system controller
163
12.Testing
The simplest way to test the controllers is to build an elevator simulator using an FPGA. We
can easily program an FPGA to simulate several elevators by keeping registers for the current
position of each elevator and using counters to control how often the elevators change state. Using
an FPGA-based elevator simulator provides good motivation for this example because we can
design the FPGA to indicate when an elevator has crashed through the floor or the ceiling of its
shaft. Working with a real-time-oriented elevator simulator helps illustrate the challenges presented
by real-time control. We can use a serial link from a PC to provide button inputs, or we can wire up
panels of buttons and indicators ourselves
.
164
Questions Bank
1. Explain in detail the design process in embedded system.
2. What are the challenges of Embedded System
3. Discuss the complete design of typical embedded system
4. What do you mean by embedded systems, Explain in brief with an example
5. Discuss the different challenges related to embedded software development
6. Describe the different issues related to embedded software development
7. Explain the different models and languages for embedded software
4. C.M.Krishna, Kang G.Shin, Real time systems, McGraw Hill, 3rd reprint,
2010.
165
SCHOOL OF ELECTRICAL & ELECTRONICS ENGINEERING
DEPARTMENT OF ELECTRONICS & INSTRUMENTATION
UNIT-V
EMBEDDED SYSTEM DESIGN– SBMA5201
166
SYSTEM DESIGN
Switches and LED interfacing-LCD Display interfacing- Analog sensors interfacing for
digital data conversion - Access control using analog keypad - Pulse width modulation
technique for motor speed control
1.Switch Interfaces
Input/output devices are critical components of an embedded system. The first input
device we will study is the switch. It allows the human to input binary information into the
computer. Typically we define the asserted state, or logic true, when the switch is pressed.
Contact switches can also be used in machines to detect mechanical contact (e.g., two parts
touching, paper present in the printer, or wheels on the ground etc.) A single pole single
throw (SPST) switch has two connections. The switches are shown as little open circles in
Figure 8.2. In a normally open switch (NO), the resistance between the connections is infinite
(over 100 MΩ on the B3F tactile switch) if the switch is not pressed and zero (under 0.1 Ω on
the B3F tactile switch) if the switch is pressed. To convert the infinite/zero resistance into a
digital signal, we can use a pull-down resistor to ground or a pull-up resistor to +3.3V as
shown in Figure 8.2. Notice that 10 kΩ is 100,000 times larger than the on-resistance of the
switch and 10,000 times smaller than its off-resistance. Another way to choose the pull-down
or pull-up resistor is to consider the input current of the microcontroller input pin. The
current into the microcontroller will be less than 2µA (shown as IIL and IIH in the data
sheet). So, if the current into microcontroller is 2µA, then the voltage drop across the 10 kΩ
resistor will be 0.02 V, which is negligibly small. With a pull-down resistor, the digital signal
will be low if the switch is not pressed and high if the switch is pressed (right Figure 8.2).
This is defined as positive logic because the asserted state is a logic high. Conversely, with a
pull-up resistor, the digital signal will be high if the switch is not pressed and low if the
switch is pressed (middle of Figure 8.2). This is defined as negative logic because the
asserted state is a logic low.
One of the complicating issues with mechanical switches is they can bounce (oscillate on and
167
off) when touched and when released. The contact bounce varies from switch to switch and
from time to time, but usually bouncing is a transient event lasting less than 5 ms. We can
eliminate the effect of bounce if we design software that waits at least 10 ms between times
we read the switch values.
To interface a switch we connect it to a pin (e.g., Figure 8.3) and initialize the pin as an input.
The initialization function will enable the clock, set the direction register to input, turn off the
alternative function, and enable the pin. Notice the software is friendly because it just affects
PA5 without affecting the other bits in Port A. The input function reads Port A and returns a
true (0x20) if the switch is pressed and returns a false (0) if the switch is not pressed. Figure
8.4 shows how we could build this circuit with a protoboard and a LaunchPad.
The software in Program 5.1 is called a driver, and it includes an initialization, which is
called once, and a second function that can be called to read the current position of the
switch. Writing software this way is called an abstraction, because it separates what the
switch does (Init, On, Off) from how it works (PortA, bit 5, TM4C123). The first input
function uses the bit-specific address to get just PA5, while the second reads the entire port
and selects bit 5 using a logical AND.
168
#define PA5 (*((volatile unsigned long *)0x40004080))
void Switch_Init(void){ volatile unsigned long delay;
SYSCTL_RCGC2_R |= 0x00000001; // 1) activate clock
for Port A delay = SYSCTL_RCGC2_R; // allow time for
clock to start
}
unsigned long Switch_Input(void){
}
Program 5.1. Software interface for a switch on PA5 (C8_Switch).
Maintenance Tip: When interacting with just some of the bits of an I/O register it is
better to modify just the bits of interest, leaving the other bits unchanged. In this way,
the action of one piece of software does not undo the action of another piece.
2.LED Interfaces
Driving LEDs
Light emitting diodes (LEDs) are often used as indicators in digital systems and in many
cases can simply be directly driven from a logic output provided there is sufficient current
and voltage drive .The voltage drive is necessary to get the LED to illuminate in the first
place. LEDs will only light up when their diode reversebreakdown voltage is exceeded. This
is usually about 2 to 2.2 volts and less than the logic high voltage. The current drive
determine show bright the LED will appear and it is usual to have a current limiting resistor
in series with the LED to prevent it from drawing too much current and overheating. For a
logic device with a 5 vol t supply a 300 resistor will limit the current to about 10 mA. The
problem comes if the logic output is only 2.4 or 2.5 volts and not the expected 5 volts. This
means that the resistor is sufficient to drop enough voltage so that the LED does not light up.
169
The solution is to use a buffer so that there is sufficient current drive or alternatively use a
transistor to switch on the LED. There are special LED driver circuits packs available that are
designed to connect directly to an LED without the need for the current limiting resistor.
The resistor or current limiting circuit is included inside the device.
A light emitting diode (LED) emits light when an electric current passes through it.
LEDs have polarity, meaning current must pass from anode to cathode to activate. The
anode is labelled a or + , and cathode is labelled kor -. The cathode is the short lead and
there may be a slight flat spot on the body of round LEDs. Thus, the anode is the longer lead.
The brightness of an LED depends on the applied electrical power (P=I*V). Since the LED
voltage is approximately constant in the active region (see left side of Figure 8.5), we can
establish the desired brightness by setting the current.
If the LED current is above 8 mA, we cannot connect it directly to the microcontroller
because the high currents may damage the chip. Figure 8.5 shows two possible interface
circuits we could use. In both circuits if the software makes its output high the LED will be
on. If the software makes its output low the LED will be off (shown in Figure 5.5 with
italics). When the software writes a logic 1 to the output port, the input to the 7405/PN2222
becomes high, output from the 7405/PN2222 becomes low, 10 mA travels through the LED,
and the LED is on. When the software writes a logic 0 to the output port, the input to the
7405/PN2222 becomes low, output from the 7405/PN2222 floats (neither high nor low), no
current travels through the LED, and the LED is dark. The value of the resistor is selected to
establish the proper LED current. When active, the LED voltage will be about 2 V, and the
power delivered to the LED will be controlled by its current. If the desired brightness
requires an operating point of 1.9 V at 10 mA, then the resistor value should be
where Vd, Id is the desired LED operating point, and VOL is the output low voltage of the
LED driver. If we use a standard resistor value of 100Ω in place of the 90Ω, then the current
170
will be (3.3-1.9- 0.5V)/100Ω, which is about 9 mA. This slightly lower current is usually
acceptable.
When the LED current is less than 8 mA, we can interface it directly to an output pin without
using a driver. The LED shown in Figure 5.6a has an operating point of 1.7 V and 1 mA. For
the positive logic interface (Figure 5.6b) we calculate the resistor value based on the desired
LED voltage and current
where VOH is the output high voltage of the microcontroller output pin. Since VOH can vary from
2.4 to
3.3 V, it makes sense to choose a resistor from a measured value of VOH, rather than the
minimum value of 2.4 V. Negative logic means the LED is activated when the software
outputs a zero. For the negative logic interface (Figure 5.6c) we use a similar equation to
determine the resistor value
where VOL is the output low voltage of the microcontroller output pin.
If we use a 1.2 kΩ in place of the 1.3 kΩ, then the current will be (3.3-1.6-0.4V)/1.2kΩ, which is
about
1.08 mA. This slightly higher current is usually acceptable. If we use a standard resistor value
of 1.5 kΩ in place of the 1.3 kΩ, then the current will be (3.3-1.6-0.4V)/1.5kΩ, which is
about 0.87 mA. This slightly lower current is usually acceptable.
The software in Program 5.2 is called a driver, and it includes an initialization, which is
called once, and two functions that can be called to turn on and off the LED. Writing
software this way is called an abstraction, because it separates what the LED does (Init, On,
Off) from how it works (PortA, TM4C123).
Checkpoint 8.2: What resistor value in of Figure 8.6 is needed if the desired LED operating
point is 1.7V and 2 mA? Use the negative logic interface and, VOL of 0.4V.
171
void LED_Init(void){ volatile unsigned long delay;
SYSCTL_RCGC2_R |= 0x01; // 1) activate clock for Port A
delay = SYSCTL_RCGC2_R; // allow time for clock to
start
GPIO_PORTA_DATA_R |= 0x04;
}
172
The yellow-purple-brown resistor is 470ohm. It doesn't matter what color the wires are,
but in this figure the wires are black, red and green. The two black wires are ground, the
red wire is +3.3V, and the green wire is the signal Out, which connects PA2 of the
microcontroller to the positive side of the LED.
Liquid Crystal Display (LCD) consists of rod-shaped tiny molecules sandwiched between a
flat piece of glass and an opaque substrate. These rod-shaped molecules in between the plates
align into two different physical positions based on the electric charge applied to them. When
electric charge is applied they align to block the light entering through them, where as when
no-charge is applied they become transparent.
Light passing through makes the desired images appear. This is the basic concept behind
LCD displays. LCDs are most commonly used because of their advantages over other display
technologies. They are thin and flat and consume very small amount of power compared to
LED displays and cathode ray tubes (CRTs).
ackground Microprocessor controlled LCD displays are widely used, having replaced most
of their LED counterparts, because of their low power and flexible display graphics. This
experiment will illustrate how a handshaked parallel port of the microcomputer will be used
to output to the LCD display. The hardware for the display uses an industry standard
HD44780 controller. The low-level software initializes and outputs to the HD44780
controller.
There are four types of access cycles to the HD44780 depending on RS and
R/W RS R/W Cycle
0 0 Write to Instruction Register
173
0 1 Read Busy Flag (bit 7)
1 0 Write data from μP to the
HD44780 1 1 Read data from
HD44780 to the μP
Two types of synchronization can be used, blind cycle and gadfly. Most operations
require 40 μs to complete whilesome require 1.64 ms. The example implementation shown in
the LCD12.H, LCD12.C uses OC5 to create the blind cycle wait. A gadfly interface provides
feedback to detect a faulty interface, but has the problem of creating a software crash if the
LCD never finishes. The best interface utilized both gadfly and blind cycle, so that the
software can return with an error code if a display operation does not finish on time (due to a
broken wire or damaged display.)
if(!LCDOpen(LCDscroll|LCDright))
error(); LCDClose Release of display
port
Sets OpenFlag to false
Release any dynamically allocated
memory Returns an error code if not
previously open Output Parameter(error
code)
Typical calling
sequence
174
if(!LCDClose())
error();
Procedure
You should look at the +5 V voltage versus time signal on a scope when power is first turned
on to determine
if the LCD ―power on reset‖ circuit will be properly activated. The LCD data sheet specifies
it needs from 0.1 ms to10 ms rise time from 0.2 V to 4.5 V to generate the power on reset.
Connect the LCD to your microcomputer. Usethe scope to verify the sharpness of the digital
inputs/outputs. Adjust the contrast potentiometer for the best lookingdisplay. Test the device
driver software and main program in small pieces.
Checkout
You should be able to demonstrate all the ―cool‖ features of your LCD display system.
Hints
1) Make sure the 14 wires are securely attached to your board.
2) One way to test for the first call to open is to test the direction register. After reset, the
direction registers areusually zero, after a call to open, some direction register bits will be
one.
3) Download from the class web site the files LCDTEST.C LCD12.H and LCD12.C files.
These C languageroutine to do low level LCD output to Port H/J. Notice that it does not
perform any input (either status or data),therefore it leaves DDRJ=0xFF, DDRH=0xFF. If
you wish to include inputs, then you will have to toggle DDRH, so that PORTH is an output
for writes and an input for reads.
4 ) Although many LCD displays use the same HD44780 controller, the displays come in
various sizes ranging from1 row by 16 columns up to 4 rows by 40 columns.
175
Analog to Digital Conversion, Data Acquisition and Control
we have seen that an embedded system uses its input/output devices to interact with the
external world. In this chapter we will focus on input devices that we use to gather
information about the world. More specifically, we present a technique for the system to
measure analog inputs using an analog to digital converter (ADC). We will use periodic
interrupts to sample the ADC at a fixed rate. We will then
combine sensors, the ADC, software, PWM output and motor interfaces to implement
intelligent control on our robot car.
176
The most pervasive method for ADC conversion is the successive approximation technique,
as illustrated in Figure 5. 10 . A 12-bit successive approximation ADC is clocked 12 times.
At each clock another bit is determined, starting with the most significant bit. For each clock,
the successive approximation hardware issues a new "guess" on Vdac by setting the bit under
test to a "1". If Vdac is now higher than the unknown input,V in, then the bit under test is
cleared. If Vdac is less than Vin, then the bit under test is remains 1. In this description, bit is
an unsigned integer that specifies the bit under test. For a 12-bit ADC, bit goes 2048, 1024,
512, 256,...,1. Dout is the ADC digital output, and Z is the binary input that is true if V dac is
greater than Vin. 0 to 3.3V on its input into a digital number from 0 to 4095
Normally we don‘t specify accuracy for just the ADC, but rather we give the accuracy of the
entire system (including transducer, analog circuit, ADC and software). An ADC is
monotonic if it has no missing codes as the analog input slowly rises. This means if the
analog signal is a slowly rising voltage, then the digital output will hit all values one at a
time, always going up, never going down. The figure of merit of an ADC involves three
factors: precision (number of bits), speed (how fast can we sample), and power (how much
energy does it take to operate). How fast we can sample involves both the ADC conversion
time (how long it takes to convert), and the bandwidth (what frequency components can be
recognized by the ADC). The ADC cost is a function of the number and quality of internal
components. Two 12-bit ADCs are built into the TM4C123/LM4F120 microcontroller. You
will use ADC0 to collect data and we will use ADC1 and the PD3 pin to implement a
voltmeter and oscilloscope.
Table 1 shows the ADC0 register bits required to perform sampling on a single
channel. There are two ADCs; you will use ADC0 and the grader uses ADC1. For more
complex configurations refer to the specific data sheet. Bits 8 and 9 of the
SYSCTL_RCGC0_R specify the maximum sampling rate, see Table 14.2. The TM4C123
can sample up to 1 million samples per second. Bits 8 and 9 of the SYSCTL_RCGC0_R
specify how fast it COULD sample; the actual sampling rate is determined by the rate at
which we trigger the ADC. In this chapter we will use software trigger mode, so the actual
sampling rate is determined by the SysTick periodic interrupt rate; the SysTick ISR will take
one ADC sample. On the TM4C123, we will need to set bits in the AMSEL register to
activate the analog interface.
177
Table5. 1. The TM4C ADC0 registers. Each register is 32 bits wide.
31-4 3 2 1 0
0x4003.8 ASE ASE ASE ASE ADC0_ACTSS_R
000 N3 N2 N1 N0
0x4003.8 MU ADC0_SSMUX3
0A0 X0 _R
0x4003.8 TS0 IE0 END D0 ADC0_SSCTL3_
0A4 0 R
0x4003.8 SS3 SS2 SS1 SS0 ADC0_PSSI_R
028
0x4003.8 INR3 INR2 INR1 INR0 ADC0_RIS_R
004
0x4003.8 IN3 IN2 IN1 IN0 ADC0_ISC_R
00C
31-12 11-0
0x4003.80 DA ADC0_SSFIFO3
A8 TA
Value Description
0x3 1M samples/second
0x2 500K samples/second
0x1 250K samples/second
0x0 125K samples/second
178
Table5 .3 shows which I/O pins on the TM4C123 can be used for ADC analog input channels.
IO Ain 0 1 2 3 4 5 6 7 8 9 1
4
PB Ain Por SSI2C M0PW T1CC CAN0R
4 10 t lk M2 P0 x
PB Ain Por SSI2F M0PW T1CC CAN0T
5 11 t ss M3 P1 x
PD Ain Por SSI3 SSI1C I2C3S M0PW M1PW WT2C
0 7 t Clk lk CL M6 M0 CP0
PD Ain Por SSI3 SSI1F I2C3S M0PW M1PW WT2C
1 6 t Fss ss DA M7 M1 CP1
PD Ain Por SSI3 SSI1R M0Fau WT3C USB0e
2 5 t Rx x lt0 CP0 pen
PD Ain Por SSI3 SSI1T ID WT3C USB0pf
3 4 t Tx x X0 CP1 lt
PE Ain Por U7R
0 3 t x
PE Ain Por U7T
1 2 t x
PE Ain Por
2 1 t
PE Ain Por
3 0 t
PE Ain Por U5R I2C2S M0PW M1PW CAN0R
4 9 t x CL M4 M2 x
PE Ain Por U5T I2C2S M0PW M1PW CAN0T
5 8 t x DA M5 M3 x
The ADC has four sequencers, but you will use only sequencer 3 in Labs 14 and 15.
We set the ADC0_SSPRI_R register to 0x0123 to make sequencer 3 the highest priority.
Because we are using just one sequencer, we just need to make sure each sequencer has a
unique priority. We set bits 15–12 (EM3) in the ADC0_EMUX_R register to specify how the
ADC will be triggered.
Table .4 shows the various ways to trigger an ADC conversion. More advanced ADC
triggering techniques are presented in the book Embedded Systems: Real-Time Interfacing to
ARM® Cortex™- M Microcontrollers. However in this course, we use software start
(EM3=0x0). The software writes an 8 (SS3) to the ADC0_PSSI_R to initiate a conversion on
sequencer 3. We can enable and disable the sequencers using the ADC0_ACTSS_R register.
There are twelve ADC channels on the LM4F120/TM4C123. Which channel we
sample is configured by writing to the ADC0_SSMUX3_R register. The
mapping between channel number and the port pin is shown in Table .3. For example channel
9 is connected to the pin PE4. The ADC0_SSCTL3_R register specifies the mode of the ADC
sample. We set TS0 to measure temperature and clear it to measure the analog voltage on the
ADC input pin. We set IE0 so that the INR3 bit is set when the ADC conversion is complete,
and clear it when no flags are needed. When using sequencer 3, there is only one sample, so
END0 will always be set, signifying this sample is the end of the sequence. In this class, the
179
sequence will be just one ADC conversion. We set the D0 bit to activate differential
sampling, such as measuring the analog difference between two ADC pins. In our example,
we clear D0 to sample a single-ended analog input. Because we set the IE0 bit, the
INR3 flag in the ADC0_RIS_R register will be set when the ADC conversion is complete,
We clear the INR3 bit by writing an 8 to the 8 to theADC0_ISC_R register.
Table 5.4. The ADC EM3, EM2, EM1, and EM0 bits in the ADC_EMUX_R register
Value Event
0x0 Software start
0x1 Analog Comparator 0
0x2 Analog Comparator 1
0x3 Analog Comparator 2
0x4 External (GPIO PB4)
0x5 Timer
180
0x6 PWM0
0x7 PWM1
0x8 PWM2
0x9 PWM3
0xF Always (continuously sample)
.
We perform the following steps to configure the ADC for software start on one channel.
Program 14.1 shows a specific details for sampling PE4, which is channel 9. The function
ADC0_InSeq3 will sample PE4 using software start and use busy-wait synchronization to
wait for completion.
Step 1. We enable the port clock for the pin that we will be using for the ADC input.
Step 2. Make that pin an input by writing zero to the DIR register.
Step 3. Enable the alternative function on that pin by writing one to the AFSEL register.
Step 4. Disable the digital function on that pin by writing zero to the DEN register.
Step 5. Enable the analog function on that pin by writing one to the AMSEL register.
Step 6. We enable the ADC clock by setting bit 16 of the SYSCTL_RCGC0_R register.
Step 7. Bits 8 and 9 of the SYSCTL_RCGC0_R register specify the maximum sampling
rate of the ADC. In this example, we will sample slower than 125 kHz, so the maximum
sampling rate is set at 125 kHz. This will require less power and produce a longer sampling
time, creating a more accurate conversion.
Step 8. We will set the priority of each of the four sequencers. In this case, we are using just
one sequencer, so the priorities are irrelevant, except for the fact that no two sequencers
should have the same priority.
Step 9. Before configuring the sequencer, we need to disable it. To disable sequencer 3, we
write a 0 to bit 3 (ASEN3) in the ADC_ACTSS_R register. Disabling the sequencer during
programming prevents erroneous execution if a trigger event were to occur during the
configuration process.
Step 10. We configure the trigger event for the sample sequencer in the ADC_EMUX_R
register. For this example, we write a 0000 to bits 15–12 (EM3) specifying software start
mode for sequencer 3.
Step 11. Configure the corresponding input source in the ADCSSMUXn register. In this
example, we write the channel number to bits 3–0 in the ADC_SSMUX3_R register. In this
example, we sample channel 9, which is PE4.
Step 12. Configure the sample control bits in the corresponding nibble in the ADC0SSCTLn
register. When programming the last nibble, ensure that the END bit is set. Failure to set the
END bit causes unpredictable behavior. Sequencer 3 has only one sample, so we
write a 0110 to the ADC_SSCTL3_R register. Bit 3 is the TS0 bit, which we clear
because we are not measuring temperature. Bit 2 is the IE0 bit, which we set because we
want to the RIS bit to be set when the sample is complete. Bit 1 is the END0 bit, which is set
because this is the last (and only) sample in the sequence. Bit 0 is the D0 bit, which we clear
because we do not wish to use differential mode.
Step 13. We enable the sample sequencer logic by writing a 1 to the corresponding ASENn.
To enable sequencer 3, we write a 1 to bit 3 (ASEN3) in the ADC_ACTSS_R register.
181
void ADC0_InitSWTriggerSeq3_Ch9(void){ volatile unsigned long delay;
SYSCTL_RCGC2_R |= 0x00000010; // 1) activate clock for Port
E delay = SYSCTL_RCGC2_R; // allow time for clock to
stabilize GPIO_PORTE_DIR_R &= ~0x04; // 2) make
PE4 input
delay = SYSCTL_RCGC2_R
Program 5.4gives a function that performs an ADC conversion. There are four steps required
to perform a software-start conversion. The range is 0 to 3.3V. If the analog input is 0, the
digital output will be 0, and if the analog input is 3.3V, the digital output will be 4095.
Digital Sample = (Analog Input (volts) • 4095) / 3.3V(volts)
Step 1. The ADC is started using the software trigger. The channel to sample was specified
earlier in the initialization.
Step 2. The function waits for the ADC to complete by polling the RIS
register bit 3.
Step 3. The 12-bit digital sample is read out of sequencer 3.
182
Step 4. The RIS bit is cleared by writing to the ISC register.
Figure 5.10. The four steps of analog to digital conversion: 1) initiate conversion, 2) wait
for the ADC to finish, 3) read the digital result, and 4) clear the completion flag.
183
7 .Interfacing to DC motors
So far with controlling DC motors, the emphasis has been simple on-off type switching. It
is possible with a digital system to actually provide speed control using a technique called
pulse width modulation
With a DC motor, there are two techniques for controlling the motor speed: the first is to
reduce the DC voltage to the motor. The higher the voltage, the faster it will turn. At low
voltages, the control can be a bit hit and miss and the power control is inefficient. The
alternative technique called pulse width modulation(PWM) will control a motor speed not by
reducing the voltage to the motor but by reducing the time that the motor is switched on.
This is done by generating a square wave at a frequency of several hundred hertz and
changing the mark/space ratio of the wave form. With a large mark and a low space, the
voltage is applied to the motor for almost all of the cycle time, and thus the motor will rotate
very quickly. With a small mark and a large space, the opposite is true. The diagram shows
the waveforms for medium, slow and fast motor control. The only difference between this
method of control and that for a simple on-off switch is the timing of the pulses from the
digital output to switch the motor on and off. There are several methods that can be used to
generate these waveforms.
Software only
With a software-only system, the waveform timing is done by creating some loops that
provide the timing functions. The program pseudo code shows a simple structure for this. The
first action is to switch the motor on and then to start counting through a delay loop. The
length of time to count through the delay loop determines the motor-on period. When the
184
count is finished, the motor is switched off. The next stage is to count through a second delay
loop to determine the motor-off period.
repeat (forever)
{
}
This whole procedure is repeated for as long as the motor needs to be driven. By changing
the value of the two delays, the mark/space ratio of the waveform can be altered. The total
time taken to execute the repeat loop gives the frequency of the waveform. This method is
processor intensive in that the program has to run while the motor is running. On first
evaluation, it may seem that while the motor is running, nothing else can be done. This is not
the case. Instead of simply using delay loops, other work can be inserted in here whose
duration now becomes part of the timing for the PWM waveform. If the work is short, then
the fine control over the mark/space ratio is not lost because the contribution that the work
delay makes compared to the delay loop is small. If the work is long, then the minimum
motor-on time and thus motor speed is determined by this period.
repeat (forever)
{
switch on
motor perform
task a delay
loop1 switch
off motor
delay loop2
}
Figure 5.12 The timing diagrams for the software PWM implementation
185
The timing diagrams for the software loop PWM waveforms are shown in the diagrams
above. In general, software only timing loops are not efficient methods of generating PWM
waveforms for motor control. The addition of a single timer greatly improves the
mechanism.
Using a single timer
By using a single timer, PWM waveforms can be created far easier and free up the processor
to do other things without impacting the timing. There are several methods that can be used
to do this. The key principle is that the timer can be programmed to create a periodic
interrupt.
Method 1 — using the timer to define the on period
With this method, the timer is used to generate the onperiod. The processor switches the
motor on and then starts thetimer to count down. While the timer is doing this, the processor
is free to do what ever work is needed. The timer will eventually time out and generate a
processor interrupt. The processor services the interrupt and switches the motor off. It then
goes into a delay loop still within the service routine until the time period
arrives to switch the motor on again. The processor switches the motor on, resets the timer
and starts it counting and continues with its work by returning from the interrupt service
routine.
Method — using the timer to define frequency period
With this method, the timer is used to generate a periodicinterrupt whose frequency is set by
the timer period. When theprocessor services the interrupt, it uses a software loop to
determinethe on period. The processor switches on the motor and usesthe software delay to
calculate the on period. When the delay loop
is completed, it switches off the motor and can continue with otherwork until the timer
generates the next interrupt.
Method 3 — using the timer to define both the on and off periods
With this method, the timer is used to generate both the on and off periods. The processor
switches the motor on, loads the timer with the on-period value and then starts the timer to
count down. While the timer is doing this, the processor is free to do what ever work is
needed. The timer will eventually time out and
generate a processor interrupt, as before. The processor services the interrupt and switches
the motor off. It then loads the timer with the value for the off period. The processor then
starts the timer counting and continues with its work by returning from the interrupt service
routine. The timer now times out and generates an interrupt. The
processor services this by switching the motor on, loading the timer with the one delay
value and setting the timer counting before returning from the interrupt.
As a result, the processor is only involved when interruptedby the timer to switch the
motor on or off and load the timer withthe appropriate delay value and start it counting. Of all
these threemethods, this last method is the most processor efficient. Withmethods 1 and 2,
the processor is only free to do other work when
the mark/space ratio is such that there is time to do it. With a longmotor-off period, the
processor performs the timing in softwareand there is little time to do anything else. With a
short motor-offperiod, there is more processing time and far more work can bedone. The
problem is that the work load that can be achieved is
dependent on the mark/space ratio of the PWM waveform andengine speed. This can be
186
a major restriction and this is why thethird method is most commonly used.
Suppose a system needs to give an analog output of a control circuit for automation. The
analog output may be to a power system for a d.c. motor or furnace. A Pulse Width
Modulator (PWM) unit in the microcontroller operates as follows: Pulse width is made
proportional to the analog-output needed. PWM inputs are from 00000000 to 11111111 for
an 8-bit DAC operation. The PWM unit outputs to an external integrator and then provides
the desired analog output. Suppose an integrator circuit (external to the microcontroller)
gives an output of 1.024 Volt when the pulse width is 50% of the total pulse time period, and
2.047V when the width is 100%. When the width is made 25% by reducing by half the value
in PWM output control-register, the integrator output will become 0.512 Volt.
Now assume that the integrator operates with a dual (plus-minus) supply. Also assume that
when an integrator circuit gives an output of 1.023 Volt, the pulse width is 100% of total
pulse time period and –1.024 Volt when the width is 0%. When the width is made 25% by
reducing by half the value in an output control register, the integrator output will be 0.512
Volt; at 50% the output will be 0.0 Volt. From this information, finding the formulae to
obtain converted bits for a given PWM register bits ranging from 00000000 to 11111111 in
both the situations is left as an exercise for the reader. The ADC in the system
microcontroller can be used in many applications such as Data Acquisition System (DAS),
analog control system and voice digitizing system. Suppose a system needs to read an analog
input from a sensor or transducer circuit. If converted to bits by the ADC unit in the system,
then these bits, after processing, can also give an output. This provides a control for
automation by a combined use of ADC and DAC features. The converted bits can be given to
the port meant for digital display. The bits may be transferred to a memory address, a serial
port or a parallel port. A processor may process the converted bits and generate a Pulse Code
Modulated (PCM) output. PCM signals are used digitizing the voice in the digital format].
187
Important points about the ADC are as follows:
1. Either a single or dual analog reference voltage source is required in the ADC. It sets
either the analog input‘s upper limit only or the lower and upper limits both. For a single
reference source, the lower limit is set to 0V (ground potential). When the analog input
equals the lower limit the ADC generates all bits as 0s, and when it equals the upper
limit it generates all bits as 1s. [As an example, suppose in an ADC the upper limit or
reference voltage is set as 2.255 Volt. Let the lower limit reference Voltage be 0.255V.
Difference in the limits is 2 Volt. Therefore, the resolution will be (2/256) Volt. If the 8-
bit ADC analog-input is 0.255V, the converted 8 bits will be 00000000. When the input
is (0.255V + 1.000V) = 1.255V, the bits will be 10000000. When the analog input is
(0.255V + 0.50V), the converted bits will be 01000000. [From this information, finding
a formula to obtain converted bits for a given analog input = v Volt is left as an exercise
for the reader].
2. An ADC may be of eight, ten, twelve or sixteen bits depending upon the
resolution needed for conversion.
3. The start of the conversion signal (STC) signal or input initiates the conversion to 8 bits.
In a system, an instruction or a timer signals the STC
. 4. There is an end of conversion (EOC) signal. In a system, a flag in a register is set to
indicate the end of conversion and generate an interrupt.
A system requires an interfacing circuit and software to display the status or message
for a line, for multi-line displays, or flashing displays. An LCD screen may show up a multi-
line display of characters or also show a small graph or icon (called pictogram). A recent
innovation in the mobile phone system turns the screen blue to indicate an incoming call.
Third generation system phones have both image and graphic displays. An LCD needs little
power. It is powered by a supply or battery (a solar panel in the calculator). LCD is a diode
that absorbs or emits light on application of 3 V to 4 V and 50 or 60 Hz voltage-pulses with
currents less than ~50 mA. The pulses are applied with the same polarity on crystal front and
back plane for no light, or with opposite polarity for light. Here polarity at annstance means
logic ‗1‘ or ‗0‘]. An LSI (Lower Scale Integrated Circuit) display-controller is often used in
the case of matrix displays. For indicating ON status of the system there may be an LED,
which glows when it is ON. A flashing LED may indicate that a specific task is under
completion or is running. It may indicate a wait status for a message. The LED is a diode
that emits yellow, green, red (or infrared light in a remote controller), on application of a
forward voltage of 1.6 to 2 V. An LED needs current up to 12 mA above 5 mA (less in
188
flashing display mode) and is much brighter than the LCD. Therefore, for flashing display
and for display limited to few digits, LEDs are used in a system
10.keypad or keyboard
The keypad or keyboard is an important device for getting user inputs. The system
must provide the necessary interfacing and key-debouncing circuit as well as the software
for the system to receive input from a set of keys or from a keyboard or keypad. A keypad
has up to a maximum of 32 keys. A keyboard may have 104 or more keys. The keypad or
keyboard may interface serially or as parallel to rocessor directly through a parallel or serial
port or through a controller.
189
Question Bank
4. C.M.Krishna, Kang G.Shin, Real time systems, McGraw Hill, 3rd reprint,
2010.
190