Embedded Systems Notes
Embedded Systems Notes
Department of
Electrical & Electronics Engineering
EE 6602 Embedded systems
Lecture Notes
Year / semester: III / VI
Regulation: 2013
Prepared by
S.B.Vinoth
AP/ EEE
1
EE6602
EMBEDDED SYSTEMS
LT P C
3003
OBJECTIVES:
To introduce the Building Blocks of Embedded System
To Educate in Various Embedded Development Strategies
To Introduce Bus Communication in processors, Input/output interfacing.
To impart knowledge in various processor scheduling algorithms.
To introduce Basics of Real time operating system and example tutorials to discuss on one realtime operating system tool
UNIT I
INTRODUCTION TO EMBEDDED SYSTEMS
9
Introduction to Embedded Systems The build process for embedded systems- Structural units in
Embedded processor , selection of processor & memory devices- DMA Memory management methodsTimer and Counting devices, Watchdog Timer, Real Time Clock, In circuit emulator, Target Hardware
Debugging.
UNIT II
EMBEDDED NETWORKING
9
Embedded Networking: Introduction, I/O Device Ports & Buses Serial Bus communication protocols RS232 standard RS422 RS485 - CAN Bus -Serial Peripheral Interface (SPI) Inter Integrated Circuits
(I2C) need for device drivers.
UNIT III
EMBEDDED FIRMWARE DEVELOPMENT ENVIRONMENT
9
Embedded Product Development Life Cycle- objectives, different phases of EDLC, Modeling of EDLC;
issues in Hardware-software Co-design, Data Flow Graph, state machine model, Sequential Program
Model, concurrent Model, object oriented Model.
UNIT IV
RTOS BASED EMBEDDED SYSTEM DESIGN
9
Introduction to basic concepts of RTOS- Task, process & threads, interrupt routines in RTOS,
Multiprocessing and Multitasking, Preemptive and non-preemptive scheduling, Task communication
shared memory, message passing-, Inter process Communication synchronization between processessemaphores, Mailbox, pipes, priority inversion, priority inheritance, comparison of Real time Operating
systems: Vx Works, C/OS-II, RT Linux.
UNIT V
EMBEDDED SYSTEM APPLICATION DEVELOPMENT
9
Case Study of Washing Machine- Automotive Application- Smart card System Application,.
TOTAL: 45
PERIODS OUTCOMES:
Ability to understand and analyze linear and digital electronic circuits.
TEXT BOOKS:
1.
Rajkamal, Embedded System-Architecture, Programming, Design, Mc Graw Hill, 2013.
2.
Peckol, Embedded system Design, John Wiley & Sons,2010
3.
Lyla B Das, Embedded Systems-An Integrated Approach, Pearson, 2013
REFERENCES:
1.
Shibu. K.V, Introduction to Embedded Systems, Tata Mcgraw Hill,2009.
2.
Elicia White, Making Embedded Systems, O Reilly Series, SPD,2011.
3.
Tammy Noergaard, Embedded Systems Architecture, Elsevier, 2006.
4.
Han-Way Huang, Embedded system Design Using C8051, Cengage Learning,2009.
5.
Rajib Mall Real-Time systems Theory and Practice Pearson Education, 2007.
UNIT I
INTRODUCTION TO EMBEDDED SYSTEMS
Introduction to Embedded Systems
The build process for embedded systems
Structural units in embedded processor,
Selection of processor & memory devices
DMA
Memory management methods
Timer and Counting devices,
Watchdog Timer,
Real Time Clock,
In circuit emulator,
Target Hardware Debugging.
HOD
Staff In-charge
Processors
Storage
Timers &
Interrupts
Bus
controllers
&
General Computing
Applications similar to desktop computing, but in an embedded package
Video games, set- top boxes, wearable computers, automatic tellers
Control Systems
Closed- loop feedback control of real- time system
Vehicle engines, chemical processes, nuclear power, flight control
Signal Processing
Computations involving large data streams
5
MEMORY:
Memory is an important part of embedded systems. The cost and performance of an
embedded system heavily depends on the kind of memory devices it utilizes. In this section we
will discuss about Memory Classification, Memory Technologies and Memory
Management.
(1) Memory Classification
Memory Devices can be classified based on following characteristics
(a) Accessibility
(b) Persistence of Storage
(c) Storage Density & Cost
(d) Storage Media
(e) Power Consumption
a) Accessibility
Memory devices can provide Random Access, Serial Access or Block Access. In a
Random Access memory, each word in memory can be directly accessed by specifying the
address of this memory word. RAM, SDRAMs, and NOR Flash are examples of Random
Access Memories. In a Serial Access Memory, all the previous words (previous to the word
being accessed) need to be accessed, before accessing a desired word. I2C PROM and SPI
PROM are examples of Serial Access Memories. In Block Access Memories, entire memory
is sub-divided in to small blocks (generally of the order of a K Byte) of memory. Each block
can be randomly accessed, and each word in a given block can be serially accessed. Hard Disks
and NAND flash employ a similar mechanism. Word access time for a RAM (Random
Access Memory) is independent of the word location. This is desirable of high speed
application making frequent access to the memory.
b) Persistence of Storage
Memory devices can provide Volatile storage or a non-Volatile storage. In a nonVolatile storage, the memory contents are preserved even after power shut down. Whereas a
Volatile memory loses its contents, after power shut down. Non-Volatile storage is needed for
storing application code, and re-usable data. However volatile memory can be used for all
temporary storages. RAM, SDRAM are examples of volatile memory. Hard Disks, Flash
(NOR & NAND) Memories, SD-MMC, and ROM are example of non-Volatile storages.
c) Storage Cells
Memory Device may employ electronic (in terms of transistors or electron states)
storage, magnetic storage or optical storage. RAM, SDRAM are examples of electronic
storage. Hard Disks are example of magnetic storage. CDs (Compact Discs) are example of
optical storage. Old Computers also employed magnetic storage (magnetic storages are still
common in some consumer electronics products).
d) Storage Density & Cost
Storage Density (number of bits which can be stored per unit area) is generally a good
measure of cost. Dense memories (like SDRAM) are much cheaper than their counterparts (like
SRAM).
e) Power Consumption:
Low Power Consumption is highly desirable in Battery Powered Embedded Systems.
Such systems generally employ memory devices which can operate at low (and ultra low)
Voltage levels. Mobile SDRAMs are example of low power memories.
Memory Technologies
RAM:
RAM stands for Random Access Memory. RAMs are simplest and most common form of
data storage. RAMs are volatile. The figure below shows typical Data, Address and Control
Signals on a RAM. The number of words which can be stored in a RAM are proportional
(exponential of two) to the number of address buses available. This severely restricts the storage
capacity of RAMs (A 32 GB RAM will require 36 Address lines) because designing circuit boards
with more signal lines directly adds to the complexity and cost.
DPRAM (Dual Port RAM):
DPRAM are static RAMs with two I/O ports. These two ports access the same memory
locations - hence DPRAMs are generally used to implement Shared Memories in Dual Processor
Systems. The operations performed on a single port are identical to any RAM. There are some
common problems associated with usage of DPRAM:
(a) Possible of data corruption when both ports are trying to access the same memory location Most DPRAM devices provide interlocked memory accesses to avoid this problem.
(b) Data Coherency when Cache scheme is being used by the processor accessing DPRAM - This
happens because any data modifications (in the DPRAM) by one processor are unknown to the
Cache controller of other processor. In order to avoid such issues, Shared memories are not
mapped to the Cacheable space. In case processor's cache configuration is not flexible enough (to
define the shared memory space as non-cacheable), the cache needs to be flushed before
performing any reads from this memory space.
Dynamic RAM:
Dynamic RAMs use a different storage technique for data storage. A Static RAM has four
transistors per memory cell, whereas Dynamic RAMs have only one transistor per memory cell.
The DRAMs use capacitive storage. Since the capacitor can lose charge, these memories need to
be refreshed periodically. This makes DRAMs more complex (because we need to have extra
control) and power consuming. However, DRAMs have a very high storage density (as compared
to static RAMs) and are much cheaper in cost. DRAMs are generally accessed in terms of rows,
columns and pages which significantly reduces the number of address buses (another advantage
7
over RAM). Generally you need a SDRAM controller (which manages different SDRAM
commands and Address translation) to access a SDRAM. Most of the modern processors come
with an on-chip SDRAM controller.
OTP- EPROM, UV-EPROM and EEPROM:
EPROMs (Electrically Programmable writable Read Only Memory) are non-volatile
memories. Contents of ROM can be randomly accessed - but generally the word RAM is used to
refer to only the volatile random access memories. The operating voltage for writing in to the
EPROMs is much higher than the operating voltage. Hence you can write in to a PROM in-circuit
(which signifies ROM). You need special programming stations (which have write mechanism)
to write in to the EPROMs.
OTP-EPROMs are One Time Programmable. Contents of these memories cannot be
changed, once written. UV-EPROM are UV erasable EPROMs. Exposure of memory cells, to UV
light erases the existing contents of these memories and these can be re-programmed after that.
EEPROM are Electrically Erasable EPROMs. These can be erased electrically (generally on the
same programming station where you write in to them). The write cycles (number of times you
can erase and re-write) for UV-EPROM and EEPROM is fairly limited. Erasable PROMs use
either FLOTOX (Floating gate Tunnel Oxide) or FAMOS (Floating gate Avalanche MOS)
technology.
Flash (NOR):
Flash (or NOR-Flash to be more accurate) are quite similar to EEPROM in usage and can
be considered in the class of EEPROM (since it is electrically erasable). However there are a few
differences. Firstly, the flash devices are in-circuit programmable. Secondly, these are much
cheaper as compared to the conventional EEPROMs. These days (NOR) Flash are widely used for
storing the boot code.
NAND FLASH:
These memories are denser and cheaper than NOR Flash. However these memories are
block accessible, and cannot be used for code execution. These devices are mostly used for Data
Storage (since it is cheaper than NOR flash). However some systems use them for storing the boot
codes (these can be used with external hardware or with built-in NAND boot logic in the
processor).
SD-MMC
SD-MMC cards provide a cheaper mean of mass storage. These memory cards can provide
storage capacity of the order of Giga Bytes. These cards are very compact and can be used with
portable systems. Most modern hand-held devices requiring mass storage (e.g. still and video
cameras) use Memory cards for storage.
Hard Disc:
Hard Discs are Optical Memory devices. These devices are bulky and they require another
bulky hardware (disk reader) for reading these memories. These memories are generally used for
Mass storage. Hence they memories do not exist in smaller and portable systems. However these
memories are being used in embedded systems which require bulk storage without any size
constraint.
Memory Management
Cache Memory:
Size and the Speed (access time) of the computer memories are inversely proportional.
Increasing the size means reduction in speed. In fact most of the memories are made up of smaller
memory blocks (generally 4 KB) in order to improve the speed. Cost of the memory is also highly
dependent on the memory speed. In order to achieve a good performance it is desirable that code
and data must reside in a high speed memory. However using a high speed memory for all the
code and data in a reasonably large system may be practically impossible. Even in a smaller
system, using high speed memory as the only storage device can raise the system cost
exponentially.
Most Systems employ a hierarchal memory system. They employ a small and fast (and
expensive) memory device to store the frequently used code and data, whereas less frequently
used data is stored in a big low speed (cheaper) memory device. In a complex system there can be
multiple level (with speed and cost of memory hierarchy).
Cache controller is a hardware (Generally built in to the processor) which can dynamically
move the currently being used code and data from a higher level (slower) memory to the lower
level (zero level or cache) memory. The incoming data or code replaces the old code or data (which
is currently not being used) in the cache memory. The data (or code) movement is hidden to the
user. Cache memories are based on the principle of locality in space and time. There are different
types of cache mechanism and replacement mechanism.
Virtual Memory:
Virtual Memory Mechanism allows users to store their data in a Hard Disk, whereas still use it as
if it was available in RAM. The application makes accesses to the data in virtual address space
(which is mapped to RAM), whereas the actual data physically resides in Hard Disk (and is moved
to RAM for access).
Paging Mechanism:
In virtual mode, memory is divided into pages usually 4096 bytes long (see page size). These
pages may reside in any available RAM location that can be addressed in virtual mode. The high
order bits in the memory address register are an index into page-mapping tables at specific starting
locations in memory and the table entries contain the starting real addresses of the corresponding
pages. The low order bits in the address register are an offset of 0 up to 4,095 (0 to the page size 1) into the page ultimately referenced by resolving all the table references of page locations.
The distinct advantages of Virtual Memory Mechanism are:
(a) User can access (in virtual space) more RAM space than what actually exists in the system.
(b) In a multi-tasking application, each task can have its own independent virtual address space
(called discrete address space).
(c) Applications can treat data as if it is stored in contiguous memory (in virtual address space),
whereas it may be in dis contiguous locations (in actual memory).
Cache Vs Virtual Memory
Cache Memory and Virtual Memory are quite similar in concept and they provide similar
benefits. However these schemes different significantly in terms of implementation:
9
Cache control is fully implemented in hardware. Virtual Memory Management is done by software
(Operating System) with some minimum support from Hardware. With cache memory in use, user
still makes accesses to the actual physical memory (and cache is hidden to the user). However it
is reverse with Virtual Memory. User makes accesses to the virtual memory and the actual physical
memory is hidden to the user.
Introduction to Counter/Timers:
Counter/timer hardware is a crucial
component of most embedded systems. In some
cases a timer is needed to measure elapsed time;
in others we want to count or time some
external events. Here's a primer on the
hardware.
Counter/timer hardware is a crucial
component of most embedded systems. In some
cases, a timer measures elapsed time (counting processor clock ticks). In others, we want to count
or time external events. The names counter and timer can be used interchangeably when talking
about the hardware. The difference in terminology has more to do with how the hardware is used
in a given application.
A simple timer similar to those often included on-chip within a microcontroller. You could
build something similar from a couple of 74HC161 counters or a programmable logic device. The
timer shown consists of a loadable 8-bit count register, an input clock signal, and an output signal.
Software loads the count register with an initial value between 0x00 and 0xFF. Each subsequent
transition of the input clock signal increments that value.
When the 8-bit count overflows, the output signal is asserted. The output signal may
thereby trigger an interrupt at the processor or set a bit that the processor can read. To restart the
timer, software reloads the count register with the same or a different initial value.
If a counter is an up counter, it counts up from the initial value toward 0xFF. A down counter
counts down, toward 0x00.
A typical counter will have some means to start the counter running once it is loaded,
usually by setting a bit in a control register. This is not shown in the figure. A real counter would
generally also provide a way for the processor to read the current value of the count register at any
time, over the data bus.
Semi-automatic:
A timer with automatic reload capability will have a latch register to hold the count written
by the processor. When the processor writes to the latch, the count register is written as well. When
the timer later overflows, it first generates an output signal. Then, it automatically reloads the
contents of the latch into the count register. Since the latch still holds the value written by the
processor, the counter will begin counting again from the same initial value.
Such a timer will produce a regular output with the same accuracy as the input clock. This
output could be used to generate a periodic interrupt like a real-time operating system (RTOS)
timer tick, provide a baud rate clock to a UART, or drive any device that requires a regular pulse.
A variation of this feature found in some timers uses the value written by the processor as
the endpoint rather than the initial count. In this case, the processor writes into a terminal count
register that is constantly compared with the value in the count register. The count register is
always reset to zero and counts up. When it equals the value in the terminal count register, the
10
output signal is asserted. Then the count register is reset to zero and the process repeats. The
terminal count remains the same. The overall effect is the same as an overflow counter. A periodic
signal of a pre-determined length will then be produced.
If a timer supports automatic reloading, it will often make this a software-selectable
feature. To distinguish between a count that will not repeat automatically and one that will, the
hardware is said to be in one of two modes: one-shot or periodic. The mode is generally controlled
by a field in the timer's control register. Input capture
are simply not accessible to human operators. If their software ever hangs, such systems are
permanently disabled. In other cases, the speed with which a human operator might reset the
system would be too slow to meet the uptime requirements of the product.
A watchdog timer is a piece of hardware that can be used to automatically detect software
anomalies and reset the processor if any occur. Generally speaking, a watchdog timer is based on
a counter that counts down from some initial value to zero. The embedded software selects the
counter's initial value and periodically restarts it. If the counter ever reaches zero before the
software restarts it, the software is presumed to be malfunctioning and the processor's reset signal
is asserted. The processor (and the embedded software it's running) will be restarted as if a human
operator had cycled the power.
Below figure shows a typical arrangement. As shown, the watchdog timer is a chip external
to the processor. However, it could also be included within the same chip as the CPU. This is done
in many microcontrollers. In either case, the output from the watchdog timer is tied directly to the
processor's reset signal.
13
data buffered by the requesting device has been transferred to memory (or when the output device
buffer is full, if writing to a peripheral).
In single-cycle mode, the DMA controller gives up the bus after each transfer. This
minimizes the amount of time that the DMA controller keeps the processor off of the memory bus,
but it requires that the bus request/acknowledge sequence be performed for every transfer. This
overhead can result in a drop in overall system throughput if a lot of data needs to be transferred.
In most designs, you would use single cycle mode if your system cannot tolerate more than
a few cycles of added interrupt latency. Likewise, if the peripheral devices can buffer very large
amounts of data, causing the DMA controller to tie up the bus for an excessive amount of time,
single-cycle mode is preferable.
Note that some DMA controllers have larger address registers than length registers. For
instance, a DMA controller with a 32-bit address register and a 16-bit length register can access a
4GB memory space, but can only transfer 64KB per block. If your application requires DMA
transfers of larger amounts of data, software intervention is required after each block.
Get on the bus
The simplest way to use DMA is to select a processor with an internal DMA controller.
This eliminates the need for external bus buffers and ensures that the timing is handled correctly.
Also, an internal DMA controller can transfer data to on-chip memory and peripherals, which is
something that an external DMA controller cannot do. Because the handshake is handled on-chip,
the overhead of entering and exiting DMA mode is often much faster than when an external
controller is used.
If an external DMA controller or processor is used, be sure that the hardware handles the transition
between transfers correctly. To avoid the problem of bus contention, ensure that bus requests are
inhibited if the bus is not free. This prevents the DMA controller from requesting the bus before
the processor has reacquired it after a transfer.
So you see, DMA is not as mysterious as it sometimes seems. DMA transfers can provide real
advantages when the system is properly designed.
An in-circuit emulator (ICE) is one of the oldest embedded debugging tools, and is still
unmatched in power and capability. It is the only tool that substitutes its own internal processor
for the one in your target system. Using one of a number of hardware tricks, the emulator can
monitor everything that goes on in this on-board CPU, giving you complete visibility into the
target code's operation. In a sense, the emulator is a bridge between your target and your
workstation, giving you an interactive terminal peering deeply into the target and a rich set of
debugging resources.
Until just a few years ago, most emulators physically replaced the target processor. Users
extracted the CPU from its socket, plugging the emulator's cable in instead. Today, we're usually
faced with a soldered-in surface-mounted CPU, making connection strategies more difficult. Some
emulators come with an adapter that clips over the surface-mount processor, tri-stating the device's
core, and replacing it with the emulator's own CPU. In other cases, the emulator vendor provides
adapters that can be soldered in place of the target CPU. As chip sizes and lead pitches shrink, the
range of connection approaches expands.
Beware: connecting the emulator will probably be difficult and frustrating. Physical
features of the target system and CPU placement can get in the way of some adapters, so plan for
ICE insertion at hardware design time, if at all possible. Add at least a few days to your schedule.
Work closely with the vendors to surmount these difficulties.
Target access:
An emulator's most fundamental resource is target access: the ability to examine and
change the contents of registers, memory, and I/O. However, since the ICE replaces the CPU, it
generally does not need working hardware to provide this capability. This makes the ICE, by far,
the best tool for troubleshooting new or defective systems. For example, you can repeatedly access
a single byte of RAM or ROM, creating a known and consistent stimulus to the system that is easy
to track using an oscilloscope.
Breakpoints are another important debugging resource. They give you the ability to stop
your program at precise locations or conditions (like "stop just before executing line 51").
Emulators also use breakpoints to implement single stepping, since the processor's single-step
mode, if any, isn't particularly useful for stepping through C code.
There's an important distinction between the two types of breakpoints used by different
sorts of debuggers. Software breakpoints work by replacing the destination instruction by a
software interrupt, or trap, instruction. Clearly, it's impossible to debug code in ROM with
software breakpoints. Emulators generally also offer some number of hardware breakpoints, which
use the unit's internal magic to compare the break condition against the execution stream.
Hardware breakpoints work in RAM or ROM/flash, or even unused regions of the processor's
address spaces.
Complex breakpoints let us ask deeper questions of the tool. A typical condition might be:
"Break if the program writes 0x1234 to variable buffer, but only if function get_data() was called
first." Some software-only debuggers (like the one included with Visual C++) offer similar power,
but interpret the program at a snail's pace while watching for the trigger condition. Emulators
implement complex breakpoints in hardware and, therefore, impose (in general) no performance
penalty.
ROM and, to some extent, flash add to debugging difficulties. During a typical debug
session we might want to recompile and download code many times an hour. You can't do that
with ROM. An ICE's emulation memory is high-speed RAM, located inside of the emulator itself
that maps logically in place of your system's ROM. With that in place, you can download firmware
changes at will.
15
Many ICEs have programmable guard conditions for accesses to both the emulation and
target memory. Thus, it's easy to break when, say, the code wanders off and tries to write to
program space, or attempts any sort of access to unused addresses. Nothing prevents you from
mapping emulation memory in place of your entire address space, so you can actually debug much
of the code with no working hardware. Why wait for the designers to finish? They'll likely be late
anyway. Operate the emulator in stand-alone mode (without the target) and start debugging code
long before engineering delivers prototypes.
Real-time trace is one of the most important emulator features, and practically unique to
this class of debugging tool. Trace captures a snapshot of your executing code to a very large
memory array, called the trace buffer, at full speed. It saves thousands to hundreds of thousands
of machine cycles, displaying the addresses, the instructions, and transferred data. The emulator
and its supporting software translates raw machine cycles to assembly code or even C/C++
statements, drawing on your source files and the link map for assistance.
Trace is always accompanied by sophisticated triggering mechanisms. It's easy to start and
stop trace collection based on what the program does. An example might be to capture every
instance of the execution of an infrequent interrupt service routine. You'll see everything the ISR
does, with no impact on the real time performance of the code.
Generally, emulators use no target resources. They don't eat your stack space, memory, or
affect the code's execution speed. This "non-intrusive" aspect is critical for dealing with real-time
systems.
Practical realities
Be aware, though, that emulators face challenges that could change the nature of their
market and the tools themselves. As processors shrink, it gets awfully hard to connect anything to
those whisker-thin package leads. ICE vendors offer all sorts of adapter options, some of which
work better than others.
Skyrocketing CPU speeds also create profound difficulties. At 100MHz, each machine
cycle lasts a mere 10ns; even an 18-inch cable between your target and the ICE starts to act as a
complex electrical circuit rather than a simple wire. One solution is to shrink the emulator, putting
all or most of the unit nearer the target socket. As speeds increase, though, even this option faces
tough electrical problems.
Skim through the ads in Embedded Systems Programming and you'll find a wide range of
emulators for 8- and 16-bit processors, the arena where speeds are more tractable. Few emulators
exist, though, for higher-end processors, due to the immense cost of providing fast emulation
memory and a reliable yet speedy connection.
Oddly, one of the biggest user complaints about emulators is their complexity of use. Too
many developers never use any but the most basic ICE features. Sophisticated triggering and
breakpoint capabilities invariably require rather complex setup steps. Figure on reading the manual
and experimenting a bit. Such up-front time will pay off later in the project. Time spent in learning
tools always gets the job done faster.
EMBEDDED SYSTEM DEBUGGING
Debugging tools
Application Debugging: Simulators and emulators are two powerful debugging tools which
allow developers to debug (and verify) their application code. These tools enable programmer
to perform the functional tests and performance tests on the application code. Simulator is a
software which tries to imitate a given processor or hardware. Simulator is based on the
mathematical model of the processor. Generally all the functional errors in an application can
16
be detected by running it on the simulator. Since simulator is not actual device itself, it may
not be an exact replica of the target hardware. Hence, some errors can pass undetected through
the simulator. Also, the performance of an application cannot be accurately measured using
Simulator (it only provides a rough estimate). Generally most development tools come under
an integrated environment, where Editor, Compiler, Archiver, Linker and Simulator are
integrated together. Emulator (or Hardware Emulator) provides a way to run the application
on actual target (but under the control of a emulation software) hardware. Results are more
accurate with emulation, as the application is actually running on the real hardware target.
Hardware Debugging: Developer of an Embedded System often encounters problems which
are related to the Hardware. Hence it is desirable to gain familiarity with some Hardware
Debugging (probing tools). DVM, Oscilloscope (DSO or CRO) and Logical Analyzer (LA)
are some of the common debugging tools, which are used in day to day debugging process.
Memory Testing Tools: There are a number of commercially available tools which help
programmers to test the memory related problems in their code. Apart from Memory leaks,
these tools can catch other memory related errors - e.g. freeing a previously allocated memory
more than once, writing to uninitialized memory etc. Here is a list of some freely (no cost)
available Memory Testing tools:
dmalloc
DUMA
valgrind
memwatch
memCheckDeluxe
Debugging an Embedded System
(a) Memory Faults:
One of the major issue in embedded systems could be memory faults. Following types of
Memory Fault are possible in a system
(i) Memory Device Failure: Sometimes the memory device may get damaged (some
common causes are current transients and static discharge). If damaged, the memory
device needs replacement. Such errors can occur in run time. However such failures are
very rare.
(ii) Address Line Failure: Improper functioning of address lines can lead to memory
faults. This could happen if one or more address lines are shorted (either with ground or
with each other or with some other signal on the circuit board). Generally these error occur
during the production of circuit board, and post-production testing can catch such errors.
Sometimes the address line drivers might get damaged during run time (again due to
current transients or static discharge). This can lead to address line faults during run time.
(iii) Data Line Failure: Can occur if the data lines (one or more) are shorted (to ground or
with each other or with some other signal). Such errors can be detected and rectified during
post-production testing. Again, the electric discharge and current transients can damage
can damage the data line drivers, which might cause to memory failures during run time.
17
(iv)Corruption of few memory blocks: Some time a few address locations in the memory
can be permanently damaged (either stuck to Low or stuck to High). Such errors are more
common with Hard-disks (less common with RAMs). The test software (power on selftest) can detect these errors, and avoid using these memory sectors (rather than replacing
the whole memory).
(v) Other Faults: Sometimes the memory device may be loosely inserted (or may be
completely missing) in to the memory slot. Also there is a possibility of Fault in Control
Signals (similar to Address and Data Lines).
There are two types of sections in System Memory - Program (or code) sections, and Data
sections. Faults in program sections are more critical because even the corruption of one single
location can cause the program to crash. Corruption of data memory also could lead to program
crashes, but mostly it only cause erratic system behavior (from which the application could
gracefully recover - provided that software design takes care of error handling).
Memory Tests:
Following simple tests can detect memory faults:
(a) Write a known patter "0xAAAA" (All odd data bits being "1" and even bits being
"0") in to the memory (across all address ranges) and read it back. Verify that the same
value (0xAAAA) is read back. If any Odd Data line is shorted (with even data line or with
Ground), this test will detect it. Now repeat the same test with data pattern "0x5555". This
test will detect any shorting of the even Data line (short with ground or with odd data line).
Also, these two test in combination can detect any bad memory sectors.
(b) Write a unique value in to each memory word (across entire memory range).
Easiest way to choose this unique value is to use the address of given word as the value.
Now read back these values and verify them. If the verification of read back values fails
(whereas the test-a passes), then there could be a fault in address lines.
The tests "a" and "b" can be easily performed as part of power on checks on the
system. However it will be tricky to perform these tests during run time, because
performing these test will mean losing the existing contents in the memory. However
certain systems run such memory tests during run time (once in every few days). In such
scenarios, the tests should be performed on smaller memory sections at a time. Data of
these memory sections can be backed up before performing the test, and this data can be
restored after test completion. Tests can be run one by one on each section (rather than
running the test on entire memory at a time).
(b) Hardware Vs Software Faults:
In Embedded System, Software is closely knit with the Hardware. Hence, the line
dividing the Hardware and Software issues is very thin. At times, you may keep debugging
you software, whereas the fault may lie somewhere in the Hardware (and vice versa). The
problem becomes more challenging when such faults occur at random and cannot be
reproduced consistently. In order to dissect and debug such tricky issues, a step-wise
approach needs to be followed.
18
Part - A
1.
3.
4.
19
5.
6.
7.
8.
1.
2.
3.
4.
5.
6.
PART - B
Explain the build process of an embedded system?
Explain the process of selecting embedded processors?
Write a notes on direct memory access?
Write about the watchdog timers?
Explain a brief note on timers and counters?
Give a brief note on debugging tools?
21
UNIT II
EMBEDDED NETWORKING
Introduction on embedded networking
I/O Device Ports & Buses
Serial Bus communication protocols
RS232 standard
RS422
RS485
CAN Bus
Serial Peripheral Interface (SPI)
Inter Integrated Circuits (I2C)
Need for device drivers
HOD
Staff In-charge
22
23
24
The output signal level usually swings between +12V and -12V. The "dead area" between
+3v and -3v is designed to absorb line noise. In the various RS-232-like definitions this dead area
may vary. For instance, the definition for V.10 has a dead area from +0.3v to -0.3v. Many receivers
designed for RS-232 are sensitive to differentials of 1v or less.
This can cause problems when using pin powered widgets - line drivers, converters,
modems etc. These type of units need enough voltage & current to power them self's up. Typical
URART (the RS-232 I/O chip) allows up to 50ma per output pin - so if the device needs 70ma to
run we would need to use at least 2 pins for power. Some devices are very efficient and only
require one pin (sometimes the Transmit or DTR pin) to be high - in the "SPACE" state while idle.
An RS-232 port can supply only limited power to another device. The number of output
lines, the type of interface driver IC, and the state of the output lines are important considerations.
The types of driver ICs used in serial ports can be divided into three general categories:
Drivers which require plus (+) and minus (-) voltage power supplies such as the 1488 series
of interface integrated circuits. (Most desktop and tower PCs use this type of driver.)
Low power drivers which require one +5 volt power supply. This type of driver has an
internal charge pump for voltage conversion. (Many industrial microprocessor controls use
this type of driver.)
Low voltage (3.3 v) and low power drivers which meet the EIA-562 Standard. (Used on
notebooks and laptops.)
Data is transmitted and received on pins 2 and 3 respectively. Data Set Ready (DSR) is an
indication from the Data Set (i.e., the modem or DSU/CSU) that it is on. Similarly, DTR indicates
to the Data Set that the DTE is on. Data Carrier Detect (DCD) indicates that a good carrier is being
received from the remote modem.
Pins 4 RTS (Request to Send - from the transmitting computer) and 5 CTS (Clear to Send from the Data set) are used to control. In most Asynchronous situations, RTS and CTS are
constantly on throughout the communication session. However where the DTE is connected to a
multipoint line, RTS is used to turn carrier on the modem on and off. On a multipoint line, it's
imperative that only one station is transmitting at a time (because they share the return phone pair).
When a station wants to transmit, it raises RTS. The modem turns on carrier, typically waits a few
milliseconds for carrier to stabilize, and then raises CTS. The DTE transmits when it sees CTS up.
When the station has finished its transmission, it drops RTS and the modem drops CTS and carrier
together.
Clock signals (pins 15, 17, & 24) are only used for synchronous communications. The modem
or DSU extracts the clock from the data stream and provides a steady clock signal to the DTE.
Note that the transmit and receive clock signals do not have to be the same, or even at the same
baud rate.
25
Pin Number
Direction of signal:
Protective Ground
20
22
Pin Number
Direction of signal:
26
RS422
RS-422, also known as TIA/EIA-422, is a technical standard originated by the Electronic
Industries Alliance that specifies electrical characteristics of a digital signaling circuit. Differential
signaling can transmit data at rates as high as 10 Mbit/s, or may be sent on cables as long as 1500
meters. Some systems directly interconnect using RS-422 signals, or RS-422 converters may be
used to extend the range of RS-232 connections. The standard only defines signal levels; other
properties of a serial interface, such as electrical connectors and pin wiring, are set by other
standards.
RS-422 is the common short form title of American National Standards Institute (ANSI)
standard ANSI/TIA/EIA-422-B Electrical Characteristics of Balanced Voltage Differential
Interface Circuits and its international equivalent ITU-T Recommendation T-REC-V.11, also
known as X.27. These technical standards specify the electrical characteristics of the balanced
voltage digital interface circuit. RS-422 provides for data transmission, using balanced, or
differential, signaling, with unidirectional/non-reversible, terminated or non-terminated
transmission lines, point to point, or multi-drop. In contrast to EIA-485 (which is multi-point
instead of multi-drop), RS-422/V.11 does not allow multiple drivers but only multiple receivers.
RS-422 specifies the electrical characteristics of a single balanced signal. The standard was
written to be referenced by other standards that specify the complete DTE/DCE interface for
applications which require a balanced voltage circuit to transmit data. These other standards would
define protocols, connectors, pin assignments and functions. Standards such as EIA-530 (DB25 connector) and EIA-449 (DC-37 connector) use RS-422 electrical signals. Some RS-422
devices have 4 screw terminals for pairs of wire, with one pair used for data in one direction.
RS-422 cannot implement a true multi-point communications network such as with EIA485 since there can be only one driver on each pair of wires, however one driver can be connected
to up to ten receivers.
RS-422 can interoperate with interfaces designed to MIL-STD-188-114B, but they are not
identical. RS-422 uses a nominal 0 to 5 volt signal while MIL-STD-188-114B uses a signal
symmetric about 0 V. However the tolerance for common mode voltage in both specifications
allows them to interoperate. Care must be taken with the termination network.
27
RS485
RS-485 allows multiple devices (up to 32) to communicate at half-duplex on a single pair
of wires, plus a ground wire (more on that later), at distances up to 1200 meters (4000 feet). Both
the length of the network and the number of nodes can easily be extended using a variety of
repeater products on the market.
Data is transmitted differentially on two wires twisted together, referred to as a "twisted
pair." The properties of differential signals provide high noise immunity and long distance
capabilities. A 485 network can be configured two ways, "two-wire" or "four-wire." In a "twowire" network the transmitter and receiver of each device are connected to a twisted pair. "Fourwire" networks have one master port with the transmitter connected to each of the "slave" receivers
on one twisted pair. The "slave" transmitters are all connected to the "master" receiver on a second
twisted pair. In either configuration, devices are addressable, allowing each node to be
communicated to independently. Only one device can drive the line at a time, so drivers must be
put into a high-impedance mode (tri-state) when they are not in use. Some RS-485 hardware
handles this automatically. In other cases, the 485 device software must use a control line to handle
the driver. (If your 485 device is controlled through an RS-232 serial port, this is typically done
with the RTS handshake line.) A consequence of tri-stating the drivers is a delay between the end
of a transmission and when the driver is tri-stated. This turn-around delay is an important part of
a two-wire network because during that time no other transmissions can occur (not the case in a
four-wire configuration). An ideal delay is the length of one character at the current baud rate (i.e.
1 ms at 9600 baud).
Two-wire or four-wire? Two-wire 485 networks have the advantage of lower wiring costs
and the ability for nodes to talk amongst themselves. On the downside, two-wire mode is limited
to half-duplex and requires attention to turn-around delay. Four-wire networks allow full-duplex
operation, but are limited to master-slave situations (i.e. a "master" node requests information from
individual "slave" nodes). "Slave" nodes cannot communicate with each other. Remember when
ordering your cable, "two-wire" is really two wires + ground, and "four-wire" is really four wires
+ ground.
RS485 software handles addressing, turn-around delay, and possibly the driver tri-state
features of 485. Determine before any purchase whether your software handles these features.
Remember, too much or too little turn-around delay can cause troubleshooting fits, and delay
should be a function of baud rate. If you're writing your own software or using software written
for an RS-232 application, be certain that provisions are made for driver tri-state control. Luckily,
there are usually hardware alternatives for controlling driver tri-stating.
Connecting a multi-drop 485 network. The EIA RS-485 Specification labels the data wires
"A" and "B", but many manufacturers label their wires "+" and "-". In our experience, the "-" wire
should be connected to the "A" line, and the "+" wire to the "B" line. Reversing the polarity will
not damage a 485 device, but it will not communicate. This said, the rest is easy: always connect
A to A and B to B.
Signal ground, don't forget it. While a differential signal does not require a signal ground
to communicate, the ground wire serves an important purpose. Over a distance of hundreds or
thousands of feet there can be very significant differences in the voltage level of "ground." RS28
485 networks can typically maintain correct data with a difference of -7 to +12 Volts. If the
grounds differ more than that amount, data will be lost and often the port itself will be damaged.
The function of the signal ground wire is to tie the signal ground of each of the nodes to one
common ground. However, if the differences in signal grounds is too great, further attention is
necessary. Optical isolation is the cure for this problem.
CAN Bus
CAN Serial Bus Communication for networking
CAN-bus line usually interconnects to a CAN controller between line and host at the
node. It gives the input and gets output between the physical and data link layers at
the host node.
The CAN controller has a BIU (bus interface unit consisting of buffer and driver),
protocol controller, status-cum control registers, receiver-buffer and message objects.
These units connect the host node through the host interface circuit
29
CAN protocol:
There is a CAN controller between the CAN line and the host node.
CAN controller BIU (Bus Interface Unit) consisting of a buffer and driver
Method for arbitration CSMA/AMP(Carrier Sense Multiple Access with Arbitration on
Message Priority basis)
Each Distributed Node Uses:
Twisted Pair Connection up to 40 m for bi-directional data
Line, which pulls to Logic 1 through a resistor between the line and + 4.5V to +12V.
Line Idle state Logic 1 (Recessive state)
Uses a buffer gate between an input pin and the CAN line
Detects Input Presence at the CAN line pulled down to dominant (active) state logic 0
ground ~ 0V) by a sender to the CAN line.
Uses a current driver between the output pin and CAN line and pulls line down to dominant
(active) state logic 0 (ground ~ 0V) when sending to the CAN line
Serial Peripheral Interface (SPI)
Full-duplex Synchronous communication.
SCLK, MOSI and MISO signals for serial clock from master, output from master and
input to master, respectively.
Device selection as master or slave can be done by a signal to hardware input SS. (Slave
select when 0) pin.
Programmable for the clock bits, and therefore of the period T of serial out data bits down to
the interval of 0.5s for an 8 MHz crystal at 68HC11
31
panel board. A second example is SDRAM DIMMs, which can feature an I2C EEPROM
containing parameters needed to correctly configure a memory controller for that module.
I2C is a two-wire serial bus, as shown in Figure 1. There's no need for chip select or
arbitration logic, making it cheap and simple to implement in hardware.
The two I2C signals are serial data (SDA) and serial clock (SCL). Together, these signals
make it possible to support serial transmission of 8-bit bytes of data-7-bit device addresses plus
control bits-over the two-wire serial bus. The device that initiates a transaction on the I2C bus is
termed the master. The master normally controls the clock signal. A device being addressed by
the master is called a slave.
In a bind, an I2C slave can hold off the master in the middle of a transaction using what's
called clock stretching (the slave keeps SCL pulled low until it's ready to continue). Most I2C slave
devices don't use this feature, but every master should support it.
The I2C protocol supports multiple masters, but most system designs include only one.
There may be one or more slaves on the bus. Both masters and slaves can receive and transmit
data bytes.
Each I2C-compatible hardware slave device comes with a predefined device address, the
lower bits of which may be configurable at the board level. The master transmits the device address
of the intended slave at the beginning of every transaction. Each slave is responsible for monitoring
the bus and responding only to its own address. This addressing scheme limits the number of
identical slave devices that can exist on an I2C bus without contention, with the limit set by the
number of user-configurable address bits (typically two bits, allowing up to four identical devices).
Communication
As you can see in Figure 2, the master begins the communication by issuing the start
condition (S). The master continues by sending a unique 7-bit slave device address, with the most
significant bit (MSB) first. The eighth bit after the start, read/not-write (), specifies whether the
slave is now to receive (0) or to transmit (1). This is followed by an ACK bit issued by the receiver,
acknowledging receipt of the previous byte. Then the transmitter (slave or master, as indicated by
the bit) transmits a byte of data starting with the MSB. At the end of the byte, the receiver (whether
master or slave) issues a new ACK bit. This 9-bit pattern is repeated if more bytes need to be
transmitted.
In a write transaction
(slave receiving), when the
master is done transmitting
all of the data bytes it wants
to send, it monitors the last ACK and then issues the stop condition (P). In a read transaction (slave
transmitting), the master does not acknowledge the final byte it receives. This tells the slave that
its transmission is done. The master then issues the stop condition.
32
A simple bus
As we've seen, the I2C signaling protocol provides device addressing, a read/write flag,
and a simple acknowledgement mechanism. There are a few more elements to the I2C protocol,
such as general call (broadcast) and 10-bit extended addressing. Beyond that, each device defines
its own command interface or address-indexing scheme.
Standard I2C devices operate up to 100Kbps, while fast-mode devices operate at up to
400Kbps. A 1998 revision of the I2C specification (v. 2.0) added a high-speed mode running at up
to 3.4Mbps. Most of the I2C devices available today support 400Kbps operation. Higher-speed
operation may allow I2C to keep up with the rising demand for bandwidth in multimedia and other
applications.
Most often, the I2C master is the CPU or microcontroller in the system. Some
microcontrollers even feature hardware to implement the I2C protocol. You can also build an allsoftware implementation using a pair of general-purpose I/O pins (single master implementations
only).
Since the I2C master controls transaction timing, the bus protocol doesn't impose any realtime constraints on the CPU beyond those of the application. (This is in contrast with other serial
buses that are timeslot-based and, therefore, take their service overhead even when no real
communication is taking place.)
RS232
RS423
RS422
RS485
Mode of Operation
SINGLE-ENDED
SINGLE-ENDED
DIFFERENTIAL
DIFFERENTIAL
1DRIVER
1DRIVER
1DRIVER
32DRIVER
1 RECVR
10 RECVR
10 RECVR
32 RECVR
50 FT.
4000 FT.
4000 FT.
4000 FT.
20kb/s
100kb/s
10Mb/s-100Kb/s
10Mb/s-100Kb/s
+/-25V
+/-6V
-0.25V to +6V
-7V to +12V
Driver
+/-5V to +/-15V
+/-3.6V
+/-2.0V
+/-1.5V
RS422/RS485)
Output
Signal
Loaded
33
+/-25V
+/-6V
+/-6V
+/-6V
3k to 7k
>=450
100
54
Power On
N/A
N/A
N/A
+/-100uA
Power Off
+/-6mA @ +/-2v
+/-100uA
+/-100uA
+/-100uA
30V/uS
Adjustable
N/A
N/A
+/-15V
+/-12V
-10V to +10V
-7V to +12V
+/-3V
+/-200mV
+/-200mV
+/-200mV
3k to 7k
4k min.
4k min.
>=12k
Driver
Output
Signal
Unloaded
High Z State
Max. Driver Current in
High Z State
Each device in a system needs device driver routine with number of device functions.
An ISR relates to a device driver command (device-function). The device driver uses SWI
to call the related ISR (device-function routine)
The device driver also responds to device hardware interrupts.
Device driver generic commands:
A programmer uses generic commands for device driver for using a device. The operating
system provides these generic commands.
Each command relates to an ISR. The device driver command uses an SWI to call the
related ISR (device-function routine)
Generic functions:
Generic functions used for the commands to the device are device create ( ), open ( ),
connect ( ), bind ( ), read ( ), write ( ), ioctl ( ) [for IO control], delete ( ) and close ( ).
Device driver code:
Different in different operating system.
Same device may have different code for the driver when system is using different operating
system.
34
Part - A
1. List the Types of IO ports in embedded networking?
Serial ports and Parallel ports
Types of serial ports:
Synchronous Serial Input
Synchronous Serial Output
Asynchronous Serial UART input
Asynchronous Serial UART output
Both as input and as output, for example, modem.
Types of parallel ports:
Parallel port one bit Input
Parallel one bit output
Parallel Port multi-bit Input
Parallel Port multi-bit Output
2. What is the meaning of MAC address?
A media access control address (MAC address) is a unique identifier assigned
network interfaces for communications on the physical network segment. MAC addresses are
used as a network address for most IEEE 802 network technologies, including Ethernet and
WiFi.
3. Define PORT?
A port is a device to receive the bytes from external peripheral(s) device or processor
or to send the bytes to external peripheral or device or processor using instructions executed
on processor.
4. List Serial Bus Communication Protocols?
a. RS-232,
b. RS-422,
c. RS 423,
d. RS-485,
e. CAN Bus,
f. SPI Bus,
g. I2C Bus.
5. What is full duplex and half duplex?
In Full-duplex communication means between two components that both can
transmit and receive information between each other simultaneously. Telephones are fullduplex systems so both parties on the phone can talk and listen at the same time.
In half-duplex systems, the transmission and reception of information must happen
alternately. While one point is transmitting, the other must only receive. Walkie-talkie
35
7. What is CAN?
CAN is a multi-master broadcast serial bus standard for connecting electronic
control unit (ECUs).
b. Controllerarea network (CAN or CAN-bus) is a vehicle bus standard designed to
allow microcontrollers a devices to communicate with each other within a vehicle
without a host computer.
c. CAN is a message-based protocol, designed specifically for automotive
applications but now also used in other areas such as industrial automation and
medical equipment.
d. The Controller Area Network (CAN) bus is a serial asynchronous bus used in
instrumentation applications for industries such as automobiles.
a.
RS485
Half duplex
Point to multi-point (master to multiple slaves)
1200 m distance at 115kbaud
2 wire (RS422 uses 4 wires)
36
37
1.
2.
3.
4.
5.
PART- B
Write a short note on embedded networking?
Explain the RS232 communication protocol?
Compare the serial communication protocols?
Explain about the CAN bus and its working?
Give a brief note on I2C protocol?
38
UNIT III
EMBEDDED FIRMWARE DEVELOPMENT
ENVIRONMENT
Objectives of EDLC,
Different phases of EDLC,
Modeling of EDLC;
issues in Hardware-software Co-design,
Data Flow Graph,
State machine model,
Sequential Program Model,
Concurrent Model,
Object oriented Model.
HOD
Staff In-charge
39
EDLC
Embedded Product Development Life Cycle (Let us call it as EDLC, though it is not a standard
and universal term) is an 'Analysis -Design -Implementation' based standard problem solving
approach for Embedded Product Development. In any product development application, the first
and foremost step is to figure out what product needs to be developed (analysis), next you need to
figure out a good approach for building it (design) and last but not least you need to develop it
(implementation).
Use of EDLC:
EDLC is essential for understanding the scope and complexity of the work involved in any
embedded product development. EDLC defines the interaction and activities among various
groups of a product development sector including project management, system design and
development (hardware, firmware and enclosure design and development), system testing, release
management and quality assurance. The standards imposed by EDLC on a product development
makes the product, developer independent in terms of standard documents and it also provides
uniformity in development approaches.
Objectives of EDLC:
The ultimate aim of any embedded product in a commercial production setup is to produce
marginal benefit. Marginal benefit is usually expressed in terms of Return on Investment (R01).
The investment for a product development includes initial investment, manpower investment, and
infrastructure investment. etc. A product is said to be profitable only if the turnover from the
selling of the product is more than that of the overall investment expenditure. For this, the product
should be acceptable by the end user and it should meet the requirements of end user in terms of
quality, reliability and functionality. So it is very essential to ensure that the product is meeting all
these criteria, throughout the design, development implementation and support phases. Embedded
Product Development Life Cycle (EDLC) helps out in ensuring all these requirements. EDLC has
three primary objectives, namely
Ensure that high quality products are delivered to end user.
Risk minimization and defect prevention in product development through project management.
Maximize the productivity.
budget allocation might have done after studying the market trends and requirements of the
product, competition, etc. EDLC must ensure that the development of the product has taken
account of all the qualitative attributes of the embedded system.
Risk Minimization and Defect Prevention through Management:
You may be thinking what the significance of project management is, or why project
management is essential in product development. Nevertheless it is an additional expenditure to
the project! If we look back to the chicken dish example, we can find out that the management
activity from dad is essential in the beginning phase but in the preparation phase it can be handled
by mom itself. There are projects in embedded product development which requires 'loose' or
'tight' project management.
If the product development project is a simple one, a senior developer itself can take charge
of the management activity and no need for a skilled project manager to look after this with
dedicated effort throughout the development process, but there should be an overall supervision
from a skilled project management team for ensuring that the development process is going in the
right direction. Projects which arc complex and requires timeliness should have a dedicated and
skilled project management part and hence they are said to be "tightly" bounded to project
management. Project management is essential for predictability, co-ordination and risk
minimization'. Whenever a product development request comes, an estimate on the duration of'
the development and deployment activity should be given to the end user/client. The timeframe
may be expressed in number of person days PDS (The effort in terms of single person working for
this much days) or 'X person for X week (e.g. 2 person 2 week) etc.
This is one aspect of predictability. The management team might have reached on this
estimate based on past experience in handling similar project or on the analysis of work summary
or data available for a similar project, which was logged in using an activity tool. Resource
(Developer) allocation is another aspect of predictability in management. Resource allocations
like how many resources should work for this project for how many days, how many resources
are critical with respect to the work handling by them and how many backups required for the
resources to overcome a critical situation where a resource is not available (Risk minimization).
Resource allocation is critical and it is having a direct impact on investment. The communication
aspect of the project management deals with co-ordination and interaction among resources and
client from which the request for the product development aroused. Project management adds an
extra cost on the budget but it is essential for ensuring the development process is going in the
right direction and the schedules of the development activity are meeting. Without management,
the development work may go beyond the stipulated time frame (schedule slipping) and may end
up in a product which is not meeting the requirements from the client side, as a result re -works
should be done to rectify the possible deviations occurred and it will again put extra cost on the
development.
41
Project management makes the proverb "A stitch in time saves nine" meaningful in an
embedded product development. Apart from resource allocation, project management also covers
activities like task allocation, scheduling, monitoring and project tracking. Computer Assisted
Software Engineering (CASE) Tools and Gantt charts help the manager in achieving this.
Microsoft0 Project Tool is a typical example of CASE tool for project management.
Increased Productivity:
Productivity is a measure of efficiency as well as Return on Investment (ROI). One aspect
of productivity covers how many resources are utilized to build the product, how much investment
required, how much time is taken for developing the product, etc. For example, the productivity
of a system is said to be doubled if a product developed by a team of 'X' members in a period of
'X' days is developed by another team of `X/2' members in a period of 'X' days or by a team of 'X'
members in a period of 'X/2' days. This productivity measurement is based on total manpower
efficiency. Productivity in terms of Returns is said to be increased, if the product is capable of
yielding maximum returns with reduced investment. Saving manpower effort will definitely result
in increased productivity. Usage of automated tools, wherever possible, is recommended for this.
The initial investment on tools may be an additional burden in terms of money, but it will definitely
save efforts in the next project also. It is a one-time investment.
"Pay once use many time". Another important factor which can help in increased
productivity is "re -usable effort". Some of the works required for the current product development
may have some common features which you built for some of the other product development in
the projects you executed before. Identify those efforts and design the new product in such a way
that it can directly be plugged into the new product without any additional effort. (For example,
the current product you are developing requires an RS -232C serial interface and one of the product
you already developed have the same feature. Adapt this part directly from the existing product in
terms of the hardware and firmware required for the same.) This will definitely increase the
productivity by reducing the development effort.
Another advised method for increasing the productivity is by using resources with specific
skill sets which matches the exact requirement of the entire or part of the product (e.g. Resource
with expertise in Bluetooth technology for developing a Bluetooth interface for the product). This
reduces the learning time taken by a resource, which does not have prior expertise in the particular
feature or domain. This is not possible in all product development environments, since some of
the resources with the desired skill sets may be engaged with some other work and releasing them
from the current work is not possible. Recruiting people with desired skill sets for the current
product development is another option; this is worth only if you expect to have more work on the
near future on the same technology or skill sets. Use of Commercial off - the - Shelf Components
(COTS). Wherever possible in a product is a very good way of reducing the development effort
and thereby increasing the productivity.
42
COTS component is a ready to use component and you can use the same as plug-in
modules in your product. For example, if the product under development requires a 10 base T
Ethernet connectivity, you can either implement the same in your product by using the TCP/IP
chip and related components or can use a readily available TCP/IP full functional plug-in module.
The second approach will save effort and time. EDLC should take all these aspects into
consideration to provide maximum productivity.
Different phases of EDLC
1. Concept Phase: Here analysis of the market trend is carried out. The phase
involves brain storming of innovative ideas driven by technology trends and
customer inputs.
2. Requirements Gathering Phase: In this stage it determines what kind of hardware
and software is required to satisfy the customers.
3. Design Phase: The product owner and design team will identify the relationship
between input and output. After investigating the overall behavior of the E.S a
System specification is written. A detailed hardware and software partitioning is
determined.
4. Development and Implementation Phase: Based on the specification of
embedded system functionality and power consumption and cost all the different
hardware add on components are chosen and hardware implementation will start in
the first sub-phase of implementation.
5. Integration Phase: The next step in the implementation process is the testing of
the entire embedded system.
6. Verification and Validation Phase: The validation phase is to ensure that the
entire system is implemented as against the design and eventually against the
requirements.
7. Maintenance and Retire Phase: This phase includes changes and additions as
required by the users and also fixing bugs to keep the product up and running at the
customer site.
Modeling of EDLC:The term modeling in the embedded product development life cycle refers to
the interconnection of various phases involved in the development of the embedded product. The
various approaches adopted or models used in modeling EDLC are described below.
43
44
45
Prototyping/Evolutionary Model:
Prototyping/evolutionary model is similar to the iterative model and the product is
developed in multiple cycles. The only difference is that this model produces a more refined
prototype of the product at the end of each cycle instead of functionality/feature addition in each
cycle as performed by the iterative model. There won't be any commercial deployment of the
prototype of the product at each cycle's end. The shortcomings of the proto-model after each cycle
are evaluated and it is fixed in the next cycle. Fig.9.
Prototyping/Evolutionary Model After the initial requirement analysis, the design for the
first prototype is made, the development process is started. On finishing the prototype, it is sent to
the customer for evaluation. The customer evaluates the product for the set of requirements and
gives his/her feedback to the developer in terms of shortcomings and improvements needed. The
developer refines the product according to the customer's exact expectation and repeats the proto
development process. After a finite number of iterations, the final product is delivered to the
customer and launches in the market/operational environment. In this approach, the product
undergoes significant evolution as a result of periodic shuttling of product information between
the customer and developer. The prototyping model follows die approach 'Requirements
definition, proto-type development, proto-type evaluation and requirements refining'. Since the
requirements undergo refinement after each proto model, it is easy to incorporate new
requirements and technology changes at any stage and thereby the product development process
can start with a bare minimum set of requirements.
The evolutionary model relies heavily on user feedback after each implementation and
hence fine-tuning of final requirements is possible. Another major advantage of prototyping model
is that the risk is spread across each proto development cycle and it is well under control. The
major drawbacks of proto-typing model are
Deviations from expected cost and schedule due to requirements refinement
Increased project management
Minimal documentation on each prototype may create problems in backward prototype
traceability
Increased Configuration Management activities
Prototyping model is the most popular product development model adopted in embedded
product industry. This approach can be considered as the best approach for products, whose
requirements are not fully available and are subject to change. This model is not recommended for
projects involving the up gradation of an existing product. There can be slight variations in the
base prototyping model depending on project management.
46
Spiral Model:
Spiral model combines the elements of linear and prototyping models to give the best
possible risk minimized EDLC Model. Spiral model is developed by Barry Boehm in 1988. The
product development starts with project definition and traverse through all phases of EDLC
through multiple phases. The activities involved in the Spiral model can be associated with the
four quadrants of a spiral and are listed below.
Determine objectives, alternatives, constraints.
Evaluate alternatives. Identify and resolve risks.
Develop and test.
Plan.
Spiral model is best suited for the development of complex embedded products and
situations where requirements arc changing from customer side. Customer evaluation of prototype
at each stage allows addition of requirements and technology changes. Risk evaluation in each
stage helps in risk planning and mitigation. The proto model developed at each stage is evaluated
by the customer against various parameters like strength, weakness, risk, etc. and the final product
is built based on the final prototype on agreement with the client. Fig.below. Spiral model
47
48
49
o An arrow directed towards the circle represents the data input (or set of inputs)
and an arrow originating from the circle represents a data output (or a set of
outputs).
o Data input along an input edge is considered as token.
o An input edge has at least one token.
o The circle represents the node.
o The node is said to be fired by the tokens from all input edges.
o The output is considered by outgoing tokens, which are produced by the node on
firing
o There are no control conditions in steps at DFG
o A DFG does not have any conditions within it so that the program has one data
entry point and one data output point.
o There is only one independent path for program flow when program is executed.
50
the gear if the airplane is on the ground), what hed get would look a bit like Figure. It would
exhibit the bad behaviour mentioned previously.
Keep the following in mind when designing the state transition diagram (or indeed any embedded
algorithm):
Computers are very fast compared to mechanical hardwareyou may have to wait.
The mechanical engineer whos describing what he wants probably doesnt know as much about
computers or algorithms as you do. Good thing, toootherwise you would be unnecessary!
How will your program behave if a mechanical or electrical part breaks? Provide for timeouts,
sanity checks, and so on.
We can now suggest the following state machine to the user, building upon his requirements by
adding a few states and transitions at a time. The result is shown in Figure 6-3. Here, we want to
preclude gear retraction until the airplane is definitely airborne, by waiting a couple of seconds
after the squat switch opens. We also want to respond to a rising edge of the pilots lever, rather
than a level, so that we rule out the someone moved the lever while the airplane was parked
problem. Also, we take into account that the pilot might change his mind. Remember, the landing
gear takes a few seconds to retract or extend, and we have to handle the case where the pilot
reversed the lever during the process. Note, too, that if the airplane touches down again while
were in the Waiting for take-off state, the timer restartsthe airplane has to be airborne for
two seconds before well retract the gear.
Landing Gear Implementation
typedef enum {GEAR_DOWN = 0, WTG_FOR_TKOFF, RAISING_GEAR, GEAR_UP,
LOWERING_GEAR} State_Type;
/* This table contains a pointer to the function to call in each state.*/ RaisingGear, GearUp,
LoweringGear}; State_Type curr_state;
Main()
{
52
InitializeLdgGearSM();
/* The heart of the state machine is this one loop. The function corresponding to the current state
is called once per iteration. */
while (1)
{
state_table[curr_state]();
DecrementTimer();
/* Do other functions, not related to this state machine.*/
}
};
void InitializeLdgGearSM()
{
curr_state = GEAR_DOWN;
timer = 0.0;
/* Stop all the hardware, turn off the lights, etc.*/
}
void GearDown()
{
/* Raise the gear upon command, but not if the airplane is
on the ground.*/
if ((gear_lever == UP) && (prev_gear_lever == DOWN) &&
(squat_switch == UP))
void (*state_table[]090 = {GearDown, WtgForTakeoff,
}
void RaisingGear()
timer = 2.0;
curr_state = WTG_FOR_TKOFF;
prev_gear_lever = gear_lever; /* Store for edge detection.*/
{
{
/* Once all 3 legs are up, go to the GEAR_UP state.*/
if ((nosegear_is_up == MADE) && (leftgear_is_up == MADE) &&
(rtgear_is_up == MADE))
{
Curr_state = GEAR_UP;
};
curr_state = LOWERING_GEAR;
/* If the pilot changes his mind, start lowering
the gear.*/
if (gear_lever == DOWN)
{
};
}
void GearUp()
{
/* If the pilot moves the lever to DOWN, lower the gear.*/
if (gear_lever == DOWN)
{
53
curr_state = LOWERING_GEAR;
};
}
void WtgForTakeoff()
{
/* Once weve been airborne for 2 sec., start raising
the gear.*/
if (timer <=0.0)
{
curr_state = RAISING_GEAR;
};
/*If we touch down again, or if the pilot changes his
mind, start over.*/
if ((squat_switch ==DOWN) || (gear_lever == DOWN))
{
timer = 2.0;
curr_state = GEAR_DOWN;
/* Dont want to require that he toggle the lever
again this was just a bounce.*/
prev_gear_lever = DOWN;
};
}
void LoweringGear()
{
if ((nosegear_is_down == MADE) && (Leftgear_is_down == MADE)
&& (rtgear_is_down == MADE))
{
curr_state = GEAR_DOWN;
};
}
if (gear_lever == UP)
{
curr_state = RAISING_GEAR;
Lets discuss a few features of the example code. First, youll notice that the functionality
of each individual state is implemented by its own C function. You could just as easily implement
it as a switch statement, with a separate case for each state. However, this can lead to a very long
function (imagine 10 or 20 lines of code per state for each of 20 or 30 states). It can also lead you
astray when you change the code late in the testing phaseperhaps youve never forgotten a break
statement at the end of a case, but I sure have. Having one states code fall into the next states
code is usually a no-no.
To avoid the switch statement, you can use an array of pointers to the individual state
functions. The index into the array is curr_state, which is declared as an enumerated type to help
our tools enforce correctness.
In coding a state machine, try to preserve its greatest strength, namely, the eloquently
visible match between the users requirements and the code. It may be necessary to hide hardware
details in another layer of functions, for instance, to keep the state machines
54
The advantage of sequential programming model is the design of program is very similar to
classical logical circuit design method. Simultaneously (in time) reading of all inputs and writing
of all outputs exclude the hazard known in logical circuits.
Concurrent Model,
The concurrent or communicating process models concurrently executing
tasks/processes. So far we discussed about the sequential execution of software programs.
It is easier to implement certain requirements in concurrent processing model than the
conventional sequential execution. Sequential execution leads to a single sequential
execution of tasks and thereby leads to poor processor utilization, when the tasks involves
I/O waiting, sleeping for specified duration, e.t.c.. if the tasks is split into multiple subtasks,
it is possible to tackle the CPU usage effectively, when the subtask under execution goes
to wait or sleep mode, by switching the task execution. However, concurrent processing
model requires additional overheads in task scheduling, task synchronization and
communication.
As an example for the concurrent processing model lets us examine how we can
implement the seat belt warning system in concurrent processing model.
55
Ignition key ON
Ignition ON?
No
Yes
Start alarm
Ignition ON?
No
Yes
Yes
No
Yes
Time expired?
Yes
Stop alarm
End
56
Many systems easier to describe with concurrent process model because inherently
multitasking
E.g., simple example:
Read two numbers X and Y
Display Hello world. every X seconds
Display How are you? every Y seconds
More effort would be required with sequential program or state machine model
Object oriented Model.
It is an object based model for modelling system requirements. It disseminates a
software requirement in to simple well defined pieces called objects. Objectoriented model brings reusability, maintainability and productivity in system
design. In the object-oriented modelling, object is an entity used for representing
or modelling a particular piece of the system. Each object is characterized by a set
of unique behavior and state. A class is an abstract description of a set
57
Part A
1.
WHAT IS EDLC?
WHY EDLC?
EDLC is essential for understanding the scope and complexity of the work involved in any
embedded product development. EDLC defines the interaction and activities among various
groups of a product development sector including project management, system design and
development (hardware, firmware and enclosure design and development), system testing, release
management and quality assurance. The standards imposed by EDLC on a product development
makes the product, developer independent in terms of standard documents and it also provides
uniformity in development approaches.
3.
EDLC APPROACHES:
Linear/Waterfall Model:
Conceptualization Need Analysis Design Development & testing Deployment Support
Upgrades Retirement
Iterative/Incremental/Fountain EDLC Model:
Cascaded series of linear models
Do some analysis, follow some design, then some implementation in cycles
58
Prototyping/evolutionary model:
Similar to iterative model, product is developed in multiple cycles
The only difference is the model produces more refined prototype of the product at each
cycle instead of just adding the functionality at each cycle like in iterative model.
Spiral model:
Spiral model is best suited for the development of complex embedded products and
situations where the requirements are changing from customer side.
Risk evaluation in each stage helps in reducing risk.
5.
Define Modeling?
Modeling is a broadly used term casually applied to disparate concepts ranging from
behavioral and structural models to more simulation-centric methodologies. The challenge with
broad terms or broad concepts of course is knowing when, where, and how it applies to your
own application.
6.
7.
8.
What is DFG?
A data-flow means that a program flow such that specifically the data only
determines all the program execution steps and program flows are determined specifically
only by the data.
9.
What are the important Hardware and software Tradeoffs in embedded system?
Processing speed and performance
Frequency of change
Memory size and gate count
Reliability
Man hours and cost.
59
1.
2.
3.
4.
5.
Part B
What is Statechart? Explain its role in embedded system?
Explain differential computational models in embedded system design?
Explain sequential program model with an example?
With a neat representation explain about Linear or Waterfall model?
What is computational model? Explain its role in hardware and software co-design?
UNIT IV
RTOS BASED EMBEDDED SYSTEM DESIGN
Introduction to basic concepts of RTOS- Task, process & threads,
Interrupt routines in RTOS,
Multiprocessing and Multitasking,
Preemptive and non-preemptive scheduling,
Task communication shared memory,
Message passing-,
Inter process Communication
Synchronization between processesSemaphores, Mailbox, pipes, priority inversion, priority inheritance,
Comparison of Real time Operating systems: Vx Works, C/OS-II, RT
Linux.
HOD
Staff In-charge
60
61
Direct Call to an ISR by an Interrupting Source and ISR sending an ISR enter
message to OS
62
Multiprocessing in RTOS:
In a single-processor, multitasking system, more than one software task appears to be
executing simultaneously. In reality, a single processor can only execute one instruction at a time,
so the parallel tasks interleave their execution in such a way that they seem to execute in parallel.
Although the mechanics of multitasking may actually decrease processor throughput, the benefits
of multitasking are significant enough to offset this use of resources.
However, programming for a multitasking environment is difficult. Poorly designed or
misapplied multitasking systems incur so much overhead that the system collapses, failing to
answer critical interrupts in a timely manner. Luckily, multitasking control software can usually
take advantage of natural lulls in each tasks sequence of operation to allow other tasks to execute.
Although multiprocessing refers to hardware and multitasking to software, the two have
much in common and often go together. Multiprocessing means, literally, having multiple
processors. Software can execute across multiple processors in many different fashions.
For example, an operating system running multiple programs could assign each
currently running program its own processor. The programs would run more or less independently
of each other. However, the programs might be sharing system resources such as a network, a
printer, or mass storage. The operating system would have to handle multiple, asynchronous
requests from the programs to use these facilities, and would have to make sure that the programs
requests did not consist of conflicting commands for peripheral devices or corrupt each others
data.
In a different configuration, a single program could distribute multiple, separable
threads of execution across several processors. For example, a data acquisition program might
collect data from many independent sources. Eventually, the program will have to synchronize the
results of the threads, generating a composite report. To do this, the program might have to pause
one thread until another thread finishes some vital operation, supplying a needed intermediate
result for the first thread.
Multitasking in RTOS:
Z-Worlds controllers can run programs of up to 20,000 lines, they are typically used
for relatively simple applications. A full-featured real-time operating system (RTOS), then, is
often an unnecessary burden. So that software engineers can design a system that precisely meets
their needs, whether straightforward or complex, Z-World offers three types of multitasking:
Preemptive
Cooperative
Simplified preemptive
63
Preemptive Multitasking:
Preemption means that some top-priority agency usually a timer interrupt or a
supervisory task (kernel) takes control from the task currently running, giving control of the
processor to another, higher priority task. The interrupted task has no control over when
preempting may take place and no ability to stop the interrupt.
A preemptive multitasking system needs, at a minimum, a kernel to stop and start tasks.
The kernel usually uses a timer interrupt from on-board timing hardware to preempt the currently
active task. A kernel can also take control of the processor in response to an asynchronous interrupt
from the outside world and, after determining the nature of the interrupt, decide to switch tasks.
Since a task can be interrupted at any point, preemptive multitasking is well-suited for
applications that require precise timing or high speeds. Because each task does not know when
preemption may take place, software engineers must be careful when tasks share common
resources such as variables, displays, storage devices, and communications lines. Cooperation and
coordination among preemptable tasks are major programming concerns. The RTK (real-time
kernel), one of two kernels shipped with Z-Worlds Dynamic C, supports preemptive multitasking.
The RTK supports prioritized preemption; only a task of higher priority than the one currently
executing can interrupt. Software engineers may create as many priority levels as desired when
using the RTK.
The RTK also has a suspend function, with which a high-priority task voluntarily
suspends itself (for a specified length of time or until awakened by other tasks) and lets lowerpriority tasks execute.
Cooperative Multitasking
Cooperative multitasking is the simplest, fastest, lowest-overhead multitasking
possible. Cooperative multitasking has low overhead because no RTK or supervisor task is needed.
Under cooperative multitasking, each task voluntarily gives up control so other tasks can execute.
Cooperative multitasking has several advantages over other types of multitasking:
The designer has explicit control of the points at which a task begins and ends logical
subsections of its overall job.
Programmers have complete, explicit control of tasks interactions.
Tasks communicate more easily.
Programming is simplified.
Errors in code are less likely, and are easier to isolate when they do occur.
Errors usually degrade performance rather than halting execution.
Indeterminate interrupt latency is lower.
64
65
Worst-case latency
Not Same for every task
Highest priority task latency smallest
Lowest priority task latency
Different for different tasks in the ready list
Tworst = {(dti + sti + eti )1 + (dti + sti + eti )2 +...+ (dti + sti + eti )p-1 + (dti + sti + eti )p}
+ tISR.
tISR is the sum of all execution times for the ISRs
66
For an i-th task, let the event detection time with when an event is brought into a list be is
dti , switching time from one task to another be is sti and task execution time be is eti
i = 1, 2, , p1 when number of higher priority tasks = p1 for the pth task. Highest
67
A non-preemptive OS designates the way / programming style for the scheduling code, so
that engineers can share the same view even if they were not in the same project before. Then with
the same view about concept task, engineers can work on different tasks and test them, profile
them independently as much as possible.
But how much are we really able to gain from this? If engineers are working in the same
project, they can find way share the same view well without using a non-preemptive OS.
If one engineer is from another project or company, he will gain the benefit if he knew the OS
before. But if he didn't, then again, it seems not to make big difference for him to learn a new OS
or a new piece of code.
Task communication shared memory,
In the discussion of the fork ( ) system call, we mentioned that a parent and its children have
separate address spaces. While this would provide a more secured way of executing parent and
children processes (because they will not interfere each other), they shared nothing and have no
way to communicate with each other. A shared memory is an extra piece of memory that
is attached to some address spaces for their owners to use. As a result, all of these processes
share the same memory segment and have access to it. Consequently, race conditions may occur
if memory accesses are not handled properly. The following figure shows two processes and
their address spaces. The yellow rectangle is a shared memory attached to both address spaces
and both process 1 and process 2 can have access to this shared memory as if the shared memory
is part of its own address space. In some sense, the original address spaces is "extended" by
attaching this shared memory.
Shared memory is a feature supported by UNIX System V, including Linux, SunOS and Solaris.
One process must explicitly ask for an area, using a key, to be shared by other processes. This
process will be called the server. All other processes, the clients that know the shared area can
access it. However, there is no protection to a shared memory and any process that knows it can
access it freely. To protect a shared memory from being accessed at the same time by several
processes, a synchronization protocol must be setup.
A shared memory segment is identified by a unique integer, the shared memory ID. The shared
memory itself is described by a structure of type shmid_ds in header file sys/shm.h. To use this
68
file, filessys/types.h and sys/ipc.h must be included. Therefore, your program should start with
the following lines:
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
A general scheme of using shared memory is the following:
For a server, it should be started before any client. The server should perform the
following tasks:
1. Ask for a shared memory with a memory key and memorize the returned shared
memory ID. This is performed by system call shmget().
2. Attach this shared memory to the server's address space with system call shmat().
3. Initialize the shared memory, if necessary.
4. Do something and wait for all clients' completion.
5. Detach the shared memory with system call shmdt().
6. Remove the shared memory with system call shmctl().
For the client part, the procedure is almost the same:
1. Ask for a shared memory with the same memory key and memorize the returned
shared memory ID.
2. Attach this shared memory to the client's address space.
3. Use the memory.
4. Detach all shared memory segments, if necessary.
5. Exit.
limited modularity - changing the name of a process means changing every sender
and receiver process to match
need to know process names
Indirect communications :
messages sent to and received from mailboxes (or ports)
mailboxes can be viewed as objects into which messages placed by
processes and from which messages can be removed by other processes
each mailbox has a unique ID
two processes can communicate only if they have a shared mailbox
send ( A, message ) : send a message to mailbox A
receive ( A, message )
: receive a message from mailbox A
a communications link is only established between a pair of processes if they have
a shared mailbox
a pair of processes can communicate via several different mailboxes if desired
a link can be either unidirectional or bidirectional
a link may be associated with more than two processes
allows one-to-many, many-to-one, many-to-many communications
one-to-many : any of several processes may receive from the mailbox
e.g. a broadcast of some sort
which of the receivers gets the message?
arbitrary choice of the scheduling system if many waiting?
only allow one process at a time to wait on a receive
many-to-one : many processes sending to one receiving process
e.g. a server providing service to a collection of processes
file server, network server, mail server etc.
receiver can identify the sender from the message header contents
many-to-many :
e.g. multiple senders requesting service and a pool of receiving servers offering
service - a server farm
Mailbox Ownership
Process mailbox ownership :
Only the process may receive messages from the mailbox
Other processes may send to the mailbox. Mailbox can be created with the
process and destroyed when the process dies process sending to a dead
processs mailbox will need to be signaled or through separate
create_mailbox and destroy_mailbox calls possibly declare variables of
type mailbox
system mailbox ownership :
mailboxes have their own independent existence, not attached to any
process
dynamic connection to a mailbox by processes for send and/or receive
Buffering - the number of messages that can reside in a link temporarily
Zero capacity - queue length 0
sender must wait until receiver ready to take the message
Bounded capacity - finite length queue
messages can be queued as long as queue not full
otherwise sender will have to wait
71
Unbounded capacity
any number of messages can be queued - in virtual space?
sender never delayed
Copying
need to minimize message copying for efficiency
Copy from sending process into kernel message queue space and then into
receiving process?
probably inevitable in a distributed system
Advantage that communicating processes are kept separate malfunctions
localized to each process.
Direct copy from one process to the other?
From virtual space to virtual space?
Message queues keep indirect pointers to message in process virtual space
Both processes need to be memory resident i.e. not swapped out to disc, at
time of message transfer
shared virtual memory
Message mapped into virtual space of both sender and receiver processes
one physical copy of message in memory ever
no copying involved beyond normal paging mechanisms
used in MACH operating system
Aside: Machs Copy-on-Write mechanism (also used in Linux forks) :
single copy of shared material mapped into both processes virtual space
both processes can read the same copy in physical memory
if either process tries to write, an exception to the kernel occurs
kernel makes a copy of the material and remaps virtual space of writing
process onto it
writing process modifies new copy and leaves old copy intact for other
process
Synchronised versus Asynchronous Communications
Synchronised:
send and receive operations blocking
sender is suspended until receiving process does a corresponding read
receiver suspended until a message is sent for it to receive
properties :
processes tightly synchronised - the rendezvous of Ada
effective confirmation of receipt for sender
at most one message can be outstanding for any process pair
no buffer space problems
easy to implement, with low overhead
disadvantages :
sending process might want to continue after its send operation without
waiting for confirmation of receipt
receiving process might want to do something else if no message is waiting
to be received
Asynchronous :
send and receive operations non-blocking
sender continues when no corresponding receive outstanding
receiver continues when no message has been sent
72
properties :
messages need to be buffered until they are received
amount of buffer space to allocate can be problematic
a process running amok could clog the system with messages if not
careful
often very convenient rather than be forced to wait
particularly for senders
can increase concurrency
some awkward kernel decisions avoided
e.g. whether to swap a waiting process out to disc or not
receivers can poll for messages
i.e. do a test-receive every so often to see if any messages waiting
interrupt and signal programming more difficult
preferable alternative perhaps to have a blocking receive in a separate
thread
Other combinations :
non-blocking send + blocking receive
probably the most useful combination
sending process can send off several successive messages to one or more
processes if need be without being held up
receivers wait until something to do i.e. take some action on message
receipt
e.g. a server process
might wait on a read until a service request arrived,
then transfer the execution of the request to a separate thread
then go back and wait on the read for the next request
blocking send + non-blocking receive
conceivable but probably not a useful combination
in practice, sending and receiving processes will each choose independently
Linux file access normally blocking :
to set a device to non-blocking (already opened with a descriptor fd) :
fcntl ( fd, F_SETFL, fcntl ( fd, F_GETFL) | O_NDELAY )
Missing messages ?
message sent but never received
receiver crashed?
receiver no longer trying to read messages?
waiting receiver never receives a message
sender crashed?
no longer sending messages?
Crashing processes :
kernel knows when processes crash
can notify waiting process
by synthesised message
by signal
terminate process
non-blocking send + blocking receive equivalent to : signal (V) by sender + wait (P) by
receiver
Mutual Exclusion :
initialise :
create_mailbox (mutex)
send (mutex, null-message)
for each process :
while (TRUE) {
receive (mutex, null-message);
critical section
send (mutex, null-messge);
}
mutual exclusion just depends on whether mailbox is empty or not
Message is just a token, possesion of which gives right to enter C.S.
while (TRUE) {
receive (mayconsume, slot);
consume data item in slot
74
else
// deal with timeout or error
}
void t3a_main(void) {
if (smx_SemTest(sbr, TMO)) {
// use resource here
smx_SemSignal(sbr);
}
else
// deal with timeout or error
}
In this example, semaphore sbr is created and its count is set to 1, making it a binary semaphore
and as if it had already been signaled -- i.e., the resource is available. Task t2a (priority 2, task a)
runs first and "gets" sbr. t2a then starts task t3a (priority 3, task a), which immediately preempts
and tests sbr. Since sbr's count == 0, t3a is suspended on sbr, and t2a resumes. t2a uses the
resource, then signals sbr, when it is done with the resource. This causes t3a to be resumed and to
preempt t2a. t3a is now able to use the resource. When done, t3a signals sbr so that another task
can use the resource and t3a stops. t2a resumes and stops.
A binary resource semaphore does the same thing as a mutex, but it has the following
shortcomings:
access to one resource. The following is an example of using a multiple resource semaphore to
control access to a block pool containing NUM blocks
TCB_PTR t2a, t3a; // tasks
PCB_PTR blk_pool; // block pool
SCB_PTR sr; // resource semaphore
#define NUM 10
#define SIZE 100
void Init(void) {
u8* p = (u8*)smx_HeapMalloc(NUM*SIZE);
sb_BlockPoolCreate(p, blk_pool, NUM, SIZE, "blk pool");
sr = smx_SemCreate(RSRC, NUM, "sr");
t2a = smx_TaskCreate(t2a_main, Pri2, 500, NO_FLAGS, "t2a");
t3a = smx_TaskCreate(t3a_main, Pri3, 500, NO_FLAGS, "t3a");
smx_TaskStart(t2a);
}
void t2a_main(void) {
u8* bp;
smx_SemTest(sr, INF);
bp = sb_BlockGet(blk_pool, 0));
smx_TaskStart(t3a);
// use bp to access block
sb_BlockRel(blk_pool, bp, 0);
smx_SemSignal(sr);
}
void t3a_main(void) {
u8* bp;
smx_SemTest(sr, INF);
bp = sb_BlockGet(blk_pool, 0));
// use bp to access block
sb_BlockRel(blk_pool, bp, 0);
smx_SemSignal(sr);
}
This example is similar to the previous example, with small differences. The Init() function first
creates a block pool of NUM blocks. It then creates sr, but because NUM is used instead of 1, a
multiple resource semaphore is created, with a starting count of NUM. t2a tests sr and sr's counter
is decremented to 9, which is greater than 0, so t2a is allowed to get a block. As before, t2a starts
t3a, which immediately preempts. t3a tests sr and sr's counter is decremented to 8, so t3a is also
allowed to get a block and use it. Eight more blocks can be obtained from blk_pool and used by
the same or different tasks. However, the eleventh test of sr will suspend the testing task on sr. As
shown, when t3a is done with its block, it releases the block back to blk_pool, then signals sr, thus
allowing the first waiting task at sr to get the released block. Similarly for t2a.
77
If no task is waiting, count is incremented by each signal. Thus the count always equals the number
of available blocks. The maximum possible count is NUM. Signals after the count == NUM are
ignored in order to protect resources if redundant signals should occur.
78
Tasks need to share resources to communicate and process data. This aspect of multi-threaded
programming is not specific to real-time or embedded systems.
Any time two tasks share a resource, such as a memory buffer, in a system that employs a prioritybased scheduler, one of them will usually have a higher priority. The higher-priority task expects
to be run as soon as it is ready. However, if the lower-priority task is using their shared resource
when the higher-priority task becomes ready to run, the higher-priority task must wait for the
lower-priority task to finish with it. We say that the higher-priority task is pendingon the resource.
If the higher-priority task has a critical deadline that it must meet, the worst-case "lockout time"
for all of its shared resources must be calculated and taken into account in the design. If the
cumulative lockout times are too long, the resource-sharing scheme must be redesigned.
Since worst-case delays resulting from the sharing of resources can be calculated at design time,
the only way they can affect the performance of the system is if no one properly accounts for them.
Priority inversions
The real trouble arises at run-time, when a medium-priority task preempts a lower-priority task
using a shared resource on which the higher-priority task is pending. If the higher-priority task is
otherwise ready to run, but a medium-priority task is currently running instead, a priority inversion
is said to occur.
the Mars Pathfinder mission in July 1997. The Pathfinder mission is best known for the little rover
that took high-resolution colour pictures of the Martian surface and relayed them back to Earth.
The problem was not in the landing software, but in the mission software run on the Martian
surface. In the spacecraft, various devices communicated over a MIL-STD-1553 data bus. Activity
on this bus was managed by a pair of high-priority tasks. One of the bus manager tasks
communicated through a pipe with a low-priority meteorological science task.
On Earth, the software mostly ran without incident. On Mars, however, a problem developed that
was serious enough to trigger a series of software resets during the mission. The sequence of events
leading to each reset began when the low-priority science task was pre-empted by a couple of
medium-priority tasks while it held a mutex related to the pipe. While the low-priority task was
pre-empted, the high-priority bus distribution manager tried to send more data to it over the same
pipe. Because the mutex was still held by the science task, the bus distribution manager was made
to wait. Shortly thereafter, the other bus scheduler became active. It noticed that the distribution
manager hadn't completed its work for that bus cycle and forced a system reset.
This problem was not caused by a mistake in the operating system, such as an incorrectly
implemented semaphore, or in the application. Instead, the software exhibited behaviour that is a
known "feature" of semaphores and inter-task communication. In fact, the RTOS used on
Pathfinder featured an optional priority-inversion workaround; the scientists at JPL simply hadn't
been aware of that option. Fortunately, they were able to recreate the problem on Earth, remotely
enable the workaround, and complete the mission successfully.
Workarounds
Research on priority inversion has yielded two solutions. The first is called priority inheritance.
This technique mandates that a lower-priority task inherit the priority of any higher-priority task
pending on a resource they share. This priority change should take place as soon as the highpriority task begins to spend; it should end when the resource is released. This requires help from
the operating system.
The second solution, priority ceilings, associates a priority with each resource; the scheduler then
transfers that priority to any task that accesses the resource. The priority assigned to the resource
is the priority of its highest-priority user, plus one. Once a task finishes with the resource, its
priority returns to normal.
A beneficial feature of the priority ceiling solution is that tasks can share resources simply by
changing their priorities, thus eliminating the need for semaphores:
void TaskA(void)
{
...
SetTaskPriority(RES_X_PRIO);
// Access shared resource X.
80
SetTaskPriority(TASK_A_PRIO);
...
}
While Task A's priority is elevated (and it is accessing shared resource X), it should not spend on
any other resource. The higher-priority user will only become the highest-priority ready task when
the lower-priority task is finished with their shared resource.
While not all of us are writing software for missions to Mars, we should learn from past mistakes
and implement solutions that don't repeat them. Many commercial RTOSs include support for
either priority inheritance or priority ceilings. Just make sure you enable one.
Priority inheritance:
Fatal embraces, deadlocks, and obscure bugs await the programmer who isn't careful
about priority inversions.
A preemptive real-time operating system (RTOS) forms the backbone of most embedded systems
devices, from digital cameras to life-saving medical equipment. The RTOS can schedule an
application's activities so that they appear to occur simultaneously. By rapidly switching from one
activity to the next, the RTOS is able to quickly respond to real-world events.
To ensure rapid response times, an embedded RTOS can use preemption, in which a higherpriority task can interrupt a low-priority task that's running. When the high-priority task finishes
running, the low-priority task resumes executing from the point at which it was interrupted. The
use of preemption guarantees worst-case performance times, which enable use of the application
in safety-critical situations.
Unfortunately, the need to share resources between tasks operating in a preemptive multitasking
environment can create conflicts. Two of the most common problems are deadlock and priority
inversion, both of which can result in application failure. In 1997, the Mars Pathfinder mission
nearly failed because of an undetected priority inversion. When the rover was collecting
meteorological data on Mars, it began experiencing system resets, losing data. The problem was
traced to priority inversion. A solution to the inversion was developed and uploaded to the rover,
and the mission completed successfully. Such a situation might have been avoided had the
designers of the rover accounted for the possibility of priority inversion.
This article describes in detail the problem of priority inversion and indicates two common
solutions. Also provided are detailed strategies for avoiding priority inversion. Avoiding priority
inversion is preferable to most other solutions, which generally require more code, more memory,
and more overhead when accessing shared resources.
Priority inversion
Priority inversion occurs when a high-priority task is forced to wait for the release of a
shared resource owned by a lower-priority task. The two types of priority inversion, bounded and
81
unbounded, occur when two tasks attempt to access a single shared resource. A shared resource
can be anything that must be used by two or more tasks in a mutually exclusive fashion. The period
of time that a task has a lock on a shared resource is called the task's critical section or critical
region.
82
Deadlock
Deadlock
Deadlock, shown in above Figure, is a special case of nested resource locks, in which a circular
chain of tasks waiting for resources prevents all the tasks in the chain from executing. Deadlocked
tasks can have potentially fatal consequences for the application. Suppose Task A is waiting for a
resource held by Task B, while Task B is waiting for a resource held by Task C, which is waiting
for a resource held by Task A. None of the three tasks is able to acquire the resource it needs to
resume execution, so the application is deadlocked.
Priority ceiling protocol
One way to solve priority inversion is to use the priority ceiling protocol, which gives each
shared resource a predefined priority ceiling. When a task acquires a shared resource, the task is
hoisted (has its priority temporarily raised) to the priority ceiling of that resource. The priority
83
ceiling must be higher than the highest priority of all tasks that can access the resource, thereby
ensuring that a task owning a shared resource won't be preempted by any other task attempting to
access the same resource. When the hoisted task releases the resource, the task is returned to its
original priority level. Any operating system that allows task priorities to change dynamically can
be used to implement the priority ceiling protocol.
A static analysis of the application is required to determine the priority ceiling for each
shared resource, a process that is often difficult and time consuming. To perform a static analysis,
every task that accesses each shared resource must be known in advance. This might be difficult,
or even impossible, to determine for a complex application.
The priority ceiling protocol provides a good worst-case wait time for a high-priority task
waiting for a shared resource. The worst-case wait time is limited to the longest critical section of
any lower-priority task that accesses the shared resource. The priority ceiling protocol prevents
deadlock by stopping chains of nested locks from developing.
On the downside, the priority ceiling protocol has poor average-case response time because
of the significant overhead associated with implementing the protocol. Every time a shared
resource is acquired, the acquiring task must be hoisted to the resource's priority ceiling.
Conversely, every time a shared resource is released, the hoisted task's priority must be lowered
to its original level. All this extra code takes time.
By hoisting the acquiring task to the priority ceiling of the resource, the priority ceiling
protocol prevents locks from being contended. Because the hoisted task has a priority higher than
that of any other task that can request the resource, no task can contend the lock. A disadvantage
of the priority ceiling protocol is that the priority of a task changes every time it acquires or releases
a shared resource. These priority changes occur even if no other task would compete for the
resource at that time.
Medium-priority tasks are often unnecessarily prevented from running by the priority
ceiling protocol. Suppose a low-priority task acquires a resource that's shared with a high-priority
task. The low-priority task is hoisted to the resource's priority ceiling, above that of the highpriority task. Any tasks with a priority below the resource's priority ceiling that are ready to
execute will be prevented from doing so, even if they don't use the shared resource.
Priority inheritance protocol
An alternative to the priority ceiling protocol is the priority inheritance protocol, a
variation that uses dynamic priority adjustments. When a low-priority task acquires a shared
resource, the task continues running at its original priority level. If a high-priority task requests
ownership of the shared resource, the low-priority task is hoisted above the requesting task. The
low-priority task can then continue executing its critical section until it releases the resource. Once
the resource is released, the task is dropped back to its original low-priority level, permitting the
high-priority task to use the resource it has just acquired.
84
Because the majority of locks in real-time applications aren't contended, the priority
inheritance protocol has good average-case performance. When a lock isn't contended, priorities
don't change; there is no additional overhead. However, the worst-case performance for the
priority inheritance protocol is worse than the worst-case priority ceiling protocol, since nested
resource locks increase the wait time. The maximum duration of the priority inversion is the sum
of the execution times of all of the nested resource locks. Furthermore, nested resource locks can
lead to deadlock when you use the priority inheritance protocol. That makes it important to design
the application so that deadlock can't occur.
Nested resource locks should obviously be avoided if possible. An inadequate or incomplete
understanding of the interactions between tasks can lead to nested resource locks. A well-thoughtout design is the best tool a programmer can use to prevent these.
You can avoid deadlock by allowing each task to own only one shared resource at a time.
When this condition is met, the worst-case wait time matches the priority ceiling protocol's worstcase wait. In order to prevent misuse, some operating systems that implement priority inheritance
don't allow nested locks. It might not be possible, however, to eliminate nested resource locks in
some applications without seriously complicating the application.
But remember that allowing tasks to acquire multiple priority inheritance resources can
lead to deadlock and increase the worst-case wait time.
Priority inheritance is difficult to implement, with many complicated scenarios arising
when two or more tasks attempt to access the same resources. The algorithm for resolving a long
chain of nested resource locks is complex. It's possible to incur a lot of overhead as hoisting one
task results in hoisting another task, and another, until finally some task is hoisted that has the
resources needed to run. After executing its critical section, each hoisted task must then return to
its original priority.
Figure 5 shows the simplest case of the priority inheritance protocol in which a low-priority task
acquires a resource that's then requested by a higher priority task. Figure 6 shows a slightly more
complex case, with a low-priority task owning a resource that's requested by two higher-priority
tasks. Figure 7 demonstrates the potential for complexity when three tasks compete for two
resources.
1.
2.
3.
0.
4.
5.
0.
6.
7.
8.
86
Task 3 is given control of the processor and begins executing. The task requests Resource A.
Task 3 acquires ownership of Resource A and begins executing its critical region.
Task 3 is preempted by Task 2, a higher-priority task. Task 2 requests ownership of Resource B.
Task 2 is granted ownership of Resource B and begins executing its critical region.
1. The task requests ownership of Resource A, which is owned by Task 3.
5. Task 3 is hoisted to a priority above Task 2 and resumes executing its critical region.
6. Task 3 is preempted by Task 1, a higher-priority task.
1. Task 1 requests Resource B, which is owned by Task 2.
7. Task 2 is hoisted to a priority above Task 1. However, Task 2 still can't execute because it must
wait for Resource A, which is owned by Task 3.
1. Task 3 is hoisted to a priority above Task 2 and continues executing its
critical region.
8. Task 3 releases Resource A and is lowered back to its original priority.
1. Task 2 acquires ownership of Resource A and resumes executing its critical
region.
9. Task 2 releases Resource A and then releases Resource B. The task is lowered back to its original
priority.
1. Task 1 acquires ownership of Resource B and begins executing its critical
region.
10. Task 1 releases Resource B and continues executing normally.
11. Task 1 finishes executing. Task 2 resumes and continues executing normally.
12. Task 2 finishes executing. Task 3 resumes and continues executing normally.
13. Task 3 finishes executing.
Manage resource ownership
Most RTOSes that support priority inheritance require resource locks to be properly nested,
meaning the resources must be released in the reverse order to that in which they were acquired.
For example, a task that acquired Resource A and then Resource B would be required to release
Resource B before releasing Resource A.
Figure 7 provides an example of priority inheritance in which two resources are released in the
opposite order to that in which they were acquired. Task 2 acquired Resource A before Resource
B; Resource B was then released before Resource A. In this example, Task 2 was able to release
the resources in the proper order without adversely affecting the application.
87
Many operating systems require resources to be released in the proper order because it's difficult
to implement the capability to do otherwise. However, situations occur in which releasing the
resources in the proper order is neither possible nor desirable. Suppose there are two shared
resources: Resource B can't be acquired without first owning Resource A. At some point during
the execution of the critical region with Resource B, Resource A is no longer needed. Ideally,
Resource A would now be released. Unfortunately, many operating systems don't allow that. They
require Resource A to be held until Resource B is released, at which point Resource A can be
released. If a higher-priority task is waiting for Resource A, the task is kept waiting unnecessarily
while the resource's current owner executes.
88
89
clearly advantageous, many implementations of the priority inheritance protocol only support
sequentially nested resource locks.
The example in Figure 9 helps show why it's more difficult to implement priority inheritance while
allowing resources to be released in any order. If a task owns multiple shared resources and has
been hoisted several times, care must be taken when the task releases those resources. The task's
priority must be adjusted to the appropriate level. Failure to do so may result in unbounded priority
inversion.
Avoid inversion
The best strategy for solving priority inversion is to design the system so that inversion can't occur.
Although priority ceilings and priority inheritance both prevent unbounded priority inversion,
neither protocol prevents bounded priority inversion. Priority inversion, whether bounded or not,
is inherently a contradiction. You don't want to have a high-priority task wait for a low-priority
task that holds a shared resource.
Prior to implementing an application, examine its overall design. If possible, avoid sharing
resources between tasks at all. If no resources are shared, priority inversion is precluded.
If several tasks do use the same resource, consider combining them into a single task. The subtasks can access the resource through a state machine in the combined task without fear of priority
inversion. Unless the competing sub-tasks are fairly simple, however, the state machine might be
too complex to justify.
Another way to prevent priority inversion is to ensure that all tasks that access a common resource
have the same priority. Although one task might still wait while another task uses the resource, no
priority inversion will occur because both tasks have the same priority. Of course, this only works
if the RTOS provides a non-preemptive mechanism for gracefully switching between tasks of
equal priority.
If you can't use any of these techniques to manage shared resources, consider giving a "server
task" sole possession of the resource. The server task can then regulate access to the resource.
When a "client task" needs the resource, it must call upon the server task to perform the required
operations and then wait for the server to respond. The server task must be at a priority greater
than that of the highest-priority client task that will access the resource. This method of controlling
access to a resource is similar to the priority ceiling protocol and requires static analysis to
determine the priority of the server task. The method relies on RTOS message passing and
synchronization services instead of resource locks and dynamic task-priority adjustments.
Part A
1.
may also be considered tasks (or subtasks). All of today's widely-used operating systems support
multitasking, which allows multiple tasks to run concurrently, taking turns using the resources of
the computer.
2.
A thread is a single sequence stream within in a process. Because threads have some of the
properties of processes, they are sometimes called lightweight processes. In a process,
threads allow multiple executions of streams.
3.
What are the advantages and disadvantages of user-level threads over kernel-level
threads?
The procedure that saves the threads state and the scheduler are just local procedures, so
invoking them is much more efficient than making a kernel call. User level threads also have other
advantages. They allow each process to have its own customized scheduling algorithm. They also
scale better since kernel threads invariably require some table space and stack space in the kernel,
which can be a problem if there are a very large number of threads.
Despite their better performance user-level threads packages have some major problems. First
among there is the problem of how blocking system calls are implemented. Kernel threads do not
require any new non-blocking system calls as compared to user level threads; their main
disadvantage is that the cost of the system call is substantial, so if thread operations (creation,
termination, etc.) are common, much more overhead will be incurred.
5.
What is multiprocessing?
At the operating system level, multiprocessing is sometimes used to refer to the execution
of multiple concurrent processes in a system, with each process running on a separate CPU or
core, as opposed to a single process at any one instant.
91
6.
What is multitasking?
Multitasking, in an operating system, is allowing a user to perform more than one computer
task (such as the operation of an application program) at a time. The operating system is able to
keep track of where you are in these tasks and go from one to the other without losing information.
7.
Shared Memory is an efficient means of passing data between programs. One program will
create a memory portion which other processes (if permitted) can access.
8.
9.
1.
2.
3.
4.
5.
6.
Part B
Explain about pre-emptive and non-preemptive scheduling of RTOS embedded system
design.
Explain about interrupt routines in RTOS embedded system design.
Explain the concepts of tasks, process and threads of RTOS system design.
Write a brief note on multiprocessing and multitasking?
What is message passing and shared memory about Inter-process Communication?
Comparison of Real time Operating systems Vx Works, C/OS-II, RT Linux?
92
UNIT V
EMBEDDED SYSTEM APPLICATION
DEVELOPMENT
Case Study of Washing Machine
Automotive Application
Smart card System Application
93
Functional Units:
Valve Control unit: Controls In out water flow.
Sensing unit: Load, Water availability & level, Detergent availability, Door
open/Close, Water temp, Motor Speed sensing.
94
Motor control unit: Clock wise and counter clock wise Rotation, Separate
speeds on washing, rinsing and drying.
Display unit: LEDs to indicate the completion of process, occurrence of
some problem while washing, set or reset of buttons and Seven segment
display for the numeric value display.
Control Flow
95
96
CLK
RST
Vcc
RFU
GND
RFU
Vpp
I/O
97
Typical Configurations:
256 bytes to 4KB RAM.
8KB to 32KB ROM.
1KB to 32KB EEPROM.
Crypto-coprocessors are optional.
8-bit to 16-bit CPU. 8051 based designs are common.
Smart Card Readers:
Computer based readers Connect through USB or COM (Serial) ports
Dedicated terminals usually with a small screen, keypad, printer, often also
have biometric devices such as thumb print scanner.
Communication mechanisms:
Communication between smart card and reader is standardized
ISO 7816 standard
Commands are initiated by the terminal
Interpreted by the card OS
Card state is updated
Response is given by the card.
Commands have the following structure
Software part of Cards:
Needs cryptographic software, needs special features in its operating
system.
Protected environment -OS stored in the protected part of ROM.
Security service API allowing the Smartcard and the reader to mutually
authenticate themselves and also to encrypt data to be exchanged between
the card and the reader.
Card Interfacing:
99
Card responds OK
advanced CPUs and a higher level language in which designers can easily reuse
modules from project to project. A successful automotive-electronic design depends
on careful processor selection. Modern power train controllers for the engine and
transmission generally require 32-bit CPUs to process the real-time algorithms.
Other areas of the automobile, such as safety, chassis, and body systems, use both
16-bit and 32-bit processors, depending on control complexity.
Although some critical timing situations still use assembly language, the
software trend in automotive embedded systems is toward C. The control software
is more complicated and precise for the current vehicles. Advanced usage of
embedded system and electronics within the vehicle can aid in controlling the
amount of pollution being generated and increasing the ability to provide systems
monitoring and diagnostic capabilities without sacrificing safety/security features
that consumers demand. The electronic content within the vehicle continues to grow
and more systems become intelligent through the addition of microcontroller based
electronics.
A typical vehicle today contains an average of 25-35 microcontrollers
with some luxury vehicles containing up to 70 microcontrollers per vehicle. Flashbased microcontrollers are continuing to replace relays, switches, and traditional
mechanical functions with higher-reliability components while eliminating the cost
and weight of copper wire. Embedded controllers also drive motors to operate power
seats, windows, and mirrors. Driver-information processors display or announce
navigation and traffic information along with vehicle diagnostics.
Embedded controllers are even keeping track of your driving habits. In
addition, enormous activity occurs in the entertainment and mobile-computing
areas. Networks are a recent addition to embedded controllers which are the
challenge of squeezing in the hardware and code for in-car networking. To satisfy
new government emissions regulations, vehicle manufacturers and the Society of
Automotive Engineers (SAE) developed J1850, a specialized automotive-network
protocol. Although J1850 is now standard on US automobiles, European
manufacturers support the controller-area network (CAN).
High-bandwidth, real-time control applications like power train, airbags,
and braking need the 1Mbps speed of CAN and their safety critical nature requires
the associated cost. Local Interconnect Network (LIN) typically is a sub-bus
101
network that is localized within the vehicle and has a substantially lower
implementation cost when compared to a CAN network. It serves low-speed, lowbandwidth applications like mirror controls, seat controls, fan controls,
environmental controls, and position sensors. Embedded system in the automotive
shares the general characters of common embedded system, but it has its own
primary design goals of automotive industry.
Reliability and cost may be the toughest design goal to achieve because
of the rugged environment of the automobile. The circuitry must survive nearby
high-voltage electronic magnetic interference (EMI), temperature extremes from the
weather and the heat of the engine, and severe shock from bad roads and occasional
collisions. The electronic control units (ECUs) should be developed and tested on
the all kinds of situations with low cost. Although testing time grows with the
complexity of the system, a reliable controller also requires complete software
testing to verify every state and path. A single bug that slips through testing may
force a very expensive recall to update the software. Therefore the development of
high-ability tools is also active in the field of automotive embedded system.
Motivation and Objective of the Research
Being the core of automotive electronic and control system, the
combination of ECUs continues to advance tomorrows automobiles with the ability
to provide the driver with a safety/security, energy efficient, and more reliable
vehicle. The quest to provide fuel-efficient, environmental friendly vehicles and the
concern of safety/security are becoming an everyday concern for consumers not only
in the automotive market but in our daily lives. Also these are the problems this
study is focusing on.
Along with the flood of breakthroughs and innovations in the world of
automotive technology, there has been considerable attention given to the most
crucial element of environment and driving. Automobile exhaust emissions
contribute about 10% of the worlds air pollution problems with carbon monoxide
and nitrogen oxide emissions. The increase in automobile fuel consumption
threatens the worlds oil reserves where considerable part of allocation is dedicated
to transportation. As environmental concerns mount, governmental regulations are
being driven towards alleviating these problems.
102
Engine controls can meet stricter emission laws and fuel economy
standards. Power train computers adjust the engine and transmission for best
performance. The electronic content in engine controls creates a networked, closedloop system that can manage the emissions and the fuel economy of the vehicle by
creating the perfect ratio of fuel/air mixture. Although there has come to be a vehicle
flourish along with the flood of breakthroughs and innovations in the world of
automotive technology, by far many traffic accidents still happen here and there.
According to World Health Organization figures, an estimated 1.17
million deaths occur and over 10 million people are crippled or injured worldwide
each year due to road accidents. The safety/security processors remind you to use
seat belts, warn you of hazards, and deploy air bags during an accident. Automotive
security and safety takes place even before and when a journey begins. It includes
the development in the areas of active safety technology, which is allowing us to
actively predict the occurrence of traffic accident, and passive safety technology,
which allows us to reduce injury to persons involved in any accident that does
happen.
Also it includes the self-security of automotive, preventing the automotive
from being stolen and robbed. In the field of vehicle safety and security, a major
trend sweeping the automotive industry is the transition from mechanical
connections to fault-tolerant electric/electronic systems using wires, controllers,
sensors and actuators to control mechanical functions such as steering, braking,
throttle and suspension. This technology connects the entire automotive to a unit
system combining different embedded parts.
Embedded System Design for Engine Control System
In the modern society, as the rapid development of automotive industries,
environment pollution gradually becomes a challenged problem to which more and
more people pay much attention. The increased environment awareness and
requirement for drivability have raised the interest and investment in the researches
of complicated automotive modeling and control methods. One of the researches
concerns on a high power output while still maintaining a good fuel economy. It can
be achieved using a smaller but turbocharged spark ignited engine with a three way
catalyst to reduce emissions.
103
Actually, the task is mostly taken by enhancing the air-fuel ratio control.
The air-fuel ratio is the ratio of air-mass and fuel-mass in the cylinder when the
valves are closed. The mass of air flowing inside the cylinder can be achieved from
the pressure in the intake manifold and the cylinder air charge efficiency. The engine
control unit (ECU) tries to get the air pressure of intake manifold and estimate the
cylinder air charge efficiency, based on which it can decide the mass of fuel to inject.
Thus it is natural to focus on the air path where there are differences. In addition, it
has been argued that the air dynamics has a more significant influence on the airfuel ratio than the fuel dynamics (Powell et al., 1998b).
Hence, a key component for precise air-fuel ratio control is the
achievement of precise intake manifold pressure or mass flow. Transient air and fuel
estimation are still difficult tasks since there are considerable non-linear dynamics
between actuators and sensors on the engine. Therefore, in the case that the air path
has been a thoroughly studied topic for naturally aspirated engines, additional
research on the air system of turbocharged engine is continued because of its more
complex intake system, in which there are couplings between the intake and exhaust
side that influence the intake manifold pressure and the cylinder air charge
efficiency. Generally the mathematical model of engine intake system to calibrate
the air pressure inside it is developed and this model is used for real time predictive
control. The model of engine intake system is very complicated, and furthermore
when the turbo charger is equipped it becomes more complex.
Even though the precise model is developed, the calculation method is not
suitable enough for predictive control in real time because of the solution speed
limitation of current hardware and software system. For example, the emissions of
hydrocarbons and carbon monoxide are reduced if the injection is finished around
intake valve opening. See e.g. (Bouza and Caserta, 2003). This means that the fuel
is injected before the induction stroke starts. Therefore in transient conditions, the
ECU has to predict the mass of air in the cylinder before intake valve opening. The
required prediction time is at least the sum of the computation time, the injection
time and the delays of actors. Typically, the necessary prediction time is around one
revolution and because modern engine is a machine with high rotation speed, the
time of one revolution means the level of millisecond.
Hybrid modeling is a good way to describe automotive engine system
because the intake system of turbocharged engine has the nature characteristics of
104
hybrid. The states of air pressure and flow mass can be expressed by continuous
state variables; the control commands of throttle plate angle inputs and influence of
turbo charge can be expressed by discontinuous variables. Under these constrains,
the air flow in the intake system will have the characteristics of acceleration,
deceleration even reverse. Modeling and solving this system from the aspects of
hybrid system reflects the essence of the system and is closest to the physical
realities. However, since the class of hybrid control problems is extremely broad (it
contains continuous control problems as well as discrete event control problems as
special cases), it is very difficult to devise a general yet effective strategy to solve
them. In our opinion, it is important to address significant application domains to
develop further understanding of the implications of the hybrid model on simulation
algorithms and to evaluate whether using this formalism can be of substantial help
in solving complex, real time control problems.
Furthermore, almost all the control algorithms are implemented on
microcomputer units which interact to practical plants. As computing tasks
performed by embedded devices become more sophisticated and the need for speed
and stability of embedded software becomes more apparent. We are facing the
problem of how to get the most precise trajectory of system by the least cost. This
means the faster and more stable software have to be developed for practical plants
on the situation of current hardware limitation. A novel hybrid simulation algorithm
is developed and this algorithm is implemented to solve the model of intake system
of turbocharged engine for predictive control. The parameters outputted by this
model are the most important parameters in engine control system. Hybrid system
is a non-smooth system consisting of sets of differential equations and discrete
variables according to the external control commands and internal evolution rules.
In this case the system isnt suitable for direct numerical methods since it has
character of chatting (oscillation) between the intersections of different regions in
certain situations.
A first issue of great practical importance in the procedure of hybrid
system simulation is whether the solver can detect the event precisely. In the
proposed approach the sign of the event function is monitored and an event is
searched in the span of an integration step, from an approximation of current state
variable at current time to the approximation of the next step state variable at next
time by fixed step size. When the sign of multiply of event function changes from
105
positive to negative or from negative to positive, an event happens and the system
trajectories cross the switching surface. A first-in-first-out stack is used to store the
calculated approximation of variable states. The events are discriminated as basic
events and induced events which are affiliate to the basic ones and will not trigger
the location change if the basic events are not trigged.
The processing makes it relatively straightforward to implement
numerical algorithm and reduces the number of checks that have to be made every
time when the event is triggered. Therefore the program is more efficient and faster.
In the procedure of event location, the transition phenomena are analyzed around
the switching surface and design the integration formula based on Filippov structure
to calculate the integration on the sliding surface. According to the event mechanism
and integration formulas, an additional node in the procedure of simulation is added
as the extension of the common hybrid automaton.
The calculation algorithm is presented to transit the system near the
switching surface of two regions into three nodes therefore the undesirable
transitions are avoided and the solutions are smooth and efficient. The model of
intake system of turbocharged engine is built from the analysis of thermodynamic
and hydrodynamic characteristics and sampled experiment data. The model is
embedded into the engine control unit to estimate the air mass flowing into the
cylinders. The current parameters are sampled by the sensors and the next step
values are calculated by the embedded internal model.
According to these values and the compensation parameters such as water
temperature, engine rotation and etc., the fuel injection can be decided. Therefore
the calculation speed should be fast enough to satisfy the requirement of engine
rotations. This model is expressed by a set of differential equations with condition
selection on the right hand side and it is developed based on the view of hybrid
system and solved using the propose algorithm. The trajectories are smooth under
the entire regions of throttle angle inputs.
Furthermore the calculation speed is improved at least eight times against
to former method and the error is restricted to be less than 1%. This solution is
verified on the platform of MATLAB and Visual C++. At last the intake system
model is implemented on FPGA chip and it can be embedded into ECU for real time
control.
106
with the vehicle models to study the behavior of the overall system and to optimize
the algorithm used in it before building prototypes. In the second part HIL platform
is constructed including the computer cluster, the hardware-vehicle interaction
(sampling, time lags, etc.), actual vehicle components, ABS controller and actuator.
All the components are connected together by controller area network (CAN) (SAE,
2056/1, and 2056/2, 1994), including engine, ABS controller, sensors and vehicle
model.
Rather than testing these components in complete actual system setups,
virtual system allows the testing of new components and prototypes by
communicating with software models on the main computer by CAN interface.
Furthermore this technology is flexible enough to allow expansion and
reconfiguration, in accordance with the development of modern automotives. For
the requirement of real time system, one computer is used to run the vehicle model
exclusively and use another one for data and graphic processing; they are connected
through Ethernet based on TCP/IP. In the procedure of HIL simulation, a novel
simulation algorithm is proposed to deal with the abrupt changes of hydraulic
pressure and make the entire system robust and stable. The structure of entire
hardware and software system is shown in Figure
components. MATLAB, database and EXCEL are integrated into this system
through Visual C++ programming platform on assistant computer. Microsoft
ACCESS is chosen as the data storage database and ADO is used for data operation.
The experiment data is stored into database through Ethernet. The necessary data
can be taken out from database and processed in MATLAB after being converted to
the proper data format.
Furthermore the desirable data and figures can be imported into EXCEL,
which convenience the data analysis and exportation for ABS designers.
Conventional ABS control algorithms must account for non-linearity in brake torque
due to temperature variation and dynamics of brake fluid viscosity. Although fuzzy
logic is rigorously structured in mathematics, one advantage is the ability to describe
systems linguistically through rule statements. The superior characteristics through
the use of fuzzy logic based control are realized rather than traditional control
algorithms to ABS controller. Due to the nature of fuzzy logic, influential dynamic
factors are accounted for in a rule based description of ABS. This type of intelligent
algorithm allows for improvement and optimization of control result. This algorithm
is tested on the HIL platform and the desirable results are achieved.
Air Bag Mechanism:
AIR BAGS are among the most important safety improvements added to
cars and light trucks in recent years, providing extra protection for front-seat
occupants in head-on crashes.
And of late, every passengers protection.
Event Data Recorder-EDR or the "Black Box"
The Event Data Recorder (EDR) in an
automobile is based upon the sensors
and microprocessor computer system
that are used to activate the airbag in
the vehicle during a crash.
111
112
Principle of functioning:
Wheel-speed sensors detect whether a wheel is showing a tendency to lockup
High-speed correction of the braking pressure up to shortly before the lockup threshold
The brake-fluid return together with the closed-loop brake circuits makes
this a safe, reliable, and cost-effective system
Advantage
A gain for driving safety
The vehicle remains steerable, even in case of panic braking
Shorter stopping distances on practically all road surfaces
sensor
sensor
Brake
Brake
ABS
control
module
Hydraulic
Pump
Brake
Brake
sensor
sensor
113
ANTI-LOCK BRAKES:
Master Cylinder
Assembly
Pressure
Valve
Dump/Vent Valve
Speed
Sensor
Anti Lock
Anti-Lock
Brake
Module
12 V
Heads-up display
Night Vision
Back-up collision sensor
Navigation Systems
Part B
Explain the design, analysis and processing of a fully automatic machine?
Explain the design, analysis and processing of smart card?
Explain the design, analysis and processing of air bag system in automotives?
Explain the design, analysis and processing of ABS system in Car?
115