Embedded Systems - Unit-1 To Unit-5. - K.anil Kumar
Embedded Systems - Unit-1 To Unit-5. - K.anil Kumar
E.g. Electronic Toys, Mobile Handsets, Washing Machines, Air Conditioners, Automotive
Control Units, Set Top Box, DVD Player etc…
Software and firmware. Software for embedded systems can vary in complexity. However,
industrial-grade microcontrollers and embedded IoT systems usually run very simple
software that requires little memory.
Real-time operating system. These are not always included in embedded systems,
especially smaller-scale systems. RTOSes define how the system works by supervising the
software and setting rules during program execution.
In terms of hardware, a basic embedded system would consist of the following elements:
Based on Generation
Based on Complexity and Performance
Based on performance & Functional requirements
Based on Triggering
1.4.1.1) First Generation: The early embedded systems built around 8-bit microprocessors like
8085 and Z80 and 4-bit microcontrollers.
1.4.1.2) Second Generation: Embedded Systems built around 16-bit microprocessors and 8 or
16-bit microcontrollers, following the first generation embedded systems.
1.4.1.3) Third Generation: Embedded Systems built around high performance 16/32 bit
Microprocessors/controllers, Application Specific Instruction set processors like Digital Signal
Processors (DSPs), and Application Specific Integrated Circuits (ASICs).The instruction set is
complex and powerful.
1.4.1.4) Fourth Generation: Embedded Systems built around System on Chips (SoC’s), Re-
configurable processors and multicore processors. It brings high performance, tight integration
and miniaturization into the embedded device market.
1.4.2.1) Small Scale: The embedded systems built around low performance and low cost 8 or 16
bit microprocessors/ microcontrollers. It is suitable for simple applications and where
performance is not time critical. It may or may not contain OS.
1.4.2.2) Medium Scale: Embedded Systems built around medium performance, low cost 16 or 32
bit microprocessors / microcontrollers or DSPs. These are slightly complex in hardware and
firmware. It may contain GPOS/RTOS.
1.4.2.3) Large Scale/Complex: Embedded Systems built around high performance 32 or 64 bit
RISC processors/controllers, RSoC or multi-core processors and PLD. It requires complex
hardware and software. These system may contain multiple processors/controllers and co-
units/hardware accelerators for offloading the processing requirements from the main processor.
It contains RTOS for scheduling, prioritization and management.
A Real-Time Embedded System is strictly time specific which means these embedded systems
provides output in a particular/defined time interval. These type of embedded systems provide
quick response in critical situations which gives most priority to time based task performance
and generation of output. That’s why real time embedded systems are used in defense sector,
medical and health care sector, and some other industrial applications where output in the right
time is given more importance.
Further this Real-Time Embedded System is divided into two types i.e.
1.4.3.1.1) Soft Real Time Embedded Systems –
In these types of embedded systems time/deadline is not so strictly followed. If deadline of the
task is passed (means the system didn’t give result in the defined time) still result or output is
accepted.
In these types of embedded systems time/deadline of task is strictly followed. Task must be
completed in between time frame (defined time interval) otherwise result/output may not be
accepted.
Examples :
a. Traffic control system
b. Military usage in defense sector
c. Medical usage in health sector
Stand Alone Embedded Systems are independent systems which can work by themselves they
don’t depend on a host system. It takes input in digital or analog form and provides the output.
Examples :
a. MP3 players
b. Microwave ovens
c. Calculator
Networked Embedded Systems are connected to a network which may be wired or wireless to
provide output to the attached device. They communicate with embedded web server through
network.
Examples:
a. Home security systems
b. ATM machine
c. Card swipe machine
Mobile embedded systems are small and easy to use and require less resources. They are the
most preferred embedded systems. In portability point of view mobile embedded systems are
also best.
Examples:
a. MP3 player
b. Mobile phones
c. Digital Camera
1.4.4.1) Event Triggered: Activities within the system (e.g.,task run-times) are dynamic and
depend upon occurrence of different events.
1.4.4.2) Time Triggered: Activities within the system follow a statically computed schedule (i.e.,
they are allocated time slots during which they can take place) and thus by nature are predictable.
Each Embedded Systems is designed to serve the purpose of any one or a combination of the
following tasks.
Data Collection/Storage/Representation
Data Communication
Data (Signal) Processing
Monitoring
Control
Application Specific User Interface
c. The data communication can happen through a wired interface (like Ethernet, RS-
232C/USB/IEEE1394 etc) or wireless interface (like Wi-Fi, GSM,/GPRS, Bluetooth,
ZigBee etc)
d. Network hubs, Routers, switches, Modems etc are typical examples for dedicated data
transmission embedded system.
1.6.4) Monitoring:
1.6.5) Control:
d. The actuators connected to the output port are controlled according to the changes in input
variable to put an impact on the controlling variable to bring the controlled variable to the
specified range.
e. Air conditioner for controlling room temperature is a typical example for embedded
system with Control functionality
f. Air conditioner contains a room temperature sensing element (sensor) which may be a
thermistor and a handheld unit for setting up (feeding) the desired temperature.
g. The air compressor unit acts as the actuator. The compressor is controlled according to the
current room temperature and the desired temperature set by the end user.
Embedded systems possess certain specific characteristics and these are unique to each
Embedded system.
9. Tightly-constrained
10. Safety-critical
a. Each Embedded System has certain functions to perform and they are developed in such a
manner to do the intended functions only.
b. They cannot be used for any other purpose.
c. Ex – The embedded control units of the microwave oven cannot be replaced with AC’S
embedded control unit because the embedded control units of microwave oven and AC are
specifically designed to perform certain specific tasks.
a. Embedded Systems are in constant interaction with the real world through sensors and
user-defined input devices which are connected to the input port of the system.
b. Any changes in the real world are captured by the sensors or input devices in real time and
the control algorithm running inside the unit reacts in a designed manner to bring the
controlled output variables to the desired level.
c. E.S produce changes in output in response to the changes in the input, so they are referred
as reactive systems.
d. Real Time system operation means the timing behavior of the system should be
deterministic ie the system should respond to requests in a known amount of time.
e. Example – E.S which are mission critical like flight control systems, Antilock Brake
Systems (ABS) etc are Real Time systems.
a. The design of E.S should take care of the operating conditions of the area where the system
is going to implement.
b. Ex – If the system needs to be deployed in a high temperature zone, then all the
components used in the system should be of high temperature grade.
c. Also proper shock absorption techniques should be provided to systems which are going to
be commissioned in places subject to high shock.
1.7.4) Distributed:
a. Product aesthetics (size, weight, shape, style, etc) is an important factor in choosing a
product.
b. It is convenient to handle a compact device than a bulky product.
Quality attributes are the non-functional requirements that need to be documented properly in
any system design.
1.8.1) Operational Quality Attributes: The operational quality attributes represent the relevant
quality attributes related to the embedded system when it is in the operational mode or online
mode.
1.8.1.1) Response :-
1.8.1.2) Throughput:-
1.8.1.3) Reliability:-
a. It is a measure of how much we can rely upon the proper functioning of thesystem.
b. Mean Time Between Failure (MTBF) and Mean Time To Repair (MTTR) are the terms
used in determining system reliability.
c. MTBF gives the frequency of failures in hours/weeks/months.
d. MTTR specifies how long the system is allowed to be out of order following a failure for
embedded system with critical application need, it should be of the order of minutes.
1.8.1.4) Maintainability:-
a. It deals with support and maintenance to the end user or client in case of technical issues
and product failure or on the basis of a routine systemcheckup.
b. Reliability and maintainability are complementary to each other.
c. A more reliable system means a system with less corrective maintainability requirements
and vice versa.
d. Maintainability can be broadly classified into two categories.
1. Scheduled or Periodic maintenance (Preventive maintenance)
2. Corrective maintenance to unexpected failures
1.8.1.5) Security:-
a. Confidentiality, Integrity and availability are the three major measures of information
security.
b. Confidentiality deals with protection of data and application from unauthorized
disclosure.
c. Integrity deals with the protection of data and application from unauthorized
modification.
d. Availability deals with protection of data and application from unauthorized users.
1.8.1.6) Safety:-
a. Safety deals with the possible damages that can happen to the operator, public y6k8and
the environment due to the breakdown of an Embedded System.
b. The breakdown of an embedded system may occur due to a hardware failure or a
firmware failure.
c. Safety analysis is a must in product engineering to evaluate the anticipated damages and
determine the best course of action to bring down the consequences of damage to an
acceptable level.
The quality attributes that needs to be addressed for the product not on the basis of operational
aspects are grouped under thiscategory.
a. Testability deals with how easily one can test the design, application and by which means
it can be done.
b. For an E.S testability is applicable to both the embedded hardware and firmware.
c. Embedded hardware testing ensures that the peripherals and total hardware functions in
the desired manner, whereas firmware testing ensures that the firmware is functioning in
the expected way.
d. Debug-ability is a means of debugging the product from unexpected behaviour in the
system
e. Debug-ability is two level process
1. Hardware level
2. Software level
1. Hardware level: It is used for finding the issues created by hardware problems.
2. Software level: It is employed for finding the errors created by the
flaws in the software.
1.8.2.2) Evolvability:-
1.8.2.3) Portability:-
a. It is the time elapsed between the conceptualization of a product and the time at which the
product is ready for selling.
b. The commercial embedded product market is highly competitive and time to market the
product is critical factor in the success of commercial embedded product.
c. There may be multiple players in embedded industry who develop products of the same
category (like mobile phone).
a. Cost is a factor which is closely monitored by both end user and product manufacturer.
b. Cost is highly sensitive factor for commercial products.
c. Any failure to position the cost of a commercial product at a nominalrate may lead to the
failure of the product in the market.
d. Proper market study and cost benefit analysis should be carried out before taking a
decision on the per-unit cost of the embedded product.
e. The ultimate aim of the product is to generate marginal profit so the budget and total cost
should be properly balanced to provide a marginal profit.
The applications of embedded system basics include smart cards, computer networking,
satellites,telecommunications, digital consumer electronics, missiles, etc.
It’s all about inputs and outputs (I/O). How do I get an audio signal from one to the other? The
ongoing evolution of professional audio has produced a number of viable digital interfaces
to complement legacy analog I/O practices. The choices may seem confusing at first, but when
youbreak them down the strengths and weakness of each become apparent.
In this overview, I will start with analog since it is familiar t o most readers and serves as a
reference for the discussion of digital formats. I will focus on professional interfaces only.
While similar in many ways to consumer I/O, professional are more robust against
electromagnetic interference (EMI) and allow much longer cables both requisites for large
sound systems.
Fig 1.3: An analog interface is point-to-point, with no exact requirements with regard to cable type,
connector type, and cable.
The signal is in the form of a time-varying analog voltage that can span a level range of over
100 dB. Signals are classified by the magnitude of this voltage (e.g. mic level, line level,
loudspeaker level). The signal flows in one direction only from output to input. Cable lengths
are limited by cable capacitance, and can approach 300 m (1000 ft) in some applications.
Analog connectors are classified by the number of electrical contacts. The impedance of cables
and connectors is not a consideration, since analog audio signals are low in frequency in
terms of the electromagnetic spectrum.
1.9.2.2) Pros:
1.9.2.3) Cons:
Microcontroller are used in wide variety of applications like for measuring and
control of physical quantity like temperature, pressure, speed, distance, etc.
In these systems microcontroller generates output which is in digital form but
the controlling system requires analog signal as they don't accept digital data
thus making it necessary to use DAC which converts digital data into
equivalent analog voltage.
In the figure shown, we use 8-bit DAC 0808. This IC converts digital data into
equivalent analogCurrent. Hence we require an I to V converter to convert this
current into equivalent voltage.
According to theory of DAC Equivalent analog output is given as:
𝑫𝟎 𝑫𝟏 𝑫𝟐 𝑫𝟑 𝑫𝟒 𝑫𝟓 𝑫𝟔 𝑫𝟕
𝑽𝟎 = 𝑽𝒓𝒆𝒇[ + + + + + + + ]
𝟐 𝟒 𝟖 𝟏𝟔 𝟑𝟐 𝟔𝟒 𝟏𝟐𝟖 𝟐𝟓𝟔
Table 1.1: Different Analog output voltages for different Digital signal is given as:
DATA OUTPUT VOLTAGE
00H 0V
80H 5V
FFH 10V
They are a critical part of Analog to Digital Converters and help in accurate conversion of
analog signals to digital signals. We will see a simple sample and hold circuit, its working,
different types of circuit implementations and some of the important performance
parameters. A Sample and Hold Circuit, sometimes represented as S/H Circuit or S & H
Circuit, is usually used with an Analog to Digital Converter to sample the input analog
signal and hold the sampled signal.
In the S/H Circuit, the analog signal is sampled for a short interval of time, usually in the
range of 10µS to 1µS. After this, the sampled value is hold until the arrival of next input
signal to be sampled. The duration for holding the sample will be usually between few
milliseconds to few seconds.
Fig 1.5: Simple block diagram of a typical Sample and Hold Circuit
If the input analog voltage of an ADC changes more than ±1/2 LSB, then there is a severe
chance that the output digital value is an error. For the ADC to produce accurate results, the
input analog voltage should be held constant for the duration of the conversion.
As the name suggests, a S/H Circuit samples the input analog signal based on a sampling
commandand holds the output value at its output until the next sampling command is arrived.
Fig 1.7: Input and output of a typical Sample and Hold Circuit.
Let us understand the operating principle of a S/H Circuit with the help of a simplified
circuitdiagram. This sample and hold circuit consists of two basic components:
Analog
witch
Holding Capacitor
This circuit tracks the input analog signal until the sample command is changed to hold
command. After the hold command, the capacitor holds the analog voltage during the
analog to digital conversion.
Any FET like JFET or MOSFET can be used as an Analog Switch. In this discussion,
we will concentrate on JFET. The Gate-Source voltage VGS is responsible for switching
the JFET.
When VGS is equal to 0V, the JFET acts as a closed switch as it operates in its Ohmic
region. When VGS is a large negative voltage (i.e. more negative than VGS(OFF)), the JFET
acts as an open switch asit is cut-off.
The switch can be either a Shunt Switch or a Series Switch, depending on its position with
respect toinput and output.
Fig 1.9: The following image shows a JFET configured as both a Shunt Switch and as a Series
Switch.
Analog signals need to be correctly "prepared" before they can be converted into digital
form for further processing. Signal conditioning is an electronic circuit that manipulates a
signal in a way that prepares it for the next stage of processing. Many data acquisition
applications involve environmental or mechanical measurement from sensors, such as
temperature and vibration. These sensors require signal conditioning before a data
acquisition device can effectively and accuratelymeasure the signal.
For example, thermocouple signals have very small voltage levels that must be amplified
before they can be digitized. Other sensors, such as resistance temperature detectors
(RTDs), accelerometers, and strain gauges require excitation to operate. All of these
preparation technologies are forms of signal conditioning.
Signal conditioning is one of the fundamental building blocks of modern data acquisition
(aka DAS or DAQ system). The basic purpose of a data acquisition system is to make
physical measurements.They are comprised of the following basic components:
Sensors (see What Is a Sensor guide)
Signal Conditioning (this article)
Analog-to-Digital Converter (ADC) (see What Is an A/D Converter guide),
And some sort of computer with DAQ software for signal logging and analysis.
Data acquisition systems need to connect to a wide variety of sensors and signals in order to
do their job. Signal conditioners take the analog signal from the sensor, manipulate it, and
send it to the ADC (analog-to-digital converter) subsystem to be digitized for further
processing (usually by computer software).
As the name implies, they are in the business of conditioning signals so that they can be
converted into the digital domain by the A/D subsystem, and then displayed, stored, and
analyzed.
After all, you cannot directly connect 500V to one of the inputs of an A/D card - and
thermocouples, RTDs, LVDTs, and other sensors require conditioning to operate and to
provide a normalized voltage output that can be input into the A/D card.
Developing user interfaces for embedded systems can have some difficult obstacles. We’ve
come up with these 5 important steps to help:
This doesn’t necessarily mean should be easy or intuitive. There are plenty of successful
software kits, such as Photoshop or AutoCAD, that are definitely not easy or intuitive for
first-time users. However, they are packed full of necessary functions and features for
professional power-users that, once they are trained and well-practiced, become exactly what
they want and need.
Therefore, it’s incredibly important that we fully understand who the typical users
are, and how their knowledge, experience and needs affect the way they use the product.
Are the users old or young? Are they used to using technology on a day-to-day basis? Are
they looking for a quick solution or anin-depth system? Do they mind having to read lengthy
instructions? These are the questions that should be asked about the typical user base, to
ensure that the interface is appropriate for them.
Software development is commonly designed in layers, with drivers being the bottom level,
and user interface being the top. In traditional software development, the bottom layer of
code for the whole project is written at the start, followed by the second layer and so on.
However, this means that youare only able to see whole elements of the project at the end of
development, providing the customerwith little to feed back on.
Vertical slicing means that instead of working on the horizontal layers of the software one at a
time, we choose a specific feature and work through all the layers of code for just that
feature. This method means it takes less time to create something that is demonstrable to the
client, therefore giving us the opportunity to get feedback much sooner.
Embedded systems are domain and application specific and are built around a central core. The
core of the embedded system falls into any one of the following categories:
(1) General Purpose and Domain Specific Processors
1.1 Microprocessors
1.2 Microcontrollers
1.3 Digital Signal Processors
(2) Application Specific Integrated Circuits (ASICs)
(3) Programmable Logic Devices (PLDs)
(4) Commercial off-the-shelf Components (COTS)
If you examine any embedded system you will find that it is built around any of the core units
mentioned above.
Almost 80% of the embedded systems are processor/controller based. The processor may be a
microprocessor or a microcontroller or a digital signal processor, depending on the domain and
application. Most of the embedded systems in the industrial control and monitoring applications
make use of the commonly available microprocessors or microcontrollers whereas domains
which require signal processing such as speech coding, speech recognition, etc. make use of
special kind of digital signal processors supplied by manufacturers like, Analog Devices, Texas
Instruments, etc.
2.1.1.1) Microprocessors
2.1.1.3) Microcontrollers:
A highly integrated silicon chip containing a CPU, scratch pad RAM, Special and
General Purpose Register Arrays, On Chip ROM/FLASH memory for program storage,
Timer and Interrupt control units and dedicated I/O ports.
Microcontrollers can be considered as a super set of Microprocessors
Microcontroller can be general purpose (like Intel 8051, designed for generic applications
and domains) or application specific (Like Automotive AVR from Atmel Corporation.
Designed specifically for automotive applications)
Since a microcontroller contains all the necessary functional blocks and
independent working, they found greater place in the embedded domain in place of
microprocessors
Microcontrollers are cheap, cost effective and are readily available in the market Texas
Instruments TMS 1000 is considered as the world‟s first microcontroller.
Powerful special purpose 8/16/32 bit microprocessors designed specifically to meet the
computational demands and power constraints of today's embedded audio, video, and
communications applications
Digital Signal Processors are 2 to 3 times faster than the general purpose microprocessors in
signal processing applications
DSPs implement algorithms in hardware which speeds up the execution whereas general
purpose processors implement the algorithm in firmware and the speed of execution
depends primarily on the clock for the processors
DSP can be viewed as a microchip designed for performing high speed
computational operations for „addition‟, „subtraction‟, „multiplication‟ and „division‟. A
typical Digital Signal Processor incorporates the following key units
o Program Memory: Memory for storing the program required by DSP to process
the data.
o Data Memory: Memory for storing the program required by DSP to process the
data.
o Computational Engine: Performs the signal processing in accordance with the
stored program memory. Computational Engine incorporates many specialised
arithmetic units and each of them operates simultaneously to increase the execution
speed. It also incorporates multiple hardware shifters for shifting operands and
thereby saves execution time.
o I/O Unit: Acts as an interface between the outside world and DSP. It is responsible
for capturing signals to be processed and delivering the processed signals. Audio
video signal processing, telecommunication, and multimedia applications are typical
examples where DSP is employed. Digital signal processing employs a large amount
of real-time calculations. Sum of products (SOP) calculation, convolution, fast
fourier transform (FFT), discrete fourier transform (DFT), etc, are some of the
operations performed by digital signal processors.
customers since it is for a specific application. “The ADE7760 Energy Metre ASIC developed by
Analog Devices for Energy metering applications is a typical example for ASSP”. Since
Application Specific Integrated Circuits (ASICs) are proprietary products, the developers of such
chips may not be interested in revealing the internal details of it and hence it is very difficult to
point out an example of it. Moreover it will create legal disputes if an illustration of such an
ASIC product is given without getting prior permission from the manufacturer of the ASIC.
The two major types of programmable logic devices are Field Programmable Gate Arrays
(FPGAs) and Complex Programmable Logic Devices ( CPLDs). Of the two, FPGAs offer the
highest amount of logic density, the most features, and the highest performance. The largest
FPGA now shipping, part of the Xilinx Virtex™§ line of devices, provides eight million “system
gates” (the relative density of logic). These advanced devices also offer features such as built-in
hardwired processors (such as the IBM power PC), substantial amounts of memory, clock
management systems, and support for many of the latest, very fast device-to-device signaling
technologies. FPGAs are used in a wide variety of applications ranging from data processing and
storage, to instrumentation, telecommunications, and digital signal processing. CPLDs, by
contrast, offer much smaller amounts of logic–up to about 10,000 gates. But CPLDs offer very
predictable timing characteristics and are therefore ideal for critical control applications. CPLDs
such as the Xilinx CoolRunner™ series also require extremely low amounts of power and are
very inexpensive, making them ideal for cost-sensitive, battery-operated, portable applications
such as mobile phones and digital handheld assistants.
A Commercial Off-the-Shelf (COTS) product is one which is used „as-is‟. COTS products are
designed in such a way to provide easy integration and interoperability with existing system
components. The COTS component itself may be developed around a general purpose or domain
specific processor or an Application Specific Integrated circuit or a programmable logic device.
Typical examples of COTS hardware unit are remote controlled toy car control units including
the RF circuitry part, high performance, high frequency microwave electronics (2–200 GHz),
high bandwidth analog-to-digital converters, devices and components for operation at very high
temperatures, electro-optic IR imaging arrays, UV/IR detectors, etc. The major advantage of
using COTS is that they are readily available in the market, are cheap and a developer can cut
down his/her development time to a great extent. This in turn reduces the time to market your
embedded systems. The TCP/IP plug-in module available from various manufactures like
„WIZnet‟, „HanRun‟, „Viewtool‟, etc. are very good examples of COTS product (Fig. 2.1).
Fig 2.1: An example of a COTS product for TCP/IP plug-in from WIZnet
This network plug-in module gives the TCP/IP connectivity to the system you are developing.
There is no need to design this module yourself and write the firmware for the TCP/IP protocol
and data transfer. Everything will be readily supplied by the COTS manufacturer. What you need
to do is identify the COTS for your system and give the plugin option on your board according to
the hardware plugin connections given in the specifications of the COTS. Though multiple
vendors supply COTS for the same application, the major problem faced by the end-user is that
there are no operational and manufacturing standards. This restricts the end-user to stick to a
particular vendor for particular COTS. This greatly affects the product design.
The major drawback of using COTS components in embedded design is that the manufacturer of
the COTS component may withdraw the product or discontinue the production of the COTS at
any time if a rapid change in technology occurs, and this will adversely affect a commercial
manufacturer of the embedded system which makes use of the specific COTS product.
2.2) MEMORY
The program memory or code storage memory of an embedded system stores the program
instructions and it can be classified into different types as per the block diagram representation
given in Fig. 2.2.
The code memory retains its contents even after the power to it is turned off. It is generally
known as non-volatile storage memory. Depending on the fabrication, erasing, and programming
techniques they are classified into the following types.
Masked ROM is a one-time programmable device. Masked ROM makes use of the hardwired
technology for storing data. The device is factory programmed by masking and metallisation
process at the time of production itself, according to the data provided by the end user. The
primary advantage of this is low cost for high volume production. They are the least expensive
type of solid state memory.
The limitation with MROM based firmware storage is the inability to modify the device
firmware against firmware upgrades. Since the MROM is permanent in bit storage, it is not
possible to alter the bit information.
Unlike Masked ROM Memory, One Time Programmable Memory (OTP) or PROM is not pre-
programmed by the manufacturer. The end user is responsible for programming these devices.
This memory has nichrome or polysilicon wires arranged in a matrix. These wires can be
functionally viewed as fuses. It is programmed by a PROM programmer which selectively burns
the fuses according to the bit pattern to be stored. Fuses which are not blown/burned represent
logic “1” whereas fuses which are blown/burned represents a logic “0”. The default state is logic
“1”. OTP is widely used for commercial production of embedded systems whose proto-typed
versions are proven and the code is finalised. It is a low cost solution for commercial production.
OTPs cannot be reprogrammed.
OTPs are not useful and worth for development purpose. During the development phase, the
code is subject to continuous changes and using an OTP each time to load the code is not
economical. Erasable Programmable Read Only Memory (EPROM) gives the flexibility to re-
program the same chip. Bit information is stored by using an EPROM programmer, which
applies high voltage to charge the floating gate. EPROM contains a quartz crystal window for
erasing the stored information. If the window is exposed to ultraviolet rays for a fixed duration,
the entire memory will be erased. Even though the EPROM chip is flexible in terms of re-
programmability, it needs to be taken out of the circuit board and put in a UV eraser device for
20 to 30 minutes. So it is a tedious and time-consuming process.
As the name indicates, the information contained in the EEPROM memory can be altered by
using electrical signals at the register/ Byte level. They can be erased and reprogrammed in-
circuit. These chips include a chip erase mode and in this mode they can be erased in a few
milliseconds. It provides greater flexibility for system design. The only limitation is their
capacity is limited when compared with the standard ROM (A few kilobytes).
2.2.1.5) FLASH
FLASH is the latest ROM technology and is the most popular ROM technology used in today‟s
embedded designs. FLASH memory is a variation of EEPROM technology. It combines the
reprogrammability of EEPROM and the high capacity of standard ROMs. FLASH memory is
organised as sectors (blocks) or pages. FLASH memory stores information in an array of floating
gate MOSFET transistors. The erasing of memory can be done at sector level or page level
without affecting the other sectors or pages. Each sector/page should be erased before re-
programming. The typical erasable capacity of FLASH is of the order of a few 1000 cycles.
SST39LF010 from Microchip (www.microchip.com) is an example of 1Mbit (Organised as
128K x8) Flash memory with typical endurance of 100,000 cycles.
2.2.1.6) NVRAM
Non-volatile RAM is a random access memory with battery backup. It contains static RAM
based memory and a minute battery for providing supply to the memory in the absence of
external power supply. The memory and battery are packed together in a single package. The life
span of NVRAM is expected to be around 10 years. DS1644 from Maxim/Dallas is an example
of 32KB NVRAM.
Static RAM stores data in the form of voltage. They are made up of flip-flops. Static RAM is the
fastest form of RAM available. In typical implementation, an SRAM cell (bit) is realised using
six transistors (or 6 MOSFETs). Four of the transistors are used for building the latch (flip-flop)
part of the memory cell and two for controlling the access. SRAM is fast in operation due to its
resistive networking and switching capabilities. In its simplest representation an SRAM cell can
be visualised as shown in Fig. 2.4.
This implementation in its simpler form can be visualised as two-cross coupled inverters with
read/write control through transistors. The four transistors in the middle form the cross-coupled
inverters. This can be visualised as shown in Fig. 2.5. From the SRAM implementation diagram,
it is clear that access to the memory cell is controlled by the line Word Line, which controls the
access transistors (MOSFETs) Q5 and Q6. The access transistors control the connection to bit
lines B & B\. In order to write a value to the memory cell, apply the desired value to the bit
control lines (For writing 1, make B = 1 and B\ =0; For writing 0, make B = 0 and B\ =1) and
assert the Word Line (Make Word line high). This operation latches the bit written in the flip-fl
op. For reading the content of the memory cell, assert both B and B\ bit lines to 1 and set the
Word line to 1.
The major limitations of SRAM are low capacity and high cost. Since a minimum of six
transistors are required to build a single memory cell, imagine how many memory cells we can
fabricate on a silicon wafer.
Dynamic RAM stores data in the form of charge. They are made up of MOS transistor gates. The
advantages of DRAM are its high density and low cost compared to SRAM. The disadvantage is
that since the information is stored as charge it gets leaked off with time and to prevent this they
need to be refreshed periodically.
Special circuits called DRAM controllers are used for the refreshing operation. The refresh
operation is done periodically in milliseconds interval. Figure 2.6 illustrates the typical
implementation of a DRAM cell. The MOSFET acts as the gate for the incoming and outgoing
data whereas the capacitor acts as the bit storage unit. Table given below summarises the relative
merits and demerits of SRAM and DRAM technology.
2.2.2.3) NVRAM
Non-volatile RAM is a random access memory with battery backup. It contains static RAM
based memory and a minute battery for providing supply to the memory in the absence of
external power supply. The memory and battery are packed together in a single package.
NVRAM is used for the non-volatile storage of results of operations or for setting up of flags,
etc. The life span of NVRAM is expected to be around 10 years. DS1744 from Maxim/Dallas is
an example for 32KB NVRAM.
The interface (connection) of memory with the processor/controller can be of various types. It
may be a parallel interface [The parallel data lines (D0-D7) for an 8 bit processor/controller will
be connected to D0-D7 of the memory] or the interface may be a serial interface like I2C
(Pronounced as I Square C. It is a 2 line serial interface) or it may be an SPI (Serial peripheral
interface, 2+n line interface where n stands for the total number of SPI bus devices in the
system). It can also be of a single wire interconnection (like Dallas 1-Wire interface). Serial
interface is commonly used for data storage memory like EEPROM. The memory density of a
serial memory is usually expressed in terms of kilobits, whereas that of a parallel interface
memory is expressed in terms of kilobytes. Atmel Corporations AT24C512 is an example for
serial memory with capacity 512 kilobits and 2-wire interface. Please refer to the section
„Communication Interface‟ for more details on I2C, SPI, and 1-Wire Bus.
Generally the execution of a program or a configuration from a Read Only Memory (ROM) is
very slow (120 to 200 ns) compared to the execution from a random access memory (40 to 70
ns). From the timing parameters it is obvious that RAM access is about three times as fast as
ROM access. Shadowing of memory is a technique adopted to solve the execution speed
problem in processor-based systems. In computer systems and video systems there will be a
configuration holding ROM called Basic Input Output Configuration ROM or simply BIOS. In
personal computer systems BIOS stores the hardware configuration information like the address
assigned for various serial ports and other non-plug „n‟ play devices, etc. Usually it is read and
the system is configured according to it during system boot up and it is time consuming. Now the
manufactures included a RAM behind the logical layer of BIOS at its same address as a shadow
to the BIOS and the first step that happens during the boot up is copying the BIOS to the
shadowed RAM and write protecting the RAM then disabling the BIOS reading.
You may be thinking that what a stupid idea it is and why both RAM and ROM are
needed for holding the same data. The answer is: RAM is volatile and it cannot hold the
configuration data which is copied from the BIOS when the power supply is switched off. Only a
ROM can hold it permanently. But for high system performance it should be accessed from a
RAM instead of accessing from a ROM.
List of Factors:
Speed: The time to read or write the data should be greatest consideration while selecting the
memory. In general, speed will not be a greater issue for small embedded applications. But
when you go for medium or high range applications, read/write access time should be faster.
Latency: It is the time between initiating the request of data until it is received. When
executing the processor instructions, it request the data stored in the memory. The requested
data must be retrieved by the processor for quick operation. Less latency, then more speed
operation.
Memory Capacity: If your application need below 60 MB, it is advisable to choose the
memory size as 64 MB. At the same time, running out of storage is the worst feeling we face
with the digital camera. So it is important to choose the capacity as needed for the application.
Size: The size of the memory device should be compatible with the embedded system. For
hand held devices, the size of the memory should be compact in nature. If it is a desktop
computer system, the size can be of medium sized. Proper size selection is also an essential
criterion while selecting a memory.
Power Consumption: Memory needs power to read or write data. For high access speed, the
power consumptions will be more, results in more power dissipation. More heat will reduce the
lifetime of the embedded system. Hence the designers should go with optimum power
consumption memory devices.
Cost: Cost plays a significant role in deciding any product. While planning to design an
embedded system, similar importance should be given to memory as that of processor selection.
Money should be allocated, considering the type of memory to be used in the embedded
system.
2.3.1) SENSORS
A sensor is a transducer device that converts energy from one form to another for any
measurement or control purpose.
2.3.2) ACTUATORS
The I/O subsystem of the embedded system facilitates the interaction of the embedded system
with the external world. As mentioned earlier the interaction happens through the sensors and
actuators connected to the input and output ports respectively of the embedded system. The
sensors may not be directly interfaced to the input ports, instead they may be interfaced through
signal conditioning and translating systems like ADC, optocouplers, etc. This section illustrates
some of the sensors and actuators used in embedded systems and the I/O systems to facilitate the
interaction of embedded systems with external world.
The Inter Integrated Circuit Bus (I2C–Pronounced „I square C‟) is a synchronous bi-directional
half duplex (one-directional communication at a given point of time) two wire serial interface
bus. The concept of I2C bus was developed by „Philips semiconductors‟ in the early 1980s.
The I2C bus comprise of two bus lines, namely; Serial Clock–SCL and Serial Data–SDA. SCL
line is responsible for generating synchronisation clock pulses and SDA is responsible for
transmitting the serial data across devices.
I2C supports multimasters on the same bus. The following bus interface diagram shown in Fig.
2.7 illustrates the connection of master and slave devices on the I2C bus.
The sequence of operations for communicating with an I2C slave device is listed below:
1. The master device pulls the clock line (SCL) of the bus to „HIGH‟
2. The master device pulls the data line (SDA) „LOW‟, when the SCL line is at logic „HIGH‟
(This is the „Start‟ condition for data transfer)
3. The master device sends the address (7 bit or 10 bit wide) of the „slave‟ device to which it
wants to communicate, over the SDA line. Clock pulses are generated at the SCL line for
synchronising the bit reception by the slave device. The MSB of the data is always transmitted
first. The data in the bus is valid during the „HIGH‟ period of the clock signal
4. The master device sends the Read or Write bit (Bit value = 1 Read operation; Bit value = 0
Write operation) according to the requirement.
5. The master device waits for the acknowledgement bit from the slave device whose address is
sent on the bus along with the Read/Write operation command. Slave devices connected to the
bus compares the address received with the address assigned to them
6. The slave device with the address requested by the master device responds by sending an
acknowledge bit (Bit value = 1) over the SDA line
7. Upon receiving the acknowledge bit, the Master device sends the 8bit data to the slave device
over SDA line, if the requested operation is „Write to device‟. If the requested operation is „Read
from device‟, the slave device sends data to the master over the SDA line
8. The master device waits for the acknowledgement bit from the device upon byte transfer
complete for a write operation and sends an acknowledge bit to the Slave device for a read
operation
9. The master device terminates the transfer by pulling the SDA line „HIGH‟ when the clock line
SCL is at logic „HIGH‟ (Indicating the „STOP‟ condition)
The first generation I2C devices were designed to support data rates only up to 100kbps.
The Serial Peripheral Interface Bus (SPI) is a synchronous bi-directional full duplex four-wire
serial interface bus. The concept of SPI was introduced by Motorola. SPI is a single master
multi-slave system. It is possible to have a system where more than one SPI device can be
master, provided the condition only one master device is active at any given point of time, is
satisfied. SPI requires four signal lines for communication.
They are:
i) Master Out Slave In (MOSI): Signal line carrying the data from master to slave device. It is
also known as Slave Input/Slave Data In (SI/SDI)
ii) Master In Slave Out (MISO): Signal line carrying the data from slave to master device. It is
also known as Slave Output (SO/SDO)
iii) Serial Clock (SCLK): Signal line carrying the clock signals
iv) Slave Select (SS): Signal line for slave device select. It is an active low signal
The bus interface diagram shown in Fig. 2.8 illustrates the connection of master and slave
devices on the SPI bus.
The master device is responsible for generating the clock signal. It selects the required slave
device by asserting the corresponding slave device‟s slave select signal „LOW‟. The data out line
(MISO) of all the slave devices when not selected fl oats at high impedance state.
SPI works on the principle of „Shift Register‟. The master and slave devices contain a special
shift register for the data to transmit or receive. The size of the shift register is device dependent.
Normally it is a multiple of 8. During transmission from the master to slave, the data in the
master‟s shift register is shifted out to the MOSI pin and it enters the shift register of the slave
device through the MOSI pin of the slave device.
At the same time the shifted out data bit from the slave device‟s shift register enters the shift
register of the master device through MISO pin. In summary, the shift registers of „master‟ and
„slave‟ devices form a circular buffer.
When compared to I2C, SPI bus is most suitable for applications requiring transfer of data in
„streams‟. The only limitation is SPI doesn‟t support an acknowledgement mechanism.
The start and stop of communication is indicated through inserting special bits in the data stream.
While sending a byte of data, a start bit is added first and a stop bit is added at the end of the bit
stream. The least significant bit of the data byte follows the „start‟ bit.
The „start‟ bit informs the receiver that a data byte is about to arrive. If parity is enabled for
communication, the UART of the transmitting device adds a parity bit. The UART of the
receiving device calculates the parity of the bits received and compares it with the received
parity bit for error checking.
The UART of the receiving device discards the „Start‟, „Stop‟ and „Parity‟ bit from the received
bit stream and converts the received serial bit data to a word. Figure 2.9 illustrates the same.
Every 1-wire device contains a globally unique 64bit identification number stored within it. This
unique identification number can be used for addressing individual devices present on the bus in
case there are multiple slave devices connected to the 1-wire bus. The identifier has three parts:
an 8bit family code, a 48bit serial number and an 8bit Cyclic Redundancy Check (CRC)
computed from the first 56 bits. The sequence of operation for communicating with a 1-wire
slave device is listed below.
1. The master device sends a „Reset‟ pulse on the 1-wire bus.
2. The slave device(s) present on the bus respond with a „Presence‟ pulse.
3. The master device sends a ROM command (Net Address Command followed by the 64bit
address of the device). This addresses the slave device(s) to which it wants to initiate a
communication.
4. The master device sends a read/write function command to read/write the internal memory or
register of the slave device.
5. The master initiates a Read data/Write data from the device or to the device.
All communication over the 1-wire bus is master initiated. The communication over the 1-wire
bus is divided into timeslots of 60 microseconds for the regular speed mode of operation
(16.3kbps). The „Reset‟ pulse occupies 8 time slots. For starting a communication, the master
asserts the reset pulse by pulling the 1-wire bus „LOW‟ for at least 8 time slots (480ms). If a
„slave‟ device is present on the bus and is ready for communication it should respond to the
master with a „Presence‟ pulse, within 60ms of the release of the „Reset‟ pulse by the master.
The slave device(s) responds with a „Presence‟ pulse by pulling the 1-wire bus „LOW‟ for a
minimum of 1 time slot (60ms). For writing a bit value of 1 on the 1-wire bus, the bus master
pulls the bus for 1 to 15ms and then releases the bus for the rest of the time slot. A bit value of
„0‟ is written on the bus by master pulling the bus for a minimum of 1 time slot (60ms) and a
maximum of 2 time slots (120ms).
To Read a bit from the slave device, the master pulls the bus „LOW‟ for 1 to 15ms. If the slave
wants to send a bit value „1‟ in response to the read request from the master, it simply releases
the bus for the rest of the time slot. If the slave wants to send a bit value „0‟, it pulls the bus
„LOW‟ for the rest of the time slot.
The on-board parallel interface is normally used for communicating with peripheral devices
which are memory mapped to the host of the system. The host processor/controller of the
embedded system contains a parallel bus and the device which supports parallel bus can directly
connect to this bus system. The communication through the parallel bus is controlled by the
control signal interface between the device and the host. The „Control Signals‟ for
communication includes „Read/Write‟ signal and device select signal. The device normally
contains a device select line and the device becomes active only when this line is asserted by the
host processor. The direction of data transfer (Host to Device or Device to Host) can be
controlled through the control signal lines for „Read‟ and „Write‟. Only the host processor has
control over the „Read‟ and „Write‟ control signals. The device is normally memory mapped to
the host processor and a range of address is assigned to it. An address decoder circuit is used for
generating the chip select signal for the device. When the address selected by the processor is
within the range assigned for the device, the decoder circuit activates the chip select line and
thereby the device becomes active. The processor then can read or write from or to the device by
asserting the corresponding control line (RD\ and WR\ respectively). Strict timing characteristics
are followed for parallel communication. As mentioned earlier, parallel communication is host
processor initiated. If a device wants to initiate the communication, it can inform the same to the
processor through interrupts. For this, the interrupt line of the device is connected to the interrupt
line of the processor and the corresponding interrupt is enabled in the host processor. The width
of the parallel interface is determined by the data bus width of the host processor. The bus
interface diagram shown in Fig. 2.11 illustrates the interfacing of devices through parallel
interface.
Parallel data communication offers the highest speed for data transfer.
RS-232 C (Recommended Standard number 232, revision C from the Electronic Industry
Association) is a legacy, full duplex, wired, asynchronous serial communication interface. The
RS-232 interface is developed by the Electronics Industries Association (EIA) during the early
1960s. RS-232 extends the UART communication signals for external data communication.
In EIA standard, logic „0‟ is known as „Space‟ and logic „1‟ as „Mark‟. The RS-232 interface
defines various handshaking and control signals for communication apart from the „Transmit‟
and „Receive‟ signal lines for data communication. RS-232 supports two different types of
connectors, namely; DB-9: 9-Pin connector and DB-25: 25-Pin connector. Figure 2.12 illustrates
the connector details for DB-9 and DB-25.
Table 2.1: The pin details for the two connectors are explained in the following table
The Data Terminal Ready (DTR) signal is activated by DTE when it is ready to accept data. The
Data Set Ready (DSR) is activated by DCE when it is ready for establishing a communication
link. DTR should be in the activated state before the activation of DSR. The Data Carrier Detect
(DCD) control signal is used by the DCE to indicate the DTE that a good signal is being
received. Ring Indicator (RI) is a modem specific signal line for indicating an incoming call on
the telephone line. The 25 pin DB connector contains two sets of signal lines for transmit,
receive and control lines.
As per the EIA standard RS-232 C supports baudrates up to 20Kbps (Upper limit 19.2 Kbps) The
commonly used baudrates by devices are 300bps, 1200bps, 2400bps, 9600bps, 11.52Kbps and
19.2Kbps. 9600 is the popular baudrate setting used for PC communication. The maximum
operating distance supported by RS-232 is 50 feet at the highest supported baudrate.
Universal Serial Bus (USB) is a wired high speed serial bus for data communication. The first
version of USB (USB1.0) was released in 1995 and was created by the USB core group members
consisting of Intel, Microsoft, IBM, Compaq, Digital and Northern Telecom. The USB
communication system follows a star topology with a USB host at the centre and one or more
wUSB peripheral devices/USB hosts connected to it. A USB 2.0 host can support connections up
to 127, including slave peripheral devices and other USB hosts. Figure 2.13 illustrates the star
topology for USB device connection. USB transmits data in packet format. The USB
communication is a host initiated one. The USB host contains a host controller which is
responsible for controlling he data communication, including establishing connectivity with USB
slave devices, packetizing and formatting the data packet.
The Pin details for the USB 2.0 Type A & B connectors are listed in the table given below.
USB uses differential signals for data transmission. It improves the noise immunity. USB
interface has the ability to supply power to the connecting devices. Two connection lines
(Ground and Power) of the USB interface are dedicated for carrying power. A Standard
Downstream USB 2.0 Port (SDP) can supply power up to 500 mA at 5 V, whereas a Charging
Downstream USB 2.0 Port (CDP) can supply power up to 1500 mA at 5 V.
Presently USB supports different data rates namely; Low Speed (1.5Mbps), Full Speed
(12Mbps), High Speed (480Mbps), SuperSpeed (5Gbps) and SuperSpeed + (or SuperSpeed USB
10 Gbps). Wireless USB combines the speed and security of wired USB technology with the
ease-of-use of wireless technology.
IEEE 1394 is a wired, isochronous high speed serial communication bus. It is also known as
High Performance Serial Bus (HPSB). The research on 1394 was started by Apple Inc. in 1985
and the standard for this was coined by IEEE. Apple Inc‟s (www.apple.com) implementation of
1394 protocol is popularly known as Firewire. i.LINK is the 1394 implementation from Sony
Corporation (www.sony.net) and Lynx is the implementation from Texas Instruments
(www.ti.com). 1394 supports peer-to-peer connection and point- to- multipoint communication
allowing 63 devices to be connected on the bus in a tree topology. 1394 is a wired serial interface
and it can support a cable length of up to 15 feet for interconnection. The 1394 standard has
evolved a lot from the first version IEEE 1394–1995 released in 1995 to the recent version IEEE
1394–2008 released in June 2008. The 1394 standard supports a data rate of 400 to
3200Mbits/second.
Table 2.3 Pin Configuration of IEEE 1394
There are two differential data transfer lines A and B per connector. In a 1394 cable, normally
the differential lines of A are connected to B (TPA+ to TPB+ and TPA–to TPB–) and vice versa.
1394 is a popular communication interface for connecting embedded devices like Digital
Camera, Camcorder, Scanners to desktop computers for data transfer and storage.
Unlike USB interface (Except USB OTG), IEEE 1394 doesn‟t require a host for communicating
between devices. For example, you can directly connect a scanner with a printer for printing. The
data-rate supported by 1394 is far higher than the one supported by USB2.0 interface. The 1394
hardware implementation is much costlier than USB implementation.
Infrared (IrDA) is a serial, half duplex, line of sight based wireless technology for data
communication between devices. It is in use from the olden days of communication and you may
be very familiar with it. The remote control of your TV, VCD player, etc. works on Infrared data
communication principle. Infrared communication technique uses infrared waves of the
electromagnetic spectrum for transmitting the data. IrDA supports point-point and point-to-
multipoint communication, provided all devices involved in the communication are within the
line of sight. The typical communication range for IrDA lies in the range 10 cm to 1 m. The
range can be increased by increasing the transmitting power of the IR device. IR supports data
rates ranging from 9600bits/second to 16Mbps. Depending on the speed of data transmission IR
is classified into Serial IR (SIR), Medium IR (MIR), Fast IR (FIR), Very Fast IR (VFIR), Ultra
Fast IR (UFIR) and GigaIR. SIR supports transmission rates ranging from 9600bps to 115.2kbps.
MIR supports data rates of 0.576Mbps and 1.152Mbps. FIR supports data rates up to 4Mbps.
VFIR is designed to support high data rates up to 16Mbps. The UFIR supports data rates up-to
96Mbps, whereas the GigaIR supports data rates 512 Mbps to 1 Gbps. IrDA communication
involves a transmitter unit for transmitting the data over IR and a receiver for receiving the data.
Infrared Light Emitting Diode (LED) is the IR source for transmitter and at the receiving end a
photodiode acts as the receiver. Both transmitter and receiver unit will be present in each device
supporting IrDA communication for bidirectional data transfer. Such IR units are known as
„Transceiver‟. Certain devices like a TV remote control always require unidirectional
communication and so they contain either the transmitter or receiver unit (The remote control
unit contains the transmitter unit and TV contains the receiver unit).
Bluetooth is a low cost, low power, short range wireless technology for data and audio
communication. Bluetooth was first proposed by „Ericsson‟ in 1994. Bluetooth operates at
2.4GHz of the Radio Frequency spectrum and uses the Frequency Hopping Spread Spectrum
(FHSS) technique for communication. Literally it supports a data rate of up to 1Mbps to 24Mbps
(and a range of approximately 30 to 100 feet (Depending on the Bluetooth version – v1.2
supports datarate up to 1Mbps, v2.0 + EDR supports datarate up to 3Mbps, v3.0 + HS and v4.0
supports datarate up to 24Mbps)) for data communication.
Like IrDA, Bluetooth communication also has two essential parts; a physical link part and a
protocol part. The physical link is responsible for the physical transmission of data between
devices supporting Bluetooth communication and protocol part is responsible for defining the
rules of communication.
Bluetooth communication follows packet based data transfer. Bluetooth supports point-
to-point (device to device) and point-to-multipoint (device to multiple device broadcasting)
wireless communication. The point- to- point communication follows the master slave
relationship. A Bluetooth device can function as either master or slave. When a network is
formed with one Bluetooth device as master and more than one device as slaves, it is called a
Piconet. A Piconet supports a maximum of seven slave devices. Bluetooth is the favourite choice
for short range data communication in handheld embedded devices. Bluetooth technology is very
popular among cell phone users as they are the easiest communication channel for transferring
ringtones, music files, pictures, media fi les, etc. between neighboring Bluetooth enabled phones.
The specifications for Bluetooth communication is defined and licensed by the standards body
„Bluetooth Special Interest Group (SIG)‟. For more information, please visit the website
www.bluetooth.org.
2.4.2.6) Wi-Fi
Wi-Fi or Wireless Fidelity is the popular wireless communication technique for networked
communication of devices. Wi-Fi follows the IEEE 802.11 standard. Wi-Fi is intended for
network communication and it supports Internet Protocol (IP) based communication. It is
essential to have device identities in a multipoint communication to address specific devices for
data communication. In an IP based communication each device is identified by an IP address,
which is unique to each device on the network.
Wi-Fi based communications require an intermediate agent called Wi-Fi router/Wireless Access
point to manage the communications. The Wi-Fi router is responsible for restricting the access to
a network, assigning IP address to devices on the network, routing data packets to the intended
devices on the network. Wi-Fi enabled devices contain a wireless adaptor for transmitting and
receiving data in the form of radio signals through an antenna. The hardware part of it is known
as Wi-Fi Radio. Wi-Fi operates at 2.4GHz or 5GHz of radio spectrum and they co-exist with
other ISM band devices like Bluetooth. Figure 2.14 illustrates the typical interfacing of devices
in a Wi-Fi network. For communicating with devices over a Wi-Fi network, the device when its
Wi-Fi radio is turned ON, searches the available Wi- Fi network in its vicinity and lists out the
Service Set Identifier (SSID) of the available networks. If the network is security enabled, a
password may be required to connect to a particular SSID. Wi-Fi employs different security
mechanisms like Wired Equivalency Privacy (WEP) Wireless Protected Access (WPA), etc. for
securing the data communication. Wi-Fi supports data rates ranging from 1Mbps to 1300Mbps
(Growing towards higher rates as technology progresses), depending on the standards
(802.11a/b/g/n/ac) and access/modulation method. Depending on the type of antenna and usage
location (indoor/outdoor), Wi-Fi offers a range of 100 to 1000 feet.
2.4.2.7) ZigBee
ZigBee is a low power, low cost, wireless network communication protocol based on the IEEE
802.15.4-2006 standard. ZigBee is targeted for low power, low data rate and secure applications
for Wireless Personal Area Networking (WPAN). The ZigBee specifications support a robust
mesh network containing multiple nodes. This networking strategy makes the network reliable
by permitting messages to travel through a number of different paths to get from one node to
another.
ZigBee operates worldwide at the unlicensed bands of Radio spectrum, mainly at 2.400 to 2.484
GHz, 902 to 928 MHz and 868.0 to 868.6 MHz. ZigBee Supports an operating distance of up to
100 metres and a data rate of 20 to 250Kbps.
In the ZigBee terminology, each ZigBee device falls under any one of the following ZigBee
device category.
ZigBee Coordinator (ZC)/Network Coordinator: The ZigBee coordinator acts as the root of
the ZigBee network. The ZC is responsible for initiating the ZigBee network and it has the
capability to store information about the network.
ZigBee Router (ZR)/Full Function Device (FFD): Responsible for passing information from
device to another device or to another ZR.
ZigBee End Device (ZED)/Reduced Function Device(RFD): End device containing ZigBee
functionality for data communication. It can talk only with a ZR or ZC and doesn‟t have the
capability to act as a mediator for transferring data from one device to another.
The diagram shown in Fig. 2.15 gives an overview of ZC, ZED and ZR in a ZigBee network.
ZigBee is primarily targeting application areas like home & industrial automation, energy
management, home control/security, medical/patient tracking etc
ZigBee PRO offers full wireless mesh, low-power networking capable of supporting more than
64,000 devices on a single network. It provides standardised networking designed to connect the
widest range of devices, in any industry, into a single control network. ZigBee PRO offers an
optional new and innovative feature, „Green Power‟ to connect energy harvesting or self-
powered devices into ZigBee PRO networks.
The „Green Power‟ feature of ZigBee PRO is the most eco-friendly way to power battery-less
devices such as sensors, switches, dimmers and many other devices and allows them to securely
join ZigBee PRO networks. The specifications for ZigBee is developed and managed by the
ZigBee alliance (www.zigbee.org), a non-profit consortium of leading semiconductor
manufacturers, technology providers, OEMs and end-users worldwide.
3G, 4G, LTE General Packet Radio Service (GPRS), 3G, 4G and LTE are cellular
communication technique for transferring data over a mobile communication network like GSM
and CDMA. Data is sent as packets in GPRS communication. The transmitting device splits the
data into several related packets. At the receiving end the data is re-constructed by combining the
received data packets. GPRS supports a theoretical maximum transfer rate of 171.2kbps. In
GPRS communication, the radio channel is concurrently shared between several users instead of
dedicating a radio channel to a cell phone user. The GPRS communication divides the channel
into 8 timeslots and transmits data over the available channel. GPRS supports Internet Protocol
(IP), Point to Point Protocol (PPP) and X.25 protocols for communication. GPRS is mainly used
by mobile enabled embedded devices for data communication.
The device should support the necessary GPRS hardware like GPRS modem and GPRS radio.
To accomplish GPRS based communication, the carrier network also should have support for
GPRS communication. GPRS is an old technology and it is being replaced by new generation
cellular data communication techniques like 3G (3rd Generation), High Speed Downlink Packet
Access (HSDPA), 4G (4th Generation), LTE (Long Term Evolution) etc. which offers higher
bandwidths for communication. 3G offers data rates ranging from 144Kbps to 2Mbps or higher,
whereas 4G gives a practical data throughput of 2 to 100+ Mbps depending on the network and
underlying technology.
The operating system acts as a bridge between the user applications/tasks and the underlying
system resources through a set of system functionalities and services. The OS manages the system
resources and makes them available to the user applications/tasks on a need basis. A normal
computing system is a collection of different I/O subsystems, working, and storage memory. The
primary functions of an operating system is
Figure 3.1 gives an insight into the basic components of an operating system and their interfaces
with rest of the world.
The kernel is the core of the operating system and is responsible for managing the system
resources and the communication among the hardware and other system services. Kernel acts as
the abstraction layer between system resources and user applications. Kernel contains a set of
system libraries and services. For a general purpose OS, the kernel contains different services for
handling the following.
Process Management: Process management deals with managing the processes/tasks. Process
management includes setting up the memory space for the process, loading the process’s code into
the memory space, allocating system resources, scheduling and managing the execution of the
process, setting up and managing the Process Control Block (PCB), Inter Process Communication
and synchronisation, process termination/deletion, etc.
Primary Memory Management: The term primary memory refers to the volatile memory (RAM)
where processes are loaded and variables and shared data associated with each process are stored.
The Memory Management Unit (MMU) of the kernel is responsible for
Keeping track of which part of the memory area is currently used by which process
Allocating and De-allocating memory space on a need basis (Dynamic memory allocation).
File System Management: File is a collection of related information. A file could be a program
(source code or executable), text files, image files, word documents, audio/video files, etc. Each of
these files differ in the kind of information they hold and the way in which the information is
stored. The file operation is a useful service provided by the OS. The file system management
service of Kernel is responsible for
The various file system management operations are OS dependent. For example, the kernel of
Microsoft®DOS OS supports a specific set of file system management operations and they are not
the same as the file system operations supported by UNIX Kernel.
I/O System (Device) Management: Kernel is responsible for routing the I/O requests coming
from different user applications to the appropriate I/O devices of the system. In a well-structured
OS, the direct accessing of I/O devices are not allowed and the access to them are provided
through a set of Application Programming Interfaces (APIs) exposed by the kernel. The kernel
maintains a list of all the I/O devices of the system. This list may be available in advance, at the
time of building the kernel. Some kernels, dynamically updates the list of available devices as and
when a new device is installed (e.g. Windows NT kernel keeps the list updated when a new plug
‘n’ play USB device is attached to the system). The service ‘Device Manager’ (Name may vary
across different OS kernels) of the kernel is responsible for handling all I/O device related
operations.
The kernel talks to the I/O device through a set of low-level systems calls, which are implemented
in a service, called device drivers. The device drivers are specific to a device or a class of devices.
The Device Manager is responsible for
Loading and unloading of device drivers
Exchanging information and the system specific control signals to and from the device
Secondary Storage Management: The secondary storage management deals with managing the
secondary storage memory devices, if any, connected to the system. Secondary memory is used as
backup medium for programs and data since the main memory is volatile. In most of the systems,
the secondary storage is kept in disks (Hard Disk). The secondary storage management service of
kernel deals with
Disk storage allocation
Disk scheduling (Time interval at which the disk is activated to backup data)
Free Disk space management
Protection Systems: Most of the modern operating systems are designed in such a way to support
multiple users with different levels of access permissions (e.g. Windows 10 with user permissions
like ‘Administrator’, ‘Standard’, ‘Restricted’, etc.). Protection deals with implementing the
security policies to restrict the access to both user and system resources by different applications or
processes or users. In multiuser supported operating systems, one user may not be allowed to view
or modify the whole/portions of another user’s data or profile details. In addition, some application
may not be granted with permission to make use of some of the system resources. This kind of
protection is provided by the protection services running within the kernel.
Interrupt Handler Kernel provides handler mechanism for all external/internal interrupts generated
by the system.
These are some of the important services offered by the kernel of an operating system. It does not
mean that a kernel contains no more than components/services explained above. Depending on the
type of the operating system, a kernel may contain lesser number of components/services or more
number of components/services. In addition to the components/services listed above, many
operating systems offer a number of addon system components/services to the kernel. Network
communication, network management, user-interface graphics, timer services (delays, timeouts,
etc.), error handler, database management, etc. are examples for such components/services. Kernel
exposes the interface to the various kernel applications/services, hosted by kernel, to the user
applications through a set of standard Application Programming Interfaces (APIs). User
applications can avail these API calls to access the various kernel application/services.
and is protected from the unauthorised access by user programs/applications. The memory space at
which the kernel code is located is known as ‘Kernel Space’. Similarly, all user applications are
loaded to a specific area of primary memory and this memory area is referred as ‘User Space’.
User space is the memory area where user applications are loaded and executed. The partitioning
of memory into kernel and user space is purely Operating System dependent. Some OS
implements this kind of partitioning and protection whereas some OS do not segregate the kernel
and user application code storage into two separate areas. In an operating system with virtual
memory support, the user applications are loaded into its corresponding virtual memory space with
demand paging technique; Meaning, the entire code for the user application need not be loaded to
the main (primary) memory at once; instead the user application code is split into different pages
and these pages are loaded into and out of the main memory area on a need basis. The act of
loading the code into and out of the main memory is termed as ‘Swapping’. Swapping happens
between the main (primary) memory and secondary storage memory. Each process run in its own
virtual memory space and are not allowed accessing the memory space corresponding to another
processes, unless explicitly requested by the process. Each process will have certain privilege
levels on accessing the memory of other processes and based on the privilege settings, processes
can request kernel to map another process’s memory to its own or share through some other
mechanism. Most of the operating systems keep the kernel application code in main memory and it
is not swapped out into the secondary memory.
As we know, the kernel forms the heart of an operating system. Different approaches are adopted
for building an Operating System kernel. Based on the kernel design, kernels can be classifi ed into
‘Monolithic’ and ‘Micro’.
Monolithic Kernel In monolithic kernel architecture, all kernel services run in the kernel space.
Here all kernel modules run within the same memory space under a single kernel thread. The tight
internal integration of kernel modules in monolithic kernel architecture allows the effective
utilisation of the low-level features of the underlying system. The major drawback of monolithic
kernel is that any error or failure in any one of the kernel modules leads to the crashing of the
entire kernel application. LINUX, SOLARIS, MS-DOS kernels are examples of monolithic kernel.
The architecture representation of a monolithic kernel is given in Fig. 3.2.
Microkernel: The microkernel design incorporates only the essential set of Operating System
services into the kernel. The rest of the Operating System services are implemented in programs
known as ‘Servers’ which runs in user space. This provides a highly modular design and OS-
neutral abstraction to the kernel. Memory management, process management, timer systems and
interrupt handlers are the essential services, which forms the part of the microkernel. Mach, QNX,
Minix 3 kernels are examples for microkernel. The architecture representation of a microkernel is
shown in Fig. 3.3.
Depending on the type of kernel and kernel services, purpose and type of computing systems
where the OS is deployed and the responsiveness to applications, Operating Systems are classifi ed
into different types.
Real-Time Operating Systems that strictly adhere to the timing constraints for a task is referred as
‘Hard Real-Time’ systems. A Hard Real-Time system must meet the deadlines for a task without
any slippage. Missing any deadline may produce catastrophic results for Hard Real-Time Systems,
including permanent data lose and irrecoverable damages to the system/users. Hard Real-Time
systems emphasise the principle ‘A late answer is a wrong answer’. A system can have several
such tasks and the key to their correct operation lies in scheduling them so that they meet their
time constraints. Air bag control systems and Anti-lock Brake Systems (ABS) of vehicles are
typical examples for Hard Real-Time Systems. The Air bag control system should be into action
and deploy the air bags when the vehicle meets a severe accident. Ideally speaking, the time for
triggering the air bag deployment task, when an accident is sensed by the Air bag control system,
should be zero and the air bags should be deployed exactly within the time frame, which is
predefined for the air bag deployment task. Any delay in the deployment of the air bags makes the
life of the passengers under threat. When the air bag deployment task is triggered, the currently
executing task must be pre-empted, the air bag deployment task should be brought into execution,
and the necessary I/O systems should be made readily available for the air bag deployment task.
To meet the strict deadline, the time between the air bag deployment event triggering and start of
the air bag deployment task execution should be minimum, ideally zero. As a rule of thumb, Hard
Real-Time Systems does not implement the virtual memory model for handling the memory. This
eliminates the delay in swapping in and out the code corresponding to the task to and from the
primary memory. In general, the presence of Human in the loop (HITL) for tasks introduces
unexpected delays in the task execution. Most of the Hard Real-Time Systems are automatic and
does not contain a ‘human in the loop’.
Real-Time Operating System that does not guarantee meeting deadlines, but offer the best effort to
meet the deadline are referred as ‘Soft Real-Time’ systems. Missing deadlines for tasks are
acceptable for a Soft Realtime system if the frequency of deadline missing is within the
compliance limit of the Quality of Service (QoS). A Soft Real-Time system emphasises the
principle ‘A late answer is an acceptable answer, but it could have done bit faster’. Soft Real-Time
systems most often have a ‘human in the loop (HITL)’. Automatic Teller Machine (ATM) is a
typical example for Soft-Real-Time System. If the ATM takes a few seconds more than the ideal
operation time, nothing fatal happens. An audio-video playback system is another example for Soft
Real- Time system. No potential damage arises if a sample comes late by fraction of a second, for
playback.
3.3) VxWORKS
VxWorks is a popular hard real-time, multitasking operating system from Wind River Systems
(http://www.windriver.com/). It supports a process model close to the POSIX Standard 1003.1,
with some deviations from the standard in certain areas. It supports a variety of target
processors/controllers including Intel x-86, ARM, MIPS, Power PC (PPC) and 68K. Please refer to
the Wind Driver System’s VxWorks documentation for the complete list of processors supported.
The kernel of VxWorks is popularly known as wind. The latest release of VxWorks introduces the
concept of Symmetric multiprocessing (SMP) and thereby facilitates multicore processor based
Real Time System design. The presence of VxWorks RTOS platform spans across aerospace and
defence applications to robotics and industrial applications, networking and consumer electronics,
and car navigation and telematics systems.
VxWorks follows the task based execution model. Under VxWorks kernel, it is possible to run
even a ‘subroutine’ as separate task with own context and stack. The ‘wind’ kernel uses the
priority based scheduling policy for tasks. The inter-task communication among the tasks is
implemented using message queues, sockets, pipes and signals, and task synchronisation is
achieved through semaphores.
Under VxWorks kernel, the tasks may be in one of the following primary states or a specifi c
combination of it at any given point of time.
READY: The task is ‘Ready’ for execution and is waiting for its turn to get the CPU
PEND: The task is in the blocked (pended) state waiting for some resources
DELAY: The task is sleeping
SUSPEND: The task is unavailable for execution. This state is primarily used for halting a task for
debugging.
Suspending a task prevents it only from the execution and will not block it from state transition. It
is possible for a suspended task to change the task state (For example, if a task is suspended when
the task is in the state ‘DELAY’, sleeping for a specifi ed duration, its state will change to
‘READY’ ‘SUSPEND’ when the sleeping delay is over.
LO 1 Discuss the VxWorks RTOS from Wind River
STOP: A state used by the debugger facilities when a break point is hit. The error detection and
reporting mechanism also uses this when an error condition occurs. The ‘STOP’ state is mainly
associated with development activity.
A task may run to completion from its inception (‘READY’) state or it may get pended or delayed
or suspended during its execution. If a task is picked up for execution by the scheduler, and
suppose another task of higher priority becomes ‘READY’ for execution, and if the scheduling
policy is pre-emptive priority based, the currently executing task is pre-empted and the high
priority task is executed. The pre-empted task enters the ‘READY’ state. If a currently executing
task requires a shared resource, which is currently held by another task, the currently executing
task is preempted and it enters the ‘PEND’ state until the resource is released by the task holding
it. If a task contains sleeping requirement like polling an external device at a periodic interval or
sampling data from the external world at a periodic time, the task sleeps during the periodic
interval. The task is said to be in the ‘DELAY’ state during this time. If a debug request for
debugging a task in execution happens during the execution of a task, it is moved to the state
‘SUSPEND’. The task in this state is unavailable for execution. The ‘SUSPEND’ state only
prevents the execution of a task and it does not prevent state transitions.
When a task is created with the task creation kernel system call taskInit(), the task is created with
the state ‘SUSPEND’. In order to run the newly created task, it should be brought to the ‘READY’
state. It is achieved by activating the task with the system call taskActivate(). The system call
taskSpawn() can be used for creating a task with ‘READY’ state. The taskSpawn() call follows the
syntax
3.4) MICROC/OS-II
MicroC/OS-II (μC/OS-II) is a simple, easy to use real-time kernel written in ‘C’ language for
embedded application development. MicroC OS 2 is the acronym for Micro Controller Operating
System Version 2. The μC/OS-II features:
Multitasking (Supports multi-threading) Priority based pre-emptive task scheduling MicroC/OS-II
is a commercial real-time kernel from Micrium Inc (www.micrium.com). The MicroC/OSII kernel
is available for different family of processors/controllers and supports processors/controllers
ranging from 8bit to 64bit. ARM family of processors, Intel 8085/x86/8051, Freescale 68K series
K.ANIL KUMAR, Asst. Prof. ECE Page 9
10 GNITC ECE UNIT III- EMBEDDED SYSTEM
and Altera Nios II are examples for some of the processors/controllers supported by MicroC/OS-II.
Please check the Micrium Inc website for complete details about the different family of
processors/controllers for which the MicroC/OS-II kernel port is available. MicroC/OS-III is the
latest addition to the MicroC/OS family RTOS with lots of improvements and features like support
for unlimited number of tasks, mutexes, semaphores, event flags, message queues, timers and
memory partitions, unlimited number of priorities, allowing multiple tasks to run at the same
priority level and preemptive scheduling with Round-Robin for equal priority tasks etc. Please
refer to the Online learning centre for the detailed coverage of MicroC/OS-III based design
MicroC OS follows the task based execution model. Each task is implemented as infinite loop.
Under MicroC OS kernel, the tasks may be in one of the following state at a given point of time.
Dormant: The dormant state corresponds to the state where a task is created but no resources are
allocated to it. This means the task is still present in the code memory space, but MicroC/OS needs
to be informed about it.
Ready: Corresponds to a state where the task is incepted into memory and is awaiting the CPU for
its turn for execution.
Running: Corresponds to a state where the task is being executed by CPU.
Pending: Corresponds to a state where a running task is temporarily suspended from execution
and does not have immediate access to resources. The waiting state might be invoked by various
conditions like—the task enters a wait state for an event to occur (e.g.Waiting for user inputs such
as keyboard input) or waiting for getting access to a shared resource.
Interrupted: A task enters the Interrupted (or ISR) state when an interrupt is occurred and the
CPU executes the ISR.
Embedded Linux is nothing but the usage of Linux kernel and other open-source software
development tools such as open-source libraries in embedded systems applications development.
Hence, instead of using a bare-metal embedded systems approach where we have to write every
piece of the software ourselves, we make use of the Linux operating system to design embedded
applications. It will make our embedded development process quick and effortless. Because Linux
provides support for all types of software that we need to use in our embedded applications such as
TCP/IP stack, serial communication protocols, graphics libraries, etc. We only have to configure
Linux and create an image according to our underlying processor architecture and this process is
known as a board support package.
There is no specific Linux kernel image for embedded devices. An extensive range of devices,
workstations, and embedded systems can be built up by the same Linux kernel code by configuring
and porting it to different processor architectures. Mentor Graphics is one of the leading embedded
Linux service providers. The following diagram shows its commercial platforms and services.
The following block diagram shows the architecture of embedded Linux based systems.
The embedded system build process is usually done on the host PC using cross-compilation tools.
Because target hardware does not have enough resources to run tools that are used to generate a
binary image for target embedded hardware. The process of compiling code on one system (host
system) and generated source code runs on the other system is known as cross-compilation.
3.6) RT LINUX
RTLinux provides the ability to run special real-time tasks and interrupt handlers on the same
machine as standard Linux. These tasks and handlers execute when they need to execute no matter
what Linux is doing. The worst case time between the moment a hardware interrupt is detected by
the processor and the moment an interrupt handler starts to execute is under 15 microseconds on
RTLinux running on a generic x86 (circa 2000).
A RTLinux periodic task runs within 35 microseconds of its scheduled time on the same hardware.
These times are hardware limited, and as hardware improves RTLinux will also improve. Standard
Linux has excellent average performance and can even provide millisecond level scheduling
precision for tasks using the POSIX soft real-time capabilities. Standard Linux is not, however,
designed to provide sub-millisecond precision and reliable timing guarantees. RTLinux was based
on a lightweight virtual machine where the Linux "guest" was given a virtualized interrupt
controller and timer, and all other hardware access was direct.
From the point of view of the real-time "host", the Linux kernel is a thread. Interrupts needed for
deterministic processing are processed by the real-time core, while other interrupts are forwarded
to Linux, which runs at a lower priority than real-time threads. Linux drivers handled almost all
I/O. First-In-First-Out pipes (FIFO) or shared memory can be used to share data between the
operating system and RTLinux.
3.6.1) OBJECTIVE:
The key RTLinux design objective is that the system should be transparent, modular, and
extensible [. Transparency means that there are no unopenable black boxes and the cost of any
operation should be determinable. Modularity means that it is possible to omit functionality and
the expense of that functionality if it is not needed. And extensibility means that programmers
should be able to add modules and tailor the system to their requirements. The base RTLinux
system supports high speed interrupt handling and no more. It has simple priority scheduler that
can be easily replaced by schedulers more suited to the needs of some specific application. When
developing RTLinux, it was designed to maximize the advantage we get from having Linux and its
powerful capabilities available.
3.7) WINDOWS CE
Windows Embedded Compact, formerly Windows Embedded CE, Windows Powered and
Windows CE, is an operating system subfamily developed by Microsoft as part of its Windows
Embedded family of products.
Unlike Windows Embedded Standard, which is based on Windows NT, Windows Embedded
Compact uses a different hybrid kernel.[7] Microsoft licenses it to original equipment
manufacturers (OEMs), who can modify and create their own user interfaces and experiences, with
Windows Embedded Compact providing the technical foundation to do so. The current version of
Windows Embedded Compact supports x86 and ARM processors with board support package
(BSP) directly.[8] The MIPS and SHx architectures had support prior to version 7.0. 7.0 still works
on MIPSII architecture.
Originally, Windows CE was designed for minimalistic and small computers. However CE had its
own kernel whereas those such as Windows XP Embedded are based on NT. Windows CE was a
modular/componentized operating system that served as the foundation of several classes of
devices such as Handheld PC, Pocket PC, Auto PC, Windows Mobile, Windows Phone 7 and
more.
3.7.1) FEATURES
Windows CE is optimized for devices that have minimal memory; a Windows CE kernel may
run with one megabyte of memory. Devices are often configured without disk storage, and may
be configured as a "closed" system that does not allow for end-user extension (for instance, it can
be burned into ROM). Windows CE conforms to the definition of a real-time operating system,
with deterministic interrupt latency. From Version 3 and onward, the system supports 256
priority levels and uses priority inheritance for dealing with priority inversion. The fundamental
unit of execution is the thread. This helps to simplify the interface and improveexecution time.
The first version – known during development under the code name "Pegasus" – featured a
Windows-like GUI and a number of Microsoft's popular apps, all trimmed down for smaller
storage, memory, and speed of the palmtops of the day. Since then, Windows CE has evolved
into a component-based, embedded, real-time operating system. It is no longer targeted solely at
hand-held computers.[11] Many platforms have been based on the core Windows CE operating
system, including Microsoft's AutoPC, Pocket PC 2000, Pocket PC 2002, Windows Mobile
2003, Windows Mobile 2003 SE, Windows Mobile 5, Windows Mobile 6, Smartphone
2002, Smartphone 2003, Portable Media Center, Zune, Windows Phone 7 and many industrial
devices and embedded systems. Windows CE even powered select games for the Sega
Dreamcast, was the operating system of the Gizmondo handheld, and can partially run on
modified Xbox game consoles.
A distinctive feature of Windows CE compared to other Microsoft operating systems is that large
parts of it are offered in source code form. First, source code was offered to several vendors, so
they could adjust it to their hardware. Then products like Platform Builder (an integrated
environment for Windows CE OS image creation and integration, or customized operating
system designs based on CE) offered several components in source code form to the general
public. However, a number of core components that do not need adaptation to specific hardware
environments (other than the CPU family) are still distributed in binary only form.
Windows CE 2.11 was the first embedded Windows release to support a console and a Windows
CE version of cmd.exe.[12]
The decision of choosing an RTOS for an embedded design is very crucial. A lot of factors needs
to be analysed carefully before making a decision on the selection of an RTOS. These factors can
be functional or non- functional. The following section gives a brief introduction to the important
functional and non-functional requirements that needs to be analysed in the
selection of an RTOS for an embedded design.
Processor Support: It is not necessary that all RTOS’s support all kinds of processor architecture.
It is essential to ensure the processor support by the RTOS.
Memory Requirements: The OS requires ROM memory for holding the OS files and it is
normally stored in a non-volatile memory like FLASH. OS also requires working memory RAM
for loading the OS services. Since embedded systems are memory constrained, it is essential to
evaluate the minimal ROM and RAM requirements for the OS under consideration.
Real- Time Capabilities: It is not mandatory that the operating system for all embedded systems
need to be Real-time and all embedded Operating systems are ‘Real-time’ in behaviour. The
task/process scheduling policies plays an important role in the ‘Real-time’ behaviour of an OS.
Analyse the real-time capabilities of the OS under consideration and the standards met by the
operating system for real-time capabilities.
Kernel and Interrupt Latency: The kernel of the OS may disable interrupts while executing
certain services and it may lead to interrupt latency. For an embedded system whose response
requirements are high, this latency should be minimal.
Inter Process Communication and Task Synchronisation: The implementation of Inter Process
Communication and Synchronisation is OS kernel dependent. Certain kernels may provide a bunch
of options whereas others provide very limited options. Certain kernels implement policies for
avoiding priority inversion issues in resource sharing.
Modularisation Support: Most of the operating systems provide a bunch of features. At times it
may not be necessary for an embedded product for its functioning. It is very useful if the OS
supports modularisation where in which the developer can choose the essential modules and re-
compile the OS image for functioning. Windows CE is an example for a highly modular operating
system.
Support for Networking and Communication: The OS kernel may provide stack
implementation and driver support for a bunch of communication interfaces and networking.
Ensure that the OS under consideration provides support for all the interfaces required by the
embedded product.
Development Language Support: Certain operating systems include the run time libraries
required for running applications written in languages like Java and C#. A Java Virtual Machine
(JVM) customised for the Operating System is essential for running java applications. Similarly
the .NET Compact Framework (.NETCF) is required for running Microsoft® .NET applications on
top of the Operating System. The OS may include these components as built-in component, if not,
check the availability of the same from a third party vendor for the OS under consideration.
Custom Developed or Off the Shelf: Depending on the OS requirement, it is possible to go for
the complete development of an operating system suiting the embedded system needs or use an off
the shelf, readily available operating system, which is either a commercial product or an Open
Source product, which is in close match with the system requirements. Sometimes it may be
possible to build the required features by customising an Open source OS. The decision on which
to select is purely dependent on the development cost, licensing fees for the OS, development time
and availability of skilled resources.
Cost: The total cost for developing or buying the OS and maintaining it in terms of commercial
product and custom build needs to be evaluated before taking a decision on the selection of OS.
Development and Debugging Tools Availability: The availability of development and debugging
tools is a critical decision making factor in the selection of an OS for embedded design. Certain
Operating Systems may be superior in performance, but the availability of tools for supporting the
development may be limited. Explore the different tools available for the OS under consideration.
Ease of Use: How easy it is to use a commercial RTOS is another important feature that needs to
be considered in the RTOS selection.
After Sales: For a commercial embedded RTOS, after sales in the form of e-mail, on-call services,
etc. for bug fixes, critical patch updates and support for production issues, etc. should be analysed
thoroughly.
Co-operating Processes: In the co-operating interaction model one process requires the inputs
from other processes to complete its execution.
Co-operating processes exchanges information and communicate through the following methods.
Co-operation through Sharing: The co-operating process exchange data through some shared
resources.
Co-operation through Communication: No data is shared between the processes. But they
communicate for synchronization.
Competing Processes: The competing processes do not share anything among themselves but they
share the system resources. The competing processes compete for the system resources such as fi le,
display device, etc.
Processes share some area of the memory to communicate among them (Fig 4.1). Information to
be communicated by the process is written to the shared memory area. Other processes which
require this information can read the same from the shared memory area. It is same as the real world
example where „Notice Board‟ is used by corporate to publish the public information among the
employees (The only exception is; only corporate have the right to modify the information
published on the Notice board and employees are given „Read‟ only access, meaning it is only a one
way channel).
The implementation of shared memory concept is kernel dependent. Different mechanisms are
adopted by different kernels for implementing this. A few among them are:
4.2.1.1) Pipes
„Pipe‟ is a section of the shared memory used by processes for communicating. Pipes follow the
client-server architecture. A process which creates a pipe is known as a pipe server and a process
which connects to a pipe is known as pipe client. A pipe can be considered as a conduit for
information flow and has two conceptual ends.
It can be unidirectional, allowing information flow in one direction or bidirectional allowing bi-
directional information flow. A unidirectional pipe allows the process connecting at one end of the
pipe to write to the pipe and the process connected at the other end of the pipe to read the data,
whereas a bi-directional pipe allows both reading and writing at one end. The unidirectional pipe
can be visualized as
Anonymous Pipes: The anonymous pipes are unnamed, unidirectional pipes used for data transfer
between two processes.
Named Pipes: Named pipe is a named, unidirectional or bi-directional pipe for data exchange
between processes. Like anonymous pipes, the process which creates the named pipe is known as
pipe server. A process which connects to the named pipe is known as pipe client. With named
pipes, any process can act as both client and server allowing point-to-point communication.
Namedpipes can be used for communicating between processes running on the same machine or
between processes running on different machines connected to a network.
Memory mapped object is a shared memory technique adopted by certain Real-Time Operating
Systems for allocating a shared block of memory which can be accessed by multiple process
simultaneously (of course certain synchronization techniques should be applied to prevent
inconsistent results). In this approach a mapping object is created and physical storage for it is
reserved and committed. A process can map the entire committed physical area or a block of it to
its virtual address space. All read and write operation to this virtual address space by a process is
directed to its committed physical area. Any process which wants to share data with other
processes can map the physical memory area of the mapped object to its virtual memory space and
use it for sharing the data.
Based on the message passing operation between the processes, message passing is classified into
Usually the process which wants to talk to another process posts the message to a First-In-First-
Out (FIFO) queue called „Message queue‟, which stores the messages temporarily in a system
defined memory object, to pass it to the desired process (Fig. 4.6). Messages are sent and received
through send (Name of the process to which the message is to be sent, message) and receive
(Name of the process from which the message is to be received, message) methods. The messages
are exchanged through a message queue. The implementation of the message queue, send and
receive methods are OS kernel dependent. The Windows XP OS kernel maintains a single system
message queue and one process/thread (Process and threads are used interchangeably here, since
thread is the basic unit of process in windows) specific message queue. A thread which wants to
communicate with another thread posts the message to the system message queue.
Fig 4.6: Concept of message queue based indirect messaging for IPC
The kernel picks up the message from the system message queue one at a time and examines the
message for finding the destination thread and then posts the message to the message queue of the
corresponding thread. For posting a message to a thread‟s message queue, the kernel fills a
message structure MSG and copies it to the message queue of the thread. The message structure
MSG contains the handle of the process/thread for which the message is intended, the message
parameters, the time at which the message is posted, etc. A thread can simply post a message to
another thread and can continue its operation or it may wait for a response from the thread to
which the message is posted. The messaging mechanism is classified into synchronous and
asynchronous based on the behavior of the message posting thread. In asynchronous messaging,
the message posting thread just posts the message to the queue and it will not wait for an
acceptance (return) from the thread to which the message is posted, whereas in synchronous
messaging, the thread which posts a message enters waiting state and waits for the message
result from the thread to which the message is posted. The thread which invoked the send message
becomes blocked and the scheduler will not pick it up for scheduling.
4.2.2.2) Mailbox
Mailbox is an alternate form of „Message queues‟ and it is used in certain Real-Time Operating
Systems for IPC. Mailbox technique for IPC in RTOS is usually used for one way messaging. The
task/thread which wants to send a message to other tasks/threads creates a mailbox for posting the
messages. The threads which are interested in receiving the messages posted to the mailbox by the
mailbox creator thread can subscribe to the mailbox. The thread which creates the mailbox is
known as „mailbox server‟ and the threads which subscribe to the mailbox are known as „mailbox
clients‟. The mailbox server posts messages to the mailbox and notifies it to the clients which are
subscribed to the mailbox. The clients read the message from the mailbox on receiving the
notification. The mailbox creation, subscription, message reading and writing are achieved
throughOS kernel provided API calls. Mailbox and message queues are same in functionality. The
only difference is in the number of messages supported by them. Both of them are used for
passing data in the form of message(s) from a task to another task(s). Mailbox is used for
exchanging a single message between two tasks or between an Interrupt Service Routine (ISR)
and a task. Mailbox associates a pointer pointing to the mailbox and a wait list to hold the tasks
waiting for a message to appear in the mailbox. The implementation of mailbox is OS kernel
dependent. The MicroC/OS-II implements mailbox as a mechanism for inter-task communication.
We will discuss about the mailbox based IPC implementation under MicroC/OS-II in a latter
chapter. Figure 4.7 given below illustrates the mailbox based IPC technique.
Signalling is a primitive way of communication between processes/threads. Signals are used for
asynchronous notifications where one process/thread fi res a signal, indicating the occurrence of a
scenario which the other process(es)/thread(s) is waiting. Signals are not queued and they do not
carry any data. The communication mechanisms used in RTX51 Tiny OS is an example for
Signalling. The os_send_signal kernel call under RTX 51 sends a signal from one task to a
specified task. Similarly the os_wait kernel call waits for a specified signal. Refer to the topic
„Round Robin Scheduling‟ under the section „Priority based scheduling‟ for more details
onSignalling in RTX51 Tiny OS. The VxWorks RTOS kernel also implements „signals‟ for inter
process communication. Whenever a specifi ed signal occurs it is handled in a signal handler
associated with the signal.
Remote Procedure Call (RPC) is a powerful technique for constructing distributed, client-server
based applications. It is based on extending the conventional local procedure calling so that
the called procedure need not exist in the same address space as the calling procedure. The two
processes may be on the same system, or they may be on different systems with a network
connecting them.
Remote Procedure Call or RPC (Fig. 4.9) is the Inter Process Communication (IPC) mechanism
used by a process to call a procedure of another process running on the same CPU or on a different
CPU which is interconnected in a network. In the object oriented language terminology RPC is
also known as Remote Invocation or Remote Method Invocation ( RMI). RPC is mainly used for
distributed applications like client server applications. With RPC it is possible to
communicateover a heterogeneous network (i.e. Network where Client and server applications are
running on different Operating systems). The CPU/process containing the procedure which needs
to be invoked remotely is known as server. The CPU/process which initiates an RPC request is
knownas client.
In asynchronous RPC calls, the calling process continues its execution while the
remote process performs the execution of the procedure. The result from the remote procedure is
returned back to the caller through mechanisms like callback functions. On security front, RPC
employs authentication mechanisms to protect the systems against vulnerabilities. The client
applications (processes) should authenticate themselves with the server for getting access.
Authentication mechanisms like IDs, public key cryptography (like DES, 3DES), etc. are used by
the client for authentication. Without authentication, any client can access the remote procedure.
This may lead to potential security risks. Fig 4.10 represents the mechanism of RPC on different
CPU.
4.3.1) Sockets
Sockets are used for RPC communication. Socket is a logical endpoint in a two-way
communication link between two applications running on a network. A port number is associated
with a socket so that the network layer of the communication channel can deliver the data to the
designated application. Sockets are of different types, namely, Internet sockets (INET), UNIX
sockets, etc. The INET socket works on internet communication protocol. TCP/IP, UDP, etc. are
the communication protocols used by INET sockets. INET sockets are classified into:
1. Stream sockets
2. Datagram sockets
Stream sockets are connection oriented and they use TCP to establish a reliable connection.
On the other hand, Datagram sockets rely on UDP for establishing a connection. The UDP
connection is unreliable when compared to TCP. The client-server communication model uses a
socket at the client side and a socket at the server side. A port number is assigned to both of these
sockets. The client and server should be aware of the port number associated with the socket. In
order to start the communication, the client needs to send a connection request to the server at the
specifi ed port number. The client should be aware of the name of the server along with its port
number. The server always listens to the specified port number on the network.
Upon receiving a connection request from the client, based on the success of authentication,
the server grants the connection request and a communication channel is established between the
client and server. The client uses the host name and port number of server for sending requests
andserver uses the client‟s name and port number for sending responses.
If the client and server applications (both processes) are running on the same CPU, both
can use the same host name and port number for communication. The physical communication
link between the client and server uses network interfaces like Ethernet or Wi-Fi for data
communication. The underlying implementation of socket is OS kernel dependent. Different types
of OSs provide different socket interfaces. The following sample code illustrates the usage of
socket for creating a client application under Windows OS. Winsock (Windows Socket 2) is the
library implementing socket functions for Win32. The above application tries to connect to a
server machine with IP address 172.168.0.1 and port number 5000. Change the values of
SERVER and PORT to connect to a machine with different IP address and port number. If the
connection is success, it sends the data “Hi from Client” to the server and waits for a response
from the server and finally terminates the connection.
Under Windows, the socket function library Winsock should be initiated before using the
socket related functions. The function WSA Startup performs this initiation. The socket() function
call creates a socket. The socket type, connection type and protocols for communication are the
parameters for socket creation. Here the socket type is INET (AF_INET) and connection type is
stream socket (SOCK_STREAM). The protocol selected for communication is TCP/IP
(IPPROTO_TCP). After creating the socket it is connected to a server. For connecting to server,
the server address and port number should be indicated in the connection request. The sockaddr_in
structure specifies the socket type, IP address and port of the server to be connected to. The
connect () function connects the socket with a specified server. If the server grants the connection
request, the connect() function returns success. The send() function is used for sending data to a
server. It takes the socket name and data to be sent as parameters. Data from server is received
using the function call recv(). It takes the socket name and details of buffer to hold the received
data as parameters. The TCP/IP network stack expects network byte order (Big Endian: Higher
order byte of the data is stored in lower memory address location) for data. The function htons()
converts the byte order of an unsigned short integer to the network order. The closesocket()
function closes the socket connection. On the server side, the server creates a socket using the
function socket() and binds the socket with a port using the bind() function. It listens to the port
bonded to the socket for any incoming connection request. The function listen() performs this.
Upon receiving a connection request, the server accepts it. The function accept() performs the
accepting operation. Now the connectivity is established. Server can receive and transmit
data using the function calls recv() and send() respectively. The implementation of the server
application is left to the readers as an exercise.
Some kernels provide a special register as part of each task‟s control block, as shown in Fig 4.11
This register, called an event register, is an object belonging to a task and consists of a group of
binary event flags used to track the occurrence of specific events. Depending on a given kernel‟s
implementation of this mechanism, an event register can be 8-, 16-, or 32-bits wide, maybe even
more. Each bit in the event register is treated like a binary flag (also called an event flag) and can
be either set or cleared.
Through the event register, a task can check for the presence of particular events that can control
its execution. An external source, such as another task or an ISR, can set bits in the event register
to inform the task that a particular event has occurred.Applications define the event associated
with an event flag. This definition must be agreed upon between the event sender and receiver
using the event registers.
Typically, when the underlying kernel supports the event register mechanism, the kernel creates an
event register control block as part of the task control block when creating a task, as shown in Fig
4.12.
The task specifies the set of events it wishes to receive. This set of events is maintained in the
wanted events register. Similarly, arrived events are kept in the received events register. The task
indicates a timeout to specify how long it wishes to wait for the arrival of certain events.
Thekernel wakes up the task when this timeout has elapsed if no specified events have arrived at
the task.
Using the notification conditions, the task directs the kernel as to when it wishes to be notified
(awakened) upon event arrivals. For example, the task can specify the notification conditions as
“send notification when both event type 1 and event type 3 arrive or when event type 2 arrives.”
This option provides flexibility in defining complex notification patterns.
Two main operations are associated with an event register, the sending and the receiving operations,
as shown in Table 4.1.
The receive operation allows the calling task to receive events from external sources. The task can
specify if it wishes to wait, as well as the length of time to wait for the arrival of desired events
before giving up. The task can wait forever or for a specified interval. Specifying a set of events
when issuing the receive operation allows a task to block-wait for the arrival of multiple events,
although events might not necessarily all arrive simultaneously. The kernel translates this event set
into the notification conditions. The receive operation returns either when the notification
conditions are satisfied or when the timeout has occurred. Any received events that are notindicated
in the receive operation are left pending in the received events register of the event register control
block. The receive operation returns immediately if the desired events are already pending.
The event set is constructed using the bit-wise AND/OR operation. With the AND operation, the
task resumes execution only after every event bit from the set is on. A task can also block-wait for
the arrival of a single event from an event set, which is constructed using the bit-wise
OR operation. In this case, the task resumes execution when any one event bit from the set is on.
The send operation allows an external source, either a task or an ISR, to send events to another task.
The sender can send multiple events to the designated task through a single send operation. Events
that have been sent and are pending on the event bits but have not been chosen for reception by the
task remains pending in the received events register of the event register control block.
Events in the event register are not queued. An event register cannot count the occurrences of the
same event while it is pending; therefore, subsequent occurrences of the same event are lost. For
example, if an ISR sends an event to a task and the event is left pending; and later another
task sends the same event again to the same task while it is still pending, the first occurrence
of theevent is lost.
Event registers are typically used for unidirectional activity synchronization. It is unidirectional
because the issuer of the receive operation determines when activity synchronization should take
place. Pending events in the event register do not change the execution state of the receiving task.
In following the diagram, at the time task 1 sends the event X to task 2, no effect occurs to the
execution state of task 2 if task 2 has not yet attempted to receive the event.
No data is associated with an event when events are sent through the event register. Other
mechanisms must be used when data needs to be conveyed along with an event. This lack of
associated data can sometimes create difficulties because of the noncumulative nature of events in
the event register. Therefore, the event register by itself is an inefficient mechanism if used
beyondsimple activity synchronization.
Another difficulty in using an event register is that it does not have a built-in mechanism for
identifying the source of an event if multiple sources are possible. One way to overcome this
problem is for a task to divide the event bits in the event register into subsets.
The task can then associate each subset with a known source. In this way, the task can identify the
source of an event if each relative bit position of each subset is assigned to the same event type.
In Fig 4.13, an event register is divided into 4-bit groups. Each group is assigned to a source,
regardless of whether it is a task or an ISR. Each bit of the group is assigned to an event type.
4.5) SEMAPHORE
Semaphores are integer variables that are used to solve the critical section problem by using two
atomic operations wait and signal that are used for process synchronization.
The definitions of wait and signal are as follows −
Wait
The wait operation decrements the value of its argument S, if it is positive. If S is negativeor
zero, then no operation is performed.
wait(S)
{
while
(S<=0);S--;
}
OR
P(Semaphore S)
{
while
(S<=0);S--;
}
Signal
The signal operation increments the value of its argument S.
signal(S)
{
S++;
}
OR
V( Semaphore S)
{
S++;
}
PROCESS P
//Some
CodeP(s);
//Critical
SectionV(s);
//Remainder section
There are two main types of semaphores i.e. counting semaphores and binary semaphores. Details
about these are given as follows −
Semaphores allow only one process into the critical section. They follow the
mutualexclusion principle strictly and are much more efficient than some other
methods of synchronization.
There is no resource wastage because of busy waiting in semaphores as processor
time is not wasted unnecessarily to check if a condition is fulfilled to allow a
process to access the critical section.
Semaphores are implemented in the machine independent code of the
microkernel. So they are machine independent.
There are four conditions applied on mutual exclusion. These are the following:
Mutual exclusion must be guaranteed between the different processes when accessing
the shared resource. There cannot be two processes within their respective critical
sections at any time.
No assumptions should be made as to the relative speed of the conflicting processes.
No process that is outside its critical section should interrupt another for access to the
critical section.
When more than one process wants to enter its critical section, it must be granted entry
in a finite time, that is, it will never be kept waiting in a loop that has no end.
The term „task‟ refers to something that needs to be done. In our day-to-day life, we are bound to
the execution of a number of tasks. The task can be the one assigned by our managers or the one
assigned by our professors/teachers or the one related to our personal or family needs. In addition,
we will have an order of priority and schedule/timeline for executing these tasks. In the operating
system context, a task is defi ned as the program in execution and the related information
maintained by the operating system for the program. Task is also known as „Job‟ in the operating
system context. A program or part of it in execution is also called a „Process‟. The terms „Task‟,
„Job‟ and „Process‟ refer to the same entity in the operating system context and most often they are
used interchangeably.
4.7.1) PROCESS
A „Process‟ is a program, or part of it, in execution. Process is also known as an instance of a
program in execution. Multiple instances of the same program can execute simultaneously. A
process requires various system resources like CPU for executing the process, memory for storing
the code corresponding to the process and associated variables, I/O devices for information
exchange, etc. A process is sequential in execution.
holds a set of registers, process status, a Program Counter (PC) to point to the next executable
instruction of the process, a stack for holding the local variables associated with the process
and the code corresponding to the process. This can be visualized as shown in Fig. 4.14. A process
which inherits all the properties of the CPU can be considered as a virtual processor, awaiting its
turn to have its properties switched into the physical processor. When the process gets its turn, its
registers and the program counter register becomes mapped to the physical registers of the
CPU. From a memory perspective, the memory occupied by the process is segregated into three
regions, namely, Stack memory, Data memory and Code memory (Fig. 4.15).
The „Stack‟ memory holds all temporary data such as variables local to the process. Data memory
holds all global data for the process. The code memory contains the program code (instructions)
corresponding to the process. On loading a process into the main memory, a specific area of
memory is allocated for the process. The stack memory usually starts (OS Kernel implementation
dependent) at the highest memory address from the memory area allocated for the process. Say for
example, the memory map of the memory area allocated for the process is 2048 to 2100, the stack
memory starts at address 2100 and grows downwards to accommodate the variables local to the
process.
The state at which a process is being created is referred as „Created State‟. The Operating
Systemrecognises a process in the „Created State‟ but no
resources are allocated to the process. The state,
where a process is incepted into the memory and
awaiting the processor time for execution, is known
as „Ready State‟. At this stage, the process is placed
in the „Ready list‟ queue maintained by the OS.
The state where in the source code instructions
corresponding to the process is being executed is
called „Running State‟. Running state is the state
at which the process execution happens. „Blocked
State/Wait State‟ refers to a state where a running
process is temporarily suspended from execution
and does not have immediate access to resources.
The blocked state might be invoked by various
conditions like: the process enters a wait state for
an event to occur (e.g. Waiting for user inputs
such as keyboard input) or waiting for getting
access to a sharedresource (will be discussed at a
later section of this chapter). A state where the
process completes its execution is known as
„Completed State‟.
The transition of a process from one state to another
Is known as „State transition‟. When a process changes
its state from Ready to running or from running to
blocked or terminated or from blocked to running, the
CPU allocation for the process may also change. It should be noted that the state representation for
a process/task mentioned here is a generic representation. The states associated with a task may be
known with a different name or there may be more or less number of states than the one explained
here under different OS kernel. For example, under VxWorks‟ kernel, the tasks may be in either
one or a specifi c combination of the states READY, PEND, DELAY and SUSPEND.
The PEND state represents a state where the task/process is blocked on waiting for I/O or system
resource.
The DELAY state represents a state in which the task/process is sleeping and the SUSPEND state
represents a state where a task/process is temporarily suspended from execution and not available
for execution. Under MicroC/OS-II kernel, the tasks may be in one of the states, DORMANT,
READY, RUNNING, WAITING or INTERRUPTED. The DORMANT state represents the
„Created‟ state and WAITING state represents the state in which a process waits for shared
resource or I/O access.
Process management deals with the creation of a process, setting up the memory space for the
process, loading the process‟s code into the memory space, allocating system resources, setting up
a Process Control Block (PCB) for the process and process termination/deletion.
4.7.5) THREADS
A thread is the primitive that can execute code. A thread is a single sequential flow of control
within a process. „Thread‟ is also known as lightweight process. A process can have many
threads of execution. Different threads, which are part of a process, share the same address space;
meaning they share the data memory, code memory and heap memory area. Threads maintain
their own thread status (CPU register values), Program Counter (PC) and stack. The memory
model for a process and its associated threads are given in Fig. 4.17.
multithreaded architecture of aprocess can be better visualised with the thread-process diagram
shown in Fig. 4.18.
If the process is split into multiple threads, which executes a portion of the process, there will be a
main thread and rest of the threads will be created within the main thread. Use of multiplethreads
to execute a process brings the following advantage.
Better memory utilisation. Multiple threads of the same process share the address
space for data memory. This also reduces the complexity of inter thread
communication since variables can be shared across the threads.
Since the process is split into different threads, when one thread enters a wait
state, theCPU can be utilised by other threads of the process that do not require the
event, which the other thread is waiting, for processing. This speeds up the execution
of the process.
Efficient CPU utilisation. The CPU is engaged all time.
The terms multiprocessing and multitasking are a little confusing and sounds alike. In the
operating system context multiprocessing describes the ability to execute multiple processes
simultaneously. Systems which are capable of performing multiprocessing are known as
multiprocessor systems. Multiprocessor systems possess multiple CPUs and can execute multiple
processes simultaneously. The ability of the operating system to have multiple programs in
memory, which are ready for execution, is referred as multiprogramming. In a uniprocessor
system, it is not possible to execute multiple processes simultaneously. However, it is possible for
a uniprocessor system to achieve some degree of pseudo parallelism in the execution of multiple
processes by switching the execution among different processes. The ability of an operating
system to hold multiple processes in memory and switch the processor (CPU) from executing one
process to another process is known as multitasking. Multitasking creates the illusion of multiple
tasks executing in parallel. Multitasking involves the switching of CPU from executing one task to
another. In a multitasking environment, when task/process switching happens, the virtual processor
(task/process) gets its properties converted into that of the physical processor. The switching of the
virtual processor to physical processor is controlled by the scheduler of the OS kernel. Whenever a
CPU switching happens, the current context of execution should be saved to retrieve it at a later
point of time when the CPU executes the process, which is interrupted currently due to execution
switching. The context saving and retrieval is essential for resuming a process exactly from the
point where it was interrupted due to CPU switching. The act of switching CPU among the
processes or changing the current execution context is known as „Context switching‟. The act of
saving the current context which contains the context details (Register details, memory details,
system resource usage details, execution details, etc.) for the currently running process at the time
of CPU switching is known as „ Context saving‟.
The process of retrieving the saved context details for a process, which is going to be
executed due to CPU switching, is known as „Context retrieval‟. Multitasking involves „Context
switching‟ (Fig. 5.1), „Context saving‟ and „Context retrieval‟.
Toss Juggling The skilful object manipulation game is a classic real world example for the
multitasking illusion. The juggler uses a number of objects (balls, rings, etc.) and throws them up
and catches them. At any point of time, he throws only one ball and catches only one per hand.
However, the speed at which he is switching the balls for throwing and catching creates the
illusion, he is throwing and catching multiple balls or using more than two hands simultaneously,
to the spectators.
As we discussed earlier, multitasking involves the switching of execution among multiple tasks.
Depending on how the switching act is implemented, multitasking can be classified into different
types. The following section describes the various types of multitasking existing in the Operating
System‟s context.
Co-operative multitasking is the most primitive form of multitasking in which a task/process gets a
chance to execute only when the currently executing task/process voluntarily relinquishes the
CPU. In this method, any task/process can hold the CPU as much time as it wants. Since this type
of implementation involves the mercy of the tasks each other for getting the CPU time for
execution, it is known as co-operative multitasking. If the currently executing task is non-
cooperative, the other tasks may have to wait for a long time to get the CPU.
Preemptive multitasking ensures that every task/process gets a chance to execute. When and how
much time a process gets is dependent on the implementation of the preemptive scheduling. As the
name indicates, in preemptive multitasking, the currently running task/process is preempted to give
a chance to other tasks/process to execute. The preemption of task may be based on time slots or
task/process priority.
K.ANIL KUMAR, Asst. Prof. ECE Page 2
3 GNITC ECE UNIT V- EMBEDDED SYSTEM
In non-preemptive multitasking, the process/task, which is currently given the CPU time, is
allowed to execute until it terminates (enters the „Completed‟ state) or enters the „Blocked/Wait‟
state, waiting for an I/O or system resource. The co-operative and non-preemptive multitasking
differs in their behaviour when they are in the „Blocked/Wait‟ state. In co-operative multitasking,
the currently executing process/task need not relinquish the CPU when it enters the „Blocked/Wait‟
state, waiting for an I/O, or a shared resource access or an event to occur whereas in non-
preemptive multitasking the currently executing task relinquishes the CPU when it waits for an I/O
or system resource or an event to occur.
As we already discussed, multitasking involves the execution switching among the different tasks.
There should be some mechanism in place to share the CPU among the different tasks and to
decide which process/task is to be executed at a given point of time. Determining which
task/process is to be executed at a given point of time is known as task/process scheduling. Task
scheduling forms the basis of multitasking. Scheduling policies forms the guidelines for
determining which task is to be executed when. The scheduling policies are implemented in an
algorithm and it is run by the kernel as a service. The kernel service/application, which implements
the scheduling algorithm, is known as „Scheduler‟. The process scheduling decision may take
place when a process switches its state to
A process switches to „Ready‟ state from the „Running‟ state when it is preempted. Hence, the type
of scheduling in scenario 1 is pre-emptive. When a high priority process in the „Blocked/Wait‟
state completes its I/O and switches to the „Ready‟ state, the scheduler picks it for execution if the
scheduling policy used is priority based preemptive. This is indicated by scenario 3. In
preemptive/non-preemptive multitasking, the process relinquishes the CPU when it enters the
„Blocked/Wait‟ state or the „Completed‟ state and switching of the CPU happens at this stage.
Scheduling under scenario 2 can be either preemptive or non-preemptive. Scheduling under
scenario 4 can be preemptive, non-preemptive or co-operative. The selection of a scheduling
criterion/algorithm should consider the following factors:
CPU Utilisation: The scheduling algorithm should always make the CPU utilisation high. CPU
utilisation is a direct measure of how much percentage of the CPU is being utilised.
Throughput: This gives an indication of the number of processes executed per unit of time. The
throughput for a good scheduler should always be higher.
Turnaround Time: It is the amount of time taken by a process for completing its execution. It
includes the time spent by the process for waiting for the main memory, time spent in the ready
queue, time spent on completing the I/O operations, and the time spent in execution. The
turnaround time should be a minimal for a good scheduling algorithm.
Waiting Time: It is the amount of time spent by a process in the „Ready‟ queue waiting to get the
CPU time for execution. The waiting time should be minimal for a good scheduling algorithm.
Response Time: It is the time elapsed between the submission of a process and the first response.
For a good scheduling algorithm, the response time should be as least as possible.
The Operating System maintains various queues in connection with the CPU scheduling, and a
process passes through these queues during the course of its admittance to execution completion.
The various queues maintained by OS in association with CPU scheduling are:
Job Queue: Job queue contains all the processes in the system
Ready Queue: Contains all the processes, which are ready for execution and waiting for CPU to
get their turn for execution. The Ready queue is empty when there is no process ready for running.
Device Queue: Contains the set of processes, which are waiting for an I/O device.
A process migrates through all these queues during its journey from „Admitted‟ to „Completed‟
stage. The following diagrammatic representation (Fig. 5.2) illustrates the transition of a process
through the various queues.
Based on the scheduling algorithm used, the scheduling can be classified into the following
categories.
As the name indicates, the First-Come-First-Served (FCFS) scheduling algorithm allocates CPU
time to the processes based on the order in which they enter the „Ready‟ queue. The first entered
process is serviced first. It is same as any real world application where queue systems are used;
e.g. Ticketing reservation system where people need to stand in a queue and the first person
standing in the queue is serviced first. FCFS scheduling is also known as First In First Out (FIFO)
where the process which is put first into the „Ready‟ queue is serviced first.
The Last-Come-First Served (LCFS) scheduling algorithm also allocates CPU time to the
processes based on the order in which they are entered in the „Ready‟ queue. The last entered
process is serviced first. LCFS scheduling is also known as Last In First Out (LIFO) where the
process, which is put last into the „Ready‟ queue, is serviced first.
Shortest Job First (SJF) scheduling algorithm „sorts the „Ready‟ queue‟ each time a process
relinquishes the CPU (either the process terminates or enters the „Wait‟ state waiting for I/O or
system resource) to pick the process with shortest (least) estimated completion/run time. In SJF,
the process with the shortest estimated run time is scheduled first, followed by the next shortest
process, and so on.
The Turn Around Time (TAT) and waiting time for processes in non-preemptive scheduling varies
with the type of scheduling algorithm. Priority based non-preemptive scheduling algorithm ensures
that a process with high priority is serviced at the earliest compared to other low priority processes
in the „Ready‟ queue. The priority of a task/process can be indicated through various mechanisms.
K.ANIL KUMAR, Asst. Prof. ECE Page 5
6 GNITC ECE UNIT V- EMBEDDED SYSTEM
The Shortest Job First (SJF) algorithm can be viewed as a priority based scheduling where each
task is prioritised in the order of the time required to complete the task. The lower the time
required for completing a process the higher is its priority in SJF algorithm. Another way of
priority assigning is associating a priority to the task/process at the time of creation of the
task/process. The priority is a number ranging from 0 to the maximum priority supported by the
OS. The maximum level of priority is OS dependent. For Example, Windows CE supports 256
levels of priority (0 to 255 priority numbers). While creating the process/task, the priority can be
assigned to it. The priority number associated with a task/process is the direct indication of its
priority. The priority variation from high to low is represented by numbers from 0 to the maximum
priority or by numbers from maximum priority to 0. For Windows CE operating system a priority
number 0 indicates the highest priority and 255 indicates the lowest priority. This convention need
not be universal and it depends on the kernel level implementation of the priority structure. The
non-preemptive priority based scheduler sorts the „Ready‟ queue based on priority and picks the
process with the highest level of priority for execution.
The non-preemptive SJF scheduling algorithm sorts the „Ready‟ queue only after completing the
execution of the current process or when the process enters „Wait‟ state, whereas the preemptive
SJF scheduling algorithm sorts the „Ready‟ queue when a new process enters the „Ready‟ queue
and checks whether the execution time of the new process is shorter than the remaining of the total
estimated time for the currently executing process. If the execution time of the new process is less,
the currently executing process is preempted and the new process is scheduled for execution. Thus
preemptive SJF scheduling always compares the execution completion time (It is same as the
remaining time for the new process) of a new process entered the „Ready‟ queue with the
K.ANIL KUMAR, Asst. Prof. ECE Page 6
7 GNITC ECE UNIT V- EMBEDDED SYSTEM
remaining time for completion of the currently executing process and schedules the process with
shortest remaining time for execution. Preemptive SJF scheduling is also known as Shortest
Remaining Time (SRT) scheduling.
The term Round Robin is very popular among the sports and games activities. You might have
heard about „Round Robin‟ league or „Knock out‟ league associated with any football or cricket
tournament. In the „Round Robin‟ league each team in a group gets an equal chance to play against
the rest of the teams in the same group whereas in the „Knock out‟ league the losing team in a
match moves out of the tournament. In the process scheduling context also, „Round Robin‟ brings
the same message “Equal chance to all”. In Round Robin scheduling, each process in the „Ready‟
queue is executed for a pre-defined time slot. The execution starts with picking up the fi rst process
in the „Ready‟ queue (see Fig. 5.3). It is executed for a pre-defined time and when the pre-defined
time elapses or the process completes (before the pre-defined time slice), the next process in the
„Ready‟ queue is selected for execution. This is repeated for all the processes in the „Ready‟ queue.
Once each process in the „Ready‟ queue is executed for the pre-defined time period, the scheduler
comes back and picks the first process in the „Ready‟ queue again for execution. The sequence is
repeated. This reveals that the Round Robin scheduling is similar to the FCFS scheduling and the
only difference is that time slice based preemption is added to switch the execution between the
processes in the „Ready‟ queue. The „Ready‟ queue can be considered as a circular queue in which
the scheduler picks up the first process for execution and moves to the next till the end of the
queue and then comes back to the beginning of the queue to pick up the first process.
The time slice is provided by the timer tick feature of the time management unit of the OS kernel
(Refer the Time management section under the subtopic „The Real-Time kernel‟ for more details
on Timer tick). Time slice is kernel dependent and it varies in the order of a few microseconds to
milliseconds. Certain OS kernels may allow the time slice as user configurable. Round Robin
scheduling ensures that every process gets a fixed amount of CPU time for execution. When the
process gets its fixed time for execution is determined by the FCFS policy (That is, a process
entering the Ready queue first gets its fixed execution time first and so on…). If a process
terminates before the elapse of the time slice, the process releases the CPU voluntarily and the next
process in the queue is scheduled for execution by the scheduler. The implementation of RR
scheduling is kernel dependent.
5.3.1) RACING
Racing or Race condition is the situation in which multiple processes compete (race) each other to
access and manipulate shared data concurrently. In a Race condition the final value of the shared
data depends on the process which acted on the data finally.
5.3.2) DEADLOCK
A race condition produces incorrect results whereas a deadlock condition creates a situation where
none of the processes are able to make any progress in their execution, resulting in a set of
deadlocked processes. A situation very similar to our traffic jam issues in a junction as illustrated
in Fig. 5.4.
In its simplest form „deadlock‟ is the condition in which a process is waiting for a resource held by
another process which is waiting for a resource held by the first process (Fig. 10.25). To elaborate:
Process A holds a resource x and it wants a resource y held by Process B. Process B is currently
holding resource y and it wants the resource x which is currently held by Process A. Both hold the
respective resources and they compete each other to get the resource held by the respective
processes. The result of the competition is „deadlock‟. None of the competing process will be able
to access the resources held by other processes since they are locked by the respective processes (If
a mutual exclusion policy is implemented for shared resource access, the resource is locked by the
process which is currently accessing it).
Mutual Exclusion: The criteria that only one process can hold a resource at a time. Meaning
processes should access shared resources with mutual exclusion. Typical example is the accessing
of display hardware in an embedded device.
Hold and Wait: The condition in which a process holds a shared resource by acquiring the lock
controlling the shared access and waiting for additional resources held by other processes.
No Resource Preemption: The criteria that operating system cannot take back a resource from a
process which is currently holding it and the resource can only be released voluntarily by the
process holding it.
Circular Wait: A process is waiting for a resource which is currently held by another process
which in turn is waiting for a resource held by the first process. In general, there exists a set of
waiting process P0, P1 …Pn with P0 is waiting for a resource held by P1 and P1 is waiting for a
resource held by P0, …, Pn is waiting for a resource held by P0 and P0 is waiting for a resource
held by Pn and so on… This forms a circular wait queue.
„Deadlock‟ is a result of the combined occurrence of these four conditions listed above. These
conditions are first described by E. G. Coffman in 1971 and it is popularly known as Coffman
conditions.
Deadlock Handling: A smart OS may foresee the deadlock condition and will act proactively to
avoid such a situation. Now if a deadlock occurred, how the OS responds to it? The reaction to
deadlock condition by OS is non-uniform. The OS may adopt any of the following techniques to
detect and prevent deadlock conditions.
Ignore Deadlocks: Always assume that the system design is deadlock free. This is acceptable for
the reason the cost of removing a deadlock is large compared to the chance of happening a
deadlock. UNIX is an example for an OS following this principle. A life critical system cannot
pretend that it is deadlock free for any reason.
Detect and Recover: This approach suggests the detection of a deadlock situation and recovery
from it. This is similar to the deadlock condition that may arise at a traffic junction. When the
vehicles from different directions compete to cross the junction, deadlock (traffic jam) condition is
resulted. Once a deadlock (traffic jam) is happened at the junction, the only solution is to back up
the vehicles from one direction and allow the vehicles from opposite direction to cross the
junction. If the traffic is too high, lots of vehicles may have to be backed up to resolve the traffic
jam. This technique is also known as „back up cars‟ technique (Fig. 5.6).
Operating systems keep a resource graph in their memory. The resource graph is updated on each
resource request and release. A deadlock condition can be detected by analysing the resource graph
by graph analyser algorithms. Once a deadlock condition is detected, the system can terminate a
process or preempt the resource to break the deadlocking cycle.
Avoid Deadlocks: Deadlock is avoided by the careful resource allocation techniques by the
Operating System. It is similar to the traffic light mechanism at junctions to avoid the traffic jams.
Prevent Deadlocks: Prevent the deadlock condition by negating one of the four conditions
favouring the deadlock situation.
Ensure that a process does not hold any other resources when it requests a resource. This can be
achieved by implementing the following set of rules/guidelines in allocating resources to
processes.
1. A process must request all its required resource and the resources should be allocated before the
process begins its execution.
2. Grant resource allocation requests from processes only if the process does not hold a resource
currently.
Ensure that resource preemption (resource releasing) is possible at operating system level. This
can be achieved by implementing the following set of rules/guidelines in resources allocation
and releasing.
1. Release all the resources currently held by a process if a request made by the process for a new
resource is not able to fulfill immediately.
2. Add the resources which are preempted (released) to a resource list describing the resources
which the process requires to complete its execution.
3. Reschedule the process for execution only when the process gets its old resources and the new
resource which is requested by the process.
Imposing these criterions may introduce negative impacts like low resource utilisation and
starvation of processes.
Livelock: The Livelock condition is similar to the deadlock condition except that a process in
livelock condition changes its state with time. While in deadlock a process enters in wait state for a
resource and continues in that state forever without making any progress in the execution, in a
livelock condition a process always does something but is unable to make any progress in the
execution completion. The livelock condition is better explained with the real world example, two
people attempting to cross each other in a narrow corridor. Both the persons move towards each
side of the corridor to allow the opposite person to cross. Since the corridor is narrow, none of
them are able to cross each other. Here both of the persons perform some action but still they are
unable to achieve their target, cross each other. We will make the livelock, the scenario more clear
in a later section–The Dining Philosophers‟ Problem, of this chapter.
Starvation: In the multitasking context, starvation is the condition in which a process does not get
the resources required to continue its execution for a long time. As time progresses the process
starves on resource. Starvation may arise due to various conditions like byproduct of preventive
measures of deadlock, scheduling policies favouring high priority tasks and tasks with shortest
execution time, etc.
Scenario 1: All the philosophers involve in brainstorming together and try to eat together. Each
philosopher picks up the left fork and is unable to proceed since two forks are required for eating
the spaghetti present in the plate. Philosopher 1 thinks that Philosopher 2 sitting to the right of
him/her will put the fork down and waits for it. Philosopher 2 thinks that Philosopher 3 sitting to
the right of him/her will put the fork down and waits for it, and so on. This forms a circular chain
of un-granted requests. If the philosophers continue in this state waiting for the fork from the
philosopher sitting to the right of each, they will not make any progress in eating and this will
result in starvation of the philosophers and deadlock.
Scenario 2: All the philosophers start brainstorming together. One of the philosophers is hungry
and he/she picks up the left fork. When the philosopher is about to pick up the right fork, the
philosopher sitting to his right also become hungry and tries to grab the left fork which is the right
fork of his neighbouring philosopher who is trying to lift it, resulting in a „ Race condition‟.
Scenario 3: All the philosophers involve in brainstorming together and try to eat together. Each
philosopher picks up the left fork and is unable to proceed, since two forks are required for eating
the spaghetti present in the plate. Each of them anticipates that the adjacently sitting philosopher
will put his/her fork down and waits for a fixed duration and after this puts the fork down. Each of
them again tries to lift the fork after a fixed duration of time. Since all philosophers are trying to
lift the fork at the same time, none of them will be able to grab two forks. This condition leads to
livelock and starvation of philosophers, where each philosopher tries to do something, but they are
unable to make any progress in achieving the target. Figure 5.8 illustrates these scenarios.
Fig. 5.8: The ‘Real Problems’ in the ‘Dining Philosophers problem’ (a) Starvation and Deadlock
(b) Racing (c) Livelock and Starvation
Solution: We need to find out alternative solutions to avoid the deadlock, livelock, racing and
starvation condition that may arise due to the concurrent access of forks by philosophers. This
situation can be handled in many ways by allocating the forks in different allocation techniques
including Round Robin allocation, FIFO allocation, etc. But the requirement is that the solution
should be optimal, avoiding deadlock and starvation of the philosophers and allowing maximum
number of philosophers to eat at a time. One solution that we could think of is:
Imposing rules in accessing the forks by philosophers, like: The philosophers should put
down the fork
he/she already have in hand (left fork) after waiting for a fixed duration for the second fork (right
fork) and should wait for a fixed time before making the next attempt. This solution works fi ne to
some extent, but, if all the philosophers try to lift the forks at the same time, a livelock situation is
resulted.
Another solution which gives maximum concurrency that can be thought of is each
philosopher acquires a semaphore (mutex) before picking up any fork. When a philosopher feels
hungry he/she checks whether the philosopher sitting to the left and right of him is already using
the fork, by checking the state of the associated semaphore. If the forks are in use by the
neighbouring philosophers, the philosopher waits till the forks are available. A philosopher when
finished eating puts the forks down and informs the philosophers sitting to his/her left and right,
who are hungry (waiting for the forks), by signalling the semaphores associated with the forks. In
the operating system context, the dining philosophers represent the processes and forks represent
the resources. The dining philosophers‟ problem is an analogy of processes competing for shared
resources and the different problems like racing, deadlock, starvation and livelock arising from the
competition.
Producer-Consumer problem is a common data sharing problem where two processes concurrently
access a shared buffer with fixed size. A thread/process which produces data is called „Producer
thread/process‟ and a thread/process which consumes the data produced by a producer
thread/process is known as „Consumer thread/process‟. Imagine a situation where the producer
thread keeps on producing data and puts it into the buffer and the consumer thread keeps on
consuming the data from the buffer and there is no synchronization between the two. There may be
chances where in which the producer produces data at a faster rate than the rate at which it is
consumed by the consumer. This will lead to „buffer overrun‟ where the producer tries to put data
to a full buffer. If the consumer consumes data at a faster rate than the rate at which it is produced
by the producer, it will lead to the situation „buffer under-run‟ in which the consumer tries to read
from an empty buffer. Both of these conditions will lead to inaccurate data and data loss. The
producer-consumer problem can be rectified in various methods. One simple solution is the „sleep
and wake-up‟. The „sleep and wake-up‟ can be implemented in various process synchronisation
techniques like semaphores, mutex, monitors, etc.
The Readers-Writers problem is a common issue observed in processes competing for limited
shared resources. The Readers-Writers problem is characterised by multiple processes trying to
read and write shared data concurrently. A typical real-world example for the Readers-Writers
problem is the banking system where one process tries to read the account information like
available balance and the other process tries to update the available balance for that account. This
may result in inconsistent results. If multiple processes try to read a shared data concurrently it
may not create any impacts, whereas when multiple processes try to write and read concurrently it
will definitely create inconsistent results. Proper synchronisation techniques should be applied to
avoid the readers-writers problem.
Priority inversion is the byproduct of the combination of blocking based (lock based) process
synchronization and pre-emptive priority scheduling. „Priority inversion‟ is the condition in which
a high priority task needs to wait for a low priority task to release a resource which is shared
between the high priority task and the low priority task, and a medium priority task which doesn‟t
require the shared resource continue its execution by preempting the low priority task (Fig. 5.9).
Priority based preemptive scheduling technique ensures that a high priority task is always executed
first, whereas the lock based process synchronisation mechanism (like mutex, semaphore, etc.)
ensures that a process will not access a shared resource, which is currently in use by another
process. The synchronisation technique is only interested in avoiding conflicts that may arise due
to the concurrent access of the shared resources and not at all bothered about the priority of the
process which tries to access the shared resource. In fact, the priority based preemption and lock
based synchronisation are the two contradicting OS primitives. Priority inversion is better
explained with the following scenario:
Let Process A, Process B and Process C be three processes with priorities High, Medium and Low
respectively. Process A and Process C share a variable „X‟ and the access to this variable is
synchronized through a mutual exclusion mechanism like Binary Semaphore S. Imagine a situation
where Process C is ready and is picked up for execution by the scheduler and „Process C‟ tries to
access the shared variable „X‟. „Process C‟ acquires the „Semaphore S‟ to indicate the other
processes that it is accessing the shared variable „X‟. Immediately after „Process C‟ acquires the
„Semaphore S‟, „Process B‟ enters the „Ready‟ state. Since „Process B‟ is of higher priority
compared to „Process C‟, „Process C‟ is preempted and „Process B‟ starts executing. Now imagine
„Process A‟ enters the „Ready‟ state at this stage. Since „Process A‟ is of higher priority than
„Process B‟, „Process B‟ is preempted and „Process A‟ is scheduled for execution. „Process A
involves accessing of shared variable „X‟ which is currently being accessed by „Process C‟. Since
„Process C‟ acquired the semaphore for signalling the access of the shared variable „X‟, „Process
A‟ will not be able to access it. Thus „Process A‟ is put into blocked state (This condition is called
Pending on resource). Now „Process B‟ gets the CPU and it continues its execution until it
relinquishes the CPU voluntarily or enters a wait state or preempted by another high priority task.
The highest priority process „Process A‟ has to wait till „Process C‟ gets a chance to execute and
release the semaphore. This produces unwanted delay in the execution of the high priority task
which is supposed to be executed immediately when it was „Ready‟. Priority inversion may be
sporadic in nature but can lead to potential damages as a result of missing critical deadlines.
Literally speaking, priority inversion „inverts‟ the priority of a high priority task with that of a low
priority task. Proper workaround mechanism should be adopted for handling the priority inversion
problem. The commonly adopted priority inversion workarounds are:
Priority Inheritance: A low-priority task that is currently accessing (by holding the lock) a shared
resource requested by a high-priority task temporarily „inherits‟ the priority of that high-priority
task, from the moment the high-priority task raises the request. Boosting the priority of the low
priority task to that of the priority of the task which requested the shared resource holding by the
low priority task eliminates the preemption of the low priority task by other tasks whose priority
are below that of the task requested the shared resource and thereby reduces the delay in waiting to
get the resource requested by the high priority task. The priority of the low priority task which is
temporarily boosted to high is brought to the original value when it releases the shared resource.
Implementation of Priority inheritance workaround in the priority inversion problem discussed for
Process A, Process B and Process C example will change the execution sequence as shown in Fig.
5.10
task to get the resource from the low priority task. The only thing is that it helps the low priority
task to continue its execution and release the shared resource as soon as possible. The moment, at
which the low priority task releases the shared resource, the high priority task kicks the low
priority task out and grabs the CPU – A true form of selfishness☺. Priority inheritance handles
priority inversion at the cost of run-time overhead at scheduler. It imposes the overhead of
checking the priorities of all tasks which tries to access shared resources and adjust the priorities
dynamically.
Priority Ceiling: In „Priority Ceiling‟, a priority is associated with each shared resource. The
priority associated to each resource is the priority of the highest priority task which uses this
shared resource. This priority level is called „ceiling priority‟. Whenever a task accesses a shared
resource, the scheduler elevates the priority of the task to that of the ceiling priority of the
resource. If the task which accesses the shared resource is a low priority task, its priority is
temporarily boosted to the priority of the highest priority task to which the resource is also shared.
This eliminates the pre-emption of the task by other medium priority tasks leading to priority
inversion. The priority of the task is brought back to the original level once the task completes the
accessing of the shared resource. „Priority Ceiling‟ brings the added advantage of sharing
resources without the need for synchronisation techniques like locks. Since the priority of the task
accessing a shared resource is boosted to the highest priority of the task among which the resource
is shared, the concurrent access of shared resource is automatically handled. Another advantage of
„Priority Ceiling‟ technique is that all the overheads are at compile time instead of run-time.
Implementation of „priority ceiling‟ workaround in the priority inversion problem discussed for
Process A, Process B and Process C example will change the execution sequence as shown in Fig.
5.11.
The biggest drawback of „Priority Ceiling‟ is that it may produce hidden priority inversion. With
„Priority Ceiling‟ technique, the priority of a task is always elevated no matter another task wants
the shared resources. This unnecessary priority elevation always boosts the priority of a low
priority task to that of the highest priority tasks among which the resource is shared and other tasks
with priorities higher than that of the low priority task is not allowed to preempt the low priority
task when it is accessing a shared resource. This always gives the low priority task the luxury of
running at high priority when accessing shared resources☺.
2. Ensuring proper sequence of operation across processes. The producer consumer problem is a
typical example for processes requiring proper sequence of operation. In producer consumer
problem, accessing the shared buffer by different processes is not the issue, the issue is the writing
process should write to the shared buffer only if the buffer is not full and the consumer thread
should not read from the buffer if it is empty. Hence proper synchronisation should be provided to
implement this sequence of operations.
5.4.1) MUTEX
Mutex similar to a binary semaphore, but mutex has an owner. Mutexes are binary semaphores that
include a priority inheritance mechanism. Mutexes are the better choice for implementing simple
mutual exclusion (hence 'MUT'ual 'EX'clusion). A mutex allows exclusive access to the resource.
The long form is Mutually Exclusion Semaphores (semaphore value of 0 or 1 but lock count can
be 0 or greater for recursive locking). A mutex is intended to protect a critical region. The main
difference is that a semaphore can be “waited for” and “signaled” by any task, while only the task
that has taken a mutex is allowed to release it.
5.4.1.1) Mutual Exclusion through Busy Waiting/ Spin Lock
The „Busy waiting‟ technique uses a lock variable for implementing mutual exclusion. Each
process/thread checks this lock variable before entering the critical section. The lock is set to „1‟
by a process/thread if the process/thread is already in its critical section; otherwise the lock is set to
„0‟. The major challenge in implementing the lock variable based synchronisation is the non-
availability of a single atomic instruction which combines the reading, comparing and setting of
the lock variable. Most often the three different operations related to the locks, viz. the operation of
Reading the lock variable, checking its present value and setting it are achieved with multiple low
level instructions. The low level implementation of these operations are dependent on the
underlying processor instruction set and the (cross) compiler in use.
The „Busy waiting‟ mutual exclusion enforcement mechanism used by processes makes the CPU
always busy by checking the lock to see whether they can proceed. This results in the wastage of
CPU time and leads to high power consumption. This is not affordable in embedded systems
powered on battery, since it affects the battery backup time of the device. An alternative to „busy
waiting‟ is the „Sleep & Wakeup‟ mechanism. When a process is not allowed to access the critical
section, which is currently being locked by another process, the process undergoes „Sleep‟ and
enters the „blocked‟ state. The process which is blocked on waiting for access to the critical section
is awakened by the process which currently owns the critical section. The process which owns the
critical section sends a wakeup message to the process, which is sleeping as a result of waiting for
the access to the critical section, when the process leaves the critical section. The „Sleep &
Wakeup‟ policy for mutual exclusion can be implemented in different ways. Implementation of
this policy is OS kernel dependent. The following section describes the important techniques for
„Sleep & Wakeup‟ policy implementation for mutual exclusion by Windows NT/CE OS kernels.
5.4.2) SEMAPHORES
It is a system of sending message by using flags. Multiple concurrent threads of execution with an
application must be able to synchronize their execution & co-ordinate mutually exclusive access to
shared resources.
To fulfill this requirement RTOS kernel provides a semaphore object that one or more threads of
execution can acquire or release for the purpose of synchronization or mutual exclusion.
Semaphore is like a key that allows a test to carry out some operation or to access a resource A
kernel supports many different types of semaphores
5.4.2.1) Binary Semaphore
Binary semaphores are used for both mutual exclusion and synchronization purposes. A binary
semaphore is used to control sharing a single resource between tasks. Fig 5.12 represent the
concept of binary semaphore.
process/thread acquires it and the count is incremented by one when a process/thread releases the
„Semaphore object‟. The state of the „Semaphore object‟ is set to non-signalled when the
semaphore is acquired by the maximum number of processes/threads that the semaphore can
support (i.e. when the count associated with the „Semaphore object‟ becomes zero). Fig 5.13
represent counting semaphore example.
5.4.3) EVENTS
Event object is a synchronisation technique which uses the notification mechanism for
synchronisation. In concurrent execution we may come across situations which demand the
processes to wait for a particular sequence for its operations. A typical example of this is the
producer consumer threads, where the consumer thread should wait for the consumer thread to
produce the data and producer thread should wait for the consumer thread to consume the data
before producing fresh data. If this sequence is not followed it will end up in producer-consumer
problem. Notification mechanism is used for handling this scenario.
Event objects are used for implementing notification mechanisms. A thread/process can
wait for an event and another thread/process can set this event for processing by the waiting
thread/process. The creation and handling of event objects for notification is OS kernel dependent.
Please refer to the Online Learning Centre for information on the usage of „Events‟ under
Windows Kernel for process/thread synchronisation. The MicroC/OS-II kernel also uses „events‟
for task synchronisation.
Device driver is a piece of software that acts as a bridge between the operating system and the
hardware. In an operating system based product architecture, the user applications talk to the
Operating System kernel for all necessary information exchange including communication with the
hardware peripherals. The architecture of the OS kernel will not allow direct device access from
the user application. All the device related access should flow through the OS kernel and the OS
kernel routes it to the concerned hardware peripheral. OS provides interfaces in the form of
Application Programming Interfaces (APIs) for accessing the hardware. The device driver
abstracts the hardware from user applications. The topology of user applications and hardware
interaction in an RTOS based system is depicted in Fig. 5.14.
Device drivers are responsible for initiating and managing the communication with the hardware
peripherals. They are responsible for establishing the connectivity, initializing the hardware
(setting up various registers of the hardware device) and transferring data. An embedded product
may contain different types of hardware components like Wi-Fi module, File systems, Storage
device interface, etc. The initialisation of these devices and the protocols required for
communicating with these devices may be different. All these requirements are implemented in
drivers and a single driver will not be able to satisfy all these. Hence each hardware (more
specifically each class of hardware) requires a unique driver component.
Certain drivers come as part of the OS kernel and certain drivers need to be installed on the fly.
For example, the program storage memory for an embedded product, say NAND Flash memory
requires a NAND Flash driver to read and write data from/to it. This driver should come as part of
the OS kernel image. Certainly the OS will not contain the drivers for all devices and peripherals
under the Sun. It contains only the necessary drivers to communicate with the onboard devices
(Hardware devices which are part of the platform) and for certain set of devices supporting
standard protocols and device class (Say USB Mass storage device or HID devices like
Mouse/keyboard). If an external device, whose driver software is not available with OS kernel
image, is connected to the embedded device (Say a medical device with custom USB class
implementation is connected to the USB port of the embedded product), the OS prompts the user
to install its driver manually.
Device drivers which are part of the OS image are known as „Built-in drivers‟ or „On-board
drivers‟. These drivers are loaded by the OS at the time of booting the device and are always kept
in the RAM. Drivers which need to be installed for accessing a device are known as „Installable
drivers‟. These drivers are loaded by the OS on a need basis. Whenever the device is connected,
the OS loads the corresponding driver to memory. When the device is removed, the driver is
unloaded from memory. The Operating system maintains a record of the drivers corresponding to
each hardware. The implementation of driver is OS dependent. There is no universal
implementation for a driver. How the driver communicates with the kernel is dependent on the OS
structure and implementation. Different Operating Systems follow different implementations. It is
very essential to know the hardware interfacing details like the memory address assigned to the
device, the Interrupt used, etc. of on-board peripherals for writing a driver for that peripheral. It
varies on the hardware design of the product. Some Real-Time operating systems like „Windows
CE‟ support a layered architecture for the driver which separates out the low level implementation
from the OS specific interface.
The low level implementation part is generally known as Platform Dependent Device (PDD) layer.
The OS specific interface part is known as Model Device Driver (MDD) or Logical Device Driver
(LDD). For a standard driver, for a specific operating system, the MDD/LDD always remains the
same and only the PDD part needs to be modified according to the target hardware for a particular
class of devices.
Most of the time, the hardware developer provides the implementation for all on board devices for
a specific OS along with the platform. The drivers are normally shipped in the form of Board
Support Package. The Board Support Package contains low level driver implementations for the
onboard peripherals and OEM Adaptation Layer (OAL) for accessing the various chip level
functionalities and a bootloader for loading the operating system. The OAL facilitates
communication between the Operating System (OS) and the target device and includes code to
handle interrupts, timers, power management, bus abstraction; generic I/O control codes
K.ANIL KUMAR, Asst. Prof. ECE Page 24
25 GNITC ECE UNIT V- EMBEDDED SYSTEM
(IOCTLs), etc. The driver fi les are usually in the form of a dll file. Drivers can run on either user
space or kernel space. Drivers which run in user space are known as user mode drivers and the
drivers which run in kernel space are known as kernel mode drivers. User mode drivers are safer
than kernel mode drivers. If an error or exception occurs in a user mode driver, it won‟t affect the
services of the kernel. On the other hand, if an exception occurs in the kernel mode driver, it may
lead to the kernel crash. The way how a device driver is written and how the interrupts are handled
in it are operating system and target hardware specific. However regardless of the OS types, a
device driver implements the following:
The Device (Hardware) initialisation part of the driver deals with configuring the different
registers of the device (target hardware). For example configuring the I/O port line of the processor
as Input or output line and setting its associated registers for building a General Purpose IO
(GPIO) driver. The interrupt configuration part deals with configuring the interrupts that needs to
be associated with the hardware. In the case of the GPIO driver, if the intention is to generate an
interrupt when the Input line is asserted, we need to configure the interrupt associated with the I/O
port by modifying its associated registers. The basic Interrupt configuration involves the following.
1. Set the interrupt type (Edge Triggered (Rising/Falling) or Level Triggered (Low or High)),
enable the interrupts and set the interrupt priorities.
2. Bind the Interrupt with an Interrupt Request (IRQ). The processor identifi es an interrupt
through IRQ. These IRQs are generated by the Interrupt Controller. In order to identify an interrupt
the interrupt needs to be bonded to an IRQ.
3. Register an Interrupt Service Routine (ISR) with an Interrupt Request (IRQ). ISR is the handler
for an Interrupt. In order to service an interrupt, an ISR should be associated with an IRQ.
Registering an ISR with an IRQ takes care of it.
With these the interrupt configuration is complete. If an interrupt occurs, depending on its priority,
it is serviced and the corresponding ISR is invoked. The processing part of an interrupt is handled
in an ISR.
The whole interrupt processing can be done by the ISR itself or by invoking an Interrupt Service
Thread (IST). The IST performs interrupt processing on behalf of the ISR. To make the ISR
compact and short, it is always advised to use an IST for interrupt processing. The intention of an
interrupt is to send or receive command or data to and from the hardware device and make the
received data available to user programs for application specific processing. Since interrupt
processing happens at kernel level, user applications may not have direct access to the drivers to
pass and receive data. Hence it is the responsibility of the Interrupt processing routine or thread to
inform the user applications that an interrupt is occurred and data is available for further
K.ANIL KUMAR, Asst. Prof. ECE Page 25
26 GNITC ECE UNIT V- EMBEDDED SYSTEM
processing. The client interfacing part of the device driver takes care of this. The client interfacing
implementation makes use of the Inter Process communication mechanisms supported by the
embedded OS for communicating and synchronising with user applications and drivers. For
example, to inform a user application that an interrupt is occurred and the data received from the
device is placed in a shared buffer, the client interfacing code can signal (or set) an event. The user
application creates the event, registers it and waits for the driver to signal it. The driver can share
the received data through shared memory techniques. IOCTLs, shared buffers, etc. can be used for
data sharing. The story line is incomplete without performing an interrupt done (Interrupt
processing completed) functionality in the driver. Whenever an interrupt is asserted, while
vectoring to its corresponding ISR, all interrupts of equal and low priorities are disabled. They are
re-enable only on executing the interrupt done function (Same as the Return from Interrupt RETI
instruction execution for 8051) by the driver. The interrupt done function can be invoked at the end
of corresponding ISR or IST.