Ce278 - HANDOUT
Ce278 - HANDOUT
i
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
TABLE OF CONTENT
Course Outline v
Course Objectives v
Course Presentation vi
References and Recommended Textbooks vi
Course Assessment vi
Attendance vii
Office Hours vii
CHAPTER ONE 9
EMBEDDED SYSTEMS INTRODUCTION 9
Chapter Objectives: 9
1.1 Small Computers, Hidden Controls 9
1.2 Computer Essentials 10
1.2.1 The Central Processing Unit (CPU) 11
1.2.2 Input and Output Unit 11
1.2.3 The Memory Unit 11
1.3 Embedded System vs General Computing Systems 11
1.4 Comparing the Microprocessor and the Microcontroller 12
1.4.2 The Microcontroller Architecture 14
1.5 What is An Embedded System? 15
1.5.1 History of Embedded System 16
1.5.2 Characteristics of an Embedded System 16
1.5.3 Quality Attributes of Embedded Systems 17
1.5.4 Classification of embedded systems 18
1.6 Major Application Areas of Embedded Systems 19
1.7 Purpose of Embedded Systems 20
1.7.1 Data Collection/ Storage/ Representation 20
1.7.2 Data Communication 21
1.7.3 Digital (Signal) Processing 21
1.7.4 Monitoring 21
1.7.5 Control 22
1.7.6 Application Specific User Interfaces 22
1.8 Challenges in Embedded System design 23
CHAPTER TWO 24
THE TYPICAL EMBEDDED SYSTEM 24
ii
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
iii
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
CHAPTER FOUR 66
BUS STRUCTURE OF EMBEDDED SYSTEMS 66
5.1 Bus Structure 66
4.2.1 Generic Organization of the Bus: 67
4.2.2 Basic Protocol Concept: 69
4.2.3 Bus Arbitration: 69
4.2.4 Direct Access Memory (DMA): 69
CHAPTER FIVE 71
DIGITAL SIGNAL PROCESSOR (DSP) 71
5.1 Introduction to digital signal processing 71
5.2 Digital Signal Processors 72
5.2.1 Simple DSP: Texas Instrument’s TMS32010 75
5.3.3 Architecture of Texas Instrument DSP: TMS32010 77
5.3.4 Features common to most Digital Signal Processors 78
iv
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Course Outline
The term “computer” usually conjures up in the minds of many people the image of a mainframe, a
minicomputer, a PC, a workstation or a laptop computer. However, computers have always been
embedded into all sorts of everyday items from automobiles and planes to TVs, in-house
entertainment centres and toasters. These are usually called embedded computers or embedded
systems, and actually account for more than 90% of all the world’s manufactured processors. In
general, users of embedded systems see a specialized function (such as a High-Definition TV) and do
not directly think of the computer embedded within the system. Such embedded computers are
gaining importance as an increasing number of systems use embedded processors, RAM, disk drives,
and networks. Embedded systems range in size from simple toasters and mini-robots to large-scale
systems deployed in process control, manufacturing, power generation, defence systems,
telecommunication systems, automotive systems, air traffic control, avionics, and video-on-demand
and video-conferencing systems. Embedded systems also differ from their conventional PC or
workstation cousins in several ways. Embedded systems are typically used over long periods of time,
will not (or cannot) be programmed or maintained by its end-users, and often face significantly
different design constraints such as limited memory, low cost, strict performance guarantees, fail-safe
operation, low power, reliability and guaranteed real-time behaviour. These embedded systems often
use simple executives (OS kernels) or real-time operating systems with typically small footprints,
support for real-time scheduling and no hard drives. Many embedded systems also interact with their
physical environment using a variety of sensors and/or actuators. This introductory course on
embedded computing focuses on these issues germane to embedded systems.
A hands-on lab component will provide students with direct experience on both the hardware and
software commonly used in embedded system design. Classroom lectures will also be augmented with
handouts and other materials.
Course Objectives
This course is designed with two complementary goals:
To understand the scientific principles and concepts behind embedded systems; and
To obtain hands-on experience in designing embedded systems.
v
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Understand the basics of embedded system application concepts such as signal processing and
feedback control;
Course Presentation
The course is presented through lectures supported with handouts and tutorials. The tutorial
will be in the form of problem solving and discussions and will constitute an integral part of
each lecture. The student can best understand and appreciate the subject by attending all
lectures and laboratory work, by practicing, reading references and handouts and by
completing all assignments and course work on schedule.
Course Assessment
Factor Weight Location Date Time
Assignment 10 % Assignment
Presentation 10%
vi
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
A B C D F
Attendance
UMaT rules and regulations say that, attendance is MANDANTORY for every student. A
total of FIVE (5) attendances shall be taken at random to the 10%. The only acceptable
excuse for absence is the one authorized by the Dean of Student on their prescribed form.
However, a student can also ask permission from me to be absent from a particular class with
a tangible reason. A student who misses all the five random attendances marked WOULD
not be allowed to take the final exams.
Office Hours
I will be available in my office every Wednesday and Friday (15.00-17.00hrs) to answering
students’ questions and provide guidance on any issues related to the course.
Students must feel free to ask questions in class. Students should not hesitate to email
me with any questions whatsoever.
Students must endeavour to attend all lectures, lab works and do all their assignments
and coursework.
Students must be seated and fully prepared for lectures at least 5 minutes before
scheduled time.
Under no circumstance a student should be late more than 15 minutes after scheduled
time.
NO student shall be admitted into the lecture room more than 15 minutes after the
start of lectures unless pre-approved by me.
vii
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
All cell phones, IPods, MP3/MP4s, and PDAs etc MUST remain switched off
throughout the lecture period.
There shall be no eating or gum chewing in class.
Plagiarism shall NOT be accepted in this course so be sure to do your referencing
properly.
Thank You.
viii
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
CHAPTER ONE
EMBEDDED SYSTEMS INTRODUCTION
Chapter Objectives:
Learn what an Embedded system is;
Distinguish between the embedded system and General Computing system;
Learn to classify embedded systems based on performance and complexity;
Understanding the different purposes of embedded systems;
Analysis of real life example on the bonding of embedded technology with human life; and
Understand the different purposes of embedded systems.
Keywords:
Information technology advances is changing the way live in the world today; be it for health,
entertainment, business, transportation, education and many more. Typical examples include the
social television, 3D gaming, snake robots, holographic television, Paper-Thin, Flexible Computers
and Phones technologies and the like. Today, the technologies of the information and communication
revolution are those at the cutting edge and their applications offer momentous opportunities for
development. The heart of Information technology is computers of astonishing power available for
our use and finding their use in every realm of human activity. Some computers are created to be as
intense as would be prudent, without worry for cost, for powerful applications in industry and
research. Others are intended for the home and office, less effective yet in addition less overpriced.
Another classification of PC is minimal perceived, somewhat the fact that it is little observed. This is
the kind of PC that is planned into an item, with a specific end goal to give its control. This PC is
cannot be seen visually, with the end goal that the user’s frequently doesn't have any acquaintance
with its existence. Such products are known as embedded microcomputer systems and that is what
this course is about. These little computers we generally call microcontrollers; it is one extended
family of these that would be studied.
To better comprehend the expression embedded microcomputer systems, consider each word
independently. In this specific circumstance, "embedded" means covered up inside so one can't see it.
The expression "micro" implies little, and a "PC" contains a processor, memory, and a way to trade
information with the outside world. "System" implies various segments interfaced together for a
typical reason. Systems have structure, behavior, and interconnectivity working in a system bound by
standards and controls. Using automobile to explain embedded systems practically, the major systems
of an automobile are the engine, fuel system, exhaust system, cooling system, lubrication system,
electrical system, transmission, and the chassis. The chassis includes the wheels and tires, the brakes,
the suspension system, and the body. Mechanical systems in automobiles are largely replaced by
electronic systems. Today Automobile industry is making great use of embedded systems. Embedded
9
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
System is basically a system of hardware and software designed for control and access of data. As a
system it includes a controller as the brain. Ranging from wiper controls to complex anti-lock brake
controls and air bags, embedded systems have gained the overall control of recent automobiles. Some
of the current trends of embedded systems in automobiles include airbag controllers, navigation
systems, satellite radio, adaptive cruise control, drive by wire, heads up displays etc.
These days embedded microcomputer systems are everywhere, appearing in the home, office, factory,
cinema or hospital. While many of these examples seem very different from each other, they all draw
on the same principles as far as their characteristics as embedded microcomputer systems are
concerned
The hardware of a microcomputer system can be divided into four functional sections:
the Input unit;
Central processing Unit;
10
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Any computer system has a central processing unit that recognizes and responds to set particular set
of instructions it executes; all programs are built up in one way or another from this instruction set.
The instruction set leads to the computer sets (1) Complex Instruction Set Computers (CISC) and (2)
Reduced Instruction Set Computers (RISC). A CISC has many instructions and considerable
sophistication. Yet the complexity of the design needed to achieve this tends to lead to slow operation.
Also a single short instruction code could be 1 byte. On the other hand, the RISC which is kept simple
leads to faster operations and the instruction is contained within a single binary word.
NB: In an embedded system the communication is likely to be primarily with the physical world
around it, through sensors and actuators.
The GPC system is a combination of generic and general purpose operating systems for
executing variety of application; whilst the ES is a combination of special purpose hardware
and embedded OS for executing a specific application;
11
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The GPC contains general purpose operating system; whereas, the ES may or may not contain
and operating system functioning;
Application and operating systems installation can be re-programmed by the users for GPC.
On the other hand, with ES the firmware (software for the ES) is pre-programmed and it is
non-alterable by the user;
With GPCs, performance is key deciding factor in the selection for the system; whilst,
application specific requirements are key deciding factor for ES;
The GPC has response requirements that are not-time critical but ES for certain category like
mission critical systems, the response time requirement is highly critical;
GPC need not to be deterministic in execution behavior; whereas, ES execution behavior is
deterministic for certain types e.g. Hard Real Time systems.
Microprocessors are referred to as general purpose digital computer central processing unit; however;
it is no sense a complete digital computer as also the microcontroller being popularly known as
“computer on a chip”. Its completeness is dependent on memory (ROM, RAM), clock timing circuit,
a number of I/O devices and others. It requires external support chips (chipset) in order to interface
with memory and other I/O devices. Examples of some general purpose microprocessors include Intel
Itanium, Advanced Micro Devices Athlon and IBM Power PC families. Beside to the general-purpose
microprocessors, these families involve another type called special-purpose microprocessors that are
used in embedded control applications. This type of embedded microprocessors is called
microcontroller. The evolution of microprocessors focuses on having them in stressed higher
integration at lower cost on single chip solution to solve problems using the same approach as
microcomputers approach; hence, microcontrollers generally applies to this paradigm. To be
simplistic the microcontroller houses all the components of a microcomputer (control unit, memory
and I/O) all in one chip.
To summarize the comparison between the microprocessor and the microcontroller, the
microprocessor is more concerned with rapid movement of code and data from external address
(memory location) to the chip whereas the microcontroller is concerned with the high speed
movement of bits within the chip. Furthermore, the microcontroller could function as a computer with
no external digital parts; the microprocessor must have many additional parts for it to be fully
operational.
The basic microprocessor encompasses from the user point of view three sections i.e. register section
(this contains the number of registers/ temporary storage elements each capable of holding/ storing a
unit byte or word), the arithmetic logic unit (ALU), which also performs actual data manipulation
operations, and the timing and control unit (coordination of the internal operation of the
microprocessor and also controls the ALU so that programs instruction works as specified).
12
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The microprocessor central processing unit (CPU) houses the Arithmetic and Logic Unit (ALU), a
program counter (PC), a stack pointer (SP), accumulator, some working registers, a clock time
circuitry, and interrupt circuit. A more detailed block diagram is shown below;
The main purpose of the microprocessor is to fetch data and execute the data by performing extensive
calculations on that data, and after the execution, the calculations is stored in a mass storage device or
displayed for human use. Under normal circumstances, the program as directed by users for the use of
the microprocessors are stored in the mass storage device and loaded into the RAM; however, very
limited microprocessor programs are stored in ROM. The ROM-based programs are ideally the
minute static programs operating on peripherals and other fixed devices connected to the system.
The microprocessor communicates with the memory, both to obtain the individual instruction which
make up the program and to access and store data, and to transfer to and from input and output ports
using the highway or bus. This highway consists of three separate buses: the data bus (used to carry
the data associated with a memory/ an I/O transfer and is typically 8 bits wide), the address bus (used
to specify the memory location/ an I/O port involved in the transfer) and the control bus (consisting
of various control lines created by the microprocessor and other systems to synchronize transfers) as
shown below;
It is important to note that the data bus of many computer and microprocessor particularly is
bidirectional. This means that the processor have the capability of writing data onto the bus line to be
read by e.g. memory device or it can read data from the bus as presented by such device. Furthermore,
the control bus incorporates the timing and controlling signals (e.g. MEMORY READ, MEMORY
WRITE, READ INPUT PORT, READ OUTPUT PORT, HOLD INTERRUPT) as generated by the
microprocessor to synchronize information transfer between the microprocessor and the memory/ an
I/O port.
13
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The microcontroller can be likened to a true computer on a chip; it integrates all the features in a
microprocessor (i.e. ALU, PC, SP, and registers). In addition, it incorporates other features that make
it a complete computer: ROM/RAM, I/O ports, counters and clock circuity. Particular to the
microcontroller, its prime use is to control the operation of machine using the fixed program written
into the ROM and stays the same throughout the machine’s lifecycle. The microcontroller is designed
in such a way that it fetches data from its own pins with an optimized architecture and instruction set
to handle data in bit and byte size. The Block diagram of the microcontroller is show in the figure
below:
A generic view of a microcontroller is shown below. Essentially, it contains a simple processor core,
along with all necessary data and program memory. To this it adds all the peripherals that allow it to
do the interfacing it needs to do. These may include digital and analog input and output, or counting
and timing elements. Other more sophisticated functions are also available, which you will encounter
later in the book. Like any electronic circuit the microcontroller needs to be powered, and needs a
clock signal (which in some controllers is generated internally) to drive the internal logic circuits.
The I/O devices are a crucial part of an embedded system, because they provide necessary
functionality. The software together with the I/O ports and associated interface circuits give an
14
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
embedded computer system its distinctive characteristics. The microcontrollers often must
communicate with each other. How the system interacts with humans is often called the human-
computer interface (HCI) or man-machine interface (MMI).
As emphasized earlier, an embedded System is basically a system of hardware and software designed
for control and access of data. As a system it includes a controller as the brain. The microcontroller/
the Microprocessor could be the brain of embedded systems.
One could typically restrict the term embedded to refer to systems that do not look and behave like a
typical computer. Most embedded systems do not have a keyboard, a graphics display, or secondary
storage (disk). An embedded system is a microcomputer (a small computer that includes a processor,
memory and I/O devices) as part of system with mechanical, chemical, or electrical devices attached
to it, programmed for a specific dedicated purpose, and packaged up as a complete system. The
external devices attached to the microcontroller allow the system to interact with its environment. An
interface is defined as the hardware and software that combine to allow the computer to communicate
with the external hardware. We must also learn how to interface a wide range of inputs and outputs
that can exist in either digital or analog form.
From the diagram above we have got the CPU, Memory (RAM & ROM) and I/O devices all
connected via a bus system. All these things are integrated onto the same silicon real estate so they are
of the same chip itself. Apart from the CPU, Memory and I/O devices mention above there are other
components of micro-controllers as well and these are:
Timer module: A timer module allows the micro-controller to perform tasks for certain time
periods. A timer in a typical general-purpose computer is programmed to work for a certain
time period with respect to the system clock and after that typically a timer can generate an
interrupt to the processor so that there can be switching from one task to another. A similar
function is performed by the timer module on a micro-controller used for embedded
application as well;
Serial I/O ports: The serial I/O ports allow data flow between the micro-controller and
devices which support serial interfaces. The serial ports are also used for interfacing with the
PC during development, environmental development and life cycle of the embedded system.
When you need to use the embedded appliance with the PC to read data or reprogram it the
serial ports are also used;
Analog to Digital Controllers (ADC): The Analog to Digital Controller allows the micro-
controller to convert external input which may come in Analog form to the Digital form for
subsequent processing by the micro-controller; and
Digital to Analog Controllers (DAC): This converts the digital signal from the micro-
controller to Analog form to the external world. Below is a more detailed block diagram of a
micro-controller.
15
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The Embedded system has been in existence prior to the IT revolution. In the earlier days embedded
systems were built around the old vacuum tube and resistor technologies and the embedded algorithm
was developed in low level languages. The revolution of the embedded system today was borne out of
the advances of the semiconductor and Nano-technology. The maiden embedded system is the Apollo
Guidance Computer developed by the MIT instrumentation Lab for the lunar expedition. They ran the
inertial guidance system of both the Command Module (CM) and Lunar Excursion Module (LEM).
The CM was designed to encircle the moon while the LME and its crew were designed to go down to
the moon and surface and land there safely. MIT’s original design was based on 4K words of fixed
memory (ROM) and 256B words of erasable memory (RAM). The final configuration progressed to
36K word of fixed memory and 2K words of erasable memory. The clock frequency of the first
microchip proto model used in the AGC was 1.024 MHz and was driven from a 2.048MHz crystal
clock. The computing unit of AGC consisted of approximately 11 instructions and 16 bit word logic.
Around 5000 integrated circuits ware used for this design. The user interaction unit of AGC is known
DSKY (display/ keyboard). DSKY looked like a calculator type keypad with an array of numerals. It
was used for inputting the commands to the module numerically.
With every system be it an embedded or non-embedded system; there exist a set of characteristics
describing the system i.e. the functional and non-functional aspect of the system. It is important to
note that whenever we designing our embedded system we take note of both functional and non-
functional features. Unlike general computers as established earlier, embedded systems have unique
feature which is specific to each embedded system application. Some important characteristics of
embedded systems include;
Application and domain specific: If we critically look at embedded systems it is very
obvious that each has certain functions to perform and developed in such a manner that they
do the intended functions only. This is one major criterion that distinguishes the general
purpose computer from embedded systems. For example one cannot replace an embedded
system control unit of a television and a fridge. It won’t work because each is designed for a
specific task;
Reactive and Real Time: As mentioned earlier embedded system constantly interact with
the real-world via sensors and actuators connected to user inputs devices via ports of the
system. Changes in environment are captured in real-time by the sensors or input devices and
the control an algorithm running within the embedded system reacts in a designed manner
bringing the control output units to the appropriate levels. Such an event could be periodic or
an unpredicted one. With respect to the unpredicted one, the embedded system should be
built such that, it is scheduled to capture the events without missing them. Embedded systems
produce change in output in response to the change in the input. They generally referred to as
Reactive Systems. Furthermore, real-time embedded systems operate such that, their timing
behavior is deterministic (known amount of time). Deadlines can’t be missed. It is important
to note that it is not necessary to build all embedded systems to operate in real time.
However, embedded systems for mission critical applications such flight control, antilock
brake systems etc. must operate in real-time;
Operates in harsh environments: Environments within which embedded systems operate
are dynamic i.e. could be controlled or not. They can be deployed in dusty, high temperature
zone or even in area subject to high vibrations and shock. Systems positioned in such area
16
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
should be able to withstand the entire harsh operating environment. For example of the
systems is to be used high temperature zone all of its components should be of high
temperature grade;
Distributed: Distributed here means that, the embedded system may be part of a larger
system. Many number of such distributed system integrates together to form a single large
embedded control unit. A typical example is an automatic vending machine. This machine
contains card reader, a vending unit and the likes. Each of the units are independent but work
together;
Small size and weight: As technology is advancing its products thereof are getting smaller.
For example today the mobile phones we use are getting smaller and even more
sophisticated. Aesthetics is an important actor when opting for any product. When one is
making an option to buy products aesthetics cannot be ruled out (e.g. size, weight, shape,
style etc.). Moreover it is convenient to handle compact devices than the bulky ones. Most
embedded systems application demands small sized and low weight products.
Power concerns: Power consumption is of great concern to embedded system developers.
Embedded systems should be designed such that, it can minimize heat dissipation by the
system. Products that gives out much heat demands cooling systems for its efficient operation
which in turn occupies more space making the system bulky. It is needful to select embedded
system components taking into consideration low power consumption like low drop out
regulators, controller and processors with power savings model. Also power concerns
become crucial with embedded systems operate on batteries. The more the power
consumption the less is the battery life.
The non-functional requirements that needed to document as part of the system could be referred to as
the quality attributes. If the quality attributes are concrete and measurable it speaks well of the system.
The various quality attribute needs to be address in any embedded system could be broadly classified
into two; (1) Operational Quality Attribute and (2) Non-Operational Quality Attributes.
17
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
It is possible to have a multitude of classification for embedded systems, based on different criteria.
Some of the criteria used in the classification of embedded systems are:
Based on generation;
Complexity and performance requirements;
Based on deterministic behavior; and
Based on triggering.
The classification based on deterministic system behavior is applicable for ‘Real Time’ systems. The
application/task execution behavior for an embedded system can be either deterministic or non-
deterministic based on the execution behavior. Real Time embedded systems are classified into Hard
and Soft. We will discuss about hard and soft real time systems in a later chapter. Embedded Systems
which are “Reactive” in nature (like process control systems in industrial control applications) can be
classified based on the trigger. Reactive systems can be either event triggered or time triggered.
First Generation: The early embedded systems were built around 8bit microprocessors like
8085 and Z80, 4bit microcontrollers. Simple in hardware circuits with firmware developed in
Assembly code. Digital telephone keypads, stepper motor control units etc. are examples of
this.
Second Generation: These are embedded systems were built around 16bit microprocessors
and 8 or 16bit microcontrollers, following the first generation embedded systems. The
instruction set for the second generation processors/controllers were much more complex and
powerful than the first generation processors/controllers. Some of the second generation
embedded systems contained embedded operating systems for their operation. Data
Acquisition Systems, SCADA systems, etc. are examples of second generation embedded
systems.
Third Generation: With advances in processor technology, embedded system developers
started making use of powerful 32bit processors and 16bit microcontrollers for their design. A
new concept of application and domain specific processors/controllers like Digital Signal
Processors (DSP) and Application Specific Integrated Circuits (ASICs) came into the picture.
The instruction set of processors became more complex and powerful and the concept of
instruction pipelining (refer to section 2.1) also evolved. The processor market was flooded
with different types of processors from different vendors. Processors like Intel Pentium,
Motorola 68K, etc. gained attention in high performance embedded requirements. Dedicated
embedded real time and general purpose operating systems entered into the embedded market.
Embedded systems spread its ground to areas like robotics. Media, industrial process control
networking, etc.
Fourth Generation: The advent of system on chips (SoC), reconfigurable processors and
multicore processors are bringing high performance, tight integration and miniaturization into
the embedded device market. The SoC technique implements a total system on a chip by
integrating different functionalities with a processor on an integrated circuit. These
generations of embedded system are making use of high performance and real-time embedded
18
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
operating system for their function, Smart phone devices, mobile internet devices etc. are all
example for the fourth generation embedded system.
Technological advancements has imparted our lives tremendously starting from the computer industry
where most people find jobs for livelihood, entertainment, transportation and the likes; embedded
systems plays a vital role. The embedded technology has acquired new dimensions from its first
generational model, the Apollo guidance computer to the modern day navigation systems in cars, bio
med devices tech is being applied into wide variety of analytical problems including medicine,
surgery and drug discovery, these devices are portable diagnostic imaging and home monitoring such
as cholesterol monitors, blood glucose meters; and with recent innovations paving way for
miniaturization of devices, replacement organs and tissues, earlier use of more accurate diagnostics,
and advances in information technology, became available thru Silicon Chip revolution. The
application areas of embedded systems are countless. Other applications of embedded systems include
home appliances, office automation, security, telecommunication, instrumentation, entertainment,
aerospace, banking and finance, automobiles personal and in different embedded systems projects. A
few of the important domains and products are listed below;
Embedded System for Detecting Rash Driving on Highways: Designed on a highway
speed-checker device that identifies rash driving on highways and alarms the traffic
authorities if the speed checker finds any vehicle violating the set speed limits on highways.
Application of Embedded System for Street Light Control: Detect the movement of
vehicles on highways and to switch on street lights ahead of it, and then to switch off the
street lights as the vehicle go past the street lights to conserve energy.
Embedded System for Traffic Signal Control System: A density based traffic signal
system. At every junction, the signal timing changes automatically according to the traffic
density at every junction. Traffic jam is a major problem in many cities across the world and
gives regular nightmares to the commuters and travelers.
Application of Embedded System for Vehicle Tracking: The main purpose of this
embedded system is to find the exact location of a vehicle by using a GPS modem and in
order to reduce vehicle thefts. The GSM modem sends an SMS to a predefined mobile which
stores the data in it. An LCD display is used to display the location information in terms of
latitude and longitude values.
19
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Embedded System for Auto Intensity Control: Designed to auto intensity control of LED
based street lights by using solar power from the photovoltaic panels. The awareness for solar
energy is increasing, and many institutions and peoples are opting solar energy. In this
project, Photovoltaic panels are used for charging batteries by converting the sun energy into
electrical energy. A solar charge controller circuit is used to control the charging.
Application of Embedded System for Home Automation System
Embedded System for Industrial Temperature Control
Application of Embedded System for War Field Spying Robot
Embedded systems are used in diverse facets of our lives be it consumer electronics, home
automation, telecommunication, automotive industry, healthcare, banking application etc. Within each
of the mentioned domains, based on the application context, the functionality may vary. Each
embedded system is designed to serve the purpose of any one/ a combination of the following tasks:
An embedded system designed for the purpose of data collection (text, voice, image, video, electric
signal, and any other measurable quantities) performs acquisition of data from the external world
which is done for storage, analysis, manipulation and transmission. The data can be either analog
(continuous) or digital (discrete). Embedded systems with analog data capturing techniques collects
data directly from analog signal whereas the ones with digital data collection mechanism coverts the
analog signal to corresponding digital signal using analog to digital converter and then collects the
binary equivalent analog data. If the data to be collected is digital, the collection can be done directly.
The data collected can be stored directly in the system or transmitted to some other systems or it may
be processed by the system or it may be instantly deleted after giving meaningful representation. A
digital camera is a typical example of an embedded system with data collection/storage/
representation of data. Captured images may be stored within the memory of the camera or presented
to the user via a graphical LCD unit.
20
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Embedded data communication systems are deployed in applications ranging from complex satellite
communication systems to simple home network systems. As mentioned early that data collected by
an embedded terminal can be transferred to its own memory or an external some other system located
remotely. The transmission could be achieved via wire or wireless medium. Data communication
could be analog or digital; but, modern industry trends are settling towards digital communication.
The data collecting embedded terminal can incorporate data communication units like wireless
modules (Bluetooth, ZigBee, Wi-Fi, EDGE, GPRS etc.) or wireline modules (RS-232C, USM,
TCP/IP cable). Network hubs, routers, switched etc. are typical examples of dedicated data
transmission embedded systems. They act as mediators in data communication and provide various
features like data security, monitoring etc.
Figure 1.10 A wireless router and industrial switch and router systems
Embedded systems can collect various kinds of data which may be used for various data processing.
Embedded systems having signal processing functionalities are employed in applications demanding
signal processing like speech coding, synthesis, audio/video codec, and transmission applications. A
digital hearing aid is a typical example of an embedded systems employing data processing. It
improves the hearing capabilities of hearing impaired people.
1.7.4 Monitoring
Embedded systems under this category are used for monitoring purposes. Majority of the embedded
systems within the medical domain comes with monitoring functions. They are used for determining
the state of some variable using sensors. They cannot impose control over the variables. A typical
example is the Elctro Cardiogram machine (ECG). It cannot impose control over the heartbeat. Other
examples of embedded systems with monitoring capabilities include digital CRO, multimeters, logic
21
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
analyzers, etc. as used in instrumentation and control. They are used for knowing the status of some
variables like current, voltage, etc. ; however, they cannot control the variable in return.
1.7.5 Control
Embedded systems with control functionalities impose control over some variable according to the
change in input variable. Such systems contain both sensors and actuators. Sensors are connected to
the input port for capturing the changes in environmental variable or measuring variable. The
actuators connected to the output ports are controlled according to the changes in the input variable to
put an impact on the controlling variable to bringing the controlled variable to the specific range. The
air conditions in our rooms can be used as a typical example of an embedded system used for control
purpose. An air conditioner contains room temperature sensing elements which may be a thermistor
and a handled unit for setting up the desired temperature. The handled unit may be connected to a
central embedded system via a wireless link or wire. The air condition compressor unit acts as the
actuator. The compressor is controlled according to the current room temperature and the desired one
set by the user.
Figure 1.13 An Airconditioner for Controlling Room Temperature. Embedded with Controlling
Functionality.
The consist of embedded systems with application specific user interfaces like buttons, switches,
keypads, lights, bells, display unit etc. Mobile phone is a typical example. Our mobile phones have
user interfaces such as keypad, graphic LCD, system speaker, vibration alerts, and many more. Other
interesting ones include the intelligent shoes.
22
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The obvious challenges are security, real-time, scalability, and high availability, but other key
challenges exist, such as what we call performance-based interoperability. Although these complex,
ubiquitous systems are glued together with layers of protocols, they still have time constraints and
other performance demands that impact how the system will perform and thereby how it will be
accepted by the public. For example, if we have a deadline for sending videos across multiple links, is
it really going to get to the people wanting to watch the video in a manner that they can view the
video properly? Solving this issue requires meeting time constraints and going through layers of
protocols, software mappings, and switches from one kind of network to another, and through layers
of software. If the result is a poor quality video, people won't accept the product. Also, for embedded
systems to be universal they must be easier to use, and we're starting to see that. For example, with e-
mail- enabled phones, you might want to go through the Internet and download your e-mail; but you
still have to punch in a URL using the keypad on the phone and then the result is three tiny lines of
text on the screen, which is not too exciting. Yet some people like it, and they're using it. Better
interfaces will make this application more prevalent.
Another challenge for smart environments is safety. For example, in a smart Institution, you won't
want to see doors opening and closing at the wrong times, or windows slamming on somebody's hand.
Smart environments must be safe environments. In the end, people won't want them if they're not safe,
available, and reliable. In fact, smart environments must be as reliable as, say, the power grid. We
come in every day, we turn on the light and the electricity is there. We need the same kind of
performance from these smart spaces Interoperability is an even bigger challenge. The diversity of
embedded devices in my opinion, the main challenge is to define a distributed computing model for
networked embedded systems. Networking these devices is just the first step.
The ultimate goal is to make them cooperate to combine or aggregate their functionalities or
resources. Their number and variety is so large that a traditional distributed model simply cannot be
applied without causing an overwhelming programming overhead. To keep programming at such a
scale manageable, the new computing model must tolerate incomplete results, partial synchronization,
and weak consistency. So far, researchers have proposed scalable solutions for simple cooperative
tasks such as routing using content-based addressing which we rely will make our everyday life
extremely difficult, if we cannot have these devices silently cooperate by exchanging data and tasks.
Finally, fault tolerance and security are traditional challenges for any distributed system, and they will
ultimately determine the acceptance of embedded technologies by society. Being able to provide an
environment that is secure and highly available while still delivering deterministic real-time
characteristics is very important.
Correctness-getting systems software and applications to run correctly, especially because they're
used in much safety critical areas. Another big issue will be scalability, meaning that the industry
must face the challenges of designing complex software that scales well with existing Microcontroller
solutions for embedded systems.
Sample Question:
Explain the different classification of embedded systems. Give an example for each
What is an embedded system? Explain the different application of embedded systems
23
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
CHAPTER TWO
THE TYPICAL EMBEDDED SYSTEM
Chapter Objectives:
A typical embedded system comprises of a single chip controller, which acts as the master brain of the
system. The controller can be a Microprocessor or Microcontroller or Field Programmable Gate Array
(FGPA) device or Digital Signal Processor (DSP) or an Application Specific Integrated Circuit
(ASIC) or Application Specific Standard Product (ASSP). Embedded systems are designed to regulate
physical variable or to manipulate the state of some devices by sending some control signals to
actuators or devices connected to the output ports, in response to the input signals provided by the end
users or sensors which are connected to the input ports. Hence, we can that an embedded system is a
reactive system. The controlling is achieved by processing the information coming from the sensors
and users interface and controlling some actuators that regulate the physical variable.
Some input devices connected to embed system are keyboards, push buttons switches, etc. and output
devices are LEDs, liquid crystal displays, piezoelectric buzzers, etc. It should be noted that all
embedded systems cannot have I/O user interfaces. I/O user interface to embedded should be
application specific. Furthermore, some embedded systems do not require any manual intervention for
their operation. They automatically respond to changes in the real world by sensing the variation in
the input parameters which they interact with via sensors connected to the input port of the system.
The sensor information is passed on to the processor after the signal conditioning and digitization.
Once the signal is received, the sensor data is received; the processor or the brain of the embedded
system performs some pre-defined operation with the help of the firmware of the embedded system
and send some actuating signal to the actuator connected to the output port of the embedded system.
The actuator in turn acts on the controlling variable to bring the controlled variable to the desired level
to make the embedded system work in the desired manner.
The memory of the embedded system is basically responsible for holding the control algorithm as
well as other configuration details. Majority of the embedded system use fixed type of memory for
storing the algorithm or the configuration data. This type of memory is known as the Read Only
Memory (ROM) and it is not available for the end user modification. The most common types of
ROM for storing embedded system, control algorithm and configuration are One Time Programmable
(OTP) ROM, Programmable ROM (PROM), Ultraviolet Erasable Programmable ROM (UVEPROM),
Electrically Erasable Programmable ROM (EEPROM) and Flash. Depending on the control
application type the memory capacity may vary from few bytes to megabytes. On the other hand, the
24
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
embedded system might require temporary memory for performing arithmetic operation or control
algorithm execution; and this kind of memory is known as Random Access Memory (RAM). The
types of RAM include Static RAM (SRAM), Dynamic RAM (DRAM) and Non Volatile RAM
(NVRAM).
An embedded system having no control algorithm is like a new born baby. With all needed
peripherals it is not capable of making any decision regardless surrounding situation in its real world.
For embedded systems it is the responsibly of the designer to impart intelligence to the system unlike
a baby whose brain is self-adaptive as he or she grows. In a controller based embedded system the
controller has an internal memory for keeping the algorithm. Some controller may not have an
internal memory and may require and external (off-chip) memory for holding the control algorithm.
Embedded systems are domain and application specific and are built around a central core. The core
of the embedded system falls into any one of the following category;
Almost 80% of embedded system are processor or controller based the processor may be a
microprocessor or a microcontroller or a digital signal processor depending on the domain and
application most of the embedded system in the industrial controls and monitoring application makes
use of the common available processor or microcontrollers whereas domains which require signal
processing such as speech coding, speech recognition etc. make use of special kind of digital signal
processors supplied by manufacturers like Analogue Devices, Texas Instruments etc.
Microprocessors
A microprocessor is a silicon chip representing a Central Process Unit (CPU) which is capable of
performing arithmetic as well as logical operations according to a predefined set of instructions which
is specific to the manufacturer. In general the CPU contains the Arithmetic and Logical Unit (ALU),
control unit and working registers. A microprocessor is a dependent unit and it requires the
combination like memory, timer unit and interrupt controller etc. for proper functioning. Intel claims
the credit for developing the first microprocessor unit Intel 4004, a 4 bit processor in November 1971.
It featured 1k data memory 12-bit program counter and 4k program memory 16 4 bit general purpose
registers and 46 instructions. It runs at a clock speed of 740 kilohertz. It was designed for olden day’s
calculators. In 1972 14 more instructions were added to the 4004 instruction set and the program
space is upgraded to 8K. Also interrupt capabilities were added to it and it was renamed as Intel 4040.
It was quickly replaced in April 1972 by Intel 8008 which was similar to Intel 4040. The only
difference was that its program counter was 14 bits wide and the 8080 served as a terminal controller.
In April 1974 intel launch the first 8 bit processor the intel 8080, with 16 bit address bus and program
25
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
counter and seven 8 bit registers (A-E,H,L;BC,DE and HL pairs formed the 16-bit registers for this
processor). Intel 8080 was the most commonly processor for industrial control and other embedded
systems in 1975s. Since the processor required other hardware components as mentioned earlier for its
proper functioning the system made out of it were bulky and lacking compactness.
Immediately after the release of Intel 8080, Motorola also entered the market with their processor
Motorola 6800 with a different architecture and instruction set compared to 8080.
In 1976 Intel came up with the updated version of the 8080-intel 8085, with two newly added
instructions 3 interrupt pins end serial input/ output. Clock generator and bus controller circuit were
built in and the power supply part was modified to a single + 5v supply.
In july 1976 Zilog entered the microprocessor market with its Z80 processor as competitor to intel.
Actually it was designed by an ex intel designer Frederico Faggin and it was an improved version of
intel 8080 processor maintaining the original 8080 architecture and has an 8 bit data and 16 bit
address bus and was capable of executing all instructions of 8080. It included a timer, new
instructions and it brought out the concept of register banking by doubling the register set. Z80 also
included two sets of index registers for flexible design.
Technical advances in the field of semiconductor industry brought a new dimension to the
microprocessor market and 20th century witnessed a fast-growing in processor technology. 16, 32 and
64 bit processors came into the place of conventional 8-bit. The initial 2 mhz clock it's now an old
story today processes with clock speed up to 2.4GHz are available in the market and more and more
competitors entered into the processor market offering high speed high performance and low-cost
processor for customer design needs.
Intel, AMD, Freescale, IBM, TI, Cytrix, Hitachi, NEC, LSI etc. are the key players in the processor
market. Intel still leads the market with cutting-edge technology processor industry. Different
instruction set in system architecture are available for the design of a microprocessor reduced
instruction set computer and complex instruction set computing are the 2 common architecture
available for processor design.
A general purpose processor is a processor design for general computational task. The processor
running inside your laptop or desktop (Pentium 4/AMD Athlon etc.) it's a typical example for general
purpose processor they are produced in large volumes and targeting the general market. A typical
general purpose processor contains an ALU and control unit. On the other hand, application-specific
set processor all processes with architecture and instruction set optimized to specific domain or
application requirements like network processing, automotive, telecom, media, digital signal
processing, control applications etc. The need for an ASIP arises when the traditional general purpose
processor to meet the increasing application needs. Most of the embedded systems are built around
application-specific instruction set processor. Some microcontrollers (like automotive AVR, USB
AVR from Atmel) systems on chips, digital signal processors etc. Examples for application specific
instruction set processor. ASIPs incorporate a processor and on-chip peripheral, application
requirement program and data memory.
26
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Microcontrollers
A microcontroller is a highly integrated chip that contains a cpu scratchpad RAM, special and general
purpose register array, on chip ROM/FLASH memory for storage, timer and interrupt control units
and dedicated I/O ports. Microcontrollers can be considered as a super set of microprocessor. Since a
microcontroller contains all necessary functional block for independent working they found greater
place in the embedded domain in place of microprocessors. Apart from this, cheap, teeth and are
highly available in the market.
Texas Instrument's TMS 1000 is considered as the world's first microcontroller. TI followed Intel’s
4004/4040, 4-bit processor design and added some amount of RAM, program storage memory (ROM)
and I/O support on a single chip, there by eliminated the requirements of multiple hardware for self-
functioning. Provision to add custom instructions to the CPU was another innovative feature of TSM
1000; TSM 1000 was released in 1974.
In 1977 Intel entered fuller market with a family of microcontrollers coming under one umbrella
named MCS-48tm family. The processes which came and that this family were 8038HL, 8039HL,
8040AHL,8048H, 8049H and 8050AH. Intel 8048 is recognized as Intel’s first microcontroller and it
was the prominent member in the MCS-48tm family. It was used in the original IBM PC keyboard.
Eventually Intel came out with its most fruitful design in the 8-bit microcontroller that is the 8051
family and its derivative. It is the most popular and powerful 8-bit micro controller ever built. Almost
75% of the microcontrollers used in the embedded domain where 8051 family-based controllers
during 1980-90s. 8051 processor cores are used in more than android devices by more than 20
independent manufacturers like Maxim, Philips, Atmel etc. under licensed from Intel. Due to the low
cost, memory efficient instruction set, mature development tools and Boolean processing capabilities,
8051 family derivative microcontrollers are much used in high volume consumer electronic devices,
industry and other gadgets where cost cutting is essential.
Another important family of microcontrollers used in industrial control and embedded application is
the PIC family microcontrollers from Microchip Technologies. It is a high performance RISC
microcontroller complementing the CISC (Complex Instruction Set Computer) feature of the 8051.
High processing speed microcontroller families like the arm11 series are also available in the market,
provide solutions to application required hardware acceleration and higher processing capability.
There are various key players in the microcontroller industry (e.g. Freescale, NEC, Philips, Daewoo,
Intel, Maxim, Sharp, Silicon Laboratories, TDK, Windond, Atmel etc.); however, Atmel has got
special significance. They are manufacturers of variety of flash memory based microcontrollers. They
also provide in system programmability for the microcontroller. The flash memory techniques health
in first programming off the ship and thereby reduces the product development time. Atmel also
provides another special family of microcontrollers called AVR), an 8-bit RISC flash microcontroller,
enough to execute powerful instructions in a single clock cycle and provide the latitude you need to
optimise power consumption.
The instruction set architecture of a microcontroller can be either RISC or CISC. Microcontrollers are
designed for general purpose application requirement (general purpose controller) or domain specific
application requirement (application specific instruction set processor). The Intel 8051
microcontroller is a typical example for a general purpose microcontroller, whereas the automotive
27
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
AVR microcontroller family from Atmel Corporation is a typical example for ASIP specifically
designed for automotive domain.
Program Memory: Memory for storing the program required by the DSP to process the data;
Data Memory: Working memory for storing temporary variable and data/ signal to be
processed;
Computational Engines: Performs the signal processing in accordance with the stored
program memory. Computation Engines incorporates many specialized arithmetic units and
each of them operates simultaneously to increase the execution speed; and
I/O units: Acts as an interface between the outside world and DSP. It is responsible for
capturing signals to be processed and delivering the processed signal.
Audio video signal processing, telecommunication and multimedia applications are typical examples
where DSP is employed. Digital Signal Processing employs large amount of real-time calculations.
Sum of products (SOP) calculations, convolution, Fast Fourier Transform (FFT), Discrete Fourier
Transform (DFT) are some of the operation performed by digital signal processors. BlackFin
processor from Analog Devices is an example of DSP which delivers breakthrough signal-processing
performance and power efficiency.
Reduced Instruction Set Computer (RISC) Vs Complex Instruction Computer (CISC) Processor/
Controllers
The RISC basically means that all RISC processor or microcontroller would possess lesser number of
instructions (ranging from 30 to 40) whereas those of the CISC have high number of instruction set.
For example Atmel AVR microcontroller is an example for RISC processor and its instruction set
contains 32 instructions; and the 8051 microcontroller is an example of CISC controller with
instructions up to 255. It is important to note that, it is not the number of instruction that determine
whether a processor/ controller is CISC or RISC. Other factors like pipelining(refer to section 2.1)
features, instruction set type etc. counts. Some important criteria include;
The RISC has less number of instructions; whereas, the CISC has greater number of
instructions;
RISC instruction sets are Orthogonal (Allows each instruction to operate on any register and
use and addressing mode); whilst, the CISC instructions are Non-Orthogonal (thus,
instruction specific);[ (refer to section 2.1)]
RISC has single fixed length instructions and the CISC has variable length instructions;
28
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
RISC controllers/processor uses less silicon and pin count; whereas, the CISC possesses more
silicon usage, addition decoder logic is required to implement complex instruction decoding;
RISC controllers/processor implements the Harvard Architecture; whereas the CISC
implements the Von Neumann Architecture.
With RISC programmers need to write more code to execute a task since instruction are
simple ones; but CISC instructions are like macros in C language. A programmer can achieve
the desired functionality with a single instruction which in turn provides the effect of using
more simple single instructions in RISC.
Microprocessors or controllers based on Von Neumann architecture shares a single bus for fetching
instructions and data. Program instructions and data are stored in a common main memory. Von
Neumann architecture based processor or controller first fetches an instruction and then fetches the
data to support the instruction from code memory. The two fetches slows down the controller's
operation. Von Neumann is also referred as Princeton architecture since it was developed at Princeton
University.
Microprocessors or controllers based on Harvard architecture will have separate data bus and
instruction bus. This allows the data transfer and program fetching to occur simultaneously to both
buses. With Harvard architecture, the data memory can be read and written while the program
memory is been assessed. These separated data memory and good memory verse allow one instruction
to execute whilst the next instruction is fetched (pre-fetching). The pre-fetch theoretically allow much
faster execution than Von-Neumann architecture. The figure below shows the two architectures.
The Harvard architecture has separates bus for instruction and data fetching whereas the Von
Neumann has single bus for instruction and data fetching
With Harvard architecture it is easier to pipeline(refer to section 2.1), so high performance
can be achieved; however, this is not with the case of the Von Neumann hence, low
performance.
Harvard architecture implementation is comparatively at high cost where as that of the Von
Neumann is cheaper.
With Harvard architecture no memory alignment problems whereas Von Neumann allows
self-modifying codes.
With Harvard architecture, since data memory and program memory are stored physically in
different locations no chance for accidental corruption of program memory. On the other
29
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
hand with the Von-Neumann, since data and program memory are stored physically in the
same chip, chances for the accidental corruption of program memory
The processor/ controller operation require data to be stored in the memory. The endianness specifies
the order in which the data would be stored in the memory by processor operation in a multi byte
system (Processors whose word size is greater than one byte). Suppose the word length is two byte
then data can be stored in the memory in two different ways; (1) higher order of data byte at the
higher memory and lower order of data byte at location just below the higher memory (2) lower order
of data byte at the higher memory and higher order of data byte at location just be below the higher
memory.
Little-Endian means lower-order byte of the data is stored in memory at the lowest address,
and the higher-order byte at the highest address. (The little end comes first). For example, a 4
byte long integer Byte3 Byte2 Byte1 Byte0 will be stored in the memory as shown below;
Big-Endian means higher-order byte of the data is stored in memory at the lower address and
the lower-order byte at the highest address (the big end come first). For example, a 4 byte
long integer Byte3 Byte2 Byte1 Byte0 will be stored in the memory as shown below;
30
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The conventional execution of instructions by the processor or controller takes follows the fetch-
decode-execute sequence. To be simplistic in the explanation of the pipelining let us consider the
decode and execution together. During the decode operation, the memory address bus is available and
if it possible to utilize it for and instruction fetch, the processing speed can be increased. Pipelining
refers to the overlap execution of instructions. The figure below illustrates the concept of instruction
pipelining for a single stage pipelining.
2.2 Memory
The memory is an important part of a processor/ controller based embedded systems. Some of the
processor/ controllers contain built in memory and this memory is referred as on-chip memory and the
vice versa as the off-chip memory. Also some working memory is required for holding data
temporarily during certain operations. This would focus on the different types of memory used in
embedded system applications.
The program memory code or storage memory of an embedded system stores the program instructions
and it can be classified into different types as per the block diagram representation given in the figure
below;
31
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Depending on the fabrication, erasing and programming technique they can be classified as follows;
32
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
design. The only limitation is that their capacity is limited when compared to the standard ROM (A
few kilobytes)
Flash
It is the most recent ROM technology and widely used ROM technology for todays embedded system
designs. It is a variation of the EEPROM technology. The flash memory is organized as sectors
(blocks) and combined the re-programmability of EEPROM and the high capacity standard ROMs.
The Flash stores information in an rray of floating gate MOS-FETs. The erasing can be done at sector
level without affecting other sectors. Each sector should be erased before reprogramming.
The RAM is the data memory or working memory of the controller/ processor for which data can read
or written to it. RAM is volatile (when power is turned off all contents are lost). It is direct access
memory (i.e. accessing a desired memory location without traversing through the entire memory).
This is in contrast with the SAM (Sequential Access Memory) where desired locations in memory are
searched by traversing through the entire memory or via the “seek” method. Magnetic tapes, CD
ROMs etc. are example of SAM. RAM falls into three categories: Static RAM (SRAM), Dynamic
RAM(DRAM) and Non-Volatile RAM(NVRAM).
SRAM
They are made up of flip-flops and stores data in the form of voltages. It is the fastest form of RAM.
A typical SRAM cell implementation is realized using 6 transistors (MOSFETs). Four of the
transistors are used to build a latch (flip flop) part of the memory cell. The visual representation of
the SRAM cell is shown in the figure below;
The four transistors in the middle form the cross-coupled inverters. The memory cell is controlled by
the word line which controls the access to the transistors (MOSFETs) Q5 and Q6 controlling
connection to the bit lines B and B\. To write a value to the memory cell, apply the desired value to
the control bit lines (for writing 1, make B=1 and B/=0; for writing 0, make B=0 and B/=1) and assert
the word line (making it high). The operation latches the bit written in the flip flop. For reading the
content of the memory cell, assert both the B and B/ nit lines to 1 and set the word line to 1. The
major limitation to the SRAM is low capacity and high cost. Since a minimum of six transistors are
required to build a single memory cell, imaging how many memory cell can fabricated on silicon.
DRAM
The DRAM stores data in the form of charges and are made up of metallic transistor gates. Its
advantage is the high density and low cost compared to the SRAM. The disadvantage is that since the
33
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
information is stored as charge is gets leaked off with time and to prevent this there is the need for
refreshing periodically in milliseconds interval. The MOSFET act as the gate for the incoming and
outgoing data whereas the capacitor acts as the bit storage unit. The figure below shows the
implementation of the DRAM.
Non-Volatile (NVRAM)
It is a random access memory with battery backup. It contains static RAM based memory and minute
battery for providing supply to the memory in the absence of external power supply. The memory and
the battery are put together in a single package. They are usually used for storing results of operation
or for setting up of flags etc.
Ideally the execution of program or configuration from the ROM is very slow (120 to 200ns)
compared to execution from the RAM (40 to 70ns). Shadowing is a memory technique adopted to
solve the execution speed problems in processor-based systems. Instead of accessing information
from the ROM manufacturers includes a RAM behind the ROM as its same address as a shadow to
the ROM. What happened is that the content of the ROM is copied to the shadowed RAM and write
protecting the RAM and disabling reading access to the ROM. A typical example would be the
memory shadowing of the BIOS (Basic Input Output Configuration ROM).
Embedded systems require a program memory for holding the control algorithm or embedded
systems OS and the applications designed to run on top of it (for OS based design); data memory for
holding variables and temporary data during execution and memory for holding non-volatile data (e.g.
configuration data, look up table etc.) which a modifiable. The memory requirement for embedded
systems is application specific dependent; however, lot of factors needs to be considered in terms of
type and size of memory for embedded system.
Identify your systems requirement and based on the processor (SoC or microcontroller with
on-chip) used for the design, take a decision on whether an on-chip memory is sufficient or an
external memory would be needful.
Know the memory parameters requirement for the embedded system design e.g. when one is
looking a memory chip of 750 bytes for an embedded system project the closest option would
the 1024 bytes memory. We cannot go in for 512 bytes as it would be below the needed
memory requirement (750 bytes).
Whilst selecting the memory keep in mind the address range supported by your processor.
It is also critical to consider the word size of the memory. The word size refers to the number
of memory bit that can be read/write together at a time 4, 8, 12, 16, 24, 32, 64 etc. Ensure that
34
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
the word size supported by the memory chip matches with the data bus width of the
processor/ controller.
For embedded systems with low power requirement like portable devices, choose low power
memory devices.
It was emphasized earlier that embedded systems do interact with the real-world and the controlling/
monitoring functions executed by the embedded system is achieved in accordance with the changes
happening in real-world. The alterations in the systems environment or variable are detected by
sensors attached to the input port of embedded system. If the embedded system is designed for
controlling purpose, the system will produce some changes in the controlling variable to a desired
value. It is achieved via an actuator connected to the out port of the embedded system.
2.3.1 Sensors
A sensor is transducer that converts energy from one form to another for any measurement or control
purpose. If we recall the smart shoe technology the sensors are used to measure the distance between
the cushion and magnet in the smart running shoe is a magnetic hall effect sensor.
2.3.2 Actuators
The I/O subsystem of the embedded system facilitates the interaction of the embedded system with
the external world. As was emphasized earlier the interaction happens via the sensors and actuators
connected as input and output devices respectively to the embedded system. This section would
illustrate some of the sensors and actuators used in embedded systems and the I/O system to facilitate
the interaction of the embedded systems with the external world.
35
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The LED segments are names from A to G and the decimal point LED segment DP. For example to
display the number 4, the segments F, G, B and C are lit. All the 8 LED segments need to be
connected to one port of the processor/ controller for displaying alphanumeric digits. The 7 LED
segment display are available in two different configuration modes; (1) Common Anode: the 8 anodes
are connected commonly (2) Common Cathode: the 8 cathodes share a common line. The LED
cathode or anode based on the configuration are connected to port of the processor or controller in the
order “A” segment to the least significant port pin and “DP” segment to the most significant port pin.
The figure below illustrates the two configurations.
Figure 2.11 Common Anode and Cathode configuration of a 7-segement LED display
Stepper Motor
A stepper motor is an electro-mechanical device which generates discrete motion in response to the
“dc” electronic signals. It differs from the normal “dc” motor such that “dc” motors produces
continuous rotation on applying “dc” voltage whereas the stepper motor produces discrete rotations in
response to the dc voltage applied to it. Stepper motors are widely used in industrial embedded
applications, consumer electronic product and robotic control system. Based on the coil winding
arrangement a two-phased stepper motor is classified as (1) Unipolar and (2) Bipolar. However, the
two-phased unipolar stepper motors are the popular choice for embedded systems application. The
figure below shows the circuit diagram of the interfacing of a stepper motor through a driver
connected to the port pins of a microprocessor or controller. The driver circuit is required because the
current requirement of the stepper motor is little high and hence the port pins of the microprocessor or
controller may not be able to drive it.
36
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Relay
The relay is an electro-magnetic device. In an embedded system application the Relay unit acts as a
dynamic path sectors for signals and power. It contains a relay coil made up of insulated wire on a
metal and a metal armature with one or more contacts. The relay works based on electromagnetic
principle. The magnetic field attracts the armature core and moves the contact coil which in turn
changes the power/ signal flow path. The relay is normally controlled using a relay driver circuit
connected to the port pin of the processor or controller. A transistor is used for building a relay driver
circuit as illustrated below.
Piezo Buzzer
Piezo Buzzer is a piezoelectric device for generating audio indications in embedded applications. A
Piezo Buzzer contains a piezoelectric diaphragm which produces audible sound in response to the
voltage applied to it. Piezo Buzzer are of two types (1) self-driven (2) external-driven. Piezo Buzzer
can be directly interfaced to the port pin of the processor or controller. Depending on the driving
current requirements, Piezo Buzzer can also be interfaced using a transistor based driver circuit as in
the case of a “Relay”.
37
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Keyboard
The keyboard is an input device for user interfacing. Matrix keyboard is an optimum solution for
handling large key requirements; thereby reducing the number of interfacing connections. For
example interfacing 16 keys, in the direct interfacing technique require 16 port pin, whereas in the
matrix keyboard only 8 lines are required. The 16 keys are arranged in a 4 column x 4 Row matrix as
shown in the figure below;
In the matrix keyboard, the keys are arranged in a matric fashion. For detecting a press key, the
keyboard uses the scanning technique, where each row of the matrix is pulled low and the columns
are read. After reading the status of each columns corresponding to a row, the row is pulled high and
the next row is pulled low and the status of the columns are read. These processes are repeated until
the scanning for all rows are completed.
Communication interface is essential for communicating with various subsystems of the embedded
system and with the external world. For an embedded product, the communication interface (Onboard
Communication Interface) and Product level communication interface (External Communication
Interface). Embedded product is a combination of different types of components (chips/devices)
38
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
arranged on a printed circuit board (PCB). The combination channel which interconnects the various
components within an embedded product is referred as device/board level communication interface
(onboard communication interface). Serial interfaces like 12C. SPI, UART, I-Wire, etc and parallel
bus interface are examples of ‘Onboard Communication Interface.
Some embedded systems are self-contained units and they don’t require any interaction and data
transfer with other sub-systems or external world. On the other hand, certain embedded systems
maybe a part of a large distributed system and they require interaction and data transfer between
various devices and sub-modules. The ‘Product level communication interface’ (External
Communication Interface) is responsible for data transfer between the embedded system and other
devices or modules. The external communication interface can be either a wired media or a wireless
media and it can be a serial or a parallel interface. Infrared (IR), Bluetooth (BT), Wireless LAN (Wi-
Fi), Radio Frequency waves (RF), GPRS, etc. are examples for wireless communication interface,
RS-232C/RS-422/RS-. USB, Ethernet IEWEE 1394 port, Parallel port, CF-11 interface, SDIO,
PCMCIA, etc. Are examples for wired interfaces, it is not mandatory that an embedded system
should contain an external communication interface. Mobile communication equipment is an example
for embedded system with external communication interface.
The following section gives you an overview of the various ‘Onboard” and ‘External’ communication
interfaces for and embedded product. We will discuss about the various physical interface firmware
requirements and initialization and communication sequence for these interfaces in dedicated book
titled ‘Device Interfacing’, which is planned under this series.
39
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Parallel Interface
The on-board parallel interface is normally used for communication with peripherals devices with are
memory mapper to the host of the system. The host processor/ controller of the embedded system
contain parallel bus and the device which supports parallel bus can directly connect to this bus system.
The communication through the parallel bus is controlled by the control signal interface and device
and the host. The “Control Signal” for communication includes “Read/ Write” signal and device
select signal. The device normally contains a device select lines and the device becomes active only
when this line is asserted by the host processor.
The External Communication Interface refers to the different communication channels/ buses to the
embedded system to communicate with the external world. The following section gives an overview
of the various interfaces for external communication.
USB transmits data in packet format. Each data packet has a standard format. The USB
communication is a host initiated one. The USB host contains a host controller which is responsible
for controlling the data communication, including establishing connectivity with USB slave devices,
packetizing and formatting the data. There are different standards for implementing the USB Host
Control interface; namely Open Host Control Interface (OHCI) and Universal Host Control Interface
(UHCI).
40
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
USB supports four different types of data transfers, namely; Control, Bulk. Isochronous and interrupt.
Control transfer is used for sending a block of data to a device. Bulk transfer supports error checking
and correction. Transferring data to a printer is an example for bulk transfer. Isochronous data transfer
is used for real-time data communication. In Isochronous transfer, data is transmitted as streams in
real-time. Isochronous transfer doesn’t support error checking and re-transmission of data in case of
any transmission loss. All streaming devices like audio devices and medical equipment for data
collection make use of the isochronous transfer. Interrupt transfer is used for transferring small
amount of dat. Interrupt transfer mechanism makes use of polling technique to see whether the USB
device has any data to send. The Frequency of polling is determined by the USB device and it varies
from 1 to 255 milliseconds Devices like Mouse and Keyboard, which transmits fewer amounts of
data, uses interrupt transfer.
Infrared (IrDA)
Infrared is seral, half duplex, line of sight based technology for data communication between devices.
It is in use from the olden days of communication and you may be familiar with it. It uses infrared
waves of the electronic magnetic spectrum for transmitting the data. Infrared supports point-to-point
and point-to-multipoint communication provides all devices involved in the communication are within
the line of sight. The typical communication range for infrared lies in the range 10cm to 1m. The
range can be increased by increasing the transmitting power of the IR device. IR supports data rates
ranging from 9600bits/Second to 16Mbps.
Bluetooth (BT)
Bluetooth is low cost, low power, short range wireless technology for data and voice communication.
Bluetooth operates at a range of 2.4GHz of radio frequency spectrum and uses the Frequency
Hopping Spread Spectrum (FHSS) technique for communication. Literally it supports data rates up to
1Mbps and a range of approximately 30 feet for data communication. Bluetooth supports point-to-
point and point-to-multipoint wireless communication. A Bluetooth device can form either the master
or slave known as Piconet.
Embedded softwares are typically hidden in watches, VCR’s, cellular phones, toasters and it guides
missiles control satellites and also used in medical instruments. These softwares application are not
like the general purpose softwares that run on desktop computers. This application scenario makes
embedded software special and different from that of standard desktop software. There are varying
requirements and the characteristic of hardware platform because of the difference in hardware for
embedded systems.
In a way, software for embedded systems interact with the physical world and therefore;
41
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Takes time – there is a n issue of time consumed by the software in relation to the timings of
external events.
It consumes power – Software for embedded system should consume less power and efficient.
It does not terminate – Unlike standard programs where they are expected to terminate in
infinite number of steps, embedded system programs typically do not terminate.
Reliability: Human intervention is not possible for error-handling and is expected to run
without human intervention for a long period.
Cost-effectiveness: Software applications for embedded system are expected to be cost
effective in their development.
Low-power consumption: Software application for embedded systems must use less power
when running.
Efficient use of memory: Applications used for embedded systems should make efficient use
of memory because memory cannot be unbounded.
Performance requirement: The most performance requirement is that of timeliness.
Timeliness: Computation takes time and even if we have infinitely fast computer, embedded
software needs to deal with time because the physical processes with which it interacts
evolves over time. Therefore these timing conditions and constraints are to be followed. The
other important issue is that performance gain due to the use of elaborate caching schemes,
speculative instruction execution and branch prediction are avoided in micro-controllers and
DSP’s since they compromise system efficiently and reliability.
Concurrency: An embedded system engages the physical world where multiple things
happen at the same time and so concurrency becomes an issue. Embedded systems need to
react to stimulus from a network of and variety of sensors and retain control over actuators,
and therefore the processing should happen concurrently.
Liveliness: Software programs for embedded systems must not terminate or block waiting for
events that will never occur because if it gets blocked in such a way then the whole system
will basically fail.
Interfaces: When software is built, a kind of composition is done to reuse the components.
Components combine according to the interface based upon static behaviour. In terms of
dynamic processes the issue to deal with is how to combine different processes to work in a
cooperative way.
Heterogeneity: Since we are getting software from different sources and also dealing with
external events which are of different nature, heterogeneity becomes an important issue. If we
consider a simple mobile phone, the job of receiving the speech data in a compressed form,
decompressing and playing it back has got different characteristics as that of accessing the
phonebook. Therefore there are two different computation styles and these computations have
to be handle together. Also the events which occur in the external world may be of different
type. They may be regular or irregular and the software needs to handle both kinds of events
without unnecessary delays.
42
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The time between presentation of a set of inputs to a system and the realization of the required
behaviour including availability of the associated output is called the response time of the system.
When there is an external signal coming, in corresponding to the characteristics of the external signals
there will be bounds on the response and that leads as to the formal definition of real-time systems.
A real-time system is a system that must satisfy explicit bounded response-time constraints or risk
severe consequences including failure.
It is to be noted that all embedded systems are not necessarily real-time systems. The real-time
systems can be further categorized as:
A Soft real-time system is one in which performance is degraded but not destroyed by failure to meet
response-time constraints.
A Hard real-time system is one in which failure to a meet a single deadline may lead to complete and
catastrophic system failure.
A firm real-time system is one in which a few missed deadlines will not lead to a total failure but
missing more than a few may lead to complete and catastrophic failure.
Dynamic Efficiency – This is the number of CPU cycles required for execution of all
instructions in the software when running.
43
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Static Efficiency – The number of bytes required, that is the RAM and ROM sizes need.
Power Consumption – The amount of power consumed when devices running the developed
applications.
These are the three most important parameters considered when measuring the quality of software for
embedded applications.
The correctness of software – That is the ability of software to do exactly what it was
developed to do.
Easy to understand – Software must be less complex and easy to comprehend and understand.
Easy to change – All software must be flexible and maintainable.
Determination of requirements
Design system software architecture
Select operating system, if any
Choose the development platform developing on the OS platform chosen
Coding of the application
Optimization of the code according to the requirement so as to meet the quality measures
Verification of the software on the host system
Verification of software on the target system
The choices of programming languages that can be used for the software development are as follows:
Assembly Language: This is processor dependent and efficient since developer can exploit
the architectural feature of the target processor directly. But it is difficult to write and
maintain large programs because they may become complex as number of instructions will be
much more. As the instructions increases the probability of bugs in the code also increases.
High level language: This ensures portability and easy software development because the
number of line of code instructions that is needed to be handled for a given task could be
much smaller and the compiler perform translation. Most development is done toady using
structured languages that is high-level languages. But some assembly level programming may
still be necessary. For example device driver softwares use assembly language portion of
programs that communicates with controls (drives) because they have detailed considerations,
extensive bit manipulation.
Build Process
Software development in embedded system goes through what is known as a build process. When a
target platform is known, tools can exploit features of hardware and OS in the building process. But it
is not always that assumptions about target can be made so in that condition user needs to provide
information about a target. So the building environment which is targeted for this kind of
44
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
development always must have provision for putting in these kinds of parameters from users. The
most important aspect of the component of the build process is compiling.
The Compiler:
Compiling is important for object code generation and what is called code optimization because the
object code has to be optimized with respect to the target architecture. There is a need to optimize also
the code that is been generated because the codes cannot be translated straight away into a high level
language code into the target machine instructions. There has to be some kind of code transformation
if we want to optimally exploit several register set of the processors. This aspect of compiler, design
and implementation is of tremendous relevance and importance for embedded systems. The other
important issue is that cross-compiler or cross-assembler is used. This compilers or assemblers
actually run on a host which is different from the target machine. Typical standard C compiler in a
Linux machine will run and generate the code for the target processor itself. When using a cross
compiler, it may run on a Linux machine but generate the code for a different processor example
ARM processor. Compilers generate object file.
Object file:
The contents of an object file can be thought of as a very large, flexible data structure which contains
instructions and data resulting from the translation process. The object file is not directly an
executable image. Object file which is generated in the host system is not definitely the executable
image and come in different standard forms which are:
They come in standard forms because they can be used with the linkers of all the software tools which
are targeted for the specific platform for generating the executable image.
Linking is the basic process of merging sections from multiple object files. The unresolved references
to symbols are replaced by reference to actual variables or function calls. These are to be bundled to
correct relative address. A special object file that contains compiled start-up code is included.
A start-up code – is a small block assembly language code that prepares the way for execution of the
code. This is present and typical for all embedded systems. To initialize the system to get code
running the start-up code is needed.
45
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The following steps prepare the environment for the software application to run on the particular
embedded system platform. Many of these steps normally will be done on a general purpose
computing platform by a loader but in this case we are directly loading software onto the processors
memory or the target boots memory for execution. Therefore all start-up codes should include all
these features.
Locator:
A locator is separate or bundled together with a linker. The designer provides information about the
memory on the target board via the locator and the locator uses this information to assign memory
addresses to code and data sections. In fact GNU linker has this provision and memory information
can be passed to GNU linker in the form of a linker script. The memory map of the targeted board has
to be known because the host system where the linking is been done does not know this memory map.
The Linker also does the job of that of relative address resolution. Therefore the locator gives the
memory map so that the memory image can actually be created. This is used along with the start-up
code for partitioning memory between stack, heap as well as the program code. After the partitioning
the complete bundle can be loaded on the target board for the purpose of execution.
Memory Allocation:
Memory allocation is one of the basic issues for any kind of software environment.
Global Variables
Heaps – for dynamic memory locations
Stack - for local variables, return addresses and temporary data
The content of these locations changes at run-time. Since the RAM is distributed among these
components, then the sizes of these components can be estimated beforehand thereby helping the
developer determine the size of RAM that is actually required on the target board.
If there are both ROM and EPROM partitioning depends upon the requirements of the design. The
configuration data can be typically put in ROM and software can be on EPROM which then can be
modified or updated anytime. The Flash memory can also have the basic software but in many cases
46
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
flash can be used for data storage. Flash memory is slow compared to RAM. Therefore RAM is used
to shadow code or parameters to improve performance.
Running a Program:
In running a program, if the development processor is different from target the code instruction is
downloaded to the target system or simulator.
In developing software for embedded system the desirable High level language features to look for are
as follows:
Versatile Parameter Passing: In parameter passing call by value is copying into stack at
considerable execution time cost and therefore large data structures are never passed by value
but rather a pointer is passed by value. To call by reference means that indirect addressing
mod is used for accessing the variable. With this time for execution of the procedure is more.
Therefore there is the need to decide which mechanism is to be followed for optimization of
time for parameter passing.
Global variable vs. Parameter Lists: A reference to global variable is definitely faster because
direct addressing is used. But parameter references makes interfaces clearly defined. This is
because if only global variables are been used, when modified there can be side effects and
that can produce a bug. That is why global variables for the purpose of software design are to
be avoided, but they provide the fastest mechanism for communication and as such used when
pressed for time.
Recursion: Recursion is another feature is to be avoided fundamentally because recursion
requires indirect addressing, stack allocations and also unknown memory requirement in run-
time. Therefore in writing software applications for embedded system recursion should be
replaced with iterative constructs.
Re-entrant Procedure: Re-entrant procedure can be used by several concurrently running tasks
in multi-tasking systems. Re-entrancy implies that if writing such procedures for multi-
tasking systems care must be taken to save the state of the processor. Which means all the
registers, variables that have been used must be saved in the stack before the procedures start
execution. Once the procedures are completed the state in which the processor was is
restored. This makes sure that if the same procedure is invoked by another concurrent
process, that same procedure can be used by another concurrent process without disturbing
the data of this procedure invocation by the original thread.
47
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
C programming language is very widely used because it has functionality of constructs which is
similar to that of assembly language constructs.
Object-Oriented language like C++ can also be used because of its reusability of any of the designing
components. The reusability comes in terms of the composition because one can build an object by
combining the objects which have been already designed and through inheritance.
Java also satisfies the object-oriented criteria but also at the same time code mobility. Java Virtual
Machine (JVM) is a king of code simulation of a target machine architecture which can run a platform
independent fashion.
48
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
CHAPTER THREE
EMBEDDED SYSTEM - ARM CORTEX
The ARM processor is basically a 32-bits processor which is meant for particularly high end
applications, that is applications which involve more complex computation. ARM processor was first
developed at Acorn Computer Limited of Cambridge, England between 1983 and 1985. It was just
after 1980 when the concept of RISC Architecture was introduced at Stanford and Berkley.
Subsequently ARM limited was formed in 1990 and this company popularizes the concept of ARM
processor core which they licensed to a number of other manufacturers to make different variety of
chips around this same processor core. Therefore what we shall be looking at is just not a family of
processors but conceptually a CPU Architecture which may figure in a number of different chips
intended for embedded applications.
The ARM processor is an embedded system processor used in majority of today microcontrollers seen
today and a wide range of embedded applications such as battery powered devices (e.g. health
monitoring and fitness applications, medical meters etc.), automotive, IOT, Mobile and Home
Appliances, home automation, toys and consumer products, PC and mobile accessories and many
more. For example a product from Fitbit known as Fitbit Flex (https://www.fitbit.com/flex2) is a
fitness tracking device with wireless synchronization has the ARM Cortex M3 as its processor. This
microprocessor is part of the microcontrollers by the STMicro is used to target low power embedded
system application. Another example is the TomTom Spark 3 GPS multisport fitness watch powered
by the ARM Cortex M7 (from Atmel microcontrollers). Many popular microcontroller manufacturers’
produces microcontrollers based on ARM Cortex M processors (32/16/8 bit). Some includes TI,
STMicro, Toshiba, NXP, Microchip, and Broadcom. The ARM Cortex core is loved by most
microcontroller manufacturers as a result its minimal cost, minimal power and minimal silicon area,
high performance based application and very powerful and easy to use interrupt controllers supporting
240 external interrupts. One of the major microcontroller manufacturer competitors to ARM includes
the AVR processors core of 8/16/32 bit (Majority of Arduino microcontrollers uses the AVR
processor architecture).
49
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The ARM Cortex Mx where the “x” could be (0, 3, 4 thus ARM Cortex M0/M3/M4) gives two
operational modes known as the Thread mode and the Handler mode. The designer’s application
code executes under the “Thread Mode” which is also called as “User Mode”. All the exceptions
handlers or interrupt handlers will be executed under the “Handler mode” of the processor. This
means during code execution when the processor receives an interrupt from any peripheral, it would
switch from the “thread/user mode” to the “handler mode” associated with the interrupt service
routine of the exception or interrupt. It is the core of the processor which gives these two modes
depending on the code execution. Ideally the processor core begins execution with the “thread/user
mode” and whenever an interrupt/ exception is encountered then it switches to the “handler mode” in
order to service the ISR associated with that system exception or interrupt.
Furthermore, the ARM Cortex M offers two access levels (1) Privileged Access Level (PAL) and (2)
Non- Access Level (PAL).
When a designer’s code is running with PAL, it implies that the codes have full access to all
the processor specific resources and restricted registers (the ability to change the content of
such registers).
On the other hand if ones code is running with NPAL, then there would be no access to some
of the restricted registers of the processor core.
By default embedded design codes run in PAL; however, when switched to the NPAL mode and want
to come to the PAL of the processor the CONTROL register can be used. The CONTROL register is
controls the stack used and the privilege level for the software execution when the processor is in the
“Thread Mode”. This done by programming the [0] bit of the control register (“0” indicates privilege
and “1” unprivileged). When the processor is executing codes in the “tread mode”, it is possible to
move the processor to NPAL. Once you move out of the PAL to NPAL being in the thread mode, it is
not possible to come back to the PAL unless you change the processor operation mode to “handler
mode”. Also the “handler mode” code execution is always with PAL.
A register (CPU register) is one of a small set of data holding places that are part of the computer
processor core. A register may hold an instruction, a storage address, or any kind of data (such as a bit
sequence or individual characters). Some instructions specify registers as part of the instruction whilst
have specific hardware functions, and may be read-only or write-only.
The ARM Cortex M processor core has 32-bit registers that includes 13 general-purpose registers and
several special-purpose registers.
50
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Registers R13, R14, and R15 have the following special functions:
Linked Register (LR): It is the register R14. It stores the return information for subroutines,
function call, and exceptions. At all other times, you can treat R14 as a general-purpose
register. On reset the processor sets the LR value to 0xFFFFFFFF;
Program Counter (PC): The program counter (PC) is register R15. It contains the program
address of the next instruction to be executed or the current instruction to be executed. On
reset the processor loads the PC with the value of the reset vector which is at 0x00000004. Bit
[0] of the value is loaded into the ESPR T-bit at reset and must be 1.
Stack pointer: Register R13 is used as the Stack Pointer (SP). Because the SP ignores writes
to bits [1:0], it is auto-aligned to a word, four-byte boundary. Handler mode always uses
SP_main, but you can configure Thread mode to use either SP_main or SP_process.
The section of the ARM cortex M registers to be discussed includes the special purpose registers.
They are the Program Status Register (PSR), Exception Mask Registers (EMS) and CONTROL
registers.
Program Status Register (PSR): The PSR holds the status of the current execution of the
program. It is 32-bit wide. It is a collection of three registers
51
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
o Application PRS- consumes only 5-bit of the 32-bit of the PSR. The APSR contains
the current state of the condition flags from the previous instruction executions.
o Interrupt PSR: consumes 0-8 bit of the 32-bit of the PSR. Also contains the exception
type number of the current Interrupt Service Routing (ISR).
o Execution PRS: Consumes bits in between; however, the “T” bit is very important
and it is for thumb state bit. The ARM Cortex M(4) processor only supports
execution of instruction in Thumb state. ARM processor supports internetworking;
hence one can switch between ARM and Thumb instruction sets. If the T-bit is set,
then the processor thinks that the next instruction which is about to be executed is
from “Thumb” instruction set. If the T-bit is reset, then the processor thinks that the
next instruction which is about to be executed is from “ARM” instruction set. The
cortex M processors do not support “ARM” instruction set. Hence, the value of the R-
bit must always be 1; failure will result in “User fault” exception. Furthermore, the bit
[0] of the program counter is linked to this T-bit. Hence, any address placed in PC
must have its 0th bit as 1 which is ideally taken care by the compiler.
Like the ARM architecture, the THUMB processor is an advanced RISC load/store
machine. The THUMB shares many properties with the ARM as it operates as a
subset of the ARM architecture. These two processors are fundamentally the same;
they run on the same silicon chip and operate in much the same way, indeed, one can
switch between the two modes in the same program. What make the THUMB
different from the ARM are the register set, register size, and instruction size. The
THUMB register set is a subset of the ARM register set mentioned above. Instead of
16 GPRs, only eight general purpose registers, R0-R7, are available. The register set
also differs in size, 16-bits for THUMB vs. 32-bits for ARM. In addition to these
GPRS, as in the ARM state, are the PC, SP, SPSR, and Link registers. The THUMB
registers are directly mapped from the ARM state registers, facilitating a convenient
transfer of data when switching between the two modes. It should also be noted that
the THUMB program counter is different from ARM to accommodate its smaller
instruction size. THUMB instructions are half-word aligned, with bit [0] of the PC set
to 0 compared to the word aligned instructions of the ARM processor. Thumb is
actually a 16-bit processor embedded in a 32-bit processor. It is used when 16-bit
52
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
operations are required and not 32-bit operations. The actual aim of embedding a 16-
bit processor into a 32-bit processor is to increase code density. Therefore Thumb is
targeted for memory constrained embedded systems.
From the figure above only one bit is assignable. When set as “0” it has no effect;
however, when set to “1” it prevents the activation of all exceptions with configurable
priority. Some exceptions would still remain active e.g. “Hard fault Exception”,
“reset Exception”.
o Fault Mask Register: It is somehow similar to the PRIMASK. This register prevents
activation of all exceptions except for Non-Maskable Interrupt (NMI). When set as
“0” it has no effect; however, when set to “1” it prevents activation of all exceptions
except for Non-Maskable Interrupt (NMI).
The memory architecture feature of the ARM cortex M would be discussed in this section. All the
ARM Cortex-M processor has 32 bit memory addressing capability which implies that the data and
address bus are 32-bit in width. As a result the processor has 4GB of addressable memory space. The
entire 4GB memory is a unified space (0-4GB) which includes the code, data and peripheral memory
space. The processor uses the Harvard bus Architecture; meaning concurrent instruction and data
access using multiple bus interfaces. This allow for the simultaneous fetch of an instruction and data;
thereby speeding up the performance of the processor. The processor supports both the big-endian and
little-endian memory system. There is configuration to alter the endianness of the processor; however,
by default, the processor always works in the little-endian. Furthermore, the processor supports
unaligned data transfer and some memory spaces are bit addressable (bit-banding) which implies the
processor gives the ability to address a single bit in the memory region. There is also memory
protection unit support.
53
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The Code Region: The section begins with the address range from 0x00000000-
0x1FFFFFFF. This space contains the application code with is 512 MB. The first few
memory contains the vector table which holds the initial stack value and addresses of various
exceptions handlers.
SRAM Region: This region is located in the next 512 MB of the memory space after the
Code region. It is for connecting on-chip SRAM memory which is mainly used for data
accesses or stack memory. The first MB (0x20000000-0x2010000) of this region is a bit
addressable (bit banding region) which means each single bit within the bit-banding region
can be addressed using the dedicated address of the bit banding region. Also codes can be
executed from the SRAM region.
54
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The Peripheral Region: This region is for connecting peripherals to the microcontroller. Its
address ranges from 0x40000000-0x5FFFFFF. The microcontroller vendor decides which
peripherals should occupy this region. It is 512MB of size and used for MCU on-chip
peripheral like ADC, timers, real-time clock, USB, Ethernet MAC, and DMA etc. The first
MB of this region is a bit addressable (bit banding region). For security reasons the processor
restrict the execution of codes from the peripheral region.
The External RAM Region: Reserved for external RAM connections (thus on-chip/ off-chip
memory) and codes can be executed from this region. The address ranges from 0x60000000
to 0x9FFFFFFF. The memory regions i.e. 0xA00000000 to 0xDFFFFFFF are intended for
external devices (off-chip peripherals) or shared memory and it is a non-executable region.
The Bus Protocols: There exist two bus protocols: (1) AHB Lite (Main System Bus) and (2)
APB (Peripheral Bus). These buses are commonly used for interconnections between the
processor and various components of the microcontroller.
o AHB Lite: The ARM cortex M uses the AHB (AMBA High Performance Bus
derived from AMBA (Advanced Microcontroller Bus Architecture) specification) lite
as its main system bus.
o APB: It stands for Advanced Peripheral Bus and derived from AMBA specification
but optimized for minimum power and reduced interface complexity. It is much
simpler and slower compared to AHB.
There exist three external AHB lite bus interfaces of 32-bit; (1) I-Code-dedicated bus used to fetch
instructions and based on ABB lite protocol. (2)D-Code-dedicated bus (32-bit) for fetching data from
SRAM and it can access data from non-word aligned memory addresses. (3)System bus- It is also a
32-bit bus which allows instruction and data fetch from memory devices such as SRAM and various
peripherals such USB, Ethernet MAC, DMA which are high-speed peripherals.
55
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
It is important to note that to connect lower speed peripherals, the AHB lite is not used; however, the
AHB bus logic is converted to a PB bus logic using an AHB to APB Bridge as shown in the figure
below.
On the other hand, the figure below shows how the un-aligned data transfer takes place. It can be
generated by using the “__packed” keyword definition.
56
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
With this approach there is no unused memory in SRAM. When an unaligned transfer is issued by the
processor they are actually converted to multiple aligned transfers by the processor bus interface unit.
This results in more clock cycles for a single data access and might not be good in situations in which
high performance is required.
Unaligned Data Access Can occur as a result of (1) direct manipulation of pointers (2) accessing data
structure with __packed attributes that results in unaligned data.
Bit-Banding
Bit banding is basically the capability to address a single bit of a memory address. However, this
feature of the memory usage is optional to microcontroller vendor specific. Prior to performing bit
banding it is necessary to know the bit band regions of the memory. The bit banding regions are those
memory regions in the memory map of the processor, whose each bit can be uniquely addressed by
using dedicated address (bit-addressable). There are only two regions of the processor memory map
which are bit addressable (1) SRAM-region (2) Peripheral region of size 1MB each as shown in the
figure below.
There is also the bit band alias memory region of size 31MB in the two bit banding regions. In order
to set or reset any of bits of the bit band memory region address, it can be done by using
corresponding bit address in the bit band alias region.
The formula below can be used to calculate the alias address of any bit of a word aligned bit band
address: ALIAS_BASE + ( 32 * (BIT_BAND_MEM_ADR - BIT_BAND_BASE) ) + BIT * 4
57
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Stack Memory
Stack is simply a memory space section (typically RAM) that allow memory access in the Last in Fist
out (LIFO) data storage buffer. The ARM processors have the “PUSH” and “POP” instruction to
store data in the stack memory and retrieve data from the stack memory respectively. All stack
operations are world aligned. The stack is useful (1) act as temporary storage of original data when a
function being executed needs to use registers for data processing (2) Passing of information to
functions or sub-routine (3) storing local variable (4) to hold processor status and register values in
the case of exception such as interrupt. The ARM Cortex M (M3/M4) supports 2 stack pointers:
Main Stack Pointer (MSP): It is the default stack pointer used after the reset and used for all
exception handlers. Also after power up, the processor hardware automatically initializes the
MSP by reading the vector table.
Process Stack Pointer (PSP): It is an alternate stack pointer which can only be used in
thread mode. The PSP is not initialized after power up and must be initialized by the software
before being used.
One can switch between the MSP and PSP by set/resetting the SPEL bit in the CONTROL register.
The ARM Cortex M uses the full descending (FD) stack memory model as shown in the figure below;
Furthermore, as per the procedure call standard of the ARM architecture, when a function is called
registers are used for passing parameters. The first input parameter / function return value is passed to
the register R0, the second input parameter / function return value if 64 bit is passed to the register R1,
the third input parameter to the register R2 and the fourth input parameter to the register R3. It is the
responsibility of the “callee function” to push the contents of the R4-R11, R13, R14 if the function is
going to change these registers (the compiler takes care when coded in high-level language like
C/C+).
58
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Full ascending – In case of full ascending the stack grows up, and SP points to the highest
address containing a valid item.
Empty ascending – The stack grows up but the SP points tot the first empty location above
stack.
Full descending – The stack grows down, but SP points to the lowest address containing a
valid data.
Empty descending – The stack grows down, but SP points to the first location below the
stack.
If the stack pointer, SP is used as the base register with the addressing modes of multiple byte
transfers discussed, the stack can be implemented in any of these modes and this a flexibility that
ARM processor provides.
Full ascending:
Empty descending:
When one resets the processor, the PC is loaded with the value 0x0000_0000.
Then the processor reads the value at memory location ox_0000_0000 in to MSP (Main Stack
Pointer register) [MSP= value at 0x0000_0000]. This implies the processor first initializes the
stack pointer.
Afterwards, the processor reads the value at memory location 0x0000_0004 in to the PC. This
is the actual value of the reset handler.
The PC jumps to the reset handler.
A reset handler is just a C or assembly function written by the embedded system programmer
to carry out any initialization required. From the reset handler you call your main () function
of the application.
59
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Before proceeding to discuss the interrupts and exception of the ARM Cortex M, it is important to
clearly distinguish between the two. An exception refers to events that emanates asynchronously
either from an external world or from the internal system. In the microcontroller or microprocessor
system, when the exception occurs a signal is generated demanding the change in normal program
execution flow. When the signal altering the normal execution flow comes from an external device
(timers/ RTC, NMI, I/Os etc.) connected to the microprocessor/ microcontroller, requesting the
immediate attention of the processor core such as exception in known as an interrupt. That being
said, there are two types of exceptions (1) System Exceptions (internal to the processor) and (2)
External Exceptions (interrupts). The ARM cortex M (M3/M4) has 15 systems exceptions (refer to
figure below) numbered 1 to 15 and supports up to 240 interrupts numbered from 16 to 255. It is
important to note that the first 3 exception has fixed priorities whilst the others are programmable.
60
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
From a programmer point of view interrupt is a boon. Interrupts are very useful in situations where
you need to read or write some data from or to an externally connected device. Without interrupts, the
normal procedure adopted is polling the device to get the status. You can write your program in two
ways to poll the device. In the first method your program polls the device continuously till the device
is ready to send data to the controller or ready to accept data from the controller. This technique
achieves the desired objective effectively by sacrificing the processor time for that single task. Also
there is a chance for the program hang up and the total system to crash in certain situations where the
external device fails or stops functioning. Another approach for implementing the polling technique is
to schedule the polling operation on a time slice basis and allocate the total time on a shared basis to
the rest of the tasks also. This leads to much more effective utilization of the processor time; but the
biggest drawback of this approach is that there is a chance for missing some information coming the
device if the total tasks are high in number. Your device polling we'll get another chance to poll the
device only after the other tasks are done at least once.
Here comes the role of interrupts if the external device supports interrupt, connect the interrupt pain of
the device to the interrupt line of the controller. Enable the corresponding interrupt in firmware. Write
the code to handle the interrupt request in a separate function and put the other task in the main
program code. Here the main program is executed normally and when the external device assets an
interrupt, the main program is interrupted and the process switches the program execution to the
interrupt request service. On finishing the execution of the interrupt service request, flow is
automatically diverted back to the main stream and the main program which resumes its execution
exactly from the point where it got interrupted.
Uses of Interrupts
I.O data transfer between peripheral devices and processor or controller;
Timing applications;
Handling emergency situations;
Context switching/ multitasking/ real-time application programming;
Event driven programming.
61
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The NVIC is a hardware block as part of the processor. They receive and manage exceptions from
diverse sources and hands them over to the processor core based on the priority level of the exception.
All processor exceptions are connected to the NVIC as can be seen in the figure below;
Within the ARM Cortex M processor, there exist a number of registers which are programmable for
managing interrupts. Majority of these register can be found within the NVIC block (and system
control block). These NVIC registers can be programmed in two ways by: (1) Direct Access the
NVIC registers (2) User CMSIS core APIs. It is important to note that, the NVIC registers can only be
assessed in privilege access level and also after reset; all interrupts are disabled and given a priory
level value of 0.
NVIC Registers
The NVIC registers for interrupt control and configuration includes (1) Interrupt Enable Register (2)
Pending State Register (3) Activate State Register (4) Interrupt Mask and Unmasked Register (5)
Interrupt Priority Registers. The NVIC register is summarized in the figure below
62
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
63
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Each priority level is sub-divided into two namely (1) pre-empt priority field (2) sub-priority field as
shown in the table figure below.
Figure 3.24: Priority Groups and their Pre-empt and sub priority fields.
Pre-empt priority field: When the processor is running interrupt handler, and another
interrupt appears, then the pre-empt priority value will be compared and exception with
higher pre-empt priority (less in number) will be allowed to run.
Sub-priority field: This value is used only when the two exceptions with same pre-empt
priority level occur at the same time. In such circumstances, the exception with higher sub-
priority (less in number) will be handled first.
For example if we look at priority group 0 of a 3-bit implemented priority level register as shown in
the figure below, there are 7 bit pre-empt priority field i.e. bit 7 to bit 1 (128 programmable interrupt
level) and one sub-priority bit as the bit 0. Out of the 7 bit pre-empt priority field only 3 are
implementable so, 8 programmable pre-empt priority interrupt levels. Also with respect to the sub-
priority field there is 1 bit which is the bit 0, but the bit 0 is not implementable; hence no sub priority
level interrupts programmable interrupt levels. In summary, when the priority grouping is 0, then we
have 8 levels of pre-empt priority and there is not sub-priority levels.
Looking at another case where the priority group is 5, there are 2 bit pre-empt priority field i.e. bit 7 to
bit 6 (4 programmable interrupt level) and 6 sub-priority bit as the bit 0 to bit 5 (64 programmable
interrupt level). Out of the 2 bit pre-empt priority field only 3 are implementable so, 4 programmable
pre-empt priority interrupt levels. Also with respect to the sub-priority field there is 6 bit which is the
bit 0 to bit 5, but only the bit 5 is implementable; hence 2 sub-priority levels interrupts programmable
interrupt levels. In summary, when the priority grouping is 5, for every pre-empt priority level; there
is 2 levels of sub-priority.
64
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
From the figure above, it can be seen that the MSP value is set to 0x00000000 (i.e. when the
processor is reset). On reset, the vector table contains the initial MSP value. The exception number 1
has as offset address 0x0004 which is the reset exception
65
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
CHAPTER FOUR
BUS STRUCTURE OF EMBEDDED SYSTEMS
The objective of this chapter is to look at the Bus Structure which is used by the CPU of
processor of embedded systems to communicate with memory and other devices.
At the end of this chapter, students are expected to have an understanding of how the CPU of
the processor communicates with memory and other devices through the bus structure.
The bus structure of an embedded system defines how the Central Processing Unit (CPU) of the
processor communicates with memory and other devices which are internal or external. For this
communication to take place, the CPU of the processor in the embedded system uses what is called a
Bus.
Bus:
The Bus is the mechanism by which the CPU of a processor communicates with memory and
Input/output devices. This is not just a collection of wires but the bus also defines the protocol for
communication to take place between the processor and other devices. Therefore the bus in the
context of embedded systems and general purpose computers defines the corresponding transport
mechanism between the CPU of the processor, memory and Input / Output devices. There are many
devices which can actually sit on the bus therefore typically the bus is a shared communication link. A
single set of wires are what is used to connect multiple subsystems together. The bus can be said to be
also a fundamental tool for composing large and complex systems. This is because the bus defines
the mechanism for communication and once a complex subsystem is put in along with CPU of the
processor, that subsystem should have the ability to communicate with the communication scheme
specified by the bus. The bus enables as to build a system on the basis of individual components
which are compatible with the bus. Figure 6.1 illustrates the bus structure of a processor and other
devices connected together.
Input
Processor
Control
Memory
Output
Data Path
66
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The diagram above shows 2 buses, one connecting memory to the processor and another bus
connecting Input/output devices to the processor. In other cases there may be this kind of two distinct
buses or just a single bus where both memory and Input/ Output devices sit on the same bus.
Transaction protocol – Where we look at how the exchange of data takes place along the bus,
how the devices really talk to each other and what kind of language the devices use in taking
to each other;
Timing and Signaling Specification – For talking among devices there should be timing and
signaling specification and these signals are obviously carried by a bunch of wires;
Electrical Specification – This defines the electrical specification for the signals carried along
by the wires;
Physical / Mechanical Characteristics – This defines where the bunch of wires that carry the
signals lie and also the nature of the connectors to be used; and
Starting from the mechanical to the top level transaction protocol constitute what is called the
bus.
Address lines;
Data lines; and
Control lines.
Control Lines – The control lines actually implements the transaction protocols. Therefore the
signals which flow along the control lines are very instrumental in implementing the transport
protocols. The signals that flow along the control lines are do basically the following;
Data Lines – The Data lines carry information between the source and the destination. This
information may be in the form of data and address because address is basically nothing but a form of
data. Also there may be complex commands which may be given to devices through the data line of
the bus.
Bus Characteristics:
Bus signals are usually tri-stated because when a particular device with its interface to the bus
is disconnected from the bus it will drive its interface line to tri-state so effectively it is
connected from the bus.
In many cases address and data lines may be multiplexed thereby saving the actual number of
connections or wires.
Every device on the bus must be able to drive the maximum bus load. The maximum bus load
determines the actual number of devices that can be put on the bus.
67
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Bus may include a clock signal. It is not necessary that all the buses are clocked but some of
the buses do include clock signal and that all the timing particularly for bus signals which
implement the complete communication protocol is define relative to the bus. The bus clock
may not be the same of that of the system clock of the processor.
Non-multiplexed address and data lines: Where address and data can be transmitted in one
bus cycle if separate address and data lines are available. The cost to this method of
increasing bandwidth is more bus lines and increased complexity in terms of the physical
layout as well as space requirement.
Increasing bus width: By increasing the width of the data bus, the transfer of multiple words
will require fewer bus cycles. The cost to this is more bus lines.
Block transfers: Block transfer allows the bus to transfer multiple words on back-to-back bus
cycle. Only one address needs to be sent at the beginning and the bus is not released until the
last word is transferred. The cost to type of bandwidth increase is increased complexity and
decrease response time for request for the other devices.
It creates communication bottle neck because the bandwidth of the bus can limit the
maximum Input / Output throughput.
The maximum bus speed is largely limited by:
o The length of the bus
o The number of devices on the bus
o The need to support a range of devices with widely varying latencies and data transfer
rates.
The basic model by which data is transferred is more of parallel communication. This means multiple
data, control and possibly power wires. Effective data transfer unit is one bit per wire. This is useful
for high data throughput with short distances and typically used when connecting devices on same IC
or same circuit board. These buses must be kept short because long parallel wires resulting in high
capacitance values which requires more time to charge and discharge. This limits the data transfer
rate. Also the data misalignment between wires increases as length increases because there is a
capacity cost associated with each wire therefore high cost means bulky wires. When talking about
general purpose computers systems the bus or the motherboard is typically a parallel bus because
these buses are expected to run for shorter length. But when looking at Embedded Systems, the
devices are not always connected by these parallel buses because these parallel buses and the many
parallel lines have many problems associated with them like space. Let’s consider the micro-
controller PIC which has got in built peripherals and it has to support some external peripherals as
well. If the PIC micro-controller supports fully fledged bus by which external devices can be
connected then it has to provide a large number of external pins.
68
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
The other option used by Embedded Systems is serial communication. In this type of communication
a single data wire or possibly single control and power wires are made available. In serial
communication words are transmitted one bit at a time. This effectively delivers higher data
throughput with long distances which brings less average capacitance, therefore more bits per unit of
time. The buses become cheaper and less bulky in serial communication. In serial communication
there are more complex interfacing logic and communication protocol. The sender needs to
decompose word into bits in serial communication and the receiver needs to recompose these bits into
word. The control signals are often sent on the same wire as data increasing protocol complexity. This
is because there is no separate data or control line and the same line is used to send the control
information followed by data.
The other way of looking at the bus is to see whether it is a synchronous or an asynchronous bus.
A synchronous bus includes a clock in the control lines. There is a fixed protocol for communication
that is relative to the clock. That means the timing for the control signals are define relative to the
clock. The advantage of this is that it involves very little logic and can run very fast. The drawback is
that every device on the bus must run at the same clock rate. Also to avoid clock skew, they cannot be
long if they are fast. If there is a skew build up there will be timing errors for data transfer. Most
processors to memory buses are expected to fast buses and use synchronous buses.
In an asynchronous bus data transfer is not clocked and therefore since it is not clocked it can
accommodate a wide range of devices. The bus can be lengthened without worrying about clock
skew. In asynchronous communication there requires a handshaking protocol so that the devices can
talk to the processor in a proper faction.
The Master is the one who starts the bus transaction by issuing the command and address. The Slave
is the one who responds to the address by sending data to the master. If the master asks for data or
receives data from the master if the master wants to send data. But the devices which lie on the bus
are not assigned permanently to this master and slave role, which means that a slave in one transaction
may become the master in some other transaction.
69
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
speaking a separate single-purpose processor which can take control of the bus. In the very common
low-end systems DMA controller is the most important device that can become the bus master. The
microprocessor relinquishes control of system bus to the DMA controller but while the DMA
controller uses the bus system the microprocessor can execute its regular programs.
There is no inefficient storing and restoring state of processor due to ISR call.
Regular program need not wait unless it requires the system.
Harvard Architecture is easily implemented since the processor can fetch and execute
instructions as long as they don’t access data memory. If they do, processor stalls.
70
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
CHAPTER FIVE
DIGITAL SIGNAL PROCESSOR (DSP)
Chapter objectives and expected results
The objective of this chapter is to look at the digital signal processors (DSP) which are electronic
systems for processing digital signals focusing on:
At the end of this chapter, students are expected to have general overview knowledge of the features
and properties of Digital Signal Processors, its architecture, data path and registers present in these
processors.
Embedded systems are targeted for what is called situated computing. This means an embedded
system is situated in an external environment. It receives input from external environment via sensors
and the signals that it receives through these sensors are processed for taking action via actuators or
doing some kind of communication as well as data processing task. Sensors can be designed for
virtually every physical and chemical quantity such as weight, velocity, acceleration, electrical
current, voltage, temperature etc. In fact on the market today, there are a variety of sensors available
which can be interfaced and used in embedded systems. All sensors provide a set of discrete values
which needs to be processed. Many of these sensors work in analog domain and therefore the signals
they receive needs to be changed from analog to digital form to do any kind of processing. Some
examples of sensors have been given here.
A very common example of sensor is the CCD Sensors which are used in cameras for sensing images.
CCD sensor is a light-sensitive silicon solid-state device composed of many cells. These cells are
nothing but some form of capacitors which accumulate charge depending on the intensity of the
incident light and at each site of this charge which translates to current is integrated over a period of
time to get a reasonable Signal to Noise Ratio (SNR). The data got from these sensors represent the
external world and not the noise. These charges are shifted out using CCD shift registers which are
not digital shift registers but shift registers for storing charges in charge baskets. These charges are
shifted out sequentially through these buckets. The output of these sensors is analogue current and this
has to be converted to digital form using Analog to Digital Converters operating at pixel rate. After
conversion from analog to digital form, the data is subsequently processed for any kind of operation.
Biometrical Sensors:
Another example of a sensor used today which is becoming very common for security applications is
the biometrical sensors. These are sensors used for taking finger prints for unique access to any
embedded system. A wide range of biometric sensors & detectors are available to consumers for
security and access control management. Biometrics refers to methods for recognizing individual
71
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
people based on unique physical and behavioral traits. Physiological biometrics is one class of
biometrics that deals with physical characteristics and attributes that are unique to individuals.
Biometric sensors work by producing electrical currents when they scan a user's physical
characteristic. Many physical characteristics may be scanned by a biometric sensor including eyes,
fingerprints, or DNA. Sensors contain an analog to digital converter enabling it to digitize the image
and store the digital information in memory so that it can verify the user next time he or she needs to
authenticate their identity.
Digital Signal Processing: A technique that converts signals from real world sources (usually in
analog form) into digital data that can then be analyzed. Analysis is performed in digital form because
once a signal has been reduced to numbers; its components can be isolated, analyzed and rearranged
more easily than in analog form. Eventually, when the DSP has finished its work, the digital data can
be turned back into an analog signal, with improved quality. For example, a DSP can filter noise from
a signal, remove interference, amplify frequencies and suppress others, encrypt information, or
analyze a complex wave form into its spectral components. This process must be handled in real-time
which are often very quickly. For instance, stereo equipment handles sound signals of up to 20
kilohertz (20,000 cycles per second), requiring a DSP to perform hundreds of millions of operations
per second. Figure 4.1 illustrates a very simple digital processing system.
In brief, DSPs are processors or microcomputers whose hardware, software, and instruction sets are
optimized for high-speed numeric processing applications an essential for processing digital data
representing analog signals in real time. A Digital Signal Processor is a special-purpose CPU (Central
Processing Unit) that provides ultra-fast instruction sequences such as shift and add, and multiply and
add, which are commonly used in math-intensive signal processing applications. Digital Signal
Processors (DSPs) take real-world signals like voice, audio, video, temperature, pressure, or position
that have been digitized and then mathematically manipulate them. A DSP is designed for performing
mathematical functions like "add", "Subtract", "multiply" and "divide" very quickly. Signals need to
be processed so that the information that they contain can be displayed, analyzed, or converted to
another type of signal that may be of use. In the real-world, analog products detect signals such as
sound, light, temperature or pressure and manipulate them. Converters such as an Analog-to-Digital
converter then take the real-world signal and turn it into the digital format of 1's and 0's. From here,
the DSP takes over by capturing the digitized information and processing it. It then feeds the digitized
information back for use in the real world. It does this in one of two ways, either digitally or in an
analog format by going through a Digital-to-Analog converter. All of this occurs at very high speeds.
To illustrate this concept, the diagram below in figure 4.2 shows how a DSP is used in an MP3 audio
player. During the recording phase, analog audio is input through a receiver or other source. This
analog signal is then converted to a digital signal by an analog-to-digital converter and passed to the
72
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
DSP. The DSP performs the MP3 encoding and saves the file to memory. During the playback phase,
the file is taken from memory, decoded by the DSP and then converted back to an analog signal
through the digital-to-analog converter so it can be output through the speaker system. In a more
complex example, the DSP would perform other functions such as volume control, equalization and
user interface.
A DSP's information can be used by a computer to control such things as security, telephone, home
theater systems, and video compression. Signals may be compressed so that they can be transmitted
quickly and more efficiently from one place to another (e.g. teleconferencing can transmit speech and
video via telephone lines). Signals may also be enhanced or manipulated to improve their quality or
provide information that is not sensed by humans (e.g. echo cancellation for cell phones or computer-
enhanced medical images). Although real-world signals can be processed in their analog form,
processing signals digitally provides the advantages of high speed and accuracy. Because it's
programmable, a DSP can be used in a wide variety of applications. You can create your own
software or use software provided by ADI and its third parties to design a DSP solution for an
application.
Types of DSPs:
Because different applications have varying ranges of frequencies, different DSPs are required. DSPs
are classified by their dynamic range, the spread of numbers that must be processed in the course of
an application. This number is a function of the processor’s data width (the number of bits it
manipulates) and the type of arithmetic it performs (fixed or floating point). For example, a 32-bit
processor has a wider dynamic range than a 24-bit processor, which has a wider range than 16-bit
processor. Floating-point chips have wider ranges than fixed-point devices. Each type of processor is
suited for a particular range of applications. Sixteen-bit fixed-point DSPs are used for voice-grade
systems such as phones, since they work with a relatively narrow range of sound frequencies. Hi-
fidelity stereo sound has a wider range, calling for a 16-bit ADC (Analog/Digital Converter), and a
24-bit fixed point DSP. Image processing, 3-D graphics and scientific simulations have a much wider
dynamic range and require a 32-bit floating-point processor.
The processing of the various signals is carried out through implements that are together called as
DSP hardware. This includes the hardware that is used for transmission of signals, various devices
that are used to enhance or filter the signals, analogue to digital and digital to analogue converters and
other processing equipment such as computers.
Among the hardware mentioned above, digital signal processors are the ones in which the actual
processing takes place. Usually the digital signal processors today have the following characteristics:
73
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
They are equipped to handle real time processing i.e. they can give the optimal performance
even when streaming data is being fed into them.
The memories that are used to store programs are different from the ones used to store data.
They do not provide hardware that supports multitasking.
It can be used as a direct memory access device in supporting or host environments.
They take analogue signals as input, convert them into the digital form, process the signals
and then, convert them back into the analogous form.
They make use of Direct Memory Access technique.
The digital signal processors usually have architecture so as to optimize the following features:
A unit that can handle floating numbers is present directly in the data flow path.
The accumulators or multipliers that are present are highly parallel in nature.
Special hardware is included in order to carry looping at a very low cost.
Their architecture is specially designed so that fetching multiple data at the same time is
possible.
The calculations are usually carried out by fixed point arithmetic process in order to speed
them up.
Most of the registers present in computers today move the data to the lower-most bit if an
overflow occurs. However, in case of digital signal processors, the overflow is retained at the
maximum point itself.
Specialized instructions are present for modulo and reversed bit addressing.
These Digital Signal Processors must perform these tasks efficiently while minimizing:
The main applications of Digital Signal Processors are audio signal processing, audio compression,
digital image processing, video compression, speech processing, speech recognition, digital
communications, RADAR, SONAR, seismology and biomedicine. Specific examples are speech
compression and transmission in digital mobile phones, room correction of sound in hi-fi and sound
reinforcement applications, weather forecasting, economic forecasting, seismic data processing,
analysis and control of industrial processes, medical imaging such as CAT scans and MRI, MP3
compression, computer graphics, image manipulation, hi-fi loudspeaker crossovers and equalization,
and audio effects for use with electric guitar amplifiers. All of these applications are actually put into
some kind of a dedicated system which is nothing but what is called embedded system.
Digital Filters – These filters are used to remove noise from the sample digital signals and
also to look at the different frequency bands of these signals.
Transformation –
74
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Texas Instruments TMS320 is a blanket name for a series of digital signal processors (DSPs) from
Texas Instruments. It was introduced on April 8, 1982 through the TMS32010 processor, which was
then the fastest Digital Signal Processor on the market. The processor is available in many different
variants, some with fixed-point arithmetic and some with floating point arithmetic. The floating point
Digital Signal Processor TMS320C3x, which exploits delayed branch logic, has as many as three
delay slots. The flexibility of this line of processors has led to it being used not merely as a co-
processor for digital signal processing but also as a main CPU. Newer implementations support
standard IEEE JTAG control for boundary scan and/or in-circuit debugging. The original TMS32010
and its subsequent variants is an example of a CPU with a Modified Harvard architecture, which
features separate address spaces for instruction and data memory but the ability to read data values
from instruction memory. The TMS32010 featured a fast multiply-and-accumulate useful in both
Digital Signal Processor applications as well as transformations used in computer graphics. The
graphics controller card for the Apollo Computer DN570 Workstation, released in 1985, was based on
the TMS32010 and could transform 20,000 2D vectors/second.
For this reason people working with DSPs often abbreviate a processor as "C5x" when the actual
name is something like TMS320C5510, since all products obviously have the name "TMS320" and all
processors with "C5" in the name are code compatible and share the same basic features. Sometimes
you will even hear people talking about "C55x" and similar subgroupings, since processors in the
same series and same generation are even more similar.
TMS320C1x, is the first generation 16-bit fixed-point DSPs. All processors in these series are
code-compatible with the TMS32010
o TMS32010, the very first processor in the first series introduced in 1983, using
external memory
o TMS320M10, the same processor but with an internal ROM of 3KB
o TMS320C10, TMS320C15 etc.
o TMS320C3x, floating point
o TMS320VC33
TMS320C4x, floating point
TMS320C8x, multiprocessor chip
o TMS320C80 MVP (multimedia video processor) has a 32 bit floating-point "master
processor" and four 32-bit fixed-point "parallel processors". In many ways the Cell
microprocessor followed this design approach.
75
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
C2000 Series:
TMS320 C2000 series consists of 2 families: C240x, an older 16-bit line that is no longer
recommended for new development and the C28xx 32-bit line. The newer C28xx family
consists of a Delfino high-performance floating point line and a low-cost Piccolo line. The
C2000 series is notable for its high performance set of on-chip control peripherals including
PWM, ADC, quadrature encoder modules, and capture modules. The series also contains
support for I2C, SPI, serial (SCI), CAN, watchdog, McBSP, external memory interface and
GPIO. Due to features like PWM waveform synchronization with the ADC unit, the C2000
line is well suited to many real-time control applications. The C2000 family is commonly
used for digital motor control and power conversion. A line of low cost kits for digital power,
renewable energy and digital motor control allow experimentation with the MCU.
C5000 Series:
TMS320C54x 16-bit fixed point DSP, 5 stage pipeline with in-order-execution of Opcode,
parallel load/store on arithmetic operations, multiply accumulate and other DSP
enhancements. It has internal multi-port memory with no cache unit.
o A popular choice for 2G Software defined cell phone radios, particularly GSM, circa
late 1990s when many Nokia and Ericsson cell phones made use of the C54x.
o At the time, desire to improve the user interface of cell phones led to the adoption of
ARM7 as a general purpose processor for user interface and control, off-loading this
function from the DSP. This ultimately led to the creation of a dual core
ARM7+C54x DSP, which later evolved into the OMAP product line.
TMS320C55x generation - fixed point, runs C54x code but adds more internal parallelism
(another ALU, dual MAC, more memory bandwidth) and registers, while supporting much
lower power operation
o Today, most C55x DSPs are sold as discrete chips
o OMAP1 chips combine an ARM9 (ARMv5TEJ) with a C55x series DSP
o OMAP2420 chips combine an ARM11 (ARMv6) with a C55x series DSP
C6000 Series:
DaVinci chips include one or both of an ARM9 and a C64x+ fixed point DSP
o OMAP-L13x chips include an ARM9 (ARMv5TEJ) and a C67x floating point DSP
o OMAP243x chips combine an ARM11 (ARMv6) with a C64x series DSP
o OMAP3 and OMAP4 chips include an ARM Cortex-A8 or A9 (ARMv7) and
frequently a fixed point C64x+ DSP
76
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
DaVinci Series:
The DaVinci series started with systems-on-a-chip using an embedded C6000 Series (C64x+)
DSP, ARM9 application processors, and Digital Media peripherals. There are variants
without ARMs, and without DSPs. Their marketing focuses on their video processing
capabilities. Original chips supported NTSC and PAL, while newer ones support HDTV.
OMAP Variants:
OMAP variants, these also have an ARM processor in the same chip. There are also OMAP
processors with other secondary processors, so these are not necessarily DSPs.
DA Variants:
DM variants:
77
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Instruction
Processor Memory
Data
Memory
T-Register
Multiplier
ALU
P-Register
Accumulator
Looking at the internal organization of the TMS32010 data path, there is a dedicated Multiplier for
multiply operations in each state of the FIR filter and it is to be used with the entire sequence of the
data. The data is fetched from memory and is been put into a separate register called T-Register for
data processing. The output of the multiplier goes into the P-Register which can be used with the
ALU to get the accumulated result in the Accumulator itself. This result can be put back again in the
multiplier if it is required. This organization of the data path is primarily to facilitate multiply and
accumulate operations which are the key operations in the case of any filter implementation. This is
critical for FIR filters.
In the data path of most Digital Signal Processors there is a specialized Hardware to perform
key arithmetic operations in one circle particularly multiply and accumulate.
78
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
There is a hardware support for managing what is called Shifters which facilitates adjustments
for mantissas of floating point numbers.
In many of these processors the data path has guard bits in their accumulators. These bits are
additional bits to increase accuracy of results.
There is also hardware support for saturation of results to prevent wrap around on overflow or
underflow.
The FIR taps implies multiple memory accesses and therefore DSPs will need multiple data ports to
facilitate its multiple data busses and memories. Also many DSPs implement ad hoc techniques to
reduce memory bandwidth demand such as:
Instruction repeat buffer – where a set of instructions if repeated are not fetch back from
memory but are put in a buffer to be looped and
Disabling of interrupts thereby increasing interrupt latency.
Many of the DSPs have instruction caches which may allow a programmer to “lock in” instructions
into cache. DSPs typically have no data caches because the data is coming in a sequence or it already
stored in a buffer to be applied in a sequence and same data is not expected to be used multiple times.
These DSPs may have multiple data memories.
Immediate
Displacement and
Register indirect
These addressing modes keep the Multiply and Accumulate(MAC) data path busy and extra
instructions imply clock cycles of overhead in inner loop that may reduce the efficiency with which
the MAC is used. These complex addressing modes can be used but the data path is not used to
calculate addresses. The data path is primarily targeted for data manipulation. Address manipulations
can be done through auxiliary computation units.
For instruction execution control DSPs have hardware for fast looping of repeated instruction to have
a zero overhead loop because same number of operations will have to be performed on different set of
data. DSPs using the feature of execution control have fast interrupts for I/O handling because each
input sample is been handle. There is also debugging support in the hardware.
In the specialized peripherals for Digital Signal Processors there can be various:
79
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
(YEAR TWO, SEMESTER TWO)
Timers
On-chip A/D, D/A converters
On-chip DMA controller for fast memory transfer of data from one memory bank to another
memory bank
80