0% found this document useful (0 votes)
91 views7 pages

Embedded Systems Embedded Processors-1

This document discusses the architecture of embedded processors. It covers: 1) The key differences between microcontrollers and microprocessors are that microcontrollers are generally used in embedded applications, have simpler memory hierarchies without cache, and have power and size constraints. 2) A typical microcontroller like the Intel 80X96 has a CPU core connected via buses to memory, I/O interfaces, and peripherals like timers and analog-digital converters. 3) The CPU core contains a register file, arithmetic logic unit, program counter, status register, and microcode engine that controls execution.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views7 pages

Embedded Systems Embedded Processors-1

This document discusses the architecture of embedded processors. It covers: 1) The key differences between microcontrollers and microprocessors are that microcontrollers are generally used in embedded applications, have simpler memory hierarchies without cache, and have power and size constraints. 2) A typical microcontroller like the Intel 80X96 has a CPU core connected via buses to memory, I/O interfaces, and peripherals like timers and analog-digital converters. 3) The CPU core contains a register file, arithmetic logic unit, program counter, status register, and microcode engine that controls execution.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Module

2 Lesson
Embedded Processors and
10
Memory Embedded Processors - I
Version 2 EE IIT, Kharagpur 1 Version 2 EE IIT, Kharagpur 2
In this lesson the student will learn the following

Architecture of an Embedded Processor


The Architectural Overview of Intel MCS 96 family of
Microcontrollers 32- or 64-bit
desktop
processors
Pre-requisite

Performance
Digital Electronics Embedded control 32-bit
embedded
10.1 Introduction controllers/processor

It is generally difficult to draw a clear-cut boundary between the class of microcontrollers and
8- or 16-bit
general purpose microprocessors. Distinctions can be made or assumed on the following controller
grounds.
x Microcontrollers are generally associated with the embedded applications
x Microprocessors are associated with the desktop computers 4-bit
controller
x Microcontrollers will have simpler memory hierarchy i.e. the RAM and ROM may exist
on the same chip and generally the cache memory will be absent.
Cost
x The power consumption and temperature rise of microcontroller is restricted because of
the constraints on the physical dimensions. Fig. 10.1 The Performance vs Cost regions
x 8-bit and 16-bit microcontrollers are very popular with a simpler design as compared to
large bit-length (32-bit, 64-bit) complex general purpose processors.

However, recently, the market for 32-bit embedded processors has been growing.
Further the issues such as power consumption, cost, and integrated peripherals differentiate a
desktop CPU from an embedded processor. Other important features include the interrupt
response time, the amount of on-chip RAM or ROM, and the number of parallel ports. The
desktop world values processing power, whereas an embedded microprocessor must do the job
for a particular application at the lowest possible cost.

Version 2 EE IIT, Kharagpur 3 Version 2 EE IIT, Kharagpur 4


10.2 The Architecture of a Typical Microcontroller
ROM EEPROM
A typical microcontroller chip from the Intel 80X96 family is discussed in the following
paragraphs.
RAM

Core Optional Interrupt


Micro- ROM Controller
processor Serial I/O

A/D
Input Input Parallel I/O
Analog Clock and PTS
and and
I/O Power Mgmt.
output output Timer
D/A
ports ports

PWM

(a) Microprocessor-based system

ROM EEPROM I/O EPA PWM WG A/D WDT FG SIO

RAM
Fig. 10.3 The Architectural Block diagram of Intel 8XC196 Microcontroller

Analog A/D Serial I/O


in
PTS: Peripheral Transaction Server; I/O: Input/Output Interface; EPA: Event Processor Array;
CPU
core Parallel I/O PWM: Pulse Width Modulated Outputs; WG: Waveform Generator; A/D- Analog to Digital
Converter;

Timer FG: Frequency Generator; SIO: Serial Input/Output Port


Analog
out Fig. 10.3 shows the functional block diagram of the microcontroller. The core of the
PWM Filter
Microcontroller
microcontroller consists of the central processing unit (CPU) and memory controller. The CPU
Digital contains the register file and the register arithmetic-logic unit (RALU). A 16-bit internal bus
(b) Microcontroller-based system
PWM connects the CPU to both the memory controller and the interrupt controller. An extension of this
bus connects the CPU to the internal peripheral modules. An 8-bit internal bus transfers
Fig. 10.2 Microprocessor versus microcontroller instruction bytes from the memory controller to the instruction register in the RALU.

Fig. 10.1 shows the performance cost plot of the available microprocessors. Naturally the more is
the performance the more is the cost. The embedded controllers occupy the lower left hand
corner of the plot.

Fig.10.2 shows the architectural difference between two systems with a general purpose
microprocessor and a microcontroller. The hardware requirement in the former system is more
than that of later. Separate chips or circuits for serial interface, parallel interface, memory and
AD-DA converters are necessary On the other hand the functionality, flexibility and the
complexity of information handling is more in case of the former.
Version 2 EE IIT, Kharagpur 5 Version 2 EE IIT, Kharagpur 6
Register Arithmetic-logic Unit (RALU)
CPU Memory Controller
The RALU contains the microcode engine, the 16-bit arithmetic logic unit (ALU), the master
Register File RALU program counter (PC), the processor status word (PSW), and several registers. The registers in
Prefetch Queue
the RALU are the instruction register, a constants register, a bit-select register, a loop counter,
Microcode and three temporary registers (the upper-word, lower-word, and second-operand registers). The
Engine Slave PC PSW contains one bit (PSW.1) that globally enables or disables servicing of all maskable
interrupts, one bit (PSW.2) that enables or disables the peripheral transaction server (PTS), and
Register ALU Address Register six Boolean flags that reflect the state of your program. All registers, except the 3-bit bit-select
RAM register and the 6-bit loop counter, are either 16 or 17 bits (16 bits plus a sign extension). Some
of these registers can reduce the ALU’s workload by performing simple operations.
Master PC Data Register
The RALU uses the upper- and lower-word registers together for the 32-bit instructions and as
temporary registers for many instructions. These registers have their own shift logic and are used
PSW for operations that require logical shifts, including normalize, multiply, and divide operations.
The six-bit loop counter counts repetitive shifts. The second-operand register stores the second
CPU SFRs
Registers Bus Controller operand for two-operand instructions, including the multiplier during multiply operations and the
divisor during divide operations. During subtraction operations, the output of this register is
complemented before it is moved into the ALU. The RALU speeds up calculations by storing
constants (e.g., 0, 1, and 2) in the constants register so that they are readily available when
Fig. 10.4 The Architectural Block diagram of the core complementing, incrementing, or decrementing bytes or words. In addition, the constants register
generates single-bit masks, based on the bit-select register, for bit-test instructions.
CPU: Central Processing Unit; RALU: Register Arithmetic Logic Unit; ALU: Arithmetic Logic
Unit;
Code Execution
Master PC: Master Program Counter; PSW: Processor Status Word; SFR: Special Function
Registers The RALU performs most calculations for the microcontroller, but it does not use an
accumulator. Instead it operates directly on the lower register file, which essentially provides
256 accumulators. Because data does not flow through a single accumulator, the
CPU Control microcontroller’s code executes faster and more efficiently.
The CPU is controlled by the microcode engine, which instructs the RALU to perform
operations using bytes, words, or double-words from either the 256-byte lower register file or Instruction Format
through a window that directly accesses the upper register file. Windowing is a technique that
maps blocks of the upper register file into a window in the lower register file. CPU instructions These microcontrollers combine general-purpose registers with a three-operand instruction
move from the 4-byte prefetch queue in the memory controller into the RALU’s instruction format. This format allows a single instruction to specify two source registers and a separate
register. The microcode engine decodes the instructions and then generates the sequence of destination register. For example, the following instruction multiplies two 16-bit variables and
events that cause desired functions to occur. stores the 32-bit result in a third variable.

Register File
The register file is divided into an upper and a lower file. In the lower register file, the lowest 24 Memory Interface Unit
bytes are allocated to the CPU’s special-function registers (SFRs) and the stack pointer, while
the remainder is available as general-purpose register RAM. The upper register file contains only The RALU communicates with all memory, except the register file and peripheral SFRs, through
general-purpose register RAM. The register RAM can be accessed as bytes, words, or double the memory controller. The memory controller contains the prefetch queue, the slave program
words. The RALU accesses the upper and lower register files differently. The lower register file counter (slave PC), address and data registers, and the bus controller. The bus controller drives
is always directly accessible with direct addressing. The upper register file is accessible with the memory bus, which consists of an internal memory bus and the external address/data bus.
direct addressing only when windowing is enabled. The bus controller receives memory-access requests from either the RALU or the prefetch
queue; queue requests always have priority.

Version 2 EE IIT, Kharagpur 7 Version 2 EE IIT, Kharagpur 8


When the bus controller receives a request from the queue, it fetches the code from the address
contained in the slave PC. The slave PC increases execution speed because the next instruction XTAL 1
byte is available immediately and the processor need not wait for the master PC to send the
address to the memory controller. If a jump interrupt, call, or return changes the address TXTAL 1 TXTAL 1
sequence, the master PC loads the new address into the slave PC, then the CPU flushes the queue
1 State Time 1 State Time
and continues processing.
PH 1
Interrupt Service
The interrupt-handling system has two main components: the programmable interrupt controller
PH 2
and the peripheral transaction server (PTS). The programmable interrupt controller has a
hardware priority scheme that can be modified by the software. Interrupts that go through the
interrupt controller are serviced by interrupt service routines those are provided by you. The CLKOUT
peripheral transaction server (PTS) which is a microcoded hardware interrupt-processor provides
efficient interrupt handling.
Phase 1 Phase 2 Phase 1 Phase 2

Disable Clock Input


(Powerdown)
Fig. 10.6 The internal clock phases

FXTAL 1 The rising edges of PH1 and PH2 generate the internal CLKOUT signal (Fig. 10.6). The
XTAL 1 Divide-by-two
Circuit clock circuitry routes separate internal clock signals to the CPU and the peripherals to provide
flexibility in power management. Because of the complex logic in the clock circuitry, the signal
Disable Clocks on the CLKOUT pin is a delayed version of the internal CLKOUT signal. This delay varies with
(Powerdown)
temperature and voltage.
XTAL 2 Peripheral Clocks (PH1, PH2)
Clock
Disable Generators CLKOUT I/O Ports
Oscillator CPU Clocks (PH1, PH2)
(Powerdown) Individual I/O port pins are multiplexed to serve as standard I/O or to carry special function
Disable Clocks signals associated with an on-chip peripheral or an off-chip component. If a particular special-
(Idle, Powerdown) function signal is not used in an application, the associated pin can be individually configured to
serve as a standard I/O pin. Ports 3 and 4 are exceptions; they are controlled at the port level.
Fig. 10.5 The clock circuitry When the bus controller needs to use the address/data bus, it takes control of the ports. When the
address/data bus is idle, you can use the ports for I/O. Port 0 is an input-only port that is also the
Internal Timing analog input for the A/D converter. For more details the reader is requested to see the data
manual at
The clock circuitry (Fig. 10.5) receives an input clock signal on XTAL1 provided by an www.intel.com/design/mcs96/manuals/27218103.pdf.
external crystal or oscillator and divides the frequency by two. The clock generators accept the
divided input frequency from the divide-by-two circuit and produce two non-overlapping Serial I/O (SIO) Port
internal timing signals, Phase 1(PH1) and Phase 2 (PH2). These signals are active when high.
The microcontroller has a two-channel serial I/O port that shares pins with ports 1 and 2. Some
versions of this microcontroller may not have any. The serial I/O (SIO) port is an
asynchronous/synchronous port that includes a universal asynchronous receiver and transmitter
(UART). The UART has two synchronous modes (modes 0 and 4) and three asynchronous
modes (modes 1, 2, and 3) for both transmission and reception. The asynchronous modes are full
duplex, meaning that they can transmit and receive data simultaneously. The receiver is buffered,
so the reception of a second byte can begin before the first byte is read. The transmitter is also
buffered, allowing continuous transmissions. The SIO port has two channels (channels 0 and 1)
with identical signals and registers.

Version 2 EE IIT, Kharagpur 9 Version 2 EE IIT, Kharagpur 10


Event Processor Array (EPA) and Timer/Counters Watchdog Timer
The event processor array (EPA) performs high-speed input and output functions associated with The watchdog timer is a 16-bit internal timer that resets the microcontroller if the software fails
its timer/counters. In the input mode, the EPA monitors an input for signal transitions. When an to operate properly.
event occurs, the EPA records the timer value associated with it. This is called a capture event.
In the output mode, the EPA monitors a timer until its value matches that of a stored time value. Special Operating Modes
When a match occurs, the EPA triggers an output event, which can set, clear, or toggle an output
pin. In addition to the normal execution mode, the microcontroller operates in several special-purpose
This is called a compare event. Both capture and compare events can initiate interrupts, which modes. Idle and power-down modes conserve power when the microcontroller is inactive. On
can be serviced by either the interrupt controller or the PTS. Timer 1 and timer 2 are both 16-bit circuit emulation (ONCE) mode electrically isolates the microcontroller from the system, and
up/down timer/counters that can be clocked internally or externally. Each timer/counter is called several other modes provide programming options for nonvolatile memory.
a timer if it is clocked internally and a counter if it is clocked externally.
Reducing Power Consumption
Pulse-width Modulator (PWM)
In idle mode, the CPU stops executing instructions, but the peripheral clocks remain active.
The output waveform from each PWM channel is a variable duty-cycle pulse. Several types of Power consumption drops to about 40% of normal execution mode consumption. Either a
electric motor control applications require a PWM waveform for most efficient operation. When hardware reset or any enabled interrupt source will bring the microcontroller out of idle mode. In
filtered, the PWM waveform produces a DC level that can change in 256 steps by varying the power-down mode, all internal clocks are frozen at logic state zero and the internal oscillator is
duty cycle. The number of steps per PWM period is also programmable (8 bits). shut off. The register file and most peripherals retain their data if VCC is maintained. Power
consumption drops into the µW range.
Frequency Generator
Testing the Printed Circuit Board
Some microcontrollers of this class has this frequency generator. This peripheral produces a
waveform with a fixed duty cycle (50%) and a programmable frequency (ranging from 4 kHz to The on-circuit emulation (ONCE) mode electrically isolates the microcontroller from the system.
1 MHz with a 16 MHz input clock). By invoking the ONCE mode, you can test the printed circuit board while the microcontroller is
soldered onto the board.
Waveform Generator
Programming the Nonvolatile Memory
A waveform generator simplifies the task of generating synchronized, pulse-width modulated
(PWM) outputs. This waveform generator is optimized for motion control applications such as The microcontrollers that have internal OTPROM provide several programming options:
driving 3-phase AC induction motors, 3-phase DC brushless motors, or 4-phase stepping motors. x Slave programming allows a master EPROM programmer to program and verify one or
The waveform generator can produce three independent pairs of complementary PWM outputs, more slave microcontrollers. Programming vendors and Intel distributors typically use
which share a common carrier period, dead time, and operating mode. Once it is initialized, the this mode to program a large number of microcontrollers with a customer’s code and
waveform generator operates without CPU intervention unless you need to change a duty cycle. data.
x Auto programming allows an microcontroller to program itself with code and data
Analog-to-digital Converter located in an external memory device. Customers typically use this low-cost method to
program a small number of microcontrollers after development and testing are complete.
The analog-to-digital (A/D) converter converts an analog input voltage to a digital equivalent. x Run-time programming allows you to program individual nonvolatile memory locations
Resolution is either 8 or 10 bits; sample and convert times are programmable. Conversions can during normal code execution, under complete software control. Customers typically use
be performed on the analog ground and reference voltage, and the results can be used to calculate this mode to download a small amount of information to the microcontroller after the rest
gain and zero-offset errors. The internal zero-offset compensation circuit enables automatic zero of the array has been programmed. For example, you might use run-time programming to
offset adjustment. The A/D also has a threshold-detection mode, which can be used to generate x download a unique identification number to a security device.
an interrupt when a programmable threshold voltage is crossed in either direction. The A/D scan x ROM dump mode allows you to dump the contents of the microcontroller’s nonvolatile
mode of the PTS facilitates automated A/D conversions and result storage. memory to a tester or to a memory device (such as flash memory or RAM).

Version 2 EE IIT, Kharagpur 11 Version 2 EE IIT, Kharagpur 12


10.3 Conclusion much harder to test and debug the code. As a result, the microcode that shipped with
machines was often buggy and had to be patched numerous times out in the field. It was the
This lesson discussed about the architecture of a typical high performance microcontrollers. difficulties involved with using microcode for control that spurred Patterson and others began
The next lesson shall discuss the signals of a typical microcontroller from the Intel MCS96 to question whether implementing all of these complex, elaborate instructions in microcode
family. was really the best use of limited transistor resources.

2. What is the function of the Watch Dog Timer?


10.4 Questions and Answers
Ans: A fail-safe mechanism that intervenes if a system stops functioning. A hardware timer
1. What do you mean by the Microcode Engine? that is periodically reset by software. If the software crashes or hangs, the watchdog timer
will expire, and the entire system will be reset automatically.
Ans: This is where the instructions which breaks down to smaller micro-instructions are The Watch Dog Unit contains a Watch Dog Timer.
executed. A watchdog timer (WDT) is a device or electronic card that performs a specific operation
Microprogramming was one of the key breakthroughs that allowed system architects to after a certain period of time if something goes wrong with an electronic system and the
implement complex instructions in hardware. To understand what microprogramming is, it system does not recover on its own.
helps to first consider the alternative: direct execution. With direct execution, the machine A common problem is for a machine or operating system to lock up if two parts or
fetches an instruction from memory and feeds it into a hardwired control unit. This control programs conflict, or, in an operating system, if memory management trouble occurs. In
unit takes the instruction as its input and activates some circuitry that carries out the task. For some cases, the system will eventually recover on its own, but this may take an unknown and
instance, if the machine fetches a floating-point ADD and feeds it to the control unit, there’s perhaps extended length of time. A watchdog timer can be programmed to perform a warm
a circuit somewhere in there that kicks in and directs the execution units to make sure that all boot (restarting the system) after a certain number of seconds during which a program or
of the shifting, adding, and normalization gets done. Direct execution is actually pretty much computer fails to respond following the most recent mouse click or keyboard action. The
what you’d expect to go on inside a computer if you didn’t know about microcoding. timer can also be used for other purposes, for example, to actuate the refresh (or reload)
button in a Web browser if a Web site does not fully load after a certain length of time
The main advantage of direct execution is that it’s fast. There’s no extra abstraction or following the entry of a Uniform Resource Locator (URL).
translation going on; the machine is just decoding and executing the instructions right in
hardware. The problem with it is that it can take up quite a bit of space. Think about it. If A WDT contains a digital counter that counts down to zero at a constant speed from a
every instruction has to have some circuitry that executes it, then the more instructions you preset number. The counter speed is kept constant by a clock circuit. If the counter reaches
have, the more space the control unit will take up. This problem is compounded if some of zero before the computer recovers, a signal is sent to designated circuits to perform the
the instructions are big and complex, and take a lot of work to execute. So directly executing desired action.
instructions for a CISC machine just wasn’t feasible with the limited transistor resources of
the day.
With microprogramming, it’s almost like there’s a mini-CPU on the CPU. The control
unit is a microcode engine that executes microcode instructions. The CPU designer uses
these microinstructions to write microprograms, which are stored in a special control
memory. When a normal program instruction is fetched from memory and fed into the
microcode engine, the microcode engine executes the proper microcode subroutine. This
subroutine tells the various functional units what to do and how to do it.
As you can probably guess, in the beginning microcode was a pretty slow way to do
things. The ROM used for control memory was about 10 times faster than magnetic core-
based main memory, so the microcode engine could stay far enough ahead to offer decent
performance. As microcode technology evolved, however, it got faster and faster. (The
microcode engines on current CPUs are about 95% as fast as direct execution) Since
microcode technology was getting better and better, it made more and more sense to just
move functionality from (slower and more expensive) software to (faster and cheaper)
hardware. So ISA instruction counts grew, and program instruction counts shrank.
As microprograms got bigger and bigger to accommodate the growing instructions sets,
however, some serious problems started to emerge. To keep performance up, microcode had
to be highly optimized with no inefficiencies, and it had to be extremely compact in order to
keep memory costs down. And since microcode programs were so large now, it became

Version 2 EE IIT, Kharagpur 13 Version 2 EE IIT, Kharagpur 14

You might also like