Module-1 QA (1)
Module-1 QA (1)
Q1
Q2
Soln: The following Figure shows a typical embedded device based on an ARM core.
Each box represents a feature or function. The lines connecting the boxes are the buses
carrying data.
We can separate the device into four main hardware components:
1. The ARM processor controls the embedded device. Different versions of the ARM
processor are available to suit the desired operating characteristics. An ARM processor
comprises a core (the execution engine that processes instructions and manipulates data)
plus the surrounding components (memory and cache) that interface it with a bus.
2. Controllers coordinate important functional blocks of the system. Two commonly
found
controllers are interrupt and memory controllers.
3. The peripherals provide all the input-output capability external to the chip and are
responsible for the uniqueness of the embedded device.
4. A bus is used to communicate between different parts of the device.
ARM Bus technology
• Embedded systems use different bus technologies than those
designed for x86 PC.
– X86 uses PCI bus technology connects Video cards and HD
controllers and hence known as external or off-chip
– Embedded device use an on-chip bus which is internal to the
chip
• A Bus has two architecture levels
– The First is a physical level that covers the electrical
characteristics and bus width (16, 32, or 64 bits).
– The Second level is the protocol– the logical
rules governing the communication between processor and
peripheral.
• ARM seldom implements the electrical characteristics of the bus,
but it routinely specifies the bus protocol.
AMBA Bus Protocol
• AMBA Advanced Micro controller Bus Architecture
• Introduced in 1996, it’s widely adopted as the on-chip bus architecture for
ARM processors.
• The first AMBA buses introduced were:
» ASB: ARM System Bus, and
» APB: ARM Peripheral Bus
Later, ARM introduced another bus design
» AHB: ARM High-performance Bus
• Using AMBA, peripheral designers can reuse the same design on multiple
projects (with different processor architecture).
• Plug-and-play interface improves availability and time to market for hardware
developers.
• AHB- provides higher data throughput than ASB. Because
» It uses a Centralized Multiplexed Bus Scheme (rather than ASB’s bi-
direction bus).
» This change allows the AHB bus to run at higher clock speed.
» 64/128 bits width.
» Two variations on the AHB bus
» Multi-layer AHB, and allows multiple active bus masters,
» AHB-Lite: only one master
Memory
– Memory is necessary to have some form of memory to store and
execute code.
– For good memory characteristics compare : price, performance, and
power consumption
– Specific memory characteristics are hierarchy, width, and type
– To double the speed for a required bandwidth, memory needs more
power.
Memory types
• DRAM: the most commonly used RAM for devices;
– Dynamic: need to have its storage cells refreshed and given a new electronic
charge every few milliseconds, so you need to set up a DRAM controlr before using
the memory.
• SRAM: is faster than the more traditional DRAM (SRAM does not require a
pause between data access).
• SDRAM
– is one of many subcategories of DRAM.
– accessed pipelined, transferred in a burst.
– Peripherals
• Embedded system that interact with the outside world need some form of
peripheral device.
– Peripherals range from a simple serial communication device to a more
complex 802.11 wireless device.
• All ARM peripherals are memory mapped – the programming interface is a set
of memory addressed register.
• Controllers are specialized peripherals that implement higher level of
functionality within an embedded system.
– Two important types of controllers are
• Memory Controller
• Interrupt Controller
• Normal IC
• Vectoring IC
• Priority
• Simple Interrupt Dispatch
Q3
Q. Compare and explain the different software’s used in embedded system.
Initialization code (or boot code): is the first code executed on the
board and is specific to a particular target or group of targets. It sets-
up the minimum parts of the board before handing over the control to
the operating system.
– takes the processor from the reset state to a state where the
operating system can run.
» Configuring memory controller, caches
» Initializing some devices
» in a simple system the OS is replaced by a Debug
Monitor or a simple scheduler.
• Three phases of tasks before handing over the control to the operating
system are:
– Initial hardware configuration
» Satisfy the requirements of the booted image
• e.g. re-organization of the memory map
– Diagnostics
» Fault identification and isolation
– Booting
» Loading an image and handing control over to the image
» The boot process may be complicated if the system
must boot different operating systems or different
versions of the same operating system.
Operating Systems
• OS organizes the system resources
– peripherals, memory, and processing time
» With an OS controlling these resources, they can be
efficiently used by different applications running
within the OS environment.
• ARM processors support over 50 OSes
– Two main categories: RTOS, platform OS
» RTOS: guarantee response times to event
» platform OS: require MMU and tend to have secondary
storage (for large application).
These two categories of OSes are not mutually exclusive.
– ARM has developed a set of processor cores that specially
target each category. Applications:
• The OS schedules applications
– code dedicated to handling a particular task.
• ARM processors are found in numerous market segments, including
– networking, automotive, mobile and consumer devices,
mass storage, and imaging.
• In contrast, ARM processors are not found in applications that
require leading-edge high performance.
Q4
Q. Draw and Explain data flow diagram (architectural diagram) of ARM.
Soln: A programmer can think of an ARM core as functional units connected by data
buses, as shown in the following Figure.
The arrows represent the flow of data, the lines represent the buses, and the boxes
represent either an operation unit or a storage area.
Data enters the processor core through the Data bus. The data may be an
instruction to execute or a data item.
Figure shows a Von Neumann implementation of the ARM—data items and
instructions share the same bus. (In contrast, Harvard implementations of the
ARM use two different buses).
The instruction decoder translates instructions before they are executed. Each
instruction executed belongs to a particular instruction set
There are no data processing instructions that directly manipulate data in
memory. Thus, data processing is carried out in registers.
Data items are placed in the register file—a storage bank made up of 32-bit
registers.
Since the ARM core is a 32-bit processor, most instructions treat the registers as
holding signed or unsigned 32-bit values. The sign extend hardware converts
signed 8-bit and 16-bit numbers to 32-bit values as they are read from memory
and placed in a register.
ARM instructions typically have two source registers, Rn and Rm, and a single
result or destination register, Rd. Source operands are read from the register file
using the internal buses A and B, respectively.
The ALU (arithmetic logic unit) or MAC (multiply-accumulate unit) takes the
register values Rn and Rm from the A and B buses and computes a result. Data
processing instructions write the result in Rd directly to the register file.
Load and store instructions use the ALU to generate an address to be held in the address
register and broadcast on the Address bus.
Q5. Q. What is pipelining mechanism? Discuss the different pipelining stages and
characteristics in ARM. OR Explain pipeline mechanism in Advanced RISC
Machine processor.
The pipeline design for each ARM family differs. For example, The ARM9 core
increases the pipeline length to five stages, as shown in Figure.
• EX: shows a sequence of three instructions being fetched, decoded, and
executed by the processor.
• Each instruction takes a single cycle to complete after the pipeline is filled.
• The three instructions are placed into the pipeline sequentially.
• In the first cycle the core fetches the ADD instruction from memory.
• In the second cycle the core fetches the SUB instruction and decodes the ADD
instruction.
• In the third cycle, both the SUB and ADD instructions are moved along the
pipeline.
• ADD instruction is executed.
• SUB instruction is decoded.
• CMP instruction is fetched
• This procedure is called filling the pipeline.
• The pipeline allows the core to execute an instruction every cycle.
• As pipeline length increases, the amount of work done at each stage is reduced,
which allows the processor to attain a higher operating frequency.
• In turn increases the performance.
• System latency also increases because it takes more cycles to fill the pipeline
before the core can execute an instruction.
Increased pipeline length also means there can be data dependency between certain
stages.
• Pipeline design for each ARM family differs. For example, The ARM9 core
increases the pipeline length to five stages.
• ARM9 adds a memory and write back stage, which allows the ARM9 to process
on average 1.1 Dhrystone MIPS per MHz & increases throughput by around
13% compared with an ARM7.
• The maximum core frequency attainable using an ARM9 is also higher.
• ARM10 increases the pipeline length still further by adding a sixth stage
• Average 1.3 Dhrystone MIPS per MHz,
• 34% more throughput than an ARM7 processor core, but again at a higher
latency cost.
• Even though the ARM9 and ARM10 pipelines are different, they still use the
same
pipeline executing characteristics as an ARM7.
Code written for the ARM7 will execute on an ARM9 or ARM10
ARM pipeline has not processed an instruction until it passes completely through
the execute stage.
For example, an ARM7 pipeline (with three stages) has executed an instruction only
when the fourth instruction is fetched.
Figure 7.11 shows an instruction sequence on an ARM7 pipeline.
MSR instruction is used to enable IRQ interrupts.
Only occurs once the MSR instruction completes the execute stage of the pipeline.
It clears the I bit in the cpsr to enable the IRQ interrupts.
Once the ADD instruction enters the execute stage of the pipeline, IRQ interrupts
are enabled.
The second form, attached to the Harvard-style cores, has separate caches for data and
instruction, as shown in the following Figure.
Soln: When an exception or interrupt occurs, the processor sets the pc to a specific
memory address. The address is within a special address range called the vector table.
The entries in the vector table are instructions that branch to specific routines designed to
handle a exception or interrupt. The memory map address 0x00000000 (or in some
processors starting at the offset 0xffff0000) is reserved for the vector table, a set of 32-
bit words. When an exception or interrupt occurs, the processor suspends normal
execution and starts loading instructions from the exception vector table (see the
following Table).
Each vector table entry contains a form of branch instruction pointing to the start of a
specific routine:
Reset vector is the location of the first instruction executed by the processor
when power is applied. This instruction branches to the initialization code.
Undefined instruction vector is used when the processor cannot decode an
instruction.
Software interrupt vector is called when you execute a SWI instruction. The SWI
instruction is frequently used as the mechanism to invoke an operating system routine.
Prefetch abort vector occurs when the processor attempts to fetch an instruction
from an address without the correct access permissions. The actual abort occurs
in the decode stage.
Data abort vector is similar to a prefetch abort, but is raised when an instruction
attempts to access data memory without the correct access permissions.
Interrupt request vector is used by external hardware to interrupt the normal
execution flow of the processor. It can only be raised if IRQs are not masked in
the cpsr.
Fast interrupt request vector is similar to the interrupt request, but is reserved for
hardware requiring faster response times. It can only be raised if FIQs are not masked in
the cpsr.
Q11 Q. Explain the Banking Registers of ARM processor with complete register sets
available.
General-purpose registers hold either data or an address. They are identified with the
letter r prefixed to the register number. For example, register 4 is given the label r4.
The Figure shows the active registers available in user mode. (A protected mode is
normally used when executing applications).The processor can operate in seven different
modes. All the registers shown are 32 bits in size. There are up to 18 active registers:
16 data registers and 2 processor status registers.
The data registers visible to the programmer are r0 to r15.
The ARM processor has three registers assigned to a particular task or special function:
r13, r14, and r15. They are given with different labels to differentiate them from the
other registers.
Register r13 is traditionally used as the stack pointer (sp) and stores the head of
the stack in the current processor mode.
Register r14 is called the link register (lr) and is where the core puts the return
address whenever it calls a subroutine.
Register r15 is the program counter (pc) and contains the address of the next
instruction to be fetched by the processor.
In ARM state the registers r0 to r13 are orthogonal—any instruction that you can apply
to r0 you can equally well apply to any of the other registers. In addition to the 16 data
registers, there are two program status registers: cpsr (current program status register)
and spsr (saved program status register).
• Register r15 is the program counter (pc) and contains the address of the next
instruction to be fetched by the processor.
• registers r13 and r14 can also be used as general-purpose registers
• OS assumes r13 is pointing to valid stack frame, not recommended as general
purpose registers
Register in ARM:
• Orthogonal Registers (ref. VAX, PDP-11)
– We say R0~R13 are orthogonal, for given instruction, if it can use R0,
then others can also be used.
– there are two program status registers: cpsr and spsr (the current and
saved program status registers, respectively).
PSRs
– R13(sp), R14(lr), R15(pc)
• CPSR/SPSR
– Condition Codes: N, Z, C, V
– Interruption mask: I(IRQ), F(FIQ)
– Thumb Enable Bit
– Mode(5-bit)