Module 3 (1)
Module 3 (1)
Module 3:
Programmable Digital Signal Processors:
Introduction, Commercial Digital Signal-processing Devices, Data Addressing Modes of
TMS32OC54XX, Memory Space of TMS32OC54xx Processors, Program Control. Detail Study of
TMS320C54X & 54xx Instructions and Programming, On Chip Peripherals, Interrupts of
TMS32OC54XX Processors, Pipeline Operation of TMS32OC54xx Processor. L1, L2, L3
3.1 Introduction:
The TMS320 family consists of two types of single chips DSPs: 16-bit fixed point &32-bit
floating-point. These DSPs possess the operational flexibility of high-speed controllers and the
numerical capability of array processors
There are several families of commercial DSP devices. Right from the early eighties, when
these devices began to appear in the market, they have been used in numerous applications, such as
communication, control, computers, Instrumentation and consumer electronics. The architectural
features and the processing power of these devices have been constantly upgraded based on the
advances in technology and the application needs. However, their basic versions, most of them have
Harvard architecture, a single-cycle hardware multiplier, an address generation unit with dedicated
address registers, special addressing modes, on-chip peripherals interfaces.
Of the various families of programmable DSP devices that are commercially available, the
three most popular ones are those from Texas Instruments, Motorola, and Analog Devices. Texas
Instruments was one of the first to come out with a commercial programmable DSP with the
introduction of its TMS32010 in 1982.
The central processing unit (CPU) of TMS320C54xx processors consists of a 40- bit
arithmetic logic unit (ALU), two 40-bit accumulators, a barrel shifter, a 17x17 multiplier, a 40-bit
adder, data address generation logic (DAGEN) with its own arithmetic unit, and program address
generation logic (PAGEN). These major functional units are supported by a number of registers
and logic in the architecture.
A powerful instruction set with a hardware-supported, single-instruction repeat and block
repeat operations, block memory move instructions, instructions that pack two or three
simultaneous reads, and arithmetic instructions with parallel store and load make these devices very
efficient for running high-speed DSP algorithms.
Several peripherals, such as a clock generator, a hardware timer, a wait state generator,
parallel I/O ports, and serial I/O ports, are also provided on-chip. These peripherals make it
convenient to interface the signal processors to the outside world.
In these following sections, we examine in detail the various architectural features of the
TMS320C54xx family of processors.
The performance of a processor gets enhanced with the provision of multiple buses to
provide simultaneous access to various parts of memory or peripherals. The 54xx architecture is
built around four pairs of 16-bit buses with each pair consisting of an address bus and a data bus.
As shown in Figure 3.1, these are
The program bus pairs (PAB, PB); which carries the instruction code from the program memory.
Three data bus pairs (CAB, CB; DAB, DB; and EAB, EB); which interconnected the various units
within the CPU. In Addition, the pair CAB, CB and DAB, DB are used to read from the data
memory, while
The pair EAB, EB; carries the data to be written to the memory. The ‘54xx can generate up to two
data-memory addresses per cycle using the two auxiliary register arithmetic unit (ARAU0 and
ARAU1) in the DAGEN block. This enables accessing two operands simultaneously.
The ‘54xx CPU is common to all the ‘54xx devices. The ’54xx CPU contains a 40-bit
arithmetic logic unit (ALU); two 40-bit accumulators (A and B); a barrel shifter; a 17 x 17-bit
multiplier; a 40-bit adder; a compare, select and store unit (CSSU); an exponent encoder (EXP); a
data address generation unit (DAGEN); and a program address generation unit (PAGEN).
The ALU performs 2’s complement arithmetic operations and bit-level Boolean operations
on 16, 32, and 40-bit words. It can also function as two separate 16-bit ALUs and perform two 16-
bit operations simultaneously. Figure 3.2 show the functional diagram of the ALU of the
TMS320C54xx family of devices.
Accumulators A and B; store the output from the ALU or the multiplier/adder block and
provide a second input to the ALU. Each accumulator is divided into three parts: guards’ bits (bits
39-32), high-order word (bits-31-16), and low-order word (bits 15- 0), which can be stored and
retrieved individually.
Figure 3.2. Functional diagram of the central processing unit of the TMS320C54xx
processors.
Barrel shifter: provides the capability to scale the data during an operand read or write. No
overhead is required to implement the shift needed for the scaling operations. The’54xx barrel
shifter can produce a left shift of 0 to 31 bits or a right shift of 0 to 16 bits on the input data. The
shift count field of status registers ST1, or in the temporary register T. Figure 3.3 shows the
functional diagram of the barrel shifter of TMS320C54xx processors.
The barrel shifter and the exponent encoder normalize the values in an accumulator in a
single cycle. The LSBs of the output are filled with0s, and the MSBs can be either zero filled or
sign extended, depending on the state of the sign-extension
mode bit in the status register ST1. An additional shift capability enables the processor to perform
numerical scaling, bit extraction, extended arithmetic, and overflow prevention operations.
Multiplier/adder unit: The kernel of the DSP device architecture is multiplier/adder unit. The
multiplier/adder unit of TMS320C54xx devices performs 17 x 17 2’s complement multiplication
with a 40-bit addition effectively in a single instruction cycle. In addition to the multiplier and
adder, the unit consists of control logic for integer and fractional computations and a 16-bit
temporary storage register, T. Figure 3.4 show the functional diagram of the multiplier/adder unit
of TMS320C54xx processors.
The compare, select, and store unit (CSSU) is a hardware unit specifically incorporated to
accelerate the add/compare/select operation. This operation is essential to implement the Viterbi
algorithm used in many signal-processing applications.
The exponent encoder unit supports the EXP instructions, which stores in the T register the
number of leading redundant bits of the accumulator content. This information is useful while
shifting the accumulator content for the purpose of scaling.
The amount and the types of memory of a processor have direct relevance to the efficiency
and performance obtainable in implementations with the processors. The ‘54xx memory is
organized into three individually selectable spaces: program, data, and I/O spaces. All ‘54xx
devices contain both RAM and ROM. RAM can be either dual-access
type (DARAM) or single-access type (SARAM). The on-chip RAM for these processors is
organized in pages having 128word locations on each page.
The ‘54xx processors have a number of CPU registers to support operand addressing and
computations. The CPU registers and peripherals registers are all located on page 0 of the data
memory. Figure 3.5(a) and (b) shows the internal CPU registers and peripheral registers with their
addresses. The processors mode status (PMST) registers that is used to configure the processor. It
is a memory-mapped register located at address 1Dh on page 0 of the RAM. A part of on-chip ROM
may contain a boot loader and look-up tables for function such as sine, cosine, µ- law, and A- law.
ST0: Contains the status of flags (OVA, OVB, C, TC) produced by arithmetic operations & bit
manipulations.
ST1: Contain the status of various conditions & modes. Bits of ST0&ST1registers can be set or
clear with the SSBX & RSBX instructions.
PMST: Contains memory-setup status & control information.
BRAF(15) CPL XF HM INTM 0 OVM SXM C16 FRCT(6) CMPT(5) ASM (4-0)
(14) (13) (12) (11) (10) (9) (8) (7)
OVM=0: the destination accumulator is set either the most positive value and most negative value.
OVM=1; Overflowed result is in destination accumulator.
SXM: Sign extension mode
SXM=0, sign extension suppressed.
SXM=1, the data is sign extended.
ASM: Accumulator Shift Mode. 5-bit field, & specifies the Shift value within -16 to 15 range.
IPTR (15-7) MP/MC (6) OVLY (5) AVIS (4) DROM (3) CLKOFF (2) SMUL (1) SST (0)
• It enables on-chip dual access data RAM blocks to be mapped into the program space.
• OVLY =0, The On-chip RAM is addressable in data space but not in program space.
• OVLY=1, The on-chip RAM is mapped into program space and data space.
• AVIS=0, The external lines do not change with the internal program address.
• Control and data lines are not affected and the address bus is driven with the last address on
the bus.
• Also, it allows the interrupt vector to be decoded in conjunction with IACK when the
DROM: It enables the on-chip DARAM (4-7) to be mapped into data space.
DROM=0, the on-chip DARAM (4-7) is not mapped into the data space
DROM=1, the on-chip DARAM (4-7) is mapped into the data space
CLKOFF: CLOCKOUT off: CLKBFF=1, the output of the CLKOUT is disabled and
remains at high level.
SMUL: saturation on Multiplication:
SMUL=1, saturation of a Multiplication result occurs before performing the accumulation in a
MAC of MAS instruction.
SMUL bit applies only when OVM =1 and FRCT=1.
SST: Saturation on store.
SST=1, saturation of the data from the accumulator is enabled storing in memory.
• The saturation is performed after the shift operation
•
3.4 Data Addressing Modes of TMS320C54X Processors:
Data addressing modes provide various ways to access operands to execute instructions and place
results in the memory or the registers. The 54XX devices offer seven basic addressing modes
1. Immediate addressing.
2. Absolute addressing.
3. Accumulator addressing.
4. Direct addressing.
5. Indirect addressing.
7. Stack addressing.
The instruction contains the specific value of the operand. The operand can be short (3,5,8 or 9 bit
in length) or long (16 bits in length). The instruction syntax for short operands occupies one memory
location,
4.*(lk) addressing.
Example:
MVKP 1000h, *AR5 ; 1000 H *AR5 (dmad addressing) MVPD 1000h, *AR7 ; 1000h
*AR7 (pmad addressing) PORTR 05h, *AR3 ; 05h *AR3 (PA addressing)
LD *(1000h), A ; *(1000h) A (*(lk) addressing)
Accumulator content is used as address to transfer data between Program and Data memory.
Ex: READA *AR2
Block diagram of the direct addressing mode for TMS320C54xx Processors is shown in figure 3.7.
Base address + 7 bits of value contained in instruction = 16-bit address. A page of 128 locations
can be accessed without change in DP or SP. Compiler mode bit (CPL) in ST1 register is used.
If CPL =0 selects DP CPL = 1 selects SP,
It should be remembered that when SP is used instead of DP, the effective address is computed by
adding the 7-bit offset to SP.
Figure 3.7 Block diagram of the direct addressing mode for TMS320C54xx Processors.
Block diagram of the indirect addressing mode for TMS320C54xx Processors is shown in
figure 3.8.
• Data space is accessed by address present in an auxiliary register.
• 54xx have 8, 16 bit auxiliary register (AR0 – AR 7). Two auxiliary register arithmetic
units (ARAU0 & ARAU1)
• Used to access memory location in fixed step size. AR0 register is used for indexed and bit
reverse addressing modes.
Figure 3.8 Block diagram of the indirect addressing mode for TMS320C54xx Processors.
11 *(lk) addr lk
Table 3.2 Indirect addressing options with a single data –memory operand.
• A circular buffer is a sliding window contains most recent data. Circular buffer of size R
must start on a N-bit boundary, where 2N > R .
• The circular buffer size register (BK): specifies the size of circular buffer.
• Effective base address (EFB): By zeroing the N LSBs of a user selected AR (ARx).
• End of buffer address (EOB) : By replacing the N LSBs of ARx with the N LSBs of BK.
Figure 3.9 Block diagram of the circular addressing mode for TMS320C54xx Processors.
Bit-Reversed Addressing:
Problem: Assuming the contents of AR3 to be 200h, what will be its contents after each of
the following TMS320C54XX addressing modes is used? Assume that the contents of AR0
are 20h.
A. *AR3+0B
B. *AR3-0B
Dual-Operand Addressing:
• If in an instruction with a parallel store the source operand the destination operand
point to the same location, the source is read before writing to the destination.
• Only 2 bits are available in the instruction code for selecting each auxiliary register
in this mode.
• Thus, just four of the auxiliary registers, AR2-AR5, can be used, The ARAUs
together with these registers, provide capability to access two operands in a single
cycle.
• Figure 3.11 shows how an address is generated using dual data-memory operand
addressing.
Name Function
Opcode This field contains the operation code for the instruction
Xmod Defined the type of indirect addressing mode used for accessing the Xmem
operand
XAR Xmem AR selection field defines the AR that contains the address of Xmem
Ymod Defies the type of indirect addressing mode used for accessing the Ymem
operand
Yar Ymem AR selection field defines the AR that contains the address of Ymem
Table 3.3. Function of the different field in dual data memory operand addressing
Figure 3.11 Block diagram of the Indirect addressing options with a dual data –memory
operand.
• Used to modify the memory-mapped registers without affecting the current data-
• Used to automatically store the program counter during interrupts and subroutines.
• Values of stack & SP before and after operation is shown in figure 3.13.
• PSHD X2
Example Problem P3.1: Assuming the current content of AR3 to be 200h, what will be its contents
after each of the following TMS320C54xx addressing modes is used? Assume that the contents of
AR0 are 20h.
a. *AR3+0
b. *AR3-0
c. *AR3+
d. *AR3
e. *AR3
f. *+AR3(40h)
g. *+AR3(-40h)
Solution:
Problem P3.2 Assume that the register AR3 with contents 1020h is selected as the pointer for the
circular buffer. Let BK = 40h to specify the circular buffer size as 40h.Determine the start and the end
addresses fort the buffer. What will be the contents of register AR3 after the execution to the instruction
LD*AR3 + 0%, A, if the contents of register AR0 are 0025h?
Solution:
AR3 = 1020h means that currently it points to location 1020h. Masking the lower 6
bits zeros gives the start address of the buffer as 1000h. Replacing the same bits with
the BK gives the end address as 1040h.
Problem P3.3 Assuming the current contents of AR3 to be 200h, what will be its contents after each
of the following TMS320C54xx addressing modes is used? Assume that the contents of AR0 are 20h
f. *AR3 + 0B
g. *AR3 – 0B
Solution:
• Data memory: To store data required to run programs & for external memory
mapped registers.
• Program memory: To store program instructions &tables used in the execution
of programs.
• Figure 3.14 shows the memory map for the TMS320C5416 Processor
❑ It contains program counter (PC), the program counter related H/W, hard stack,
repeat counters &status registers.
❑ End of a block repeat loop: The PC is loaded with the contents of the block repeat
program address start register.
❑ Logical operations.
❑ Program-control operations
Store Instruction
ST Store TREG, TRN or immediate value into ST T, Smem
memory ST TRN, Smem
ST #lk, Smem
STH Store accumulator high into memory STH src, Smem
STH src, ASM, Smem
STL Store accumulator low into memory STH src, Smem
STH src, [,SHIFT], Smem
STLM Store accumulator low to MMR STLM src, MMR
STM Store immediate value to MMR STM lk, MMR
ST||ADD Store accumulator with parallel ADD ST src, Ymem|| ADD Xmem,
dst
ST||SUB Store accumulator with parallel subtract ST src, Ymem|| SUB Xmem,
dst
ST||MAC Store accumulator with parallel multiply ST src, Ymem|| MAC Xmem,
accumulate. dst
ST||MPY Store accumulator with parallel multiply ST src, Ymem|| MPY Xmem,
dst
ST||MAC[R]: Store Accumulator With Parallel Multiply Accumulate With/Without
Rounding
ST||MAS[R]: Store Accumulator With Parallel Multiply Subtract With/Without
Rounding
STRCD: Store T Conditionally
3.7.2. Arithmetic Instructions
ADD instructions: (Flags affected OVM, C and OVsrc, SXM, OVdst)
Adds a 16-bit value to the contents of selected accumulator or 16-bit Xmem operand in dual-
memory operand addressing mode.
16-bit value can be
i). Contents of single data memory operand
ii). Contents of dual data memory operand
iii). 16-bit long immediate operand
iv). Shifted value in source accumulator.
NOTE: If destination is specified then result is stored in the destination location, else stored
in the source accumulator itself.
Syntax: Expression
ADD Smem, src src = src + Smem
ADD Smem, TS, src src = src + Smem << TS
ADD Smem [, SHIFT ], src [ , dst ] dst = src + Smem << SHIFT
ADD #lk [, SHFT ], src [ , dst ] dst = src + #lk << SHFT
Operands:
Smem: single data-memory operand
Xmem: Dual data-memory operands
src,dst: A (accumulator A)
B (accumulator B)
-32768<=lk<=32767
-16<=SHIFT<=15
0<=SHFT<=15
ADD: ADD accumulator
ADDM: Add long immediate value to 16-bit single operand
ADDC: Add to accumulator with carry
ADDS: Add to accumulator with sign extension suppressed
SUB: Subtract from Accumulator
SUBB: Subtract from accumulator with barrow
SUBS Subtract with accumulator sign extension suppressed.
SUBC: Subtract conditionally is used for division.
NOTE: Subtraction instructions have the same format as ADD instruction and operands are
also obtained in the same as in addition.
MVDK: Move Data From Data Memory to Data Memory With Destination Addressing
MVDM: Move Data From Data Memory to Memory-Mapped Register
MVDP: Move Data from Data Memory to Program Memory
MVKD: Move Data From Data Memory to Data Memory With Source Addressing
MVMD: Move Data From Memory-Mapped Register to Data Memory
MVMM: Move Data From Memory-Mapped Register to Memory-Mapped Register
MVPD: Move Data From Program Memory to Data Memory
PORTR: Read Data from Port
PORTW: Write Data to Port
READA: Read Program Memory addressed by Accumulator A and Store in Data
Memory
WRITA: Write Data to Program Memory Addressed by Accumulator A
Branch Instructions
B[D]: Branch Unconditionally
BACC[D]: Branch to Location Specified by Accumulator
BANZ[D]: Branch on Auxiliary Register Not Zero
BC[D]: Branch Conditionally
FB[D]: Far Branch Unconditionally
FBACC[D]: Far Branch to Location Specified by Accumulator
NOP: No Operation
RESET: Software Reset
NOTE:
MPY Xmem, Yemen, dst; where Xmem and Ymem are dual data memory operands and dst is
accumulator A orB.
The instruction multiplies data memory value by another data memory value and stores the result in
accumulator A or B.
The register T is loaded with the Xmem value you in the read memory phace.
In the indirect addressing mode, the instruction can also multiply the contents of the auxiliary
register used for indirect addressing.
2) MPY #01234h, A
3). MPY *AR2-, * AR4+ 0, B
Solution: instruction 1) multiply the current content of the T register by the content of the data
memory location 13 in the current data page. The result is placed in the accumulator B.
Instruction 2). multiplies the current content of the T register by the constant 1234h and places the
result in the accumulator A.
Instruction 3). multiplies the content of the memory pointed by AR2 by the content of the memory
pointed by AR4. the result is placed in the accumulator B. during this instruction execution, register
T is loaded with the content of the same data memory location pointed by
AR2. AR2 is then decremented by 1 and AR4 is updated by adding to it the content of AR0.
Instruction 1) multiplies the content of data memory location pointed by AR5 by the constant 1234h,
and adds the product to the content of the accumulator A. During the execution the register is loaded
with the content of the data memory location pointed by AR5. AR5 is then incremented by one.
Instruction 2) multiply the content of the data memory pointed by AR3 by the content of the data
memory pointed by AR4. The contents of the accumulator B are added to the product and the result
is placed in the accumulator A. It is loaded with the content of the same data memory location
pointed by AR3. AR3 is decremented by 1 and AR4 is incremented by1.
The MAC instruction used for computing the sum of the series of the product terms.
MAS Xmem, Yemen, src, dst; where Xmem and Ymem are dual data memory operands and src,
dst are accumulator A and B
The instruction multiplies data memory value by another data memory value and subtract the
product from the content of the source, which may be either of the two accumulators A and B. The
result is stored in the other accumulator. the register T is loaded with Xmem value in the read
memory phase.
The indirect mode, in addition to the multiply operation, the instruction can modify the content of
the auxiliary registers used for indirect addressing.
This instruction multiplies the content of the data memory pointed by AR3 by the content of the data
memory pointed by AR4.
The product is subtracted from the content of the accumulator B and the result is placed in the
accumulator A.
During this instruction, register T is loaded with the content of the same data memory location
appointed by AR3.
AR3 is then decremented by one and AR5 is incremented by the instruction used for computing the
butterflies in the fft implementation.
This instruction carries out all the functions of the MAC instruction and, in addition, copies the
content of the current data memory address to the next higher data memory address.
However, the two operands of the multipliers are required to be single data memory value and
program memory value.
This feature is equivalent to implementing the Z-1 delay encountered in digital Signal Processing
algorithms.
For this reason, the MACD instruction is often used for implementing FIR filters.
The format and all other features of the MACD instruction are the same as those of the MAC
instruction.
Repeat instruction
The format of the instruction is
Solution:
The first instruction loads the register RC with 2.
This number is the repeat count for the next MAC instruction.
The MAC instruction executes 3 times.
It multiplies and accumulates in A the data locations contents pointed to by the register AR1 and
AR2.
After each multiply and pointer AR1 is incremented and pointer AR2 is decremented.
RPTB pmad, where pmad is the program memory address denoting the end of the block of
instruction to be repeated.
• This instruction is similar to the RPT instruction, except that it repeat a block of the code
given number of times without any penalty for looping.
• One more than the number of times the block of the instructions is to be repeated is initially
loaded into the memory mapped block repeat counter register, BRC .
➢ Code
➢ Variables
❑ Mnemonics
❑ Directives
❑ named variable
❑ .bss x, 5
Example: 1. Write a program to find the sum of a series of signed numbers stored at successive
locations in the data memory and places the result in the accumulator.
Solution:
This program computes the signed sum of data memory locations from address 410h to
41fh.The result is placed in A.
A=dmad(410h)+dmad(411h)+ ............. + dmad(41fh)
.mmregs
.global _c_int000
.text
._c_int00:
STM #10h, AR2 :initialize counter AR2=10h STM #410h,
AR2 :Initialize Pointer AR2=410h
LD #0h, A :Initialize sum A=0
.end
Eaxmple2: Program to computes multiply and accumulate using direct addressing mode: Y (n)
=h0x(n)+h1x(n-1)+h2x(n-2)
_c_int00:
Example3: Program computes multiply and accumulate using indirect addressing mode
.global _c_int00
_c_int00:
SSBX SXM ; Select sign extension mode
STM #310H, AR2 ; Initialize pointer AR2 for x(n) stored at 310H
STM @h, AR3 ; Initialize pointer AR3 for coefficients MPY
*AR2+,*AR3+, A ; A = x(n) * h(0)
MPY *AR2+,*AR3+, B ; A = x(n-1) * h(1)
ADD A, B ; B = x(n) * h(0) + x(n-1) * h(1)
NOP ; No operation
.end
.global _c_int00
.data
.bss x, 3
.bss y, 2
.text
_c_int00:
SSBX SXM ; Select sign extension mode
STM #x, AR2 ; Initialize AR2 to point to x(n)
NOP ; No operation
.end
• Hardware timer
• Clock generator
• Serial port
• TINT &TOUT
o The timer register (TIM) is a 16-bit memory-mapped register that decrements at every
pulse from the prescaler block (PSC).
o The timer period register (PRD) is a 16-bit memory-mapped register whose contents
are loaded onto the TIM whenever the TIM decrements to zero or the device is reset
(SRESET).
o The timer can also be independently reset using the TRB signal. The timer control
register (TCR) is a 16-bit memory-mapped register that contains status and control bits.
o Table 3.10. shows the functions of the various bits in the TCR. The prescaler block is
also an on-chip counter.
o Whenever the prescaler bits count down to 0, a clock pulse is given to the TIM register
that decrements the TIM register by 1.
o The TDDR bits contain the divide-down ratio, which is loaded onto the prescaler block
after each time the prescaler bits count down to 0.
o That is to say that the 4-bit value of TDDR determines the divide-by ratio of the timer
clock with respect to the system clock.
o In other words, the TIM decrements either at the rate of the system clock or at a rate
slower than that as decided by the value of the TDDR bits.
o TOUT and TINT are the output signal generated as the TIM register decrements to 0.
TOUT can trigger the start of the conversion signal in an ADC interfaced to the DSP.
o The sampling frequency of the ADC determines how frequently it receives the TOUT
signal.
o TINT is used to generate interrupts, which are required to service a peripheral such as
a DRAM controller periodically.
o The timer can also be stopped, restarted, reset, or disabled by specific status bits.
11 Soft Used in conjunction with the free bit to determine the state of the timer
Soft=0,the timer stops immediately.
Soft=1,the timer stops when the counter decrements to 0.
10 Free Use in conjunction with the soft bit Free=0,the soft bit selects the timer
mode free=1,the timer runs free
9-6 PSC Timer prescaler counter, specifies the count for the on-chip timer
➢ HRDY
➢ HCNTL0 &HCNTL1
➢ HBIL
➢ HR/𝑊
The clock generator on TMS320C54xx devices has two options-an external clock and the
internal clock. In the case of the external clock option, a clock source is directly connected
to the device. The internal clock source option, on the other hand, uses an internal clock
generator and a phase locked loop (PLL) circuit. The PLL, in turn, can be hardware
configured or software programmed. Not all
devices of the TMS320C54xx family have all these clock options; they vary from device
to device.
➢ Synchronous ports.
➢ Buffered ports.
The synchronous serial ports are high-speed, full-duplex ports and that provide direct
communications with serial devices, such as codec, and analog-to-digital (A/D) converters. A
buffered serial port (BSP) is synchronous serial port that is provided with an auto buffering unit
and is clocked at the full clock rate. The head of servicing interrupts. A time-division multiplexed
(TDM) serial port is a synchronous serial port that is provided to allow time-division multiplexing
of the data. The functioning of each of these on-chip peripherals is controlled by memory-mapped
registers assigned to the respective peripheral.
Many times, when CPU is in the midst of executing a program, a peripheral device may
require a service from the CPU. In such a situation, the main program may be interrupted by a signal
generated by the peripheral devices. This results in the processor suspending the main program in
order to execute another program, called interrupt service routine, to service the peripheral device.
On completion of the interrupt service routine, the processor returns to the main program to
continue from where it left.
Almost all the devices of TMS320C54xx family have 32 interrupts. However, the types and
the number under each type vary from device to device. Some of these interrupts are reserved for
use by the CPU.
The CPU of ‘54xx devices have a six-level-deep instruction pipeline. The six stages of the
pipeline are independent of each other. This allows overlapping execution of
instructions. During any given cycle, up to six different instructions can be active, each at a different
stage of processing. The six levels of the pipeline structure are program prefetch, program fetch,
decode, access, read and execute.
1 During program prefetch, the program address bus, PAB, is loaded with the address of
the next instruction to be fetched.
2 In the fetch phase, an instruction word is fetched from the program bus, PB, and loaded
into the instruction register, IR. These two phases from the instruction fetch sequence.
3 During the decode stage, the contents of the instruction register, IR are decoded to
determine the type of memory access operation and the control signals required for the
data-address generation unit and the CPU.
4 The access phase outputs the read operand’s on the data address bus, DAB. If a second
operand is required, the other data address bus, CAB, also loaded with an appropriate
address. Auxiliary registers in indirect addressing mode and the stack pointer (SP) are
also updated.
5 In the read phase the data operand(s), if any, are read from the data buses, DB and CB.
This phase completes the two-phase read process and starts the two- phase write
processes. The data address of the write operand, if any, is loaded into the data write
address bus, EAB.
6 The execute phase writes the data using the data write bus, EB, and completes the
operand write sequence. The instruction is executed in this phase.
Eaxmple1: Show the pipeline operation of the following sequence of instructions if the initial
value of AR3 is 80 & the values stored in memory location 80, 81, 82 are 1, 2 & 3. LD *AR3+, A
ADD #1000h, A STL A, *AR3+