0% found this document useful (0 votes)
24 views907 pages

Tms320c3x J

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views907 pages

Tms320c3x J

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 907

TMS320C3x

User’s Guide

2558539-9721 revision J
October 1994
TMS320C3x

User’s Guide

1994 Digital Signal Processing Products


Printed in U.S.A., October 1994 SPRU031D
2558539-9761 revision J
TMS320C3x

1994
User’s
Guide
IMPORTANT NOTICE

Texas Instruments (TI) reserves the right to make changes to its products or to discontinue any
semiconductor product or service without notice, and advises its customers to obtain the latest
version of relevant information to verify, before placing orders, that the information being relied
on is current.

TI warrants performance of its semiconductor products and related software to the specifications
applicable at the time of sale in accordance with TI’s standard warranty. Testing and other quality
control techniques are utilized to the extent TI deems necessary to support this warranty.
Specific testing of all parameters of each device is not necessarily performed, except those
mandated by government requirements.

Certain applications using semiconductor products may involve potential risks of death,
personal injury, or severe property or environmental damage (“Critical Applications”).

TI SEMICONDUCTOR PRODUCTS ARE NOT DESIGNED, INTENDED, AUTHORIZED, OR


WARRANTED TO BE SUITABLE FOR USE IN LIFE-SUPPORT APPLICATIONS, DEVICES
OR SYSTEMS OR OTHER CRITICAL APPLICATIONS.

Inclusion of TI products in such applications is understood to be fully at the risk of the customer.
Use of TI products in such applications requires the written approval of an appropriate TI officer.
Questions concerning potential risk applications should be directed to TI through a local SC
sales office.

In order to minimize risks associated with the customer’s applications, adequate design and
operating safeguards should be provided by the customer to minimize inherent or procedural
hazards.

TI assumes no liability for applications assistance, customer product design, software


performance, or infringement of patents or services described herein. Nor does TI warrant or
represent that any license, either express or implied, is granted under any patent right, copyright,
mask work right, or other intellectual property right of TI covering or relating to any combination,
machine, or process in which such semiconductor products or services might be or are used.

Copyright  1994, Texas Instruments Incorporated


Read This First

Preface

Read This First

About This Manual


This user’s guide serves as a reference book for the TMS320C3x generation
of digital signal processors, which includes the TMS320C30, TMS320C30-27,
TMS320C30-40, TMS320C31, TMS320C31-27, TMS320C31-40,
TMS320C31-50, TMS320LC31, and TMS320C31PQA. Throughout the book,
all references to ’C3x refer collectively to ’C30 and ’C31, and the TMS320C30
and TMS320C31 refer to all speed variations unless an exception is noted.
This document provides information to assist managers and
hardware/software engineers in application development.

How to Use This Book


This revision of the TMS320C3x User’s Guide incorporates the following
changes:
- Updated reference list of publications
- Improved description of repeat modes and interrupts in Chapter 6
- Description of power management modes in Chapter 6
- Improved description of serial ports and DMA coprocessor in Chapter 8
- Description of power management instructions in Chapter 10
- Description of low-power-mode interrupt interface in Chapter 12
- More detailed information on MPSD emulator interface, signal timings,
and connections between emulator and target system
- Current timing specification in Chapter 13
- TMS320C30PPM pinout, mechanical drawing, and timings in Chapter 13
- Development support description and device/tool part numbers in
Appendix B
- Data sheet for current military versions of the ’C3x in Appendix E

Read This First iii


Notational Conventions

Notational Conventions
This document uses the following conventions:

- Program listings, program examples, interactive displays, filenames, and


symbol names are shown in a special font. Examples use a bold version
of the special font for emphasis. Here is a sample program listing:
0011 0005 0001 .field 1, 2
0012 0005 0003 .field 3, 4
0013 0005 0006 .field 6, 3
0014 0006 .even
- In syntax descriptions, the instruction, command, or directive is in a bold
face font and parameters are in italics. Portions of a syntax that are in
bold face should be entered as shown; portions of a syntax that are in
italics describe the type of information that should be entered. Here is an
example of a directive syntax:
.asect ”section name”, address
.asect is the directive. This directive has two parameters, indicated by
section name and address. When you use .asect, the first parameter must
be an actual section name, enclosed in double quotes; the second
parameter must be an address.

- Square brackets ( [ and ] ) identify an optional parameter. If you use an


optional parameter, you specify the information within the brackets; you
don’t enter the brackets themselves. Here’s an example of an instruction
that has an optional parameter:
LALK 16-bit constant [, shift]
The LALK instruction has two parameters. The first parameter, 16-bit
constant, is required. The second parameter, shift, is optional. As this
syntax shows, if you use the optional second parameter, you must
precede it with a comma.
Square brackets are also used as part of the pathname specification for
VMS pathnames; in this case, the brackets are actually part of the
pathname (they are not optional).

- Braces ( { and } ) indicate a list. The symbol | (read as or) separates items
within the list. Here’s an example of a list:
{ * | *+ | *– }
This provides three choices: *, *+, or *–.
Unless the list is enclosed in square brackets, you must choose one item
from the list.

iv
Notational Conventions / Information About Cautions / Related Documentation from Texas Instruments

- Some directives can have a varying number of parameters. For example,


the .byte directive can have up to 100 parameters. The syntax for this
directive is
.byte value1 [, ... , valuen ]
This syntax shows that .byte must have at least one value parameter, but
you have the option of supplying additional value parameters separated
by commas.

Information About Cautions

This book may contain cautions and warnings.


- A caution describes a situation that could potentially cause your system
to behave unexpectedly.

This is what a caution looks like.

The information in a caution is provided for your information. Please read each
caution carefully.

Related Documentation From Texas Instruments


The following books describe the TMS320 floating-point devices and related
support tools. To obtain a copy of any of these TI documents, call the Texas
Instruments Literature Response Center at (800) 477–8924. When ordering,
please identify the book by its title and literature number.
TMS320 Floating-Point DSP Assembly Language Tools User’s Guide
(literature number SPRU035) describes the assembly language tools
(assembler, linker, and other tools used to develop assembly language
code), assembler directives, macros, common object file format, and
symbolic debugging directives for the ’C3x and ’C4x generations of
devices.
TMS320 Floating-Point DSP Optimizing C Compiler User’s Guide
(literature number SPRU034) describes the TMS320 floating-point C
compiler. This C compiler accepts ANSI standard C source code and
produces TMS320 assembly language source code for the ’C3x and
’C4x generations of devices.

Read This First v


Related Documentation from Texas Instruments / References

TMS320C3x C Source Debugger (literature number SPRU053) describes


the ’C3x debugger for the emulator, evaluation module, and simulator.
This book discusses various aspects of the debugger interface, including
window management, command entry, code execution, data
management, and breakpoints. It also includes a tutorial that introduces
basic debugger functionality.
TMS320 Family Development Support Reference Guide (literature number
SPRU011) describes the TMS320 family of digital signal processors and
the various products that support it. This includes code-generation tools
(compilers, assemblers, linkers, etc.) and system integration and debug
tools (simulators, emulators, evaluation modules, etc.). This book also
lists related documentation, outlines seminars and the university
program, and provides factory repair and exchange information.
TMS320 Third-Party Support Reference Guide (literature number
SPRU052) alphabetically lists over 100 third parties who supply various
products that serve the family of TMS320 digital signal processors,
including software and hardware development tools, speech
recognition, image processing, noise cancellation, modems, etc.

References

The publications in the following reference list contain useful information


regarding functions, operations, and applications of digital signal processing
(DSP). These books also provide other references to many useful technical
papers. The reference list is organized into categories of general DSP, speech,
image processing, and digital control theory and is alphabetized by author.

- General Digital Signal Processing:


Antoniou, Andreas, Digital Filters: Analysis and Design. New York, NY:
McGraw-Hill Company, Inc., 1979.
Bateman, A., and Yates, W., Digital Signal Processing Design. Salt Lake
City, Utah: W. H. Freeman and Company, 1990.
Brigham, E. Oran, The Fast Fourier Transform. Englewood Cliffs, NJ:
Prentice-Hall, Inc., 1974.
Burrus, C.S., and Parks, T.W., DFT/FFT and Convolution Algorithms. New
York, NY: John Wiley and Sons, Inc., 1984.
Chassaing, R., and Horning, D., Digital Signal Processing with the
TMS320C25. New York, NY: John Wiley and Sons, Inc., 1990.
Digital Signal Processing Applications with the TMS320 Family, Vol. I.
Texas Instruments, 1986; Prentice-Hall, Inc., 1987.

vi
References

Digital Signal Processing Applications with the TMS320 Family, Vol. II.
Texas Instruments, 1990; Prentice-Hall, Inc., 1990.
Digital Signal Processing Applications with the TMS320 Family, Vol. III.
Texas Instruments, 1990; Prentice-Hall, Inc., 1990.
Gold, Bernard, and Rader, C.M., Digital Processing of Signals. New York,
NY: McGraw-Hill Company, Inc., 1969.
Hamming, R.W., Digital Filters. Englewood Cliffs, NJ: Prentice-Hall, Inc.,
1977.
Hutchins, B., and Parks, T., A Digital Signal Processing Laboratory Using
the TMS320C25. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1990.
IEEE ASSP DSP Committee (Editor), Programs for Digital Signal
Processing. New York, NY: IEEE Press, 1979.
Jackson, Leland B., Digital Filters and Signal Processing. Hingham, MA:
Kluwer Academic Publishers, 1986.
Jones, D.L., and Parks, T.W., A Digital Signal Processing Laboratory
Using the TMS32010. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1987.
Lim, Jae, and Oppenheim, Alan V. (Editors), Advanced Topics in Signal
Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1988.
Morris, L. Robert, Digital Signal Processing Software. Ottawa, Canada:
Carleton University, 1983.
Oppenheim, Alan V. (Editor), Applications of Digital Signal Processing.
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1978.
Oppenheim, Alan V., and Schafer, R.W., Digital Signal Processing.
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1975.
Oppenheim, Alan V., and Schafer, R.W., Discrete-Time Signal
Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1989.
Oppenheim, Alan V., and Willsky, A.N., with Young, I.T., Signals and
Systems. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1983.
Parks, T.W., and Burrus, C.S., Digital Filter Design. New York, NY: John
Wiley and Sons, Inc., 1987.
Rabiner, Lawrence R., and Gold, Bernard, Theory and Application of
Digital Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1975.
Treichler, J.R., Johnson, Jr., C.R., and Larimore, M.G., Theory and Design
of Adaptive Filters. New York, NY: John Wiley and Sons, Inc., 1987.
- Speech:
Gray, A.H., and Markel, J.D., Linear Prediction of Speech. New York, NY:
Springer-Verlag, 1976.
Jayant, N.S., and Noll, Peter, Digital Coding of Waveforms. Englewood
Cliffs, NJ: Prentice-Hall, Inc., 1984.

Read This First vii


References

Papamichalis, Panos, Practical Approaches to Speech Coding.


Englewood Cliffs, NJ: Prentice-Hall, Inc., 1987.
Parsons, Thomas., Voice and Speech Processing. New York, NY:
McGraw Hill Company, Inc., 1987.
Rabiner, Lawrence R., and Schafer, R.W., Digital Processing of Speech
Signals. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1978.
Shaughnessy, Douglas., Speech Communication. Reading, MA:
Addison-Wesley, 1987.
- Image Processing:
Andrews, H.C., and Hunt, B.R., Digital Image Restoration. Englewood
Cliffs, NJ: Prentice-Hall, Inc., 1977.
Gonzales, Rafael C., and Wintz, Paul, Digital Image Processing. Reading,
MA: Addison-Wesley Publishing Company, Inc., 1977.
Pratt, William K., Digital Image Processing. New York, NY: John Wiley and
Sons, 1978.
- Multirate DSP:
Crochiere, R.E., and Rabiner, L.R., Multirate Digital Signal Processing.
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1983.
Vaidyanathan, P.P., Multirate Systems and Filter Banks. Englewood Cliffs,
NJ: Prentice-Hall, Inc.
- Digital Control Theory:
Dote, Y., Servo Motor and Motion Control Using Digital Signal Processors.
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1990.
Jacquot, R., Modern Digital Control Systems. New York, NY: Marcel
Dekker, Inc., 1981.
Katz, P., Digital Control Using Microprocessors. Englewood Cliffs, NJ:
Prentice-Hall, Inc., 1981.
Kuo, B.C., Digital Control Systems. New York, NY: Holt, Reinholt and
Winston, Inc., 1980.
Moroney, P., Issues in the Implementation of Digital Feedback
Compensators. Cambridge, MA: The MIT Press, 1983.
Phillips, C., and Nagle, H., Digital Control System Analysis and Design.
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1984.
- Adaptive Signal Processing:
Haykin, S., Adaptive Filter Theory. Englewood Cliffs, NJ: Prentice-Hall,
Inc., 1991.
Widrow, B., and Stearns, S.D. Adaptive Signal Processing. Englewood
Cliffs, NJ: Prentice-Hall, Inc., 1985.

viii
References / If You Need Assistance / Trademarks

- Array Signal Processing:


Haykin, S., Justice, J.H., Owsley, N.L., Yen, J.L., and Kak, A.C. Array
Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1985.
Hudson, J.E. Adaptive Array Principles. New York, NY: John Wiley and
Sons, 1981.
Monzingo, R.A., and Miller, J.W. Introduction to Adaptive Arrays. New
York, NY: John Wiley and Sons, 1980.

If You Need Assistance. . .


If you want to. . . Do this. . .
Order Texas Instruments Call the TI Literature Response Center:
documentation (800) 477–8924
Ask questions about product Call the DSP hotline:
operation or report suspected (713) 274–2320
problems FAX: (713) 274–2324
Electronic Mail: [email protected].
European fax line: +33–1–3070–1032
Report mistakes in this document Fill out and return the reader response card at
or any other TI documentation the end of this book, or send your comments to:
Texas Instruments Incorporated
Technical Publications Manager, MS 702
P.O. Box 1443
Houston, Texas 77251–1443

Trademarks
ABEL is a registered trademark of Data I/O Corporation.
CodeView, MS, MS-DOS, MS-Windows, and Presentation Manager are trademarks of
Microsoft Corp.
DEC, Digital DX, Ultrix, VAX, and VMS and are trademarks of Digital Equipment Corp.
HPGL is a registered trademark of Hewlett-Packard Co.
Macintosh and MPW are trademarks of Apple Computer Corp.
Micro Channel, OS/2, PC-DOS, and PGA are trademarks of IBM Corp.
SPARC, Sun 3, Sun 4, Sun Workstation, SunView, and SunWindows are trademarks
of Sun Microsystems, Inc.
UNIX is a registered trademark of UNIX Systems Laboratories, Inc.

Read This First ix


x
Running Title—Attribute Reference

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
A general description of the TMS320C30 and TMS320C31, their key features, and typical
applications.
1.1 General Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
1.2 TMS320C30 Key Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
1.3 TMS320C31 Key Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
1.4 Typical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

2 TMS320C3x Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1


Functional block diagram, TMS320C3x design description, hardware components, device
operation, and instruction set summary.
2.1 Architectural Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
2.2 Central Processing Unit (CPU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.2.1 Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.2.2 Arithmetic Logic Unit (ALU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.2.3 Auxiliary Register Arithmetic Units (ARAUs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.2.4 CPU Register File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
2.3 Memory Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11
2.3.1 RAM, ROM, and Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11
2.3.2 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
2.3.3 Memory Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16
2.4 Instruction Set Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
2.5 Internal Bus Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-22
2.6 Parallel Instruction Set Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-23
2.7 External Bus Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-26
2.7.1 External Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-26
2.7.2 Interlocked-Instruction Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-26
2.8 Peripherals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27
2.8.1 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-28
2.8.2 Serial Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-28
2.9 Direct Memory Access (DMA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-29
2.10 TMS320C30 and TMS320C31 Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
2.10.1 Data/Program Bus Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
2.10.2 Serial-Port Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
2.10.3 Reserved Memory Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30

xi
Contents

2.10.4 Effects on the IF and IE Interrupt Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31


2.10.5 User Program/Data ROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31
2.10.6 Development Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31
2.11 System Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-32
3 CPU Registers, Memory, and Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
Description of the registers in the CPU register file. Includes memory maps and explains
instruction cache architecture, algorithm, and control bits.
3.1 CPU Register File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3.1.1 Extended-Precision Registers (R7–R0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3.1.2 Auxiliary Registers (AR7–AR0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3.1.3 Data-Page Pointer (DP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.1.4 Index Registers (IR0, IR1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.1.5 Block Size Register (BK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.1.6 System Stack Pointer (SP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.1.7 Status Register (ST) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.1.8 CPU/DMA Interrupt Enable Register (IE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
3.1.9 CPU Interrupt Flag Register (IF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3.1.10 I/O Flags Register (IOF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.1.11 Repeat-Count (RC) and Block-Repeat Registers (RS, RE) . . . . . . . . . . . . . . . 3-11
3.1.12 Program Counter (PC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
3.1.13 Reserved Bits and Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
3.2 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
3.2.1 TMS320C3x Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
3.2.2 TMS320C31 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
3.2.3 Reset/Interrupt/Trap Vector Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
3.2.4 Peripheral Bus Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-20
3.3 Instruction Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-21
3.3.1 Cache Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-21
3.3.2 Cache Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-23
3.3.3 Cache Control Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-24
3.4 Using the TMS320C31 Boot Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
3.4.1 Boot-Loader Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
3.4.2 Invoking the Boot Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
3.4.3 Mode Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-29
3.4.4 External Memory Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
3.4.5 Examples of External Memory Loads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
3.4.6 Serial-Port Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-33
3.4.7 Interrupt and Trap-Vector Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-33
3.4.8 Precautions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-35
4 Data Formats and Floating-Point Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
Description of signed and unsigned integer and floating-point formats. Discussion of
floating-point multiplication, addition, subtraction, normalization, rounding, and conversions.
4.1 Integer Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
4.1.1 Short-Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
4.1.2 Single-Precision Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2

xii
Contents

4.2 Unsigned-Integer Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3


4.2.1 Short Unsigned-Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4.2.2 Single-Precision Unsigned-Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4.3 Floating-Point Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
4.3.1 Short Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
4.3.2 Single-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
4.3.3 Extended-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
4.3.4 Conversion Between Floating-Point Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
4.4 Floating-Point Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10
4.5 Floating-Point Addition and Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-14
4.6 Normalization Using the NORM Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-18
4.7 Rounding: The RND Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-20
4.8 Floating-Point-to-Integer Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-22
4.9 Integer-to-Floating-Point Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-24

5 Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
Operation, encoding, and implementation of addressing modes. Format descriptions. System
stack management.
5.1 Types of Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
5.1.1 Register Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5.1.2 Direct Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5.1.3 Indirect Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5.1.4 Short-Immediate Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16
5.1.5 Long-Immediate Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5.1.6 PC-Relative Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5.2 Groups of Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-19
5.2.1 General Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-19
5.2.2 Three-Operand Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
5.2.3 Parallel Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-21
5.2.4 Conditional-Branch Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23
5.3 Circular Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24
5.4 Bit-Reversed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-29
5.5 System and User Stack Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5.5.1 System Stack Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5.5.2 Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32
5.5.3 Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-33

6 Program Flow Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1


Software control of program flow with repeat modes and branching. Interlocked operations.
Reset and interrupts.
6.1 Repeat Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6.1.1 Repeat-Mode Control Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3
6.1.2 Repeat-Mode Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3
6.1.3 RPTB Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4

Contents xiii
Contents

6.1.4 RPTS Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5


6.1.5 Repeat-Mode Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6
6.1.6 RC Register Value After Repeat Mode Completes . . . . . . . . . . . . . . . . . . . . . . . 6-6
6.1.7 Nested Block Repeats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
6.2 Delayed Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8
6.3 Calls, Traps, and Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
6.4 Interlocked Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
6.5 Reset Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
6.6 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-23
6.6.1 Interrupt Vector Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-23
6.6.2 Interrupt Prioritization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-25
6.6.3 Interrupt Control Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-26
6.6.4 Interrupt Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-27
6.6.5 CPU Interrupt Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-30
6.6.6 CPU/DMA Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-30
6.6.7 TMS320C3x Interrupt Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-31
6.6.8 TMS320C30 Interrupt Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-32
6.6.9 Prioritization and Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-34
6.7 TMS320LC31 Power Management Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-36
6.7.1 IDLE2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-36
6.7.2 LOPOWER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-38
7 External Bus Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
Description of primary and expansion interfaces. External interface timing diagrams.
Programmable wait-states and bank switching.
7.1 External Interface Control Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
7.1.1 Primary-Bus Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
7.1.2 Expansion-Bus Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
7.2 External Interface Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.2.1 Primary-Bus Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.2.2 Expansion-Bus I/O Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
7.3 Programmable Wait States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-28
7.4 Programmable Bank Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
8 Peripherals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
Description of the DMA controller, timers, and serial ports.
8.1 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
8.1.1 Timer Global-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
8.1.2 Timer Period and Counter Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
8.1.3 Timer Pulse Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
8.1.4 Timer Operation Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10
8.1.5 Timer Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-11
8.1.6 Timer Initialization/Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-12
8.2 Serial Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-13
8.2.1 Serial-Port Global-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-15
8.2.2 FSX/DX/CLKX Port-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18

xiv
Contents

8.2.3 FSR/DR/CLKR Port-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20


8.2.4 Receive/Transmit Timer-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-21
8.2.5 Receive/Transmit Timer-Counter Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
8.2.6 Receive/Transmit Timer-Period Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-23
8.2.7 Data-Transmit Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-23
8.2.8 Data-Receive Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
8.2.9 Serial-Port Operation Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
8.2.10 Serial-Port Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26
8.2.11 Serial-Port Interrupt Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29
8.2.12 Serial-Port Functional Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-30
8.2.13 Serial-Port Initialization/Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
8.2.14 TMS320C3x Serial-Port Interface Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
8.3 DMA Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43
8.3.1 DMA Global-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
8.3.2 Destination- and Source-Address Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
8.3.3 Transfer-Counter Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
8.3.4 CPU/DMA Interrupt-Enable Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
8.3.5 DMA Memory Transfer Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
8.3.6 Synchronization of DMA Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-54
8.3.7 DMA Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-56
8.3.8 DMA Initialization/Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-57
8.3.9 Hints for DMA Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-57
8.3.10 DMA Programming Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-58

9 Pipeline Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1


Discussion of the pipeline of operations on the TMS320C3x.
9.1 Pipeline Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
9.2 Pipeline Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
9.2.1 Branch Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
9.2.2 Register Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-7
9.2.3 Memory Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10
9.3 Resolving Register Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-18
9.4 Resolving Memory Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-21
9.5 Clocking of Memory Accesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-23
9.5.1 Program Fetches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-23
9.5.2 Data Loads and Stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-24

10 Assembly Language Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1


Functional listing of instructions. Condition codes defined. Alphabetized individual instruction
descriptions with examples.
10.1 Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
10.1.1 Load-and-Store Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
10.1.2 Two-Operand Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3
10.1.3 Three-Operand Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4

Contents xv
Contents

10.1.4 Program-Control Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-5


10.1.5 Low-Power Control Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-5
10.1.6 Interlocked-Operations Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6
10.1.7 Parallel-Operations Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-7
10.1.8 Illegal Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-9
10.2 Condition Codes and Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-10
10.3 Individual Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-14
10.3.1 Symbols and Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-14
10.3.2 Optional Assembler Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-16
10.3.3 Individual Instruction Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-18
11 Software Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-1
Software application examples for the use of various TMS320C3x instruction set features.
11.1 Processor Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
11.2 Program Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6
11.2.1 Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6
11.2.2 Software Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-8
11.2.3 Interrupt Service Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11.2.4 Delayed Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17
11.2.5 Repeat Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-18
11.2.6 Computed GOTOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-22
11.3 Logical and Arithmetic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-23
11.3.1 Bit Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-23
11.3.2 Block Moves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-25
11.3.3 Bit-Reversed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-25
11.3.4 Integer and Floating-Point Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-26
11.3.5 Square Root . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-34
11.3.6 Extended-Precision Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-38
11.3.7 IEEE/TMS320C3x Floating-Point Format Conversion . . . . . . . . . . . . . . . . . . 11-42
11.4 Application-Oriented Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-53
11.4.1 Companding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-53
11.4.2 FIR, IIR, and Adaptive Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-58
11.4.3 Matrix-Vector Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-70
11.4.4 Fast Fourier Transforms (FFT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-73
11.4.5 Lattice Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-125
11.5 Programming Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-131
11.5.1 C-Callable Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-131
11.5.2 Hints for Assembly Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-131
11.5.3 Low-Power-Mode Wakeup Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-133
12 Hardware Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1
Hardware design techniques and application examples for interfacing to memories,
peripherals, or other microcomputers/microprocessors.
12.1 System Configuration Options Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
12.1.1 Categories of Interfaces on the TMS320C3x . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
12.1.2 Typical System Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3

xvi
Contents

12.2 Primary Bus Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-4


12.2.1 Zero-Wait-State Interface to Static RAMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-4
12.2.2 Ready Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9
12.2.3 Bank Switching Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-13
12.3 Expansion Bus Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-19
12.3.1 A/D Converter Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-19
12.3.2 D/A Converter Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-23
12.4 System Control Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-27
12.4.1 Clock Oscillator Circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-27
12.4.2 Reset Signal Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-29
12.5 Serial-Port Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-32
12.6 Low-Power-Mode Interrupt Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-36
12.7 XDS Target Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-39
12.7.1 Designing Your MPSD Emulator Connector (12-Pin Header) . . . . . . . . . . . . 12-39
12.7.2 MPSD Emulator Cable Signal Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-40
12.7.3 Connections Between the Emulator and the Target System . . . . . . . . . . . . . 12-41
12.7.4 Mechanical Dimensions for the 12-Pin Emulator Connector . . . . . . . . . . . . . 12-43
12.7.5 Diagnostic Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-45

13 TMS320C3x Signal Descriptions and Electrical Characteristics . . . . . . . . . . . . . . . . . . . . 13-1


Pin locations, pin descriptions, dimensions, electrical characteristics, signal timing diagrams,
and characteristics.
13.1 Pinout and Pin Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
13.1.1 TMS320C30 Pinouts and Pin Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
13.1.2 TMS320C30 PPM Pinouts and Pin Assignments . . . . . . . . . . . . . . . . . . . . . . . 13-8
13.1.3 TMS320C31 Pinouts and Pin Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-12
13.2 Signal Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-16
13.2.1 TMS320C30 Signal Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-16
13.2.2 TMS320C31 Signal Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-22
13.3 Electrical Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-25
13.4 Signal Transition Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-29
13.4.1 TTL-Level Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-29
13.4.2 TTL-Level Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-29
13.5 Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-30
13.5.1 X2/CLKIN, H1, and H3 Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-30
13.5.2 Memory Read/Write Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-32
13.5.3 XF0 and XF1 Timing When Executing LDFI or LDII . . . . . . . . . . . . . . . . . . . . 13-38
13.5.4 XF0 Timing When Executing STFI and STII . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-40
13.5.5 XF0 and XF1 Timing When Executing SIGI . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-41
13.5.6 Loading When the XF Pin Is Configured as an Output . . . . . . . . . . . . . . . . . . 13-42
13.5.7 Changing the XF Pin From an Output to an Input . . . . . . . . . . . . . . . . . . . . . . 13-43
13.5.8 Changing the XF Pin From an Input to an Output . . . . . . . . . . . . . . . . . . . . . . 13-44
13.5.9 Reset Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-45
13.5.10 SHZ Pin Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-51

Contents xvii
Contents

13.5.11 Interrupt Response Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-52


13.5.12 Interrupt Acknowledge Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-54
13.5.13 Data Rate Timing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-55
13.5.14 HOLD Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-61
13.5.15 General-Purpose I/O Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-63
13.5.16 Timer Pin Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-66

A Instruction Opcodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1


List of the opcode fields for the TMS320C3x instructions.

B Development Support/Part Ordering Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1


Lists of the hardware and software available to support the TMS320C3x devices.
B.1 TMS320C3x Development Support Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
B.1.1 TMS320 Third Parties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-4
B.1.2 TMS320 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-5
B.1.3 DSP Hotline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-5
B.1.4 Bulletin Board Service (BBS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-5
B.1.5 Technical Training Organization (TTO) TMS320 Workshop . . . . . . . . . . . . . . . . B-6
B.2 TMS320C3x Part Ordering Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-7
B.2.1 Device and Development Support Tool Prefix Designators . . . . . . . . . . . . . . . . B-8
B.2.2 Device Suffixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-9

C Quality and Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1


Discussion of Texas Instruments quality and reliability criteria for evaluating performance.
C.1 Reliability Stress Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2
C.2 TMS320C31 PQFP Reflow Soldering Precautions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-7

D Calculation of TMS320C30 Power Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-1


Discussion of information used to determine the power dissipation and the thermal
management requirements for the TMS320C30.
D.1 Fundamental Power Dissipation Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-2
D.1.1 Components of Power Supply Current Requirements . . . . . . . . . . . . . . . . . . . . D-2
D.1.2 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-2
D.1.3 Determining Algorithm Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4
D.1.4 Test Setup Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4
D.2 Current Requirement for Internal Circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-5
D.2.1 Quiescent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-5
D.2.2 Internal Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-5
D.2.3 Internal Bus Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-6
D.3 Current Requirement for Output Driver Circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-9
D.3.1 Primary Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-10
D.3.2 Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-13
D.3.3 Data Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-14
D.3.4 Capacitive Load Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-16

xviii
Contents

D.4 Calculation of Total Supply Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-18


D.4.1 Combining Supply Current Due to All Components . . . . . . . . . . . . . . . . . . . . . . D-18
D.4.2 Supply Voltage, Operating Frequency, and Temperature Dependencies . . . D-19
D.4.3 Design Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-21
D.4.4 Peak Versus Average Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-22
D.4.5 Thermal Management Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-23
D.5 Supply Current Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-26
D.5.1 Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-26
D.5.2 Data Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-26
D.5.3 Average Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-27
D.5.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-27
D.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-28
D.7 Photo of IDD for FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-29
D.8 FFT Assembly Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-30

E SMJ320C3x Digital Signal Processor Data Sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-1


Data sheet for the military version of the digital signal processor, the SMJ320C30.

F Analog Interface Peripherals and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-1


Devices that interface to the TMS320 DSPs.
F.1 Multimedia Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-2
F.1.1 System Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-2
F.1.2 Multimedia-Related Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-4
F.2 Telecommunications Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-5
F.3 Dedicated Speech Synthesis Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-11
F.4 Servo Control/Disk Drive Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-14
F.5 Modem Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-17
F.6 Advanced Digital Electronics Applications for Consumers . . . . . . . . . . . . . . . . . . . . . . . F-20

G Boot Loader Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-1


Source code for the TMS320C3x boot loader.

Contents xix
Figures

Figures
1–1 TMS320 Device Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
1–2 TMS320C3x Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
2–1 TMS320C3x Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2–2 Central Processing Unit (CPU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
2–3 Memory Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
2–4 TMS320C30 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
2–5 TMS320C31 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
2–6 Peripheral Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27
2–7 DMA Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-29
3–1 Extended-Precision Register Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3–2 Extended-Precision Register Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3–3 Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
3–4 CPU/DMA Interrupt Enable Register (IE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
3–5 CPU Interrupt-Flag Register (IF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3–6 I/O-Flag Register (IOF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3–7 TMS320C30 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
3–8 TMS320C31 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
3–9 Reset, Interrupt, and Trap-Vector Locations
for the TMS320C30/TMS320C31 Microprocessor Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-18
3–10 Interrupt and Trap Branch Instructions for the TMS320C31 Microcomputer Mode . . . . . 3-19
3–11 Peripheral Bus Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-20
3–12 Instruction Cache Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-22
3–13 Address Partitioning for Cache Control Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-22
3–14 Boot-Loader-Mode Selection Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-27
3–15 Boot-Loader Memory-Load Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-28
3–16 Boot-Loader Serial-Port Load-Mode Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-29
4–1 Short-Integer Format and Sign Extension of Short Integers . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
4–2 Single-Precision Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
4–3 Short Unsigned-Integer Format and Zero Fill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4–4 Single-Precision Unsigned-Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4–5 Generic Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
4–6 Short Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
4–7 Single-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
4–8 Extended-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
4–9 Converting From Short Floating-Point Format
to Single-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
4–10 Converting From Short Floating-Point Format
to Extended-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8

xx
Figures

4–11 Converting From Single-Precision Floating-Point Format


to Extended-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
4–12 Converting From Extended-Precision Floating-Point Format
to Single-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
4–13 Flowchart for Floating-Point Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11
4–14 Flowchart for Floating-Point Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-15
4–15 Flowchart for NORM Instruction Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-19
4–16 Flowchart for Floating-Point Rounding by the RND Instruction . . . . . . . . . . . . . . . . . . . . . 4-21
4–17 Flowchart for Floating-Point-to-Integer Conversion by FIX Instructions . . . . . . . . . . . . . . 4-23
4–18 Flowchart for Integer-to-Floating-Point Conversion by FLOAT Instructions . . . . . . . . . . . 4-24
5–1 Direct Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5–2 Instruction Encoding Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5–3 Encoding for 24-Bit PC-Relative Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-18
5–4 Encoding for General Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
5–5 Encoding for Three-Operand Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-21
5–6 Encoding for Parallel Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-21
5–7 Encoding for Conditional-Branch Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23
5–8 Flowchart for Circular Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-25
5–9 Circular Buffer Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-26
5–10 Data Structure for FIR Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28
5–11 System Stack Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5–12 Implementations of High-to-Low Memory Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32
5–13 Implementations of Low-to-High Memory Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-33
6–1 CALL Response Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11
6–2 Multiple TMS320C3xs Sharing Global Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15
6–3 Zero-Logic Interconnect of TMS320C3xs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16
6–4 Interrupt Logic Functional Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-23
6–5 Interrupt Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-28
6–6 IDLE2 Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-37
6–7 Interrupt Response Timing After IDLE2 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-37
6–8 LOPOWER Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-38
6–9 MAXSPEED Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-38
7–1 Memory-Mapped External Interface Control Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
7–2 Primary-Bus Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
7–3 Expansion-Bus Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
7–4 Read-Read-Write for (M)STRB = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
7–5 Write-Write-Read for (M)STRB = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8
7–6 Use of Wait States for Read for (M)STRB = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7–7 Use of Wait States for Write for (M)STRB = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7–8 Read and Write for IOSTRB = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
7–9 Read With One Wait State for IOSTRB = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
7–10 Write With One Wait State for IOSTRB = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13
7–11 Memory Read and I/O Write for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
7–12 Memory Read and I/O Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-15

Contents xxi
Figures

7–13 Memory Write and I/O Write for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16
7–14 Memory Write and I/O Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-17
7–15 I/O Write and Memory Write for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18
7–16 I/O Write and Memory Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19
7–17 I/O Read and Memory Write for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-20
7–18 I/O Read and Memory Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7–19 I/O Write and I/O Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
7–20 I/O Write and I/O Write for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
7–21 I/O Read and I/O Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-24
7–22 Inactive Bus States for IOSTRB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-25
7–23 Inactive Bus States for STRB and MSTRB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-26
7–24 HOLD and HOLDA Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-27
7–25 BNKCMP Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
7–26 Bank-Switching Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-31
8–1 Timer Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
8–2 Memory-Mapped Timer Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
8–3 Timer Global-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
8–4 Timer Modes as Defined by CLKSRC and FUNC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
8–5 Timer Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
8–6 Timer Output Generation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-9
8–7 Timer I/O Port Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10
8–8 Serial-Port Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-14
8–9 Memory-Mapped Locations for the Serial Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-15
8–10 Serial-Port Global-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18
8–11 FSX/DX/CLKX Port-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19
8–12 FSR/DR/CLKR Port-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20
8–13 Receive/Transmit Timer-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
8–14 Receive/Transmit Timer-Counter Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
8–15 Receive/Transmit Timer-Period Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-23
8–16 Transmit Buffer Shift Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-23
8–17 Receive Buffer Shift Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
8–18 Serial-Port Clocking in I/O Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-25
8–19 Serial-Port Clocking in Serial-Port Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26
8–20 Data Word Format in Handshake Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
8–21 Single Zero Sent as an Acknowledge Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
8–22 Direct Connection Using Handshake Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29
8–23 Fixed Burst Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-31
8–24 Fixed Continuous Mode With Frame Sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-31
8–25 Fixed Continuous Mode Without Frame Sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-33
8–26 Exiting Fixed Continuous Mode Without Frame Sync, FSX Internal . . . . . . . . . . . . . . . . . 8-34
8–27 Variable Burst Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
8–28 Variable Continuous Mode With Frame Sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
8–29 Variable Continuous Mode Without Frame Sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
8–30 TMS320C3x Zero-Glue-Logic Interface to TLC3204x Example . . . . . . . . . . . . . . . . . . . . . 8-40

xxii
Figures

8–31 DMA Global-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47


8–32 CPU/DMA Interrupt-Enable Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
8–33 No DMA Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-54
8–34 DMA Source Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
8–35 DMA Destination Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
8–36 DMA Source and Destination Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-56
9–1 TMS320C3x Pipeline Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3
9–2 Two-Operand Instruction Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-24
9–3 Three-Operand Instruction Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-25
9–4 Multiply or CPU Operation With a Parallel Store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-28
9–5 Two Parallel Stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-29
9–6 Parallel Multiplies and Adds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-29
10–1 Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-11
11–1 Data Memory Organization for an FIR Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-58
11–2 Data Memory Organization for a Single Biquad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-60
11–3 Data Memory Organization for N Biquads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-63
11–4 Data Memory Organization for Matrix-Vector Multiplication . . . . . . . . . . . . . . . . . . . . . . . 11-71
11–5 Structure of the Inverse Lattice Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-126
11–6 Data Memory Organization for Lattice Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-126
11–7 Structure of the (Forward) Lattice Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-128
12–1 External Interfaces on the TMS320C3x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
12–2 Possible System Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3
12–3 TMS320C3x Interface to Cypress Semiconductor CY7C186 CMOS SRAM . . . . . . . . . . 12-6
12–4 Read Operations Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12–5 Write Operations Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8
12–6 Circuit for Generation of Zero, One, or Two Wait States for Multiple Devices . . . . . . . . 12-12
12–7 Bank Switching for Cypress Semiconductor’s CY7C185 . . . . . . . . . . . . . . . . . . . . . . . . . . 12-15
12–8 Bank Memory Control Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-16
12–9 Timing for Read Operations Using Bank Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-18
12–10 Interface to AD1678 A/D Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-20
12–11 Read Operations Timing Between the TMS320C30 and AD1678 . . . . . . . . . . . . . . . . . . 12-22
12–12 Interface Between the TMS320C30 and the AD565A . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-24
12–13 Write Operation to the D/A Converter Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-25
12–14 Crystal Oscillator Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-27
12–15 Magnitude of the Impedance of the Oscillator LC Network . . . . . . . . . . . . . . . . . . . . . . . . 12-28
12–16 Reset Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-29
12–17 Voltage on the TMS320C30 Reset Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-30
12–18 AIC to TMS320C30 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-33
12–19 Synchronous Timing of TLC32044 to TMS320C3x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-35
12–20 Asynchronous Timing of TLC32044 to TMS320C30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-35
12–21 Interrupt Generation Circuit for Use With IDLE2 Operation . . . . . . . . . . . . . . . . . . . . . . . . 12-36
12–22 12-Pin Header Signals and Header Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-39
12–23 Emulator Cable Pod Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-40
12–24 Emulator Cable Pod Timings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-41

Contents xxiii
Figures

12–25 Signals Between the Emulator and the ’C3x With No Signals Buffered . . . . . . . . . . . . . 12-42
12–26 Signals Between the Emulator and the ’C3x
With Transmission Signals Buffered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-42
12–27 All Signals Buffered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-43
12–28 Pod/Connector Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-44
12–29 12-Pin Connector Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-45
12–30 TBC Emulation Connections for ’C3x Scan Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-46
13–1 TMS320C30 Pinout (Top View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-3
13–2 TMS320C30 Pinout (Bottom View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-4
13–3 TMS320C30 181-Pin PGA Dimensions—GEL Package . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5
13–4 TMS320C30 PPM Pinout (Top View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-8
13–5 TMS320C30 PPM 208-Pin Plastic Quad Flat Pack—PQL Package . . . . . . . . . . . . . . . . . 13-9
13–6 TMS320C31 Pinout (Top View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-12
13–7 TMS320C31 132-Pin Plastic Quad Flat Pack—PQL Package . . . . . . . . . . . . . . . . . . . . . 13-13
13–8 Test Load Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-28
13–9 TTL-Level Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-29
13–10 TTL-Level Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-29
13–11 Timing for X2/CLKIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-31
13–12 Timing for H1/H3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-31
13–13 Timing for Memory ( (M)STRB = 0) Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-34
13–14 Timing for Memory ( (M)STRB = 0) Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-35
13–15 Timing for Memory ( IOSTRB = 0) Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-36
13–16 Timing for Memory ( IOSTRB = 0) Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-37
13–17 Timing for XF0 and XF1 When Executing LDFI or LDII . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-39
13–18 Timing for XF0 When Executing an STFI or STII . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-40
13–19 Timing for XF0 and XF1 When Executing SIGI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-41
13–20 Timing for Loading XF Register When Configured as an Output Pin . . . . . . . . . . . . . . . . 13-42
13–21 Timing for Change of XF From Output to Input Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-43
13–22 Timing for Change of XF From Input to Output Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-44
13–23 Timing for RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-48
13–24 CLKIN to H1/H3 as a Function of Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-49
13–25 CLKIN to H1/H3 as a Function of Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-49
13–26 CLKIN to H1/H3 as a Function of Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-50
13–27 Timing for SHZ Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-51
13–28 Timing for INT3–INT0 Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-53
13–29 Timing for IACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-54
13–30 Timing for Fixed Data Rate Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-55
13–31 Timing for Variable Data Rate Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-56
13–32 Timing for HOLD/HOLDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-61
13–33 Timing for Peripheral Pin General-Purpose I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-63
13–34 Timing for Change of Peripheral Pin From General-Purpose Output to Input Mode . . . 13-64
13–35 Timing for Change of Peripheral Pin From General-Purpose Input to Output Mode . . . 13-65
13–36 Timing for Timer Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-67
B–1 TMS320 Device Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-10

xxiv
Figures

D–1 Current Measurement Test Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4


D–2 Internal Bus Current Versus Transfer Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-7
D–3 Internal Bus Current Versus Data Complexity Derating Curve . . . . . . . . . . . . . . . . . . . . . . . D-7
D–4 Primary Bus Current Versus Transfer Rate and Wait States . . . . . . . . . . . . . . . . . . . . . . . . D-11
D–5 Primary Bus Current Versus Transfer Rate at Zero Wait States . . . . . . . . . . . . . . . . . . . . . D-12
D–6 Expansion Bus Current Versus Transfer Rate and Wait States . . . . . . . . . . . . . . . . . . . . . D-13
D–7 Expansion Bus Current Versus Transfer Rate at Zero Wait States . . . . . . . . . . . . . . . . . . D-14
D–8 Primary Bus Current Versus Data Complexity Derating Curve . . . . . . . . . . . . . . . . . . . . . . D-15
D–9 Expansion Bus Current Versus Data Complexity Derating Curve . . . . . . . . . . . . . . . . . . . D-16
D–10 Current Versus Output Load Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-17
D–11 Current Versus Frequency and Supply Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-20
D–12 Current Versus Operating Temperature Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-20
D–13 Load Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-23
F–1 System Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-2
F–2 Multimedia Speech Encoding and Modem Communication . . . . . . . . . . . . . . . . . . . . . . . . . F-3
F–3 TMS320C25 to TLC32047 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-3
F–4 Typical DSP/Combo Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-6
F–5 DSP/Combo Interface Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-7
F–6 General Telecom Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-9
F–7 Generic Telecom Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-10
F–8 Generic Servo Control Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-14
F–9 Disk Drive Control System Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-15
F–10 TMS320C14–TLC32071 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-16
F–11 High-Speed V.32 Bis and Multistandard Modem With the TLC320AC01 AIC . . . . . . . . . F-18
F–12 Applications Performance Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-20
F–13 Video Signal Processing Basic System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-21
F–14 Typical Digital Audio Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-21

Contents xxv
Tables

Tables
1–1 Typical Applications of the TMS320 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
2–1 CPU Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
2–2 Instruction Set Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
2–3 Parallel Instruction Set Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-24
2–4 Feature Set Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
2–5 TMS320C31 Reserved Memory Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31
3–1 CPU Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3–2 Status Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
3–3 IE Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
3–4 IF Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3–5 IOF Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
3–6 Combined Effect of the CE and CF Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-25
3–7 Loader Mode Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
3–8 External Memory Loader Header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
3–9 TMS320C31 Interrupt and Trap Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-34
5–1 CPU Register Address/Assembler Syntax and Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5–2 Indirect Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5–3 Index Steps and Bit-Reversed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-30
6–1 Repeat-Mode Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6–2 Interlocked Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
6–3 Pin Operation at Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-19
6–4 Reset, Interrupt, and Trap-Vector Locations
for the TMS320C30/TMS320C31 Microprocessor Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24
6–5 Reset, Interrupt, and Trap-Vector Locations
for the TMS320C31 Microcomputer Boot Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-25
6–6 Reset and Interrupt Vector Priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-26
6–7 Interrupt Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-29
6–8 Reset and Interrupt Vector Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-35
7–1 Primary-Bus Control Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4
7–2 Expansion-Bus Control Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
7–3 Wait-State Generation When SWW = 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7–4 Wait-State Generation When SWW = 0 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7–5 Wait-State Generation When SWW = 1 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7–6 Wait-State Generation When SWW = 1 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7–7 BNKCMP and Bank Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
8–1 Timer Global-Control Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
8–2 Result of a Write of Specified Values of GO and HLD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8

xxvi
Tables

8–3 Serial-Port Global-Control Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-15


8–4 FSX/DX/CLKX Port-Control Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19
8–5 FSR/DR/CLKR Port-Control Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20
8–6 Receive/Transmit Timer-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-21
8–7 Memory-Mapped Locations for a DMA Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-44
8–8 DMA Global-Control Register Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-45
8–9 START Bits and Operation of the DMA (Bits 0–1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-46
8–10 STAT Bits and Status of the DMA (Bits 2–3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-46
8–11 SYNC Bits and Synchronization of the DMA (Bits 8–9) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-46
8–12 CPU/DMA Interrupt-Enable Register Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-48
8–13 DMA Timing When Destination Is On-Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-50
8–14 DMA Timing When Destination Is a Primary Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-51
8–15 DMA Timing When Destination Is an Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-52
8–16 Maximum DMA Transfer Rates When Cr = Cw = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
8–17 Maximum DMA Transfer Rates When Cr = 1, Cw = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
8–18 Maximum DMA Transfer Rates When Cr = 1, Cw = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
9–1 One Program Fetch and One Data Access for Maximum Performance . . . . . . . . . . . . . . 9-21
9–2 One Program Fetch and Two Data Accesses for Maximum Performance . . . . . . . . . . . . 9-22
10–1 Load-and-Store Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
10–2 Two-Operand Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3
10–3 Three-Operand Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4
10–4 Program Control Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-5
10–5 Low-Power Control Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-5
10–6 Interlocked Operations Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6
10–7 Parallel Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-7
10–8 Output Value Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-10
10–9 Condition Codes and Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-13
10–10 Instruction Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-15
10–11 CPU Register Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-18
11–1 TMS320C3x FFT Timing Benchmarks (Cycles) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-125
11–2 TMS320C3x FFT Timing Benchmarks (Milliseconds) . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-125
12–1 Bank Switching Interface Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-18
12–2 Key Timing Parameter for D/A Converter Write Operation . . . . . . . . . . . . . . . . . . . . . . . . 12-26
12–3 12-Pin Header Signal Descriptions and Pin Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-39
12–4 Emulator Cable Pod Timing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-41
13–1 TMS320C30–PGA Pin Assignments (Alphabetical) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
13–2 TMS320C30–PGA Pin Assignments (Numerical) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-7
13–3 TMS320C30–PPM Pin Assignments (Alphabetical) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-10
13–4 TMS320C30–PPM Pin Assignments (Numerical) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-11
13–5 TMS320C31 Pin Assignments (Alphabetical) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-14
13–6 TMS320C31 Pin Assignments (Numerical) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-15
13–7 TMS320C30 Signal Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-17
13–8 TMS320C31 Signal Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-22
13–9 Absolute Maximum Ratings Over Specified Temperature Range . . . . . . . . . . . . . . . . . . 13-25

Contents xxvii
Tables

13–10 Recommended Operating Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-26


13–11 Electrical Characteristics Over Specified Free-Air Temperature Range . . . . . . . . . . . . . 13-27
13–12 Timing Parameters for X2/CLKIN, H1, and H3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-30
13–13 Timing Parameters for a Memory ( (M)STRB) = 0) Read/Write . . . . . . . . . . . . . . . . . . . . 13-33
13–14 Timing Parameters for a Memory ( IOSTRB = 0) Read . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-35
13–15 Timing Parameters for a Memory ( IOSTRB = 0) Write . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-37
13–16 Timing Parameters for XF0 and XF1 When Executing LDFI or LDII . . . . . . . . . . . . . . . . 13-39
13–17 Timing Parameters for XF0 When Executing STFI or STII . . . . . . . . . . . . . . . . . . . . . . . . 13-40
13–18 Timing Parameters for XF0 and XF1 When Executing SIGI . . . . . . . . . . . . . . . . . . . . . . . 13-41
13–19 Timing Parameters for Loading the XF Register When Configured as an Output Pin . 13-42
13–20 Timing Parameters of XF Changing From Output to Input Mode . . . . . . . . . . . . . . . . . . . 13-43
13–21 Timing Parameters of XF Changing From Input to Output Mode . . . . . . . . . . . . . . . . . . . 13-44
13–22 Timing Parameters for RESET for the TMS320C30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-46
13–23 Timing Parameters for RESET for the TMS320C31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-47
13–24 Timing Parameters for the SHZ Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-51
13–25 Timing Parameters for INT3–INT0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-52
13–26 Timing Parameters for IACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-54
13–27 Serial-Port Timing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-57
13–28 Timing Parameters for HOLD/HOLDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-62
13–29 Timing Parameters for Peripheral Pin General-Purpose I/O . . . . . . . . . . . . . . . . . . . . . . . 13-63
13–30 Timing Parameters for Peripheral Pin
Changing From General-Purpose Output to Input Mode . . . . . . . . . . . . . . . . . . . . . . . . . 13-64
13–31 Timing Parameters for Peripheral Pin
Changing From General-Purpose Input to Output Mode . . . . . . . . . . . . . . . . . . . . . . . . . . 13-64
13–32 Timing Parameters for Timer Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-66
13–33 Timing Parameters for Timer Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-67
A–1 TMS320C3x Instruction Opcodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
B–1 TMS320C3x Digital Signal Processor Part Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-7
B–2 TMS320C3x Support Tool Part Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-8
C–1 Microprocessor and Microcontroller Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3
C–2 Definitions of Microprocessor Testing Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-4
C–3 TMS320C3x Transistors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-6
D–1 Current Equation Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-22
F–1 Data Converter ICs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-4
F–2 Switched-Capacitor Filter ICs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-4
F–3 Telecom Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-8
F–4 Switched-Capacitor Filter ICs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-9
F–5 TI Voice Synthesizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-11
F–6 Speech Memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-12
F–7 Switched-Capacitor Filter ICs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-12
F–8 Speech Synthesis Development Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-13
F–9 Control-Related Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-16
F–10 Modem AFE Data Converters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-17
F–11 Audio/Video Analog/Digital Interface Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-23

xxviii
Examples

Examples
3–1 Byte-Wide Configured Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-31
3–2 16-Bit-Wide Configured Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-32
3–3 32-Bit-Wide Configured Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-32
4–1 Floating-Point Multiply (Both Mantissas = –2.0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-12
4–2 Floating-Point Multiply (Both Mantissas = 1.5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-12
4–3 Floating-Point Multiply (Both Mantissas = 1.0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-13
4–4 Floating-Point Multiply Between Positive and Negative Numbers . . . . . . . . . . . . . . . . . . . 4-13
4–5 Floating-Point Multiply by 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-13
4–6 Floating-Point Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-16
4–7 Floating-Point Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-16
4–8 Floating-Point Addition With a 32-Bit Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17
4–9 Floating-Point Addition/Subtraction With Floating-Point 0 . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17
4–10 NORM Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-18
5–1 Direct Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5–2 Auxiliary Register Indirect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5–3 Indirect With Predisplacement Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5–4 Indirect With Predisplacement Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5–5 Indirect With Predisplacement Add and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5–6 Indirect With Predisplacement Subtract and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5–7 Indirect With Postdisplacement Add and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
5–8 Indirect With Postdisplacement Subtract and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
5–9 Indirect With Postdisplacement Add and Circular Modify . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5–10 Indirect With Postdisplacement Subtract and Circular Modify . . . . . . . . . . . . . . . . . . . . . . 5-11
5–11 Indirect With Preindex Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5–12 Indirect With Preindex Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5–13 Indirect With Preindex Add and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
5–14 Indirect With Preindex Subtract and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
5–15 Indirect With Postindex Add and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
5–16 Indirect With Postindex Subtract and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
5–17 Indirect With Postindex Add and Circular Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15
5–18 Indirect With Postindex Subtract and Circular Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15
5–19 Indirect With Postindex Add and Bit-Reversed Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16
5–20 Short-Immediate Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5–21 Long-Immediate Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5–22 PC-Relative Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-18
5–23 Circular Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-27

Contents xxix
Examples

5–24 FIR Filter Code Using Circular Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28


5–25 Bit-Reversed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-29
6–1 Repeat-Mode Control Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
6–2 RPTB Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
6–3 Incorrectly Placed Standard Branch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6
6–4 Incorrectly Placed Delayed Branch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6
6–5 Pipeline Conflict in an RPTB Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
6–6 Incorrectly Placed Delayed Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9
6–7 Busy-Waiting Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14
6–8 Multiprocessor Counter Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14
6–9 Implementation of V(S) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16
6–10 Implementation of P(S) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16
6–11 Code to Synchronize Two TMS320C3xs at the Software Level . . . . . . . . . . . . . . . . . . . . . 6-17
8–1 Serial-Port Register Setup #1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38
8–2 Serial-Port Register Setup #2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38
8–3 CPU Transfer With Serial-Port Transmit Polling Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-39
8–4 TMS320C3x Zero-Glue-Logic Interface to Burr Brown A/D and D/A . . . . . . . . . . . . . . . . . 8-41
8–5 Array Initialization With DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-58
8–6 DMA Transfer With Serial-Port Receive Interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-59
8–7 DMA Transfer With Serial-Port Transmit Interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-61
9–1 Standard Branch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5
9–2 Delayed Branch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6
9–3 Write to an AR Followed by an AR for Address Generation . . . . . . . . . . . . . . . . . . . . . . . . . 9-8
9–4 A Read of ARs Followed by ARs for Address Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9
9–5 Program Wait Until CPU Data Access Completes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11
9–6 Program Wait Due to Multicycle Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-12
9–7 Multicycle Program Memory Fetches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-12
9–8 Single Store Followed by Two Reads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-13
9–9 Parallel Store Followed by Single Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-14
9–10 Interlocked Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-15
9–11 Busy External Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-16
9–12 Multicycle Data Reads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-17
9–13 Conditional Calls and Traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-17
9–14 Address Generation Update of an AR Followed by an AR for Address Generation . . . . 9-18
9–15 Write to an AR Followed by an AR for Address Generation Without a Pipeline Conflict 9-19
9–16 Write to DP Followed by a Direct Memory Read Without a Pipeline Conflict . . . . . . . . . . 9-20
9–17 Dummy src2 Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-26
9–18 Operand Swapping Alternative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-27
11–1 TMS320C3x Processor Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11–2 Subroutine Call (Dot Product) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7
11–3 Use of Interrupts for Software Polling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11–4 Context Save for the TMS320C3x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12
11–5 Context Restore for the TMS320C3x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14
11–6 Interrupt Service Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-16

xxx
Examples

11–7 Delayed Branch Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17


11–8 Loop Using Block Repeat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-19
11–9 Use of Block Repeat to Find a Maximum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-20
11–10 Loop Using Single Repeat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-21
11–11 Computed GOTO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-22
11–12 Use of TSTB for Software-Controlled Interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-23
11–13 Copy a Bit From One Location to Another . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-24
11–14 Block Move Under Program Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-25
11–15 Bit-Reversed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-26
11–16 Integer Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-29
11–17 Inverse of a Floating-Point Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-32
11–18 Square Root of a Floating-Point Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-35
11–19 64-Bit Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-39
11–20 64-Bit Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-39
11–21 32-Bit-by-32-Bit Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-40
11–22 IEEE-to-TMS320C3x Conversion (Fast Version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-44
11–23 IEEE-to-TMS320C3x Conversion (Complete Version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-46
11–24 TMS320C3x-to-IEEE Conversion (Fast Version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-49
11–25 TMS320C3x-to-IEEE Conversion (Complete Version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-51
11–26 µ-Law Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-54
11–27 µ-Law Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-55
11–28 A-Law Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-56
11–29 A-Law Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-57
11–30 FIR Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-59
11–31 IIR Filter (One Biquad) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-61
11–32 IIR Filters (N > 1 Biquads) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-64
11–33 Adaptive FIR Filter (LMS Algorithm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-68
11–34 Matrix Times a Vector Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-72
11–35 Complex, Radix-2, DIF FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-75
11–36 Table With Twiddle Factors for a 64-Point FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-78
11–37 Complex, Radix-4, DIF FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-81
11–38 Real, Radix-2 FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-88
11–39 Real Inverse, Radix-2 FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-108
11–40 Inverse Lattice Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-127
11–41 Lattice Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-129
11–42 Setup of IDLE2 Power-Down-Mode Wakeup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-133
12–1 State Machine and Equations for the Interrupt Generation 16R4 PLD . . . . . . . . . . . . . . 12-37

Contents xxxi
xxxii
Chapter 1

Introduction

The TMS320C3x generation of digital signal processors (DSPs) are high-per-


formance CMOS 32-bit floating-point devices in the TMS320 family of
single-chip digital signal processors. Since 1982, when the TMS32010 was in-
troduced, the TMS320 family, with its powerful instruction sets, high-speed
number-crunching capabilities, and innovative architectures, has established
itself as the industry standard. It is ideal for DSP applications.

The 40-ns cycle time of the TMS320C31-50 allows it to execute operations at


a performance rate of up to 60 million floating-point instructions per second
(MFLOPS) and 30 million instructions per second (MIPS). This performance
was previously available only on a supercomputer. The generation’s perform-
ance is further enhanced through its large on-chip memories, concurrent direct
memory access (DMA) controller, and two external interface ports.

This chapter presents the following major topics:

Topic Page

1.1 General Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2


1.2 TMS320C30 Key Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
1.3 TMS320C31 Key Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
1.4 Typical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

1-1
General Description

1.1 General Description


The TMS320 family consists of five generations: TMS320C1x, TMS320C2x,
TMS320C3x, TMS320C4x, and TMS320C5x (see Figure 1–1). The expan-
sion includes enhancements of earlier generations and more powerful new
generations of DSPs.

The TMS320’s internal busing and special DSP instruction set have the speed
and flexibility to execute at up to 50 MFLOPS. The TMS320 family optimizes
speed by implementing functions in hardware that other processors imple-
ment through software or microcode. This hardware-intensive approach pro-
vides power previously unavailable on a single chip.

The emphasis on total system cost has resulted in a less expensive processor
that can be designed into systems currently using costly bit-slice processors.
Also, cost/performance selection is provided by the different processors in the
TMS320C3x generation:

- TMS320C30: 60-ns, single-cycle execution-time

- TMS320C30-27: Lower cost; 74-ns, single-cycle execution time

- TMS320C30-40: Higher speed; 50-ns, single-cycle execution time

- TMS320C30-50: Highest speed; 40-ns, single-cycle execution time

- TMS320C31: Low cost; 60-ns, single-cycle execution time

- TMS320C31-27: Lower cost; 74-ns, single-cycle execution time

- TMS320C31-40: Low cost; 50-ns, single-cycle execution time

- TMS320C31PQA: Low cost; extended temperature; 60-ns, single-cycle


execution time

- TMS320C31-50: Highest speed; 40-ns, single-cycle execution time

- TMS320LC31: Low power; 60-ns, single-cycle execution time,


3.3-volt operation

All of these processors are described in this user’s guide. Essentially, their
functionality is the same. However, electrical and timing characteristics vary
(as described in Chapter 13); part numbering information is found in Section
B.2 on page B-7. Throughout this book, TMS320C3x is used to refer to the
TMS320C30 and TMS320C31 and all speed variations. TMS320C30 and
TMS320C31 are used to refer to all speed variants of those processors where
appropriate. Special references, such as TMS320C30-40, are used to note
specific exceptions.

1-2
General Description

Figure 1–1. TMS320 Device Evolution

TMS320C4x
TMS320C3x
TMS320C40
TMS320C30 TMS320C40-40
TMS320c30-27
TMS320C30-40
TMS320C31
TMS320C31-27
TMS320C31-40
TMS320C31PQA
TMS320C31-50
TMS320LC31
PERFORMANCE MIPS/MFLOPS

TMS320C5x

TMS320C50
TMS320C51
TMS320C52
TMS320C53

TMS320C2x

TMS320C1x TMS320C25
TMS320E25
TMS320C25-33
TMS320C10 TMS320C25-50
TMS320C10-14/-25 TMS320C26
TMS320C14
TMS320E14/P14
TMS320C15/LC15
TMS320E15/P15
TMS320C15-25
TMS320E15-25
TMS320C16
TMS320C17/LC17
TMS320E17/P17

GENERATION

Fixed-Point Generations Floating-Point Generations

Introduction 1-3
General Description

The TMS320C30 and TMS320C31 can perform parallel multiply and arithme-
tic logic unit (ALU) operations on integer or floating-point data in a single cycle.
The processor also possesses a general-purpose register file, a program
cache, dedicated auxiliary register arithmetic units (ARAU), internal dual-ac-
cess memories, one DMA channel supporting concurrent I/O, and a short ma-
chine-cycle time. High performance and ease of use are products of those fea-
tures.

General-purpose applications are greatly enhanced by the large address


space, multiprocessor interface, internally and externally generated wait
states, two external interface ports (one on the TMS320C31), two timers, two
serial ports (one on the TMS320C31), and multiple interrupt structure. The
TMS320C3x supports a wide variety of system applications from host proces-
sor to dedicated coprocessor.

High-level language is more easily implemented through a register-based ar-


chitecture, large address space, powerful addressing modes, flexible instruc-
tion set, and well-supported floating-point arithmetic.

1-4
General Description

Figure 1–2 is a functional block diagram that shows the interrelationships be-
tween the various TMS320C3x key components.

Figure 1–2. TMS320C3x Block Diagram


Program
RAM Block 0 RAM Block 1 ROM Block 0
Cache
(1K x 32) (1K x 32) (4K x 32)
RDY (64 x 32)
XRDY
HOLD IOSTRB
HOLDA XR/W
STRB Data Buses XD31–0
R/W XA12–0
D31–0 MSTRB
A23–0
CPU DMA
RESET Serial
INT3–0 Integer/ Integer/ Address Generators Port 0
Floating-Point Floating-Point

Peripheral Bus
IACK Multiplier ALU Control Registers Serial
XF1–0 Port 1
8 Extended-Precision
Controller

MCBL/MP Registers
Timer 0
X1
X2/CLKIN Address Address
Generator 0 Generator 1 Timer 1
VDD
VSS 8 Auxiliary Registers
SHZ
12 Control Registers

Available on
TMS320C30,
TMS320C30-27, and
TMS320C30-40

Introduction 1-5
TMS320C30 Key Features

1.2 TMS320C30 Key Features


Some key features of the TMS320C30 are listed below.

- Performance
J TMS320C30 (33 MHz)
H 60-ns, single-cycle instruction execution time
H 33.3 MFLOPS
H 16.7 MIPS
J TMS320C30-27
H 74-ns, single-cycle instruction execution time
H
H
27 MFLOPS
13.5 MIPS
J TMS320C30-40
H 50-ns, single-cycle instruction execution time
H
H
40 MFLOPS
20 MIPS

- One 4K x 32-bit, single-cycle, dual-access, on-chip, read-only memory


(ROM) block

- Two 1K x 32-bit, single-cycle, dual-access, on-chip, random access


memory (RAM) blocks

- 64- x 32-bit instruction cache

- 32-bit instruction and data words

- 24-bit addresses

- 40-/32-bit floating-point/integer multiplier and ALU

- 32-bit barrel shifter

- Eight extended-precision registers (accumulators)

- Two address generators with eight auxiliary registers and two auxiliary
register arithmetic units

- On-chip DMA controller for concurrent I/O and CPU operation

- Integer, floating-point, and logical operations

- Two- and three-operand instructions

- Parallel ALU and multiplier instructions in a single cycle

1-6
TMS320C30 Key Features

- Block repeat capability

- Zero-overhead loops with single-cycle branches

- Conditional calls and returns

- Interlocked instructions for multiprocessing support

- Two 32-bit data buses (24- and 13-bit address)

- Two serial ports to support 8/16/24/32-bit transfers

- Two 32-bit timers

- Two general-purpose external flags; four external interrupts

- 181-pin grid array (PGA) package; 1-µm CMOS

Introduction 1-7
TMS320C31 Key Features

1.3 TMS320C31 Key Features


The TMS320C31 is a low-cost 32-bit DSP that offers the advantages of a floa-
ting-point processor and ease of use. The TMS320C31 devices are object-
code compatible with the TMS320C30. Aside from lacking a ROM block and
having a single serial port, the TMS320C31 is functionally equivalent to the
TMS320C30 but differs in its respective electrical and timing characteristics.
Chapter 13 describes these differences in detail.
- The TMS320C31 (33 MHz) features are identical to those of the
TMS320C30 device, except that the TMS320C31 uses a subset of the
TMS320C30’s standard peripheral and memory interfaces. This main-
tains the 33-MFLOPS performance of the TMS320C30’s core CPU while
providing the cost advantages associated with 132-pin plastic quad flat
pack (PQFP) packaging.
- The TMS320C31-27 is the slower speed version of the TMS320C31. The
TMS320C31-27 delivers 27 MFLOPS and runs at 27 MHz. The reduced
speed allows you to realize an immediate system cost reduction by using
slower off-chip memories and a lower-cost processor.
- The TMS320C31-40 is a high-speed version of the TMS320C31. The
40-MHz TMS320C31-40 runs with 50-ns cycle time and offers up to 40
MFLOPS in performance.
- The TMS320C31-50 is the highest-speed version of the TMS320C31. The
50-MHz TMS320C31-50 runs with 40-ns cycle time and offers up to 50
MFLOPS in performance.
- The TMS320C31PQA (33 MHz) offers extended-temperature capabilities
to TMS320C31 performance. The TMS320C31PQA will operate at case
_ _
temperatures ranging from –40 C to +85 C, making it a lower-cost floa-
ting-point solution for industrial and extended-temperature commercial
applications.
- The TMS320LC31 is the low-power version of the TMS320C31. The
TMS320LC31 runs with 60-ns cycle time and offers up to 33 MFLOPS in
performance at 3.3-volt operation.
Some key features of the TMS320C31, including those which differentiate it
from the TMS320C30, are summarized as follows:
- Performance
J TMS320C31 (PQL/PQA)
H 60-ns, single-cycle instruction execution time
H
H
33.3 MFLOPS
16.7 MIPS (million instructions per second)

1-8
TMS320C31 Key Features

J TMS320C31-27
H 74-ns, single-cycle instruction execution time
H
H
27 MFLOPS
13.5 MIPS
J TMS320C31-40
H 50-ns, single-cycle instruction execution time
H
H
40 MFLOPS
20 MIPS
J TMS320C31-50
H
H
40-ns, single-cycle instruction execution time
50 MFLOPS
H 25 MIPS
J TMS320LC31
H
H
60-ns, single-cycle instruction execution time
33.3 MFLOPS
H 16.7 MIPS
H
H
Low-power, 3.3 volt operation
Two power-down nodes; 2-MHz operation and idle

- Flexible boot program loader

- One serial port to support 8-/16-/24-/32-bit transfers

- 132-pin PQFP package, .8 µm CMOS

Introduction 1-9
Typical Applications

1.4 Typical Applications


The TMS320 family’s versatility, real-time performance, and multiple functions
offer flexible design approaches in a variety of applications, which are shown
in Table 1–1.

Table 1–1. Typical Applications of the TMS320 Family


General-Purpose DSP Graphics/Imaging Instrumentation

Digital Filtering 3-D Transformations Rendering Spectrum Analysis


Convolution Robot Vision Function Generation
Correlation Image Transmission/Compression Pattern Matching
Hilbert Transforms Pattern Recognition Seismic Processing
Fast Fourier Transforms Image Enhancement Transient Analysis
Adaptive Filtering Homomorphic Processing Digital Filtering
Windowing Workstations Phase-Locked Loops
Waveform Generation Animation/Digital Map

Voice/Speech Control Military

Voice Mail Disk Control Secure Communications


Speech Vocoding Servo Control Radar Processing
Speech Recognition Robot Control Sonar Processing
Speaker Verification Laser Printer Control Image Processing
Speech Enhancement Engine Control Navigation
Speech Synthesis Motor Control Missile Guidance
Text-to-Speech Kalman Filtering Radio Frequency Modems
Neural Networks Sensor Fusion

Telecommunications Automotive

Echo Cancellation FAX Engine Control


ADPCM Transcoders Cellular Telephones Vibration Analysis
Digital PBXs Speaker Phones Antiskid Brakes
Line Repeaters Digital Speech Adaptive Ride Control
Channel Multiplexing Interpolation (DSI) Global Positioning
1 200- to 19 200-bps Modems X.25 Packet Switching Navigation
Adaptive Equalizers Video Conferencing Voice Commands
DTMF Encoding/Decoding Spread Spectrum Digital Radio
Data Encryption Communications Cellular Telephones

Consumer Industrial Medical

Radar Detectors Robotics Hearing Aids


Power Tools Numeric Control Patient Monitoring
Digital Audio/TV Security Access Ultrasound Equipment
Music Synthesizer Power Line Monitors Diagnostic Tools
Toys and Games Visual Inspection Prosthetics
Solid-State Answering Lathe Control Fetal Monitors
Machines CAM MR Imaging

1-10
Chapter 2

TMS320C3x Architecture

This chapter gives an architectural overview of the TMS320C3x processor.

Major areas of discussion are listed below.

Topic Page

2.1 Architectural Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2


2.2 Central Processing Unit (CPU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.3 Memory Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11
2.4 Instruction Set Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
2.5 Internal Bus Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-22
2.6 Parallel Instruction Set Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-23
2.7 External Bus Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-26
2.8 Peripherals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27
2.9 Direct Memory Access (DMA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-29
2.10 TMS320C30 and TMS320C31 Differences . . . . . . . . . . . . . . . . . . . . . . 2-30
2.11 System Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-32

2-1
Architectural Overview

2.1 Architectural Overview


The TMS320C3x architecture responds to system demands that are based on
sophisticated arithmetic algorithms and that emphasize both hardware and
software solutions. High performance is achieved through the precision and
wide dynamic range of the floating-point units, large on-chip memory, a high
degree of parallelism, and the direct memory access (DMA) controller.

Figure 2–1 is a block diagram of the TMS320C3x architecture.

2-2
Architectural Overview

ÉÉÉÉ
ÉÉÉ
Figure 2–1. TMS320C3x Block Diagram

Cache
RAM RAM

ÉÉÉÉ
ÉÉÉ ROM

ÉÉÉÉ
ÉÉÉ
Block 0 Block 1 Block
(64 × 32) (1K × 32) (1K × 32) (4K × 32)

32 24 24 32 24 32 24 32
ÉÉ
ÉÉÉÉ
ÉÉ
ÉÉ
PDATA Bus XRDY
MSTRB
PADDR Bus

ÉÉÉÉÉ
IOSTRB
RDY
XR/W
HOLD DDATA Bus
Multiplexer

Multiplexer
ÉÉ
XD31–XD0
HOLDA
XA12–XA0
STRB DADDR1 Bus
R/W
D31–D0 DADDR2 Bus
A23–A0
DMADATA Bus

DMAADDR Bus

32 24 32 24 24 32 24 Serial Port 0
Port Control
Register FSX0
DMA Controller DX0
R/X Timer CLKX0
Global Control Register FSR0
Register DR0
MULTIPLEXER Data Transmit
Register CLKR0
IR
Source Address
PC CPU1 Register Data Receive
RESET Register

ÉÉÉÉÉ
ÉÉÉÉ
INT3–0 CPU2 Destination
IACK Address Serial Port 1

ÉÉÉ
ÉÉÉÉÉ
ÉÉÉÉ
MC/MP REG1 Register
Port Control

Peripheral Address Bus


XF(1,0) FSX1

Peripheral Data Bus


REG2 Transfer Register

ÉÉÉÉÉ
ÉÉÉÉ
ÉÉÉ
VDD(3-0) DX1
REGISTER 1

Counter
REGISTER2

IODVDD(1,0) 32 32 40 40 Register R/X Timer CLKX1


CPU1

Register

ÉÉÉÉÉ
ÉÉÉÉ
ADVDD(1,0) FSR1
32-Bit
PDVDD Barrel DR1
Multiplier Data Transmit

ÉÉÉÉÉ
DDVDD(1,0) Shifter Register CLKR1
Controller

MDVDD ALU
40 Data Receive

ÉÉÉÉÉ
VSS(3-0)
Register
DVSS(3–0) 40
CVSS(1,0) 40
Extended 40 Timer 0
IVSS 40 Precision
VBBP 32 Registers 40 Global Control
SUBS (R7–R0) Register
X1 Timer Period TCLK0
X2/CLKIN Register
DISP0, IR0, IR1
H1
H3 Timer Counter
ARAU0 ARAU1 Register
EMU6-0 BK
RSV10–0
Timer 1
24 Global Control
24
Register
24 Auxiliary 24
32 Registers Timer Period TCLK1
32 (AR0–AR7) Register
32
Timer Counter
32 Register
Other 32

ÉÉÉÉ
32 Registers Port Control
(12)
Primary

ÉÉÉÉ Available on TMS320C30 Expansion

TMS320C3x Architecture 2-3


Central Processing Unit (CPU)

2.2 Central Processing Unit (CPU)


The TMS320C3x has a register-based central processing unit (CPU) architec-
ture. The CPU consists of the following components:

- Floating-point/integer multiplier

- Arithmetic logic unit (ALU) for performing floating-point, integer, and log-
ical-operations arithmetic

- 32-bit barrel shifter

- Internal buses (CPU1/CPU2 and REG1/REG2)

- Auxiliary register arithmetic units (ARAUs)

- CPU register file

Figure 2–2 shows the various CPU components that are discussed in the
succeeding subsections.

2-4
Central Processing Unit (CPU)

Figure 2–2. Central Processing Unit (CPU)

DADDR1 Bus

DADDR2 Bus

DDATA Bus

Multiplexer

CPU1 Bus

CPU2 Bus

REG1 Bus

REG2 Bus
DADDR1 Bus

DADDR2 Bus

REG1 Bus

REG2 Bus
CPU1 Bus

32 32 40 40
32-Bit Barrel
Multiplier Shifter
ALU
40
40
40
Extended 40
40 Precision
Registers
40
32 (R0–R7)

*Disp, IR0, IR1

ARAU0 BK ARAU1

24 24
24 Auxiliary
Registers 24
32
(AR0–AR7) 32
32

32
Other
32
Registers
32 (12)

* Disp = an 8-bit integer displacement carried in a program control instruction

TMS320C3x Architecture 2-5


Central Processing Unit (CPU)

2.2.1 Multiplier
The multiplier performs single-cycle multiplications on 24-bit integer and 32-bit
floating-point values. The TMS320C3x implementation of floating-point arith-
metic allows for floating-point operations at fixed-point speeds via a 50-ns in-
struction cycle and a high degree of parallelism. To gain even higher through-
put, you can use parallel instructions to perform a multiply and ALU operation
in a single cycle.

When the multiplier performs floating-point multiplication, the inputs are 32-bit
floating-point numbers, and the result is a 40-bit floating-point number. When
the multiplier performs integer multiplication, the input data is 24 bits and yields
a 32-bit result. Refer to Chapter 4 for detailed information on data formats and
floating-point operation.

2.2.2 Arithmetic Logic Unit (ALU)


The ALU performs single-cycle operations on 32-bit integer, 32-bit logical, and
40-bit floating-point data, including single-cycle integer and floating-point con-
versions. Results of the ALU are always maintained in 32-bit integer or 40-bit
floating-point formats. The barrel shifter is used to shift up to 32 bits left or right
in a single cycle. Refer to Chapter 4 for detailed information on data formats
and floating-point operation.

Internal buses, CPU1/CPU2 and REG1/REG2, carry two operands from


memory and two operands from the register file, thus allowing parallel multi-
plies and adds/subtracts on four integer or floating-point operands in a single
cycle.

2.2.3 Auxiliary Register Arithmetic Units (ARAUs)


Two auxiliary register arithmetic units (ARAU0 and ARAU1) can generate two
addresses in a single cycle. The ARAUs operate in parallel with the multiplier
and ALU. They support addressing with displacements, index registers (IR0
and IR1), and circular and bit-reversed addressing. Refer to Chapter 5 for a
description of addressing modes.

2-6
Central Processing Unit (CPU)

2.2.4 CPU Register File


The TMS320C3x provides 28 registers in a multiport register file that is tightly
coupled to the CPU. All of these registers can be operated upon by the multipli-
er and ALU and can be used as general-purpose registers. However, the regis-
ters also have some special functions. For example, the eight extended-preci-
sion registers are especially suited for maintaining extended-precision float-
ing-point results. The eight auxiliary registers support a variety of indirect ad-
dressing modes and can be used as general-purpose 32-bit integer and logical
registers. The remaining registers provide such system functions as address-
ing, stack management, processor status, interrupts, and block repeat. Refer
to Chapter 6 for detailed information and examples of stack management and
register usage.

The register names and assigned functions are listed in Table 2–1. Following
the table, the function of each register or group of registers is briefly described.
Refer to Chapter 3 for detailed information on each of the CPU registers.

TMS320C3x Architecture 2-7


Central Processing Unit (CPU)

Table 2–1. CPU Registers


Register
Name Assigned Function

R0 Extended-precision register 0
R1 Extended-precision register 1
R2 Extended-precision register 2
R3 Extended-precision register 3
R4 Extended-precision register 4
R5 Extended-precision register 5
R6 Extended-precision register 6
R7 Extended-precision register 7

AR0 Auxiliary register 0


AR1 Auxiliary register 1
AR2 Auxiliary register 2
AR3 Auxiliary register 3
AR4 Auxiliary register 4
AR5 Auxiliary register 5
AR6 Auxiliary register 6
AR7 Auxiliary register 7

DP Data-page pointer
IR0 Index register 0
IR1 Index register 1
BK Block size
SP System stack pointer

ST Status register
IE CPU/DMA interrupt enable
IF CPU interrupt flags
IOF I/O flags

RS Repeat start address


RE Repeat end address
RC Repeat counter

The extended-precision registers (R7–R0) are capable of storing and sup-


porting operations on 32-bit integer and 40-bit floating-point numbers. Any in-
struction that assumes the operands are floating-point numbers uses bits
39–0. If the operands are either signed or unsigned integers, only bits 31–0
are used; bits 39–32 remain unchanged. This is true for all shift operations.
Refer to Chapter 4 for extended-precision register formats for floating-point
and integer numbers.

The 32-bit auxiliary registers (AR7–AR0) can be accessed by the CPU and
modified by the two ARAUs. The primary function of the auxiliary registers is
the generation of 24-bit addresses. They can also be used as loop counters
or as 32-bit general-purpose registers that can be modified by the multiplier
and ALU. Refer to Chapter 5 for detailed information and examples of the use
of auxiliary registers in addressing.

2-8
Central Processing Unit (CPU)

The data page pointer (DP) is a 32-bit register. The eight LSBs of the data
page pointer are used by the direct addressing mode as a pointer to the page
of data being addressed. Data pages are 64K words long, with a total of 256
pages.

The 32-bit index registers (IR0, IR1) contain the value used by the ARAU to
compute an indexed address. Refer to Chapter 5 for examples of the use of
index registers in addressing.

The ARAU uses the 32-bit block size register (BK) in circular addressing to
specify the data block size.

The system stack pointer (SP) is a 32-bit register that contains the address
of the top of the system stack. The SP always points to the last element pushed
onto the stack. A push performs a preincrement of the system stack pointer;
a pop performs a postdecrement. The SP is manipulated by interrupts, traps,
calls, returns, and the PUSH and POP instructions. Refer to Section 5.5 for in-
formation about system stack management.

The status register (ST) contains global information relating to the state of the
CPU. Operations usually set the condition flags of the status register accord-
ing to whether the result is 0, negative, etc. This includes register load and
store operations as well as arithmetic and logical functions. When the status
register is loaded, however, a bit-for-bit replacement is performed with the con-
tents of the source operand, regardless of the state of any bits in the source
operand. Therefore, following a load, the contents of the status register are
identical to the contents of the source operand. This allows the status register
to be easily saved and restored. See Table 3–2 for a list and definitions of the
status register bits.

The CPU/DMA interrupt enable register (IE) is a 32-bit register. The CPU
interrupt enable bits are in locations 10–0. The DMA interrupt enable bits are
in locations 26–16. A 1 in a CPU/DMA interrupt enable register bit enables the
corresponding interrupt. A 0 disables the corresponding interrupt. Refer to
subsection 3.1.8 for bit definitions.

The CPU interrupt flag register (IF) is also a 32-bit register (see subsection
3.1.9). A 1 in a CPU interrupt flag register bit indicates that the corresponding
interrupt is set. A 0 indicates that the corresponding interrupt is not set.

The I/O flags register (IOF) controls the function of the dedicated external
pins, XF0 and XF1. These pins may be configured for input or output and may
also be read from and written to. See subsection 3.1.10 for detailed informa-
tion.

TMS320C3x Architecture 2-9


Central Processing Unit (CPU)

The repeat counter (RC) is a 32-bit register used to specify the number of
times a block of code is to be repeated when performing a block repeat. When
the processor is operating in the repeat mode, the 32-bit repeat start address
register (RS) contains the starting address of the block of program memory
to be repeated, and the 32-bit repeat end address register (RE) contains the
ending address of the block to be repeated.

The program counter (PC) is a 32-bit register containing the address of the
next instruction to be fetched. Although the PC is not part of the CPU register
file, it is a register that can be modified by instructions that modify the program
flow.

2-10
Memory Organization

2.3 Memory Organization


The total memory space of the TMS320C3x is 16M (million) 32-bit words. Pro-
gram, data, and I/O space are contained within this 16M-word address space,
thus allowing tables, coefficients, program code, or data to be stored in either
RAM or ROM. In this way, memory usage is maximized and memory space
allocated as desired.

2.3.1 RAM, ROM, and Cache


Figure 2–3 shows how the memory is organized on the TMS320C3x. RAM
blocks 0 and 1 are each 1K x 32 bits. The ROM block, available only on the
TMS320C30, is 4K x 32 bits. Each RAM and ROM block is capable of support-
ing two CPU accesses in a single cycle. The separate program buses, data
buses, and DMA buses allow for parallel program fetches, data reads and
writes, and DMA operations. For example: the CPU can access two data val-
ues in one RAM block and perform an external program fetch in parallel with
the DMA loading another RAM block, all within a single cycle.

TMS320C3x Architecture 2-11


Memory Organization

ÉÉÉÉ
Figure 2–3. Memory Organization

RAM RAM
ÉÉÉÉ ROM

ÉÉÉÉ
Cache
Block 0 Block 1 Block
(64 x 32)
(1K x 32) (1K x 32) (4K x 32)

32 24 24 32 24 32 24 32

ÉÉÉÉÉ
ÉÉÉÉÉ
PDATA Bus

ÉÉÉÉÉ
PADDR Bus XRDY

ÉÉÉÉÉ
RDY MSTRB
HOLD DDATA Bus IOSTRB

ÉÉÉÉÉ
Multiplexer

Multiplexer
HOLDA XR/W
XD31–XD0

ÉÉÉÉÉ
STRB DADDR1 Bus
R/W XA12–XA0
D31–D0 DADDR2 Bus

Peripheral Bus
A23–A0
DMADATA Bus

DMAADDR Bus
32 24 32 24 24 32 24

ÉÉÉÉ
Program Counter/ DMA
Instruction Register CPU Controller

ÉÉÉÉ Available on TMS320C30

A 64 x 32-bit instruction cache is provided to store often-repeated sections of


code, thus greatly reducing the number of off-chip accesses necessary. This
allows for code to be stored off-chip in slower, lower-cost memories. The exter-
nal buses are also freed for use by the DMA, external memory fetches, or other
devices in the system.

Refer to Chapter 3 for detailed information about the memory and instruction
cache.

2-12
Memory Organization

2.3.2 Memory Maps


The memory map depends on whether the processor is running in micropro-
cessor mode (MC/MP or MCBL/MP = 0) or microcomputer mode (MC/MP or
MCBL/MP = 1). The memory maps for these modes are similar (see
Figure 2–4 and Figure 2–5). Locations 800000h–801FFFh are mapped to the
expansion bus. When this region, available only on the TMS320C30, is ac-
cessed, MSTRB is active. Locations 802000h–803FFFh are reserved. Loca-
tions 804000h–805FFFh are mapped to the expansion bus. When this region,
available only on the TMS320C30, is accessed, IOSTRB is active. Locations
806000h–807FFFh are reserved. All of the memory-mapped peripheral bus
registers are in locations 808000h–8097FFh. In both modes, RAM block 0 is
located at addresses 809800h–809BFFh, and RAM block 1 is located at ad-
dresses 809C00h–809FFFh. Locations 80A000h–0FFFFFFh are accessed
over the external memory port (STRB active).

In microprocessor mode, the 4K on-chip ROM (TMS320C30) or boot loader


(TMS320C31) is not mapped into the TMS320C3x memory map. Locations
0h–0BFh consist of interrupt vector, trap vector, and reserved locations, all of
which are accessed over the external memory port (STRB active). Locations
0C0h–7FFFFFh are also accessed over the external memory port.

In microcomputer mode, the 4K on-chip ROM (TMS320C30) or boot loader


(TMS320C31) is mapped into locations 0h–0FFFh. There are 192 locations
(0h–0BFh) within this block for interrupt vectors, trap vectors, and a reserved
space (TMS320C30). Locations 1000h–7FFFFFh are accessed over the ex-
ternal memory port (STRB active).

Section 3.2 on page 3-13 describes the memory maps in greater detail and
provides the peripheral bus map and vector locations for reset, interrupts, and
traps.

Be careful! Access to a reserved area produces unpredictable


results.

TMS320C3x Architecture 2-13


Memory Organization

Figure 2–4. TMS320C30 Memory Maps

0h 0h
Reset, Interrupt, Trap Vectors,
Reset, Interrupt, Trap Vectors,
and Reserved Locations (192)
and Reserved Locations (192)
(External STRB Active)
03Fh 0BFh
040h 0C0h
ROM
(Internal)
External
STRB Active 0FFFh
1000h
External
STRB Active
7FFFFFh 7FFFFFh
800000h 800000h
Expansion Bus Expansion Bus
MSTRB Active MSTRB Active
(8K Words) (8K Words)
801FFFh 801FFFh
802000h 802000h
Reserved Reserved
(8K Words) (8K Words)
803FFFh 803FFFh
804000h 804000h
Expansion Bus Expansion Bus
IOSTRB Active IOSTRB Active
(8K Words) (8K Words)
805FFFh 805FFFh
806000h 806000h
Reserved Reserved
(8K Words) (8K Words)
807FFFh 807FFFh
808000h 808000h

Peripheral Bus Peripheral Bus


Memory-Mapped Memory-Mapped
Registers Registers
(6K Words Internal) (Internal)
(6K Words Internal)
8097FFh 8097FFh
809800h 809800h
RAM Block 0 RAM Block 0
(1K Word Internal) (1K Word Internal)
809BFFh 809BFFh
809C00h 809C00h
RAM Block 1 RAM Block 1
(1K Word Internal) (1K Word Internal)
809FFFh 809FFFh
80A000h 80A000h
External External
STRB Active STRB Active
0FFFFFFh 0FFFFFFh

(a) Microprocessor Mode (b) Microcomputer Mode

2-14
Memory Organization

Figure 2–5. TMS320C31 Memory Maps

0h 0h
Reset, Interrupt, Trap Vectors,
and Reserved Locations (192)
(External STRB Active) Reserved for Boot
03Fh Loader Operations
040h
(See Section 3.4)

FFFh
1000h
External Boot 1
STRB Active
External
STRB
Active
400000h Boot 2

7FFFFFh 7FFFFFh
800000h 800000h
Reserved Reserved
(32K Words) (32K Words)
807FFFh 807FFFh
808000h 808000h
Peripheral Bus Peripheral Bus
Memory-Mapped Memory-Mapped
Registers Registers
(6K Words Internal) (6K Words Internal)
8097FFh 8097FFh
809800h 809800h
RAM Block 0 RAM Block 0
(1K Word Internal) (1K Word Internal)
809BFFh 809BFFh
809C00h 809C00h
RAM Block 1
(1K Word—63 Internal)
809FC0h
RAM Block 1 809FC1h
(1K Word Internal)
User Program Interrupt
and Trap Branches
(63 Words Internal)
809FFFh 809FFFh
80A000h 80A000h
External
External FFF000h Boot 3 STRB
STRB Active
Active
FFFFFFh FFFFFFh

(a) Microprocessor Mode (b) Microcomputer/Boot Loader Mode

TMS320C3x Architecture 2-15


Memory Organization

2.3.3 Memory Addressing Modes


The TMS320C3x supports a base set of general-purpose instructions as well
as arithmetic-intensive instructions that are particularly suited for digital signal
processing and other numeric-intensive applications. Refer to Chapter 5 for
detailed information on addressing.

Five groups of addressing modes are provided on the TMS320C3x. Six types
of addressing can be used within the groups, as shown in the following list:

- General addressing modes:


J Register. The operand is a CPU register.
J Short immediate. The operand is a 16-bit immediate value.
J Direct. The operand is the contents of a 24-bit address.
J Indirect. An auxiliary register indicates the address of the operand.

- Three-operand addressing modes:


J Register. Same as for general addressing mode.
J Indirect. Same as for general addressing mode.

- Parallel addressing modes:


J Register. The operand is an extended-precision register.
J Indirect. Same as for general addressing mode.

- Long-immediate addressing mode.


The Long-immediate operand is a 24-bit immediate value.

- Conditional branch addressing modes:


J Register. Same as for general addressing mode.
J PC-relative. A signed 16-bit displacement is added to the PC.

2-16
Instruction Set Summary

2.4 Instruction Set Summary

Table 2–2 lists the TMS320C3x instruction set in alphabetical order. Each
table entry shows the instruction mnemonic, description, and operation. Refer
to Chapter 10 for a functional listing of the instructions and individual instruc-
tion descriptions.
Table 2–2. Instruction Set Summary

Mnemonic Description Operation


ABSF Absolute value of a floating-point number |src| → Rn
ABSI Absolute value of an integer |src| → Dreg
ADDC Add integers with carry src + Dreg + C → Dreg
ADDC3 Add integers with carry (3 operand) src1 + src2 + C → Dreg
ADDF Add floating-point values src + Rn → Rn
ADDF3 Add floating-point values (3 operand) src1 + src2 → Rn
ADDI Add integers src + Dreg → Dreg
ADDI3 Add integers (3 operand) src1 + src2 + → Dreg
AND Bitwise logical AND Dreg AND src → Dreg
AND3 Bitwise logical AND (3 operand) src1 AND src2 → Dreg
ANDN Bitwise logical AND with complement Dreg AND src → Dreg
ANDN3 Bitwise logical ANDN (3 operand) src1 AND src2 → Dreg
ASH Arithmetic shift If count ≥ 0:
(Shifted Dreg left by count) → Dreg
Else:
(Shifted Dreg right by |count|) → Dreg
ASH3 Arithmetic shift (3 operand) If count ≥ 0:
(Shifted src left by count) → Dreg
Else:
(Shifted src right by |count|) → Dreg
Bcond Branch conditionally (standard) If cond = true:
If Csrc is a register, Csrc → PC
If Csrc is a value, Csrc + PC → PC
Else, PC + 1 → PC
BcondD Branch conditionally (delayed) If cond = true:
If Csrc is a register, Csrc → PC
If Csrc is a value, Csrc + PC + 3 → PC
Else, PC + 1 → PC
BR Branch unconditionally (standard) Value → PC
BRD Branch unconditionally (delayed) Value → PC
CALL Call subroutine PC + 1 → TOS
Value → PC
Legend: C carry bit Csrc conditional-branch addressing modes
cond condition code count shift value (general addressing modes)
Dreg register address (any register) PC program counter
Rn register address (R7–R0) src general addressing modes
src1 three-operand addressing modes src2 three-operand addressing modes

TMS320C3x Architecture 2-17


Instruction Set Summary

Table 2–2. Instruction Set Summary (Continued)

Mnemonic Description Operation


CALLcond Call subroutine conditionally If cond = true:
PC + 1 → TOS
If Csrc is a register, Csrc → PC
If Csrc is a value, Csrc + PC → PC
Else, PC + 1 → PC
CMPF Compare floating-point values Set flags on Rn – src
CMPF3 Compare floating-point values Set flags on src1 – src2
(3 operand)
CMPI Compare integers Set flags on Dreg – src
CMPI3 Compare integers (3 operand) Set flags on src1 – src2
DBcond Decrement and branch conditionally ARn – 1 → ARn
(standard) If cond = true and ARn ≥ 0:
If Csrc is a register, Csrc → PC
If Csrc is a value, Csrc + PC + 1 → PC
Else, PC + 1 → PC
DBcondD Decrement and branch conditionally ARn – 1 → ARn
(delayed) If cond = true and ARn ≥ 0:
If Csrc is a register, Csrc → PC
If Csrc is a value, Csrc + PC + 3 → PC
Else, PC + 1 → PC
FIX Convert floating-point value to integer Fix (src) → Dreg
FLOAT Convert integer to floating-point value Float(src) → Rn
IACK Interrupt acknowledge Dummy read of src
IACK toggled low, then high
IDLE Idle until interrupt PC + 1 → PC
Idle until next interrupt
LDE Load floating-point exponent src(exponent) → Rn(exponent)
LDF Load floating-point value src → Rn
LDFcond Load floating-point value conditionally If cond = true, src → Rn
Else, Rn is not changed
LDFI Load floating-point value, interlocked Signal interlocked operation src → Rn
LDI Load integer src → Dreg
LDIcond Load integer conditionally If cond = true, src → Dreg
Else, Dreg is not changed
Legend: ARn auxiliary register n (AR7–AR0 Rn register address (R7 — R0)
Csrc conditional-branch addressing modes src general addressing modes
cond condition code src1 three-operand addressing modes
Dreg register address (any register) src2 three-operand addressing modes
PC program counter TOS top of stack

2-18
Instruction Set Summary

Table 2–2. Instruction Set Summary (Continued)


Mnemonic Description Operation
LDII Load integer, interlocked Signal interlocked operation src → Dreg
LDM Load floating-point mantissa src (mantissa) → Rn (mantissa)
LSH Logical shift If count ≥ 0:
(Dreg left-shifted by count) → Dreg
Else:
(Dreg right-shifted by |count|) → Dreg
LSH3 Logical shift (3-operand) If count ≥ 0:
(src left-shifted by count) → Dreg
Else:
(src right-shifted by |count|) → Dreg
MPYF Multiply floating-point values src × Rn → Rn
MPYF3 Multiply floating-point value (3 operand) src1 × src2 → Rn
MPYI Multiply integers src × Dreg → Dreg
MPYI3 Multiply integers (3 operand) src1 × src2 → Dreg
NEGB Negate integer with borrow 0 – src – C → Dreg
NEGF Negate floating-point value 0 – src → Rn
NEGI Negate integer 0 – src → Dreg
NOP No operation Modify ARn if specified
NORM Normalize floating-point value Normalize (src) → Rn
NOT Bitwise logical complement src → Dreg
OR Bitwise logical OR Dreg OR src → Dreg
OR3 Bitwise logical OR (3 operand) src1 OR src2 → Dreg
POP Pop integer from stack *SP– – → Dreg
POPF Pop floating-point value from stack *SP– – → Rn
PUSH Push integer on stack Sreg → *++ SP
PUSHF Push floating-point value on stack Rn → *++ SP
Legend: ARn auxiliary register n (AR7–AR0) SP stack pointer
C carry bit Sreg register address (any register)
Dreg register address (any register) src general addressing modes
PC program counter src1 3-operand addressing modes
Rn register address (R7–R0) src2 3-operand addressing modes

TMS320C3x Architecture 2-19


Instruction Set Summary

Table 2–2. Instruction Set Summary (Continued)

Mnemonic Description Operation


RETIcond Return from interrupt conditionally If cond = true or missing:
*SP– – → PC
1 → ST (GIE)
Else, continue

RETScond Return from subroutine conditionally If cond = true or missing:


*SP– – → PC
Else, continue

RND Round floating-point value Round (src) → Rn

ROL Rotate left Dreg rotated left 1 bit → Dreg

ROLC Rotate left through carry Dreg rotated left 1 bit through carry → Dreg

ROR Rotate right Dreg rotated right 1 bit → Dreg

RORC Rotate right through carry Dreg rotated right 1 bit through carry → Dreg

RPTB Repeat block of instructions src → RE


1 → ST (RM)
Next PC → RS

RPTS Repeat single instruction src → RC


1 → ST (RM)
Next PC → RS
Next PC → RE

SIGI Signal, interlocked Signal interlocked operation


Wait for interlock acknowledge
Clear interlock

STF Store floating-point value Rn → Daddr

STFI Store floating-point value, interlocked Rn → Daddr


Signal end of interlocked operation

STI Store integer Sreg → Daddr

STII Store integer, interlocked Sreg → Daddr


Signal end of interlocked operation
SUBB Subtract integers with borrow Dreg – src – C → Dreg
Legend: C carry bit RM repeat mode bit
cond condition code RS repeat start register
Daddr destination memory address Rn register address (R7–R0)
Dreg register address (any register) SP stack pointer
GIE global interrupt enable register ST status register
PC program counter Sreg register address (any register)
RC repeat counter register src general addressing modes
RE repeat interrupt register

2-20
Instruction Set Summary

Table 2–2. Instruction Set Summary (Concluded)


Mnemonic Description Operation
SUBB3 Subtract integers with borrow (3 operand) src1 – src2 – C → Dreg
SUBC Subtract integers conditionally If Dreg – src ≥ 0:
[(Dreg – src) << 1] OR 1 → Dreg
Else, Dreg << 1 → Dreg
SUBF Subtract floating-point values Rn – src → Rn

SUBF3 Subtract floating-point values (3 operand) src1 – src2 → Rn

SUBI Subtract integers Dreg – src → Dreg

SUBI3 Subtract integers (3 operand) src1 – src2 → Dreg

SUBRB Subtract reverse integer with borrow src – Dreg – C → Dreg

SUBRF Subtract reverse floating-point value src – Rn → Rn

SUBRI Subtract reverse integer src – Dreg → Dreg

SWI Software interrupt Perform emulator interrupt sequence

TRAPcond Trap conditionally If cond = true or missing:


Next PC → * ++ SP
Trap vector N → PC
0 → ST (GIE)
Else, continue

TSTB Test bit fields Dreg AND src

TSTB3 Test bit fields (3 operand) src1 AND src2

XOR Bitwise exclusive OR Dreg XOR src → Dreg

XOR3 Bitwise exclusive OR (3 operand) src1 XOR src2 → Dreg


Legend: C carry bit Rn register address (R7–R0)
cond condition code SP stack pointer
Dreg register address (any register) src general addressing modes
GIE global interrupt enable register src1 3-operand addressing modes
N any trap vector 0–27 src2 3-operand addressing modes
PC program counter ST status register

TMS320C3x Architecture 2-21


Internal Bus Operation

2.5 Internal Bus Operation


Much of the TMS320C3x’s high performance is due to internal busing and par-
allelism. The separate program buses (PADDR and PDATA), data buses
(DADDR1, DADDR2, and DDATA), and DMA buses (DMAADDR and
DMADATA) allow for parallel program fetches, data accesses, and DMA ac-
cesses. These buses connect all of the physical spaces (on-chip memory,
off-chip memory, and on-chip peripherals) supported by the TMS320C30.
Figure 2–3 shows these internal buses and their connection to on-chip and off-
chip memory blocks.

The PC is connected to the 24-bit program address bus (PADDR). The instruc-
tion register (IR) is connected to the 32-bit program data bus (PDATA). These
buses can fetch a single instruction word every machine cycle.

The 24-bit data address buses (DADDR1 and DADDR2) and the 32-bit data
data bus (DDATA) support two data memory accesses every machine cycle.
The DDATA bus carries data to the CPU over the CPU1 and CPU2 buses. The
CPU1 and CPU2 buses can carry two data memory operands to the multiplier,
ALU, and register file every machine cycle. Also internal to the CPU are regis-
ter buses REG1 and REG2, which can carry two data values from the register
file to the multiplier and ALU every machine cycle. Figure 2–2 shows the buses
internal to the CPU section of the processor.

The DMA controller is supported with a 24-bit address bus (DMAADDR) and
a 32-bit data bus (DMADATA). These buses allow the DMA to perform memory
accesses in parallel with the memory accesses occurring from the data and
program buses.

2-22
Parallel Instruction Set Summary

2.6 Parallel Instruction Set Summary


Table 2–3 lists the ’C3x instruction set in alphabetical order. Each table entry
shows the instruction mnemonic, description, and operation. Refer to Section
10.3 on page -14 for a functional listing of the instructions and individual
instruction descriptions.

TMS320C3x Architecture 2-23


Parallel Instruction Set Summary

Table 2–3. Parallel Instruction Set Summary


Mnemonic Description Operation

Parallel Arithmetic With Store Instructions

ABSF Absolute value of a floating point |src2| → dst1


|| STF || src3 → dst2

ABSI Absolute value of an integer |src2| → dst1


|| STI || src3 → dst2

ADDF3 Add floating point src1 + src2 → dst1


|| STF || src3 → dst2

ADDI3 Add integer src1 + src2 → dst1


|| STI || src3 → dst2
AND3 Bitwise logical AND src1 AND src2 → dst1
|| STI || src3 → dst2

ASH3 Arithmetic shift If count ≥ 0:


|| STI src2 << count → dst1
|| src3 → dst2
Else:
src2 >> |count| → dst1
|| src3 → dst2

FIX Convert floating point to integer Fix(src2) → dst1


|| STI || src3 → dst2

FLOAT Convert integer to floating point Float(src2) → dst1


|| STF || src3 → dst2
LDF Load floating point src2 → dst1
|| STF || src3 → dst2

LDI Load integer src2 → dst1


|| STI || src3 → dst2

LSH3 Logical shift If count ≥ 0:


|| STI src2 << count → dst1
|| src3 → dst2
Else:
src2 >> |count| → dst1
|| src3 → dst2

MPYF3 Multiply floating point src1 x src2 → dst1


|| STF || src3 → dst2

MPYI3 Multiply integer src1 x src2 → dst1


|| STI || src3 → dst2
Legend: count register addr (R7–R0) src1 register addr (R7–R0)
dst1 register addr (R7–R0) src2 indirect addr (disp = 0, 1, IR0, IR1)
dst2 indirect addr (disp = 0, 1, IR0, IR1) src3 register addr (R7–R0)

2-24
Parallel Instruction Set Summary

Table 2–3. Parallel Instruction Set Summary (Continued)

Mnemonic Description Operation


Parallel Arithmetic With Store Instructions (Concluded)
NEGF Negate floating point 0– src2 → dst1
|| STF || src3 → dst2
NEGI Negate integer 0 – src2 → dst1
|| STI || src3 → dst2
NOT Complement src1 → dst1
|| STI || src3 → dst2
OR3 Bitwise logical OR src1 OR src2 → dst1
|| STI || src3 → dst2
STF Store floating point src1 → dst1
|| STF || src3 → dst2
STI Store integer src1 → dst1
|| STI || src3 → dst2
SUBF3 Subtract floating point src1 – src2 → dst1
|| STF || src3 → dst2
SUBI3 Subtract integer src1 – src2 → dst1
|| STI || src3 → dst2
XOR3 Bitwise exclusive OR src1 XOR src2 → dst1
|| STI || src3 → dst2
Parallel Load Instructions
LDF Load floating point src2 → dst1
|| LDF || src4 → dst2
LDI Load integer src2 → dst1
|| LDI || src4 → dst2
Parallel Multiply And Add/Subtract Instructions
MPYF3 Multiply and add floating point op1 x op2 → op3
|| ADDF3 || op4 + op5 → op6
MPYF3 Multiply and subtract floating point op1 x op2 → op3
|| SUBF3 || op4 – op5 → op6
MPYI3 Multiply and add integer op1 x op2 → op3
|| ADDI3 || op4 + op5 → op6
MPYI3 Multiply and subtract integer op1 x op2 → op3
|| SUBI3 || op4 – op5 → op6
Legend: dst1 register addr (R7–R0) op3 register addr (R0 or R1)
dst2 indirect addr (disp = 0, 1, IR0, IR1) op6 register addr (R2 or R3)
op1, op2, op4, and op5 Any two of these src1 register addr (R7–R0)
operands must be specified using src2 indirect addr (disp = 0, 1, IR0, IR1)
register addr; the remaining two src3 register addr (R7–R0)
must be specified using indirect.

TMS320C3x Architecture 2-25


External Bus Operation

2.7 External Bus Operation


The TMS320C30 provides two external interfaces: the primary bus and the ex-
pansion bus. The TMS320C31 provides one external interface: the primary
bus. Both primary and expansion buses consist of a 32-bit data bus and a set
of control signals. The primary bus has a 24-bit address bus, whereas the ex-
pansion bus has a 13-bit address bus. Both buses can be used to address ex-
ternal program/data memory or I/O space. The buses also have an external
RDY signal for wait-state generation. You can insert additional wait states un-
der software control. Refer to Chapter 7 for detailed information on external
bus operation.

2.7.1 External Interrupts


The TMS320C3x supports four external interrupts (INT3–INT0), a number of
internal interrupts, and a nonmaskable external RESET signal. These can be
used to interrupt either the DMA or the CPU. When the CPU responds to the
interrupt, the IACK pin can be used to signal an external interrupt acknowl-
edge. Section 6.5 (beginning on page 6-18) covers RESET and interrupt pro-
cessing.

2.7.2 Interlocked-Instruction Signaling


Two external I/O flags, XF0 and XF1, can be configured as input or output pins
under software control. These pins are also used by the interlocked operations
of the TMS320C3x. The interlocked-operations instruction group supports
multiprocessor communication (see Section 6.4 on page 6-12 for examples of
the use of interlocked instructions).

2-26
Peripherals

2.8 Peripherals
All TMS320C3x peripherals are controlled through memory-mapped registers
on a dedicated peripheral bus. This peripheral bus is composed of a 32-bit data
bus and a 24-bit address bus. This peripheral bus permits straightforward
communication to the peripherals. The TMS320C3x peripherals include two
timers and two serial ports (only one serial port is available on the
TMS320C31). Figure 2–6 shows the peripherals with associated buses and
signals. Refer to Chapter 8 for detailed information on the peripherals.

Figure 2–6. Peripheral Modules


Serial Port 0

Port Control Register FSX0


M DX0
S
E R/X Timer Register
P CLKX0
M
A
O
C Data Transmit Register FSR0
R
E
Y DR0

ÉÉÉÉÉÉÉÉÉ
Data Receive Register
CLKR0

ÉÉÉÉÉÉÉÉÉ Serial Port 1

ÉÉÉÉÉÉÉÉÉ
FSX1
Port Control Register

ÉÉÉÉÉÉÉÉÉ
DX1

ÉÉÉÉÉÉÉÉÉ
P R/X Timer Register CLKX1
E
FSR1

ÉÉÉÉÉÉÉÉÉ
P R
Data Transmit Register
E I DR1

ÉÉÉÉÉÉÉÉÉ
R P
I H Data Receive Register CLKR1
P E
H R
E A Timer 0
R L
A Global Control Register
L A
TCLK0
D Timer Period Register
D D
A R
Timer Counter Register
T E
A S
S Timer 1
B
U B Global Control Register
S U
S TCLK1
Timer Period Register

Timer Counter Register

ÉÉÉÉ
ÉÉÉÉ
Available on TMS320C30

TMS320C3x Architecture 2-27


Peripherals

2.8.1 Timers
The two timer modules are general-purpose 32-bit timer/event counters with
two signaling modes and internal or external clocking. Each timer has an I/O
pin that can be used as an input clock to the timer or as an output signal driven
by the timer. The pin can also be configured as a general-purpose I/O pin.

2.8.2 Serial Ports


The two bidirectional serial ports are totally independent. They are identical to
a complementary set of control registers that control each port. Each serial
port can be configured to transfer 8, 16, 24, or 32 bits of data per word. The
clock for each serial port can originate either internally or externally. An inter-
nally generated divide-down clock is provided. The serial port pins are confi-
gurable as general-purpose I/O pins. The serial ports can also be configured
as timers. A special handshake mode allows TMS320C3xs to communicate
over their serial ports with guaranteed synchronization.

2-28
Direct Memory Access (DMA)

2.9 Direct Memory Access (DMA)


The on-chip DMA controller can read from or write to any location in the
memory map without interfering with the operation of the CPU. Therefore, the
TMS320C3x can interface to slow external memories and peripherals without
reducing throughput to the CPU. The DMA controller contains its own address
generators, source and destination registers, and transfer counter. Dedicated
DMA address and data buses minimize conflicts between the CPU and the
DMA controller. A DMA operation consists of a block or single-word transfer
to or from memory. Refer to Section 8.3 on page 8-43 for detailed information
on the DMA controller. Figure 2–7 shows the DMA controller with associated
buses.

Figure 2–7. DMA Controller

DMADATA Bus

DMAADDR Bus

Peripheral Data Bus

Peripheral Address Bus


DMA Controller

Global Control Register

Source Address Register

Destination Address Register

Transfer Counter Register

TMS320C3x Architecture 2-29


TMS320C30 and TMS320C31 Differences

2.10 TMS320C30 and TMS320C31 Differences


This section addresses the major memory access differences between the
TMS320C31 and the TMS320C30 devices. Observance of these consider-
ations is critical for achieving design goal success.

Table 2–4 shows these differences, which are detailed in the following subsec-
tions.

Table 2–4. Feature Set Comparison


Feature TMS320C31 TMS320C30
Data/program bus Primary bus: one bus composed of Two buses:
a 32-bit data and a 24-bit address D Primary bus: a 32-bit data and a
bus 24-bit address
D Expansion bus: a 32-bit data and
a 13-bit address

Serial I/O ports 1 serial port (SP0) 2 serial ports (SP0, SP1)

User program/data ROM Not available 4K words/16K bytes

Program boot loader User selectable Not available

2.10.1 Data/Program Bus Differences


The TMS320C31 uses only the primary bus and reserves the memory space
that was previously used for expansion bus operations.

Be careful! Program access to a reserved area produces


unpredictable results.

2.10.2 Serial-Port Differences


Serial port 1 references in Section 8.2 are not applicable to the TMS320C31.
The memory locations identified for the associated control registers and buff-
ers are reserved.

2.10.3 Reserved Memory Locations


Table 2–5 identifies TMS320C31 reserved memory locations in addition to
those shown in Figure 3–8 on page 3-16.

2-30
TMS320C30 and TMS320C31 Differences

Table 2–5. TMS320C31 Reserved Memory Locations


Feature TMS320C31 TMS320C30
0x000000–0x000FFF Reserved† Microcomputer program/data ROM mode†

0x800000–0x801FFF Reserved Expansion bus MSTRB space

0x804000–0x805FFF Reserved Expansion bus IOSTRB space

0x808050 Reserved SP1 global-control register

0x808052–0x808056 Reserved SP1 local-control registers

0x808058 Reserved SP1 data-transmit buffer

0x80805C Reserved SP1 receive-transmit buffer

0x808060 Reserved Expansion bus control register


† Applies to the MCBL and MC modes only.

2.10.4 Effects on the IF and IE Interrupt Registers


The bits associated with serial port 1 in the IE (interrupt enable) register and
the IF (interrupt flag) register for the TMS320C30 are not applicable to the
TMS320C31. Write only logic 0 data to IE register bits 6, 7, 22, and 23 and to
IF register bits 6 and 7. Writing logic 1s to these bits produces unpredictable
results.

2.10.5 User Program/Data ROM


The user program/data ROM that is available for the TMS320C30 device does
not exist for the TMS320C31. Rather, the memory locations that were allo-
cated to support user program/data ROM operations have been reserved on
the TMS320C31 to support microcomputer/boot loader accessing. See
Chapter 3 for more information on using the microcomputer/boot loader func-
tion.

2.10.6 Development Considerations


If you are developing application code using a TMS320C3x simulator, XDS,
or ASM/LNK, TI recommends that you modify the .cfm and .cmd files by re-
moving these memory spaces from the tool’s configured memory. This
ensures that your developed application performs as expected when the
TMS320C31 device is used.

TMS320C3x Architecture 2-31


System Integration

2.11 System Integration


In summary, the TMS320C3x is a powerful DSP system that integrates an in-
novative, high-performance CPU, two external interface ports, large memo-
ries, and efficient buses to support its speed. A single chip contains this sys-
tem, along with peripherals such as a DMA controller, two serial ports, and two
timers. The TMS320C3x system is truly an affordable single-chip solution.

2-32
Chapter 3

CPU Registers, Memory, and Cache

The central processing unit (CPU) register file contains 28 registers that can
be operated on by the multiplier and arithmetic logic unit (ALU). Included in the
register file are the auxiliary registers, extended-precision registers, and index
registers. The registers in the CPU register file support addressing, float-
ing-point/integer operations, stack management, processor status, block re-
peats, and interrupts.

The TMS320C3x provides a total memory space of 16M (million) 32-bit words
containing program, data, and I/O space. Two RAM blocks of 1K x 32 bits each
and a ROM block of 4K x 32 bits (available only on the TMS320C30) permit
two CPU accesses in a single cycle. The memory maps for the microcomputer
and microprocessor modes are similar, except that the on-chip ROM is not
used in the microprocessor mode.

A 64- x 32-bit instruction cache stores often-repeated sections of code. This


greatly reduces the number of off-chip accesses and allows code to be stored
off-chip in slower, lower-cost memories. Three bits in the CPU status register
control the clear, enable, or freeze of the cache.

This chapter describes in detail each of the CPU registers, the memory maps,
and the instruction cache. Major topics are as follows:

Topic Page

3.1 CPU Register File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2


3.2 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
3.3 Instruction Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-21
3.4 Using the TMS320C31 Boot Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26

3-1
CPU Register File

3.1 CPU Register File


The TMS320C3x provides 28 registers in a multiport register file that is tightly
coupled to the CPU. The program counter (PC) is not included in the 28 regis-
ters. All of these registers can be operated on by the multiplier and the ALU
and can be used as general-purpose 32-bit registers. However, the registers
also have some special functions for which they are particularly appropriate.
For example, the eight extended-precision registers are especially suited for
maintaining extended-precision floating-point results. The eight auxiliary reg-
isters support a variety of indirect addressing modes and can be used as gen-
eral-purpose 32-bit integer and logical registers. The remaining registers pro-
vide system functions, such as addressing, stack management, processor
status, interrupts, and block repeat. Refer to Chapter 5 for detailed information
and examples of the use of CPU registers in addressing.

Table 3–1 lists the registers names and assigned functions.

Table 3–1. CPU Registers

Register Assigned Function Name


R0 Extended-precision register 0
R1 Extended-precision register 1
R2 Extended-precision register 2
R3 Extended-precision register 3
R4 Extended-precision register 4
R5 Extended-precision register 5
R6 Extended-precision register 6
R7 Extended-precision register 7
AR0 Auxiliary register 0
AR1 Auxiliary register 1
AR2 Auxiliary register 2
AR3 Auxiliary register 3
AR4 Auxiliary register 4
AR5 Auxiliary register 5
AR6 Auxiliary register 6
AR7 Auxiliary register 7
DP Data-page pointer
IR0 Index register 0
IR1 Index register 1
BK Block-size register
SP System stack pointer
ST Status register
IE CPU/DMA interrupt enable
IF CPU interrupt flags
IOF I/O flags
RS Repeat start address
RE Repeat end address
RC Repeat counter

3-2
CPU Register File

3.1.1 Extended-Precision Registers (R7–R0)


The eight extended-precision registers (R7–R0) are capable of storing and
supporting operations on 32-bit integer and 40-bit floating-point numbers.
These registers consist of two separate and distinct regions:
- bits 39–32: dedicated to storage of the exponent (e) of the floating-point
number.
- bits 31–0: store the mantissa of the floating-point number:
J bit 31: sign bit (s)
J bits 30–0: the fraction (f)
Any instruction that assumes the operands are floating-point numbers uses
bits 39–0. Figure 3–1 illustrates the storage of 40-bit floating-point numbers
in the extended-precision registers.

Figure 3–1. Extended-Precision Register Floating-Point Format


39 32 31 30 0

e s fraction (f)

mantissa

For integer operations, bits 31–0 of the extended-precision registers contain


the integer (signed or unsigned). Any instruction that assumes the operands
are either signed or unsigned integers uses only bits 31–0. Bits 39–32 remain
unchanged. This is true for all shift operations. The storage of 32-bit integers
in the extended-precision registers is shown in Figure 3–2.

Figure 3–2. Extended-Precision Register Integer Format

39 32 31 0

unchanged signed or unsigned integer

3.1.2 Auxiliary Registers (AR7–AR0)


The eight 32-bit auxiliary registers (AR7–AR0) can be accessed by the CPU
and modified by the two Auxiliary Register Arithmetic Units (ARAUs). The pri-
mary function of the auxiliary registers is the generation of 24-bit addresses.
However, they can also be used as loop counters in indirect addressing or as
32-bit general-purpose registers that can be modified by the multiplier and
ALU. Refer to Chapter 5 for detailed information and examples of the use of
auxiliary registers in addressing.

CPU Registers, Memory, and Cache 3-3


CPU Register File

3.1.3 Data-Page Pointer (DP)


The data-page pointer (DP) is a 32-bit register that is loaded using the LDP
instruction. The eight LSBs of the data-page pointer are used by the direct ad-
dressing mode as a pointer to the page of data being addressed. Data pages
are 64K words long, with a total of 256 pages. Bits 31–8 are reserved; you
should always keep these set to 0 (cleared).

3.1.4 Index Registers (IR0, IR1)


The 32-bit index registers (IR0 and IR1) are used by the ARAU for indexing
the address. Refer to Chapter 5 for detailed information and examples of the
use of index registers in addressing.

3.1.5 Block Size Register (BK)


The 32-bit block size register (BK) is used by the ARAU in circular addressing
to specify the data block size (see Section 5.3 on page 5-24).

3.1.6 System Stack Pointer (SP)


The system stack pointer (SP) is a 32-bit register that contains the address of
the top of the system stack. The SP always points to the last element pushed
onto the stack. The SP is manipulated by interrupts, traps, calls, returns, and
the PUSH, PUSHF, POP, and POPF instructions. Pushes and pops of the
stack perform preincrement and postdecrement, respectively, on all 32 bits of
the stack pointer. However, only the 24 LSBs are used as an address. Refer
to Section 5.5 on page 5-31 for information about system stack management.

3.1.7 Status Register (ST)


The status register (ST) contains global information relating to the state of the
CPU. Operations usually set the condition flags of the status register accord-
ing to whether the result is 0, negative, etc. This includes register load and
store operations as well as arithmetic and logical functions. When the status
register is loaded, however, the contents of the source operand replace the
current contents bit-for-bit, regardless of the state of any bits in the source op-
erand. Therefore, following a load, the contents of the status register are iden-
tically equal to the contents of the source operand. This allows the status regis-
ter to be saved easily and restored. At system reset, 0 is written to this register.

3-4
CPU Register File

Figure 3–3 shows the format of the status register. Table 3–2 defines the sta-
tus register bits, their names, and their functions.

Figure 3–3. Status Register


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx GIE CC CE CF xx RM OVM LUF LV UF N Z V C
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W

Notes: 1) xx = reserved bit, read as 0


2) R = read, W = write

CPU Registers, Memory, and Cache 3-5


CPU Register File

Table 3–2. Status Register Bits Summary

Bit Name Reset Value Function


0† C 0 Carry flag

1† V 0 Overflow flag

2† Z 0 Zero flag

3† N 0 Negative flag

4† UF 0 Floating-point underflow flag

5† LV 0 Latched overflow flag

6† LUF 0 Latched floating-point underflow flag

7 OVM 0 Overflow mode flag. This flag affects only the integer operations. If OVM
= 0, the overflow mode is turned off; integer results that overflow are
treated in no special way. If OVM = 1,
a) integer results overflowing in the positive direction are set to the
most positive 32-bit twos-complement number (7FFFFFFFh), and
b) integer results overflowing in the negative direction are set to the
most negative 32-bit twos-complement number (80000000h).
Note that the function of V and LV is independent of the setting of OVM.

8 RM 0 Repeat mode flag. If RM = 1, the PC is being modified in either the


repeat-block or repeat-single mode.

9 Reserved 0 Read as 0

10 CF 0 Cache freeze. When CF = 1, the cache is frozen. If the cache is enabled


(CE = 1), fetches from the cache are allowed, but no modification of the
state of the cache is performed. This function can be used to save fre-
quently used code resident in the cache. At reset, 0 is written to this bit.
Cache clearing (CC = 1) is allowed when CF = 0.

11 CE 0 Cache enable. CE = 1 enables the cache, allowing the cache to be used


according to the least recently used (LRU) cache algorithm. CE = 0 dis-
ables the cache; no update or modification of the cache can be per-
formed. No fetches are made from the cache. This function is useful for
system debugging. At system reset, 0 is written to this bit. Cache clear-
ing (CC = 1) is allowed when CE = 0.

12 CC 0 Cache clear. CC = 1 invalidates all entries in the cache. This bit is always
cleared after it is written to and thus always read as 0. At reset, 0 is writ-
ten to this bit.

13 GIE 0 Global interrupt enable. If GIE = 1, the CPU responds to an enabled in-
terrupt. If GIE = 0, the CPU does not respond to an enabled interrupt.

15–14 Reserved 0 Read as 0

31–16 Reserved 0–0 Value undefined

† The seven condition flags (ST bits 6–0) are defined in Section 10.2 on page -10.

3-6
CPU Register File

3.1.8 CPU/DMA Interrupt Enable Register (IE)


The CPU/DMA interrupt enable register (IE) is a 32-bit register (see
Figure 3–4). The CPU interrupt enable bits are in locations 10 –0. The direct
memory access (DMA) interrupt enable bits are in locations 26–16. A 1 in a
CPU/DMA IE register bit enables the corresponding interrupt. A 0 disables the
corresponding interrupt. At reset, 0 is written to this register. Table 3–3 defines
the register bits, the bit names, and the bit functions.

Figure 3–4. CPU/DMA Interrupt Enable Register (IE)


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

xx xx xx xx xx EDINT ETINT1 ETINT0 ERINT1 EXINT1 ERINT0 EXINT0 EINT3 EINT2 EINT1 EINT0
(DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA)
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx xx EDINT ETINT1 ETINT0 ERINT1 EXINT1 ERINT0 EXINT0 EINT3 EINT2 EINT1 EINT0
(CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU)
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W

Notes: 1) xx = reserved bit, read as 0


2) R = read, W = write

CPU Registers, Memory, and Cache 3-7


CPU Register File

Table 3–3. IE Register Bits Summary


Bit Name Reset Value Function
0 EINT0 0 Enable external interrupt 0 (CPU)

1 EINT1 0 Enable external interrupt 1 (CPU)

2 EINT2 0 Enable external interrupt 2 (CPU)

3 EINT3 0 Enable external interrupt 3 (CPU)

4 EXINT0 0 Enable serial-port 0 transmit interrupt (CPU)

5 ERINT0 0 Enable serial-port 0 receive interrupt (CPU)

6 EXINT1 0 Enable serial-port 1 transmit interrupt (CPU)

7 ERINT1 0 Enable serial-port 1 receive interrupt (CPU)

8 ETINT0 0 Enable timer 0 interrupt (CPU)

9 ETINT1 0 Enable timer 1 interrupt (CPU)

10 EDINT 0 Enable DMA controller interrupt (CPU)

15–11 Reserved 0 Value undefined

16 EINT0 0 Enable external interrupt 0 (DMA)

17 EINT1 0 Enable external interrupt 1 (DMA)

18 EINT2 0 Enable external interrupt 2 (DMA)

19 EINT3 0 Enable external interrupt 3 (DMA)

20 EXINT0 0 Enable serial-port 0 transmit interrupt (DMA)

21 ERINT0 0 Enable serial-port 0 receive interrupt (DMA)

22 EXINT1 0 Enable serial-port 1 transmit interrupt (DMA)

23 ERINT1 0 Enable serial-port 1 receive interrupt (DMA)

24 ETINT0 0 Enable timer 0 interrupt (DMA)

25 ETINT1 0 Enable timer 1 interrupt (DMA)

26 EDINT 0 Enable DMA controller interrupt (DMA)

31–27 Reserved 0–0 Value undefined

3-8
CPU Register File

3.1.9 CPU Interrupt Flag Register (IF)


Figure 3–5 shows the 32-bit CPU interrupt flag register (IF). A 1 in a CPU IF
register bit indicates that the corresponding interrupt is set. The IF bits are set
to 1 when an interrupt occurs. They may also be set to 1 through software to
cause an interrupt. A 0 indicates that the corresponding interrupt is not set. If
a 0 is written to an IF register bit, the corresponding interrupt is cleared. At re-
set, 0 is written to this register. Table 3–4 lists the bit fields, bit-field names, and
bit-field functions of the CPU IF register.

Figure 3–5. CPU Interrupt-Flag Register (IF)


31 29 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx
30 28

15 13 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx xx DINT TINT1 TINT0 RINT1 XINT1 RINT0 XINT0 INT3 INT2 INT1 INT0
14 12 R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W

Notes: 1) xx = reserved bit, read as 0


2) R = read, W = write

Table 3–4. IF Register Bits Summary


Bit Name Reset Value Function
0 INT0 0 External interrupt 0 flag

1 INT1 0 External interrupt 1 flag

2 INT2 0 External interrupt 2 flag

3 INT3 0 External interrupt 3 flag

4 XINT0 0 Serial-port 0 transmit interrupt flag

5 RINT0 0 Serial-port 0 receive interrupt flag

6 XINT1† 0 Serial-port 1 transmit interrupt flag

7 RINT1† 0 Serial-port 1 receive interrupt flag

8 TINT0 0 Timer 0 interrupt flag

9 TINT1 0 Timer 1 interrupt flag

10 DINT 0 DMA channel interrupt flag

31–11 Reserved 0–0 Value undefined

† Reserved on TMS320C31

CPU Registers, Memory, and Cache 3-9


CPU Register File

3.1.10 I/O Flags Register (IOF)


The I/O flags register (IOF) is shown in Figure 3–6 and controls the function
of the dedicated external pins, XF0 and XF1. These pins can be configured for
input or output. The pins can also be read from and written to. At reset, 0 is
written to this register. Table 3–5 shows the bit fields, bit-field names, and bit-
field functions.

Figure 3–6. I/O-Flag Register (IOF)


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx xx xx xx xx INXF1 OUTXF1 I/OXF1 xx INXF0 OUTXF0 I/OXF0 xx
R R/W R/W R R/W R/W

Notes: 1) xx = reserved bit, read as 0


2) R = read, W = write

3-10
CPU Register File

Table 3–5. IOF Register Bits Summary


Bit Name Reset Value Function
0 Reserved 0 Read as 0

1 I/OXF0 0 If I/OXF0 = 0, XF0 is configured as a general-purpose input pin.


If I/OXF0 = 1, XF0 is configured as a general-purpose output pin.

2 OUTXF0 0 Data output on XF0

3 INXF0 0 Data input on XF0. A write has no effect.

4 Reserved 0 Read as 0

5 I/OXF1 0 If I/OXF1 = 0, XF1 is configured as a general-purpose input pin.


If I/OXF1 = 1, XF1 is configured as a general-purpose output pin.

6 OUTXF1 0 Data output on XF1

7 INXF1 0 Data input on XF1. A write has no effect.

31–8 Reserved 0–0 Read as 0

3.1.11 Repeat-Count (RC) and Block-Repeat Registers (RS, RE)

The 32-bit repeat start address register (RS) contains the starting address of
the block of program memory to be repeated when the CPU is operating in the
repeat mode.

The 32-bit repeat end address register (RE) contains the ending address of
the block of program memory to be repeated when the CPU is operating in the
repeat mode.

Note: RE < RS
If RE < RS, the block of program memory will not be repeated, and the code
will not loop backwards. However, the ST(RM) bit remains set to 1.

The repeat-count register (RC) is a 32-bit register used to specify the number
of times a block of code is to be repeated when a block repeat is performed.
If RC contains the number n, the loop is executed n + 1 times.

3.1.12 Program Counter (PC)

The PC is a 32-bit register containing the address of the next instruction to be


fetched. While the program counter register is not part of the CPU register file,
it can be modified by instructions that modify the program flow.

CPU Registers, Memory, and Cache 3-11


CPU Register File

3.1.13 Reserved Bits and Compatibility


To retain compatibility with future members of the TMS320C3x family of micro-
processors, reserved bits that are read as 0 must be written as 0. A reserved
bit that has an undefined value must not have its current value modified. In oth-
er cases, you should maintain the reserved bits as specified.

3-12
Memory

3.2 Memory
The TMS320C3x’s total memory space of 16M (million) 32-bit words contains
program, data, and I/O space, allowing tables, coefficients, program code, or
data to be stored in either RAM or ROM. In this way, you can maximize memory
usage and allocate memory space as desired.

RAM blocks 0 and 1 are each 1K x 32 bits. The ROM block is 4K x 32 bits. Each
on-chip RAM and ROM block is capable of supporting two CPU accesses in
a single cycle. The separate program buses, data buses, and DMA buses al-
low for parallel program fetches, data reads/writes, and DMA operations.
Chapter 9 covers this in detail.

3.2.1 TMS320C3x Memory Maps


The memory map depends on whether the processor is running in micropro-
cessor mode (MC/MP or MCBL/MP = 0) or microcomputer mode (MC/MP or
MCBL/MP = 1). The memory maps for these modes are similar (see
Figure 3–7). Locations 800000h through 801FFFh are mapped to the expan-
sion bus. When this region, available only on the TMS320C30, is accessed,
MSTRB is active. Locations 802000h through 803FFFh are reserved. Loca-
tions 804000h through 805FFFh are mapped to the expansion bus. When this
region, available only on the TMS320C30, is accessed, IOSTRB is active. Lo-
cations 806000h through 807FFFh are reserved. All of the memory-mapped
peripheral registers are in locations 808000h through 8097FFh. In both
modes, RAM block 0 is located at addresses 809800h through 809BFFh, and
RAM block 1 is located at addresses 809C00h through 809FFFh. Memory lo-
cations 80A000h through 0FFFFFFh are accessed over the primary external
memory port (STRB active).

In microprocessor mode, the 4K on-chip ROM (TMS320C30) or boot loader


(TMS320C31) is not mapped into the TMS320C3x memory map. As shown
in Figure 3–7, locations 0h through 03Fh consist of interrupt vector, trap vec-
tor, and reserved locations, all of which are accessed over the primary external
memory port (STRB active). Interrupt and trap vector locations are shown in
Figure 3–9. Locations 040h–7FFFFFh and 80A000L–FFFFFFh are also ac-
cessed over the primary external memory port.

CPU Registers, Memory, and Cache 3-13


Memory

In microcomputer mode, the 4K on-chip ROM (TMS320C30) or boot loader


(TMS320C31) is mapped into locations 0h through 0FFFh. There are 192 lo-
cations (0h through BFh) within this block for interrupt vectors, trap vectors,
and a reserved space. Locations 1000h–7FFFFFh are accessed over the pri-
mary external memory port (STRB active).

Reserved Spaces
Do not read and write to reserved portions of the TMS320C3x
memory space and reserved peripheral bus addresses. Doing so
might cause the TMS320C3x to halt operation and require a system
reset to restart.

3-14
Memory

Figure 3–7. TMS320C30 Memory Maps

0h 0h
Reset, Interrupt, Trap Vector,
Reset, Interrupt, Trap Vector,
and Reserved Locations (64)
and Reserved Locations (192)
External STRB Active
03Fh 0BFh
040h 0C0h
ROM
(Internal)
External
0FFFh
STRB Active 1000h
External
STRB Active
7FFFFFh 7FFFFFh
800000h 800000h
Expansion Bus Expansion Bus
MSTRB Active MSTRB Active
(8K Words) (8K Words)
801FFFh 801FFFh
802000h 802000h
Reserved Reserved
(8K Words) (8K Words)
803FFFh 803FFFh
804000h 804000h
Expansion Bus Expansion Bus
IOSTRB Active IOSTRB Active
(8K Words) (8K Words)
805FFFh 805FFFh
806000h 806000h
Reserved Reserved
(8K Words) (8K Words)
807FFFh 807FFFh
808000h 808000h

Peripheral Bus Peripheral Bus


Memory-Mapped Memory-Mapped
Registers Registers
(6K Words Internal) (6K Words Internal)

8097FFh 8097FFh
809800h 809800h
RAM Block 0 RAM Block 0
(1K Word Internal) (1K Word Internal)
809BFFh 809BFFh
809C00h 809C00h
RAM Block 1 RAM Block 1
(1K Word Internal) (1K Word Internal)
809FFFh 809FFFh
80A000h 80A000h
External External
STRB Active STRB Active
0FFFFFFh 0FFFFFFh

(a) Microprocessor Mode (b) Microcomputer Mode

CPU Registers, Memory, and Cache 3-15


Memory

Figure 3–8. TMS320C31 Memory Maps

0h 0h
Reset, Interrupt, Trap Vector,
and Reserved Locations (64)
(External STRB Active) Reserved for Boot
03Fh Loader Operations
040h
(See Section 3.4.)

FFFh
1000h
External Boot 1
STRB Active
External
STRB
Active
400000h Boot 2

7FFFFFh 7FFFFFh
800000h 800000h
Reserved Reserved
(32K Words) (32K Words)
807FFFh 807FFFh
808000h 808000h
Peripheral Bus Peripheral Bus
Memory-Mapped Memory-Mapped
Registers Registers
(6K Words Internal) (6K Words Internal)
8097FFh 8097FFh
809800h 809800h
RAM Block 0 RAM Block 0
(1K Word Internal) (1K Word Internal)
809BFFh 809BFFh
809C00h 809C00h
RAM Block 1
(1K Word— 63 Internal)
809FC0h
RAM Block 1 809FC1h
(1K Word Internal)
User Program Interrupt
and Trap Branches
(63 Words Internal)
809FFFh 809FFFh
80A000h 80A000h
External
External FFF000h Boot 3 STRB
STRB Active
Active
FFFFFFh FFFFFFh

(a) Microprocessor Mode (b) Microcomputer/Boot Loader Mode

Boot 1–3 locations are used by the boot-loader function. See Section 3.4 for
a complete description. All reserved memory locations are described in
Table 2–5 on page 2-31.

3-16
Memory

3.2.2 TMS320C31 Memory Maps


Setting the TMS320C31 MCBL/MP pin determines the mode in which the
TMS320C31 can function:
- Microprocessor mode (MCBL/MP = 0), or
- Microcomputer/boot loader mode (MCBL/MP = 1)

The major difference between these two modes is their memory maps (see
Figure 3–8). The program boot load feature is enabled when the MCBL/MP pin
is driven high during reset.

Figure 3–8 shows the memory locations (internal and external) used by the
boot loader to load the source program.

3.2.3 Reset/Interrupt/Trap Vector Map


The addresses for the reset, interrupt, and trap vectors are 00h–3Fh, as shown
in Figure 3–9. The reset vector contains the address of the reset routine.

Microprocessor and Microcomputer Modes


In the microprocessor mode of the TMS320C30 and TMS320C31 and the
microcomputer mode of the TMS320C30, the interrupt and trap vectors stored
in locations 0h–3Fh are the addresses of the starts of the respective interrupt
and trap routines. For example, at reset, the content of memory location 00h
(reset vector) is loaded into the PC, and execution begins from that address.
See Figure 3–9.

Microcomputer/Boot Loader Mode


In the microcomputer/boot loader mode of the TMS320C31, the interrupt and
trap vectors stored in locations 809FC1h–809FFFh are branch instructions to
the start of the respective interrupt and trap routines. See Figure 3–10.

CPU Registers, Memory, and Cache 3-17


Memory

Figure 3–9. Reset, Interrupt, and Trap-Vector Locations for the TMS320C30/TMS320C31
Microprocessor Mode

00h RESET
01h INT0
02h INT1
03h INT2
04h INT3
05h XINT0
06h RINT0
07h XINT1†
08h RINT1†
09h TINT0
0Ah TINT1
0Bh DINT
0Ch
RESERVED
1Fh
20h TRAP 0



3Bh TRAP 27
3Ch TRAP 28 (Reserved)
3Dh TRAP 29 (Reserved)
3Eh TRAP 30 (Reserved)
3Fh TRAP 31 (Reserved)

† Reserved on TMS320C31

Note: Traps 28–31


Traps 28–31 are reserved; do not use them.

3-18
Memory

Figure 3–10. Interrupt and Trap Branch Instructions for the TMS320C31 Microcomputer
Mode

809FC1h INT0
809FC2h INT1
809FC3h INT2
809FC4h INT3
809FC5h XINT0
809FC6h RINT0
809FC7h XINT1
809FC8h RINT1
809FC9h TINT0
809FCAh TINT1
809FCBh DINT

809FCC–
809FDFh RESERVED

809FE0h TRAP0
809FE1h TRAP1



809FFBh TRAP27
809FFCh TRAP28 (Reserved)
809FFDh TRAP29 (Reserved)
809FFEh TRAP30 (Reserved)
809FFFh TRAP31 (Reserved)

Note: Traps 28–31


Traps 28–31 are reserved; do not use them.

CPU Registers, Memory, and Cache 3-19


Memory

3.2.4 Peripheral Bus Map


The memory-mapped peripheral registers are located starting at address
808000h. The peripheral bus memory map is shown in Figure 3–11. Each pe-
ripheral occupies a 16-word region of the memory map. Locations 808010h
through 80801Fh and locations 808070h through 8097FFh are reserved.

Figure 3–11. Peripheral Bus Memory Map


808000h
DMA Controller Registers

80800Fh (16)
808010h
Reserved

80801Fh (16)
808020h
Timer 0 Registers

80802Fh (16)
808030h
Timer 1 Registers
80803Fh (16)
808040h
Serial-Port 0 Registers
80804Fh (16)
808050h Serial-Port 1 Registers†
(16)
80805Fh
808060h
Primary and Expansion Port
Registers (16)
80806Fh
808070h
Reserved
8097FFh
† Reserved on TMS320C31

3-20
Instruction Cache

3.3 Instruction Cache


A 64 × 32-bit instruction cache facilitates maximum system performance by
storing sections of code that can be fetched when the device repeatedly ac-
cesses time-critical code. This reduces the number of off-chip accesses nec-
essary and allows code to be stored off-chip in slower, lower-cost memories.
The cache also frees external buses from program fetches so that they can be
used by the DMA or other system elements.

The cache can operate automatically, with no user intervention. Subsection


3.3.2 describes a form of the least recently used (LRU) cache update algo-
rithm.

3.3.1 Cache Architecture


The instruction cache (see Figure 3–12) contains 64 32-bit words of RAM; it
is divided into two 32-word segments. Associated with each segment is a
19-bit segment start address (SSA) register. For each word in the cache, there
is a corresponding single bit: present (P) flag.

CPU Registers, Memory, and Cache 3-21


Instruction Cache

Figure 3–12. Instruction Cache Architecture


Segment Start P
Address Registers Flags Segment Words
LRU
Stack
Most Recently Used
Segment Number
SSA Register 0 0 Segment Word 0

1 Segment Word 1 Least Recently Used


19
Segment Number
Segment 0

30 Segment Word 30

31 Segment Word 31

32

SSA Register 1 0 Segment Word 0


1 Segment Word 1

Segment 1

30 Segment Word 30
31 Segment Word 31

When the CPU requests an instruction word from external memory, the cache
algorithm checks to determine whether the word is already contained in the
instruction cache. Figure 3–13 shows the partitioning of an instruction address
as used by the cache control algorithm. The algorithm uses the19 most signifi-
cant bits (MSBs) of the instruction address to select the segment; the five least
significant bits (LSBs) define the address of the instruction word within the per-
tinent segment. The algorithm compares the 19 MSBs of the instruction ad-
dress with the two SSA registers. If there is a match, the algorithm checks the
relevant P flag. The P flag indicates whether a word within a particular segment
is already present in cache memory.

Figure 3–13. Address Partitioning for Cache Control Algorithm


23 54 0
segment start address instruction word
(SSA) address within segment

If there is no match, one of the segments must be replaced by the new data.
The segment replaced in this circumstance is determined by the LRU algo-
rithm. The LRU stack (see Figure 3–12) is maintained for this purpose.

3-22
Instruction Cache

The LRU stack determines which of the two segments qualifies as the least
recently used after each access to the cache; therefore, the stack contains ei-
ther 0,1 or 1,0. Each time a segment is accessed, its segment number is re-
moved from the LRU stack and pushed onto the top of the LRU stack. There-
fore, the number at the top of the stack is the most recently used segment num-
ber, and the number at the bottom of the stack is the least recently used seg-
ment number.

At system reset, the LRU stack is initialized with 0 at the top and 1 at the bot-
tom. All P flags in the instruction cache are cleared.

When a replacement is necessary, the least recently used segment is selected


for replacement. Also, the 32 P flags for the segment to be replaced are set
to 0, and the segment’s SSA register is replaced with the 19 MSBs of the in-
struction address.

3.3.2 Cache Algorithm

When the TMS320C3x requests an instruction word from external memory,


one of two possible actions occurs: a cache hit or a cache miss.

- Cache Hit. The cache contains the requested instruction, and the follow-
ing actions occur:

1) The instruction word is read from the cache.

2) The number of the segment containing the word is removed from the
LRU stack and pushed to the top of the LRU stack, thus moving the
other segment number to the bottom of the stack.

- Cache Miss. The cache does not contain the instruction. Following are
the types of cache miss:

J Word miss. The segment address register matches the instruction ad-
dress, but the relevant P flag is not set. The following actions occur in
parallel:

H The instruction word is read from memory and copied into the
cache.

H The number of the segment containing the word is removed from


the LRU stack and pushed to the top of the LRU stack, thus mov-
ing the other segment number to the bottom of the stack.

H The relevant P flag is set.

CPU Registers, Memory, and Cache 3-23


Instruction Cache

J Segment miss. Neither of the segment addresses matches the in-


struction address. The following actions occur in parallel:
H The least recently used segment is selected for replacement. The
P flags for all 32 words are cleared.
H The SSA register for the selected segment is loaded with the 19
MSBs of the address of the requested instruction word.
H The instruction word is fetched and copied into the cache. It goes
into the appropriate word of the least recently used segment. The
P flag for that word is set to 1.
H The number of the segment containing the instruction word is re-
moved from the LRU stack and pushed to the top of the LRU
stack, thus moving the other segment number to the bottom of the
stack.
Only instructions may be fetched from the program cache. All reads and writes
of data in memory bypass the cache. Program fetches from internal memory
do not modify the cache and do not generate cache hits or misses. The pro-
gram cache is a single-access memory block. Dummy program fetches (i.e.,
following a branch) are treated by the cache as valid program fetches and can
generate cache misses and cache updates.
Take care when using self-modifying code. If an instruction resides in cache
and the corresponding location in primary memory is modified, the copy of the
instruction in cache is not modified.
You can use the cache more efficiently by aligning program code on 32-word
address boundaries. Do this with the ALIGN directive when coding assembly
language.

3.3.3 Cache Control Bits


Three cache control bits are located in the CPU status register:
- Cache Clear Bit (CC). Writing a 1 to the cache clear bit (CC) invalidates
all entries in the cache. All P flags in the cache are cleared. The CC bit is
always cleared after the cache is cleared. It is therefore always read as a
0. At reset, the cache is cleared and 0 is written to this bit.
- Cache Enable Bit (CE). Writing a 1 to this bit enables the cache. When
enabled, the cache is used according to the previously described cache
algorithm. Writing a 0 to the cache enable bit disables the cache; no up-
dates or modification of the cache can be performed. Specifically, no SSA
register updates are performed, no P flags are modified (unless CC = 1),
and the LRU stack is not modified. Writing a 1 to CC when the cache is
disabled clears the cache, and, thus, the P flags. No fetches are made
from the cache when the cache is disabled. At reset, 0 is written to this bit.

3-24
Instruction Cache

- Cache Freeze Bit (CF). When CF = 1, the cache is frozen. If, in addition,
the cache is enabled, fetches from the cache are allowed, but no modifica-
tion of the state of the cache is performed. Specifically, no SSA register
updates are performed, no P flags are modified (unless CC = 1), and the
LRU stack is not modified. You can use this function to keep frequently
used code resident in the cache. Writing a 1 to CC when the cache is fro-
zen clears the cache, and, thus, the P flags. At reset, 0 is written to this bit.

Table 3–6 defines the effect of the CE and CF bits used in combination.

Table 3–6. Combined Effect of the CE and CF Bits


CE CF Effect
0 0 Cache not enabled

0 1 Cache not enabled

1 0 Cache enabled and not frozen

1 1 Cache enabled and frozen

CPU Registers, Memory, and Cache 3-25


Using the TMS320C31 Boot Loader

3.4 Using the TMS320C31 Boot Loader


This section describes how to use the TMS320C31 microcomputer/boot load-
er (MCBL/MP)function. This feature is unique to the TMS320C31 and is not
available on the TMS320C30 devices. The source code for the boot loader is
supplied in Appendix G.

3.4.1 Boot-Loader Operations


The boot loader lets you load and execute programs that are received from a
host processor, inexpensive EPROMs, or other standard memory devices.
The programs to be loaded either reside in one of three memory mapped areas
identified as Boot 1, Boot 2, and Boot 3 (see the shaded areas of Figure 3–8),
or they are received by means of the serial port.

User-definable byte, half-word, and word-data formats, as well as 32-bit fixed


burst loads from the TMS320C31 serial port, are supported. See Section 8.2
on page 8-13 for a detailed description of the serial-port operation.

3.4.2 Invoking the Boot Loader


The boot-loader function is selected by resetting the processor while driving
the MCBL/MP pin high. Use interrupt pins INT3 – INT0 to set the mode of the
boot load operation. Figure 3–14 shows the flow of this operation, which de-
pends on the mode selected (external memory or serial boot). Figure 3–15
shows memory load operations; Figure 3–16 shows serial port load opera-
tions.

3-26
Using the TMS320C31 Boot Loader

Figure 3–14. Boot-Loader-Mode Selection Flowchart

Begin

Reset
MCBL/MP = 1

Is
Register Yes
Bit INT3 Serial Port Load
Set?
No
Is
Register Yes Memory Load
Bit INT0 From 1000h
Set?
No
Is
Register Yes Memory Load
Bit INT1 From 400000h
Set?
No
Is
Register Yes Memory Load
Bit INT2 From FFF000h
Set?
No

CPU Registers, Memory, and Cache 3-27


Using the TMS320C31 Boot Loader

Figure 3–15. Boot-Loader Memory-Load Flowchart

Memory Load

Yes
Branch to Address Block Size = 0?
Boot 1,
Boot 2, or
Boot 3 No

Load Destination
Address
Determine Mode
8, 16, or 32?

Set Memory Yes


Configuration Block Size = 0?
Control Word

No
Load Block Size Transfer Data From
Source to
Destination

Block Size –1

Load Block Size

Branch to Destination
Address of First
Block Loaded

Begin Program Execution

3-28
Using the TMS320C31 Boot Loader

Figure 3–16. Boot-Loader Serial-Port Load-Mode Flowchart

Serial Port Load


Yes
Block Size = 0?
Set up Serial Port
for 32-Bit
Fixed Burst Mode No

Wait for Serial


Port Input
Wait for Serial
Port Input
Transfer Data from
Load Block Size Serial Port to
Destination Address

Block Size –1
Yes
Block Size = 0?

No

Wait for Serial


Port Input Wait for Serial
Port Input

Load Destination
Address Load Block Size

Branch to Destination
Address of First
Block Loaded

Begin Program Execution

3.4.3 Mode Selection

After reset, the loader mode is determined by polling the status of the
INT3–INT0 bits of the IF register. The bits are polled in the order described in
the flowchart in Figure 3–14 on page 3-27. Table 3–7 lists the mode options
and the interrupt that you can use to set the particular mode. The interrupt can
be driven any time after the RESET pin has been deasserted. Unless only one
interrupt flag bit is set (INT0, INT1, INT2, or INT3), the boot mode cannot be
guaranteed.

CPU Registers, Memory, and Cache 3-29


Using the TMS320C31 Boot Loader

Table 3–7. Loader Mode Selection


Active Interrupt Loader Mode Memory Addresses
INT0 External memory Boot 1 address 0x001000

INT1 External memory Boot 2 address 0x400000

INT2 External memory Boot 3 address 0xFFF000

INT3 32-bit serial Serial port 0

3.4.4 External Memory Loading


Table 3–8 shows and describes the information that you must specify to define
boot memory organization (8, 16, or 32 bits), the code block size, the load des-
tination address, and memory access timing control for the boot memory. You
must specify this information before a source program can be externally
loaded.

This information must be specified in the first four locations of the Boot 1, Boot
2, or Boot 3 areas. The header is followed by the data or program code that
is the block size in length.

Table 3–8. External Memory Loader Header


Location Description Valid Data Entries
0 Boot memory type (8, 16, or 32) 0x8, 0x10, or 0x20 specified as a 32-bit number

1 Boot memory configuration See Chapter 7 for valid bus-control register entries.
(defined # of wait states, etc.)

2 Program block size (blk) Any value 0 < blk < 224

3 Destination address Any valid TMS320C31 24-bit address

4 Program code starts here Any 32-bit data value or valid TMS320C3x instruction

The loader fetches 32 bits of data for each specified location, regardless of
what memory configuration width is specified. The data values must reside
within or be written to memory, beginning with the value of least significance
for each 32 bits of information.

3.4.5 Examples of External Memory Loads


Example 3–1, Example 3–2, and Example 3–3 show memory images for
byte-wide, 16-bit-wide, and 32-bit-wide configured memory.

3-30
Using the TMS320C31 Boot Loader

These examples assume the following:

- An INT0 signal was detected after reset was deasserted (signifying an ex-
ternal memory load from Boot 1).

- The loader header resides at memory location 0x1000 and defines the fol-
lowing:
J Boot memory type EPROMs that require two wait states and SWW = 11,
J A loader destination address at the beginning of the TMS320C31’s in-
ternal RAM Block 1, and
J A single block of memory that is 0x1FF in length.

Example 3–1.Byte-Wide Configured Memory


Address Value Comments
0x1000 0x08 Memory width = 8 bits

0x1001 0x00

0x1002 0x00

0x1003 0x00

0x1004 0x58 Memory type = SWW = 11, WCNT = 2

0x1005 0x10

0x1006 0x00

0x1007 0x00

0x1008 0xFF Program code size = 0x1FF

0x1009 0x01

0x100A 0x00

0x100B 0x00

0x100C 0x00 Program load starting address = 0x809C00

0x100D 0x9C

0x100E 0x80

0x100F 0x00

CPU Registers, Memory, and Cache 3-31


Using the TMS320C31 Boot Loader

Example 3–2.16-Bit-Wide Configured Memory

Address Value Comments


0x1000 0x10 Memory width = 16

0x1001 0x0000

0x1002 0x1058 Memory type = SWW = 11, WCNT = 2

0x1003 0x0000

0x1004 0x1FF Program code size = 0x1FF

0x1005 0x0000

0x1006 0x9C00 Program load starting address = 0x809C00

0x1007 0x0080

Example 3–3.32-Bit-Wide Configured Memory

Address Value Comments


0x1000 0x00000020 Memory width = 32

0x1001 0x00001058 Memory type = SWW = 11, WCNT = 2

0x1002 0x000001FF Program code size = 0x1FF

0x1003 0x00809C00 Program load starting address = 0x809C00

After reading the header, the loader transfers blk, 32-bit words beginning at a
specified destination address. Code blocks require the same byte and half-
word ordering conventions. The loader can also load multiple code blocks at
different address destinations.

After loading all code blocks, the boot loader branches to the destination ad-
dress of the first block loaded and begins program execution. Consequently,
the first code block loaded should be a start-up routine to access the other
loaded programs.

Each code block has the following header:

BLK size 1st location


Destination address 2nd location

End the loader function and begin execution of the first code block by append-
ing the value of 0x00000000 to the last block.

3-32
Using the TMS320C31 Boot Loader

It is assumed that at least one block of code will be loaded when the
loader is invoked. Initial loader invocation with a block size of
0x00000000 produces unpredictable results.

3.4.6 Serial-Port Loading


Boot loads, by way of the TMS320C31 serial port, are selected by driving the
INT3 pin active (low) following reset. The loader automatically configures the
serial port for 32-bit fixed-burst-mode reads. It is interrupt-driven by the frame
synchronization receive (FSR) signal. You cannot change this mode for boot
loads. Your hardware must externally generate the serial-port clock and FSR.

As in parallel loading, a header must precede the actual program to be loaded.


However, you need only apply the block size and destination address because
the loader and your hardware have predefined serial-port speed and data for-
mat (i.e., skip data words 0 and 1 from Table 3–8).

The transferred data-bit order must begin with the MSB and end with the LSB.

3.4.7 Interrupt and Trap-Vector Mapping


Unlike the microprocessor mode, the microcomputer/boot-loader (MCBL)
mode uses a dual-vectoring scheme to service interrupt and trap requests.
Dual vectoring was implemented to ensure code compatibility with future ver-
sions of TMS320C3x devices.

In a dual-vectoring scheme, branch instructions to an address, rather than di-


rect-interrupt vectoring, are used. The normal interrupt and trap vectors are
defined to vector to the last 63 locations in the on-chip RAM, starting at address
809FC1h. When the loader is invoked, the last 63 locations in RAM Block 1 of
the TMS320C31 are assumed to contain branch instructions to the interrupt
source routines.

Take care to ensure that these locations are not inadvertently


overwritten by loaded program or data values.

CPU Registers, Memory, and Cache 3-33


Using the TMS320C31 Boot Loader

Table 3–9 shows the MCBL/MP mode interrupt and trap instruction memory
maps.

Table 3–9. TMS320C31 Interrupt and Trap Memory Maps


Address Description
809FC1 INT0

809FC2 INT1

809FC3 INT2

809FC4 INT3

809FC5 XINT0

809FC6 RINT0

809FC7 Reserved

809FC8 Reserved

809FC9 TINT0

809FCA TINT1

809FCB DINT0

809FCC–809FDF Reserved

809FE0 TRAP0

809FE1 TRAP1

• •

• •

• •

809FFB TRAP27

809FFC–809FFF Reserved

3-34
Using the TMS320C31 Boot Loader

3.4.8 Precautions
The boot loader builds a one-word-deep stack, starting at location 809801h.

Avoid loading code at location 809801h.

The interrupt flags are not reset by the boot-loader function. If pending inter-
rupts are to be avoided when interrupts are enabled, clear the IF register be-
fore enabling interrupts.

The MCBL/MP pin should remain high during the entire boot-loader execution,
but it can be changed subsequently at any time. The TMS320C31 does not
need to be reset after the MCBL/MP pin is changed. During the change, the
TMS320C31 should not access addresses 0h–FFFh.

CPU Registers, Memory, and Cache 3-35


3-36
Chapter 4

Data Formats and Floating-Point Operation

In the TMS320C3x architecture, data is organized into three fundamental


types: integer, unsigned-integer, and floating-point. The terms integer and
signed-integer are considered to be equivalent. The TMS320C3x supports
short and single-precision formats for signed and unsigned integers. It also
supports short, single-precision, and extended-precision formats for float-
ing-point data.

Floating-point operations make fast, trouble-free, accurate, and precise com-


putations. Specifically, the TMS320C3x implementation of floating-point arith-
metic facilitates floating-point operations at integer speeds while preventing
problems with overflow, operand alignment, and other burdensome tasks
common in integer operations.

This chapter discusses in detail the data formats and floating-point operations
supported in the TMS320C3x. Major topics in this section are as follows:

Topic Page

4.1 Integer Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2


4.2 Unsigned-Integer Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4.3 Floating-Point Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
4.4 Floating-Point Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10
4.5 Floating-Point Addition and Subtraction . . . . . . . . . . . . . . . . . . . . . . . 4-14
4.6 Normalization Using the NORM Instruction . . . . . . . . . . . . . . . . . . . . . 4-18
4.7 Rounding: The RND Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-20
4.8 Floating-Point-to-Integer Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . 4-22
4.9 Integer-to-Floating-Point Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . 4-24

4-1
Integer Formats

4.1 Integer Formats


The TMS320C3x supports two integer formats: a 16-bit short integer format
and a 32-bit single-precision integer format. When extended-precision regis-
ters are used as integer operands, only bits 31– 0 are used; bits 39 – 32 remain
unchanged and unused.

4.1.1 Short-Integer Format

The short integer format is a 16-bit two’s complement integer format for imme-
diate integer operands. For those instructions that assume integer operands,
this format is sign-extended to 32 bits (see Figure 4–1). The range of an
integer si, represented in the short integer format, is –215 ≤ si ≤ 215 – 1. In
Figure 4–1, s = signed bit.

Figure 4–1. Short Integer Format and Sign Extension of Short Integers

15 0

(a) Short Integer Format

31 16 15 0

s s s s s s s s s s s s s s s s

(b) Sign Extension of a Short Integer

4.1.2 Single-Precision Integer Format

In the single-precision integer format, the integer is represented in two’s com-


plement notation. The range of an integer sp, represented in the single-preci-
sion integer format, is – 231 ≤ sp ≤ 231 – 1. Figure 4–2 shows the single-preci-
sion integer format.

Figure 4–2. Single-Precision Integer Format

31 0

4-2
Unsigned-Integer Formats

4.2 Unsigned-Integer Formats


The TMS320C3x supports two unsigned-integer formats: a 16-bit short format
and a 32-bit single-precision format. In extended-precision registers, the un-
signed-integer operands use only bits 31–0; bits 39–32 remain unchanged.

4.2.1 Short Unsigned-Integer Format


Figure 4–3 shows the16-bit, short, unsigned-integer format for immediate un-
signed-integer operands. For those instructions that assume
unsigned-integer operands, this format is zero-filled to 32 bits. In Figure 4–3,
x = most significant bit (MSB) (1 or 0).

Figure 4–3. Short Unsigned-Integer Format and Zero Fill

15 0

(a) Short Unsigned-Integer Format

31 16 15 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x

(b) Zero Fill of a Short Unsigned Integer

4.2.2 Single-Precision Unsigned-Integer Format


In the single-precision unsigned-integer format, the number is represented as
a 32-bit value, as shown in Figure 4–4.

Figure 4–4. Single-Precision Unsigned-Integer Format

31 0

Data Formats and Floating-Point Operation 4-3


Floating-Point Formats

4.3 Floating-Point Formats


All TMS320C3x floating-point formats consist of three fields: an exponent field
(e), a single-bit sign field (s), and a fraction field (f ). These are stored as shown
in Figure 4–5. The exponent field is a two’s complement number. The sign field
and fraction field may be considered one unit and referred to as the mantissa
field (man). The two’s complement fraction is combined with the sign bit and
the implied most significant bit to create the mantissa. The mantissa repre-
sents a normalized two’s complement number. A normalized representation
implies a most significant nonsign bit, thus providing additional precision. The
value of a floating-point number x as a function of the fields e, s, and f is given as
x = 01.f × 2e if s = 0, or if the leading 0 is the sign bit and the
1 is the implied most significant nonsign bit
10.f × 2e if s = 1, or if the leading 1 is the sign bit and the
0 is the implied most significant nonsign bit
0 if e = most negative two’s complement
value of the specified exponent field width

Figure 4–5. Generic Floating-Point Format

e s f

man (mantissa)

Note: e = exponent field


s = single-bit sign field
f = fraction field

Three floating-point formats are supported on the TMS320C3x. The first is a


short floating-point format for immediate floating-point operands, consisting of
a 4-bit exponent, a sign bit, and an 11-bit fraction. The second is a single-preci-
sion format consisting of an 8-bit exponent, a sign bit, and a 23-bit fraction. The
third is an extended-precision format consisting of an 8-bit exponent, a sign
bit, and a 31-bit fraction.

4.3.1 Short Floating-Point Format


In the short floating-point format, floating-point numbers are represented by
a two’s complement 4-bit exponent field (e) and a two’s complement 12-bit
mantissa field (man) with an implied most significant nonsign bit. See
Figure 4–6.

4-4
Floating-Point Formats

Figure 4–6. Short Floating-Point Format

15 12 11 10 0

e s f

mantissa

Operations are performed with an implied binary point between bits 11 and 10.
When the implied most significant nonsign bit is made explicit, it is located to
the immediate left of the binary point. The floating-point two’s complement
number x in the short floating-point format is given by the following:
x = 01.f × 2e if s = 0
10.f × 2e if s = 1
0 if e = – 8

You must use the following reserved values to represent 0 in the short float-
ing-point format:
e=–8
s=0
f=0

The following examples illustrate the range and precision of the short float-
ing-point format:
Most Positive: x = (2 – 2 –11) × 27 = 2.5594 × 102
Least Positive: x = 1 × 2 –7 = 7.8125 × 10–3
Least Negative: x = (–1– 2 –11) × 2 –7 = –7.8163 × 10–3
Most Negative: x = –2 × 27 = – 2.5600 × 102

Data Formats and Floating-Point Operation 4-5


Floating-Point Formats

4.3.2 Single-Precision Floating-Point Format

In the single-precision format, the floating-point number is represented by an


8-bit exponent field (e ) and a two’s complement 24-bit mantissa field (man)
with an implied most significant nonsign bit. See Figure 4–7.

Figure 4–7. Single-Precision Floating-Point Format

31 24 23 22 0

e s f

mantissa

Operations are performed with an implied binary point between bits 23 and 22.
When the implied most significant nonsign bit is made explicit, it is located to
the immediate left of the binary point. The floating-point number x is given by
the following:

x = 01.f × 2e if s = 0
10.f × 2e if s = 1
0 if e = – 8

You must use the following reserved values to represent 0 in the single-preci-
sion floating-point format:
e = – 128
s=0
f=0

The following examples illustrate the range and precision of the single-preci-
sion floating-point format.

Most Positive: x = (2 – 2 – 23) × 2127 = 3.4028234 × 1038


Least Positive: x = 1 × 2 –127 = 5.8774717 × 10–39
Least Negative: x = (–1–2 – 23) × 2 –127 = – 5.8774724 × 10–39
Most Negative: x = – 2 × 2127 = – 3.4028236 × 1038

4.3.3 Extended-Precision Floating-Point Format

In the extended-precision format, the floating-point number is represented by


an 8-bit exponent field (e ) and a 32-bit mantissa field (man) with an implied
most significant nonsign bit. See Figure 4–8.

4-6
Floating-Point Formats

Figure 4–8. Extended-Precision Floating-Point Format

39 32 31 30 0

e s f

mantissa

Operations are performed with an implied binary point between bits 31 and 30.
When the implied most significant nonsign bit is made explicit, it is located to
the immediate left of the binary point. The floating-point number x is given by
the following:
x = 01.f × 2e if s = 0
10.f × 2e if s = 1
0 if e = –128

You must use the following reserved values to represent 0 in the extended-pre-
cision floating-point format:
e = –128
s=0
f=0

The following examples illustrate the range and precision of the extended-pre-
cision floating-point format:
Most Positive: x = (2 – 2 – 23) × 2127 = 3.4028234 × 1038
Least Positive: x = 1 × 2 –127 = 5.8774717541 × 1038
Least Negative: x = (–1–2 –31) × 2 –127 = – 5.8774717569 × 10–39
Most Negative: x = – 2 × 2127 = – 3.4028236691 × 1038

Data Formats and Floating-Point Operation 4-7


Floating-Point Formats

4.3.4 Conversion Between Floating-Point Formats


Floating-point operations assume several different formats for inputs and out-
puts. These formats often require conversion from one floating-point format to
another (e.g., short floating-point format to extended-precision floating-point
format). Format conversions occur automatically in hardware, with no over-
head, as a part of the floating-point operations. Examples of the four conver-
sions are shown in Figure 4–9, Figure 4–10, Figure 4–11, and Figure 4–12.
When a floating-point format 0 is converted to a greater-precision format, it is
always converted to a valid representation of 0 in that format. In Figure 4–9,
Figure 4–10, Figure 4–11, and Figure 4–12, s = sign bit of the exponent.

Figure 4–9. Converting From Short Floating-Point Format to Single-Precision


Floating-Point Format

15 12 11 10 0

s x x x y y y

(a) Short Floating-Point Format

31 27 24 23 22 12 11 0

s s s s x x x x y y y 0 0

(b) Single-Precision Floating-Point Format

In this format, the exponent field is sign-extended, and the fraction field is filled
with 0s.

Figure 4–10. Converting From Short Floating-Point Format to Extended-Precision


Floating-Point Format

15 12 11 10 0

s x x x y y y

(a) Short Floating-Point Format

39 35 32 31 30 20 19 0

s s s s x x x x y y y 0 0

(b) Extended-Precision Floating-Point Format

The exponent field in this format is sign-extended, and the fraction field is filled
with 0s.

4-8
Floating-Point Formats

Figure 4–11. Converting From Single-Precision Floating-Point Format to


Extended-Precision Floating-Point Format

31 24 23 22 0

x x y y y

(a) Single-Precision Floating-Point Format

39 32 31 30 8 7 0

x x y y y 0 0

(b) Extended-Precision Floating-Point Format

The fraction field is filled with 0s.

Figure 4–12. Converting From Extended-Precision Floating-Point Format to


Single-Precision Floating-Point Format

39 32 31 30 8 7 0

x x y y y z z

(a) Extended-Precision Floating-Point Format

31 24 23 22 0

x x y y y

(b) Single-Precision Floating-Point Format

The fraction field is truncated.

Data Formats and Floating-Point Operation 4-9


Floating-Point Multiplication

4.4 Floating-Point Multiplication


A floating-point number α can be written in floating-point format as in the fol-
lowing formula:

α = α(man) × 2α(exp)

where:
α(man) is the mantissa and α(exp) is the exponent.

The product of α and b is c, defined as:

c = α × b = α(man) × b(man) × 2(α(exp) + b (exp))

where:
c(man) = α(man) × b(man), and
c(exp) = α(exp) + b(exp)

During floating-point multiplication, source operands are always assumed to


be in the single-precision floating-point format. If the source of the operands
is in short floating-point format, it is extended to the single-precision float-
ing-point format. If the source of the operands is in extended-precision float-
ing-point format, it is truncated to single-precision format. These conversions
occur automatically in hardware with no overhead. All results of floating-point
multiplications are in the extended-precision format. These multiplications oc-
cur in a single cycle.

A flowchart for floating-point multiplication is shown in Figure 4–13. In step 1,


the 24-bit source operand mantissas are multiplied, producing a 50-bit result
c(man). (Note that input and output data are always represented as normal-
ized numbers.) In step 2, the exponents are added, yielding c(exp). Steps 3
through 6 check for special cases. Step 3 checks for whether c(man) in exten-
ded-precision format is equal to 0. If c(man) is 0, step 7 sets c(exp) to –128,
thus yielding the representation for 0.

Steps 4 and 5 normalize the result. If a right shift of 1 is necessary, then in step
8, c(man) is right-shifted 1 bit, thus adding 1 to c(exp). If a right shift of 2 is nec-
essary, then in step 9, c(man) is right-shifted 2 bits, thus adding 2 to c(exp).
Step 6 occurs when the result is normalized.

In step 10, c(man) is set in the extended-precision floating-point format. Steps


11 through 16 check for special cases of c(exp). If c(exp) has overflowed (step
11) in the positive direction, then step 14 sets c(exp) to the most positive exten-
ded-precision format value. If c(exp) has overflowed in the negative direction,
then step 14 sets c(exp) to the most negative extended-precision format value.
If c(exp) has underflowed (step 12), then step 15 sets c to 0; that is, c(man)
= 0 and c(exp) = –128.

4-10
Floating-Point Multiplication

Figure 4–13. Flowchart for Floating-Point Multiplication


α(man) b(man) α(exp) b(exp)

(1) (2)
Multiply mantissas Add exponents

c(man) = α(man) x b(man) c(exp) = α(exp) + b(exp)


(50-bit result)

Test for special cases of c(man)


(3) (4) (5) (6)
c(man) = 0 Right- shift 1 Right- shift 2 No shift
to normalize to normalize to normalize

(7) (8) (9)

c(exp) = c(man) > > c(man) > >


– 128 1 2
and c(exp) = and c(exp) =
c(exp) + 1 c(exp) + 2

Dispose of extra bits (10)


Put c(man) in extended
precision floating-point
format

Test for special cases of c(exp)

(11) (12) (13)


c(exp) overflow c(exp) underflow c(exp) in range

(14)
If c(man) > 0, c(exp) = –128 (15)
set c(exp) to most c(man) = 0
positive value
If c(man) < 0,
set c(exp) to most
negative value

Set c to final result (16)

c=αxb

Data Formats and Floating-Point Operation 4-11


Floating-Point Multiplication

Example 4–1, Example 4–2, Example 4–3, Example 4–4, and Example 4–5
illustrate how floating-point multiplication is performed on the TMS320C3x.
For these examples, the implied most significant nonsign bit is made explicit.

Example 4–1.Floating-Point Multiply (Both Mantissas = –2.0)


Let:
α = –2.0 × 2α(exp) = 10 .00000000000000000000000 × 2α(exp)
b = –2.0 × 2b(exp) = 10 .00000000000000000000000 × 2b(exp)

where:

α and b are both represented in binary form according to the normalized sing-
le-precision floating-point format.

Then:

10 .00000000000000000000000 × 2α(exp)
× 10 .00000000000000000000000 × 2b(exp)
0100 .0000000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp))

To place this number in the proper normalized format, it is necessary to shift


the mantissa two places to the right and add 2 to the exponent. This yields:

10 .00000000000000000000000 × 2α(exp)
x 10 .00000000000000000000000 × 2b(exp)
01 .0000000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp) + 2)
In floating-point multiplication, the exponent of the result may overflow. This
can occur when the exponents are initially added or when the exponent is mo-
dified during normalization.

Example 4–2.Floating-Point Multiply (Both Mantissas = 1.5)


Let:
a = 1.5 × 2α(exp) = 01.10000000000000000000000 × 2α(exp)
b = 1.5 × 2b(exp) = 01.10000000000000000000000 × 2b(exp)
where a and b are both represented in binary form according to the single-pre-
cision floating-point format. Then:
01 .10000000000000000000000 × 2α(exp)
× 01 .10000000000000000000000 × 2b(exp)
0010 .0100000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp))

4-12
Floating-Point Multiplication

To place this number in the proper normalized format, it is necessary to shift


the mantissa one place to the right and add 1 to the exponent. This yields:

01 .10000000000000000000000 × 2α(exp)
× 01 .10000000000000000000000 × 2b(exp)
01 .00100000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp) + 1)

Example 4–3.Floating-Point Multiply (Both Mantissas = 1.0)


Let:
α = 1.0 × 2α(exp) = 01 .00000000000000000000000 × 2α(exp)
b = 1.0 × 2b(exp) = 01 .00000000000000000000000 × 2b(exp)
where a and b are both represented in binary form according to the single-pre-
cision floating-point format. Then:

01 .00000000000000000000000 × 2α(exp)
× 01 .00000000000000000000000 × 2b(exp)
0001.0000000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp))
This number is in the proper normalized format. Therefore, no shift of the man-
tissa or modification of the exponent is necessary.
These examples have shown cases where the product of two normalized num-
bers can be normalized with a shift of 0, 1, or 2. For all normalized inputs with
the floating-point format used by the TMS320C3x, a normalized result can be
produced by a shift of 0, 1, or 2.

Example 4–4.Floating-Point Multiply Between Positive and Negative Numbers


Let:
α = 1.0 x 2α(exp) = 01 .00000000000000000000000 x 2α(exp)
b = –2.0 x 2b(exp) = 10 .00000000000000000000000 x 2b(exp)
Then:

01 .00000000000000000000000 × 2α(exp)
× 10 .00000000000000000000000 × 2b(exp)
1110 .0000000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp))
The result is c = – 2.0 x 2(α(exp) + b(exp))

Example 4–5.Floating-Point Multiply by 0


All multiplications by a floating-point 0 yield a result of 0 (f = 0, s = 0, and exp
= –128).

Data Formats and Floating-Point Operation 4-13


Floating-Point Addition and Subtraction

4.5 Floating-Point Addition and Subtraction


In floating-point addition and subtraction, two floating-point numbers α and b
can be defined as:
α = α(man) × 2 α(exp)
b = b(man) × 2 b(exp)

The sum (or difference) of α and b can be defined as:


c=α±b
= (α(man) ± (b(man) × 2 – (α(exp) – b(exp)))) × 2 α(exp),
if α(exp) ≥ b(exp)
= ((α(man) × 2 – (b(exp) – α(exp))) ± b(man)) × 2 b(exp),
if α(exp) < b(exp)

The flowchart for floating-point addition is shown in Figure 4–14. Since this
flowchart assumes signed data, it is also appropriate for floating-point subtrac-
tion. In this figure, it is assumed that α(exp) ≤ b(exp). In step 1, the source ex-
ponents are compared, and c(exp) is set equal to the largest of the two source
exponents. In step 2, d is set to the difference of the two exponents. In step 3,
the mantissa with the smallest exponent, in this case α(man), is right-shifted
d bits to align the mantissas. After the mantissas have been aligned, they are
added (step 4).

Steps 5 through 7 check for a special case of c(man). If c(man) is 0 (step 5),
then c(exp) is set to its most negative value (step 8) to yield the correct repre-
sentation of 0. If c(man) has overflowed c (step 6), then c(man) is right-shifted
one bit, and 1 is added to c(exp). Otherwise, step 10 normalizes c by left-shift-
ing c(man) and subtracting c(exp) by the number of leading non-significant
sign bits (step 7). Steps 11 through 13 check for special cases of c(exp). If
c(exp) has overflowed (step 11) in the positive direction, then step 14 sets
c(exp) to the most positive extended-precision format value. If c(exp) has over-
flowed (step 11) in the negative direction, then step 14 sets c(exp) to the most
negative extended-precision format value. If c(exp) has underflowed (step 12),
then step 15 sets c to 0; that is, c(man) = 0 and c(exp) = –128.

4-14
Floating-Point Addition and Subtraction

Figure 4–14. Flowchart for Floating-Point Addition


α(man) b(man) α(exp) b(exp)
(1)
Compare exponents
If α(exp) < = b(exp)
c(exp) = b(exp)
(3) else
Align mantissas c(exp) = α(exp)
α(man) = α(man) > > d (Assume for simplicity
that α(exp) < = b(exp))
Discard LSBs to keep
α(man) in extended-
precision floating- (2) Subtract exponents
point format d = b(exp) ± α(exp)

(4) Add mantissas


c (man) = α(man) + b(man)

Test for special cases of c(man)


(5) (6) (7)
k = # of leading
c(man) = 0 Overflow of c(man) non-significant
sign bits
(9)
c(man) = c(man) > > 1
c(exp) = c(exp) + 1
Discard LSBs to keep in
extended-precision
floating-point format (10)
(8)
c(man) < < k
c(exp) = –128 c(exp) = c(exp) – k

Test for special cases of c(exp)


(11) (12) (13)
c(exp) overflow c(exp) underflow c(exp) in range

(14) If c(man) > 0, set c to 0 (15)


set c to most c(exp) = –128
positive value c(man) = 0
If c(man) < 0,
set c to most
negative value

(16)
Set c to final result

c=α+b

Data Formats and Floating-Point Operation 4-15


Floating-Point Addition and Subtraction

Example 4–6, Example 4–7, Example 4–8, and Example 4–9 describe the
floating-point addition and subtraction operations. It is assumed that the data
is in the extended-precision floating-point format.

Example 4–6.Floating-Point Addition


In the case of two normalized numbers to be summed, let
α = 1.5 = 01.1000000000000000000000000000000 × 20
b = 0.5 = 01.0000000000000000000000000000000 × 2 –1

It is necessary to shift b to the right by 1 so that α and b have the same expo-
nent. This yields:
b = 0.5 = 00.1000000000000000000000000000000 × 20

Then:

01 .10000000000000000000000000000000 × 20
+ 00 .10000000000000000000000000000000 × 20
010 .00000000000000000000000000000000 × 20

As in the case of multiplication, it is necessary to shift the binary point one place
to the left and add 1 to the exponent. This yields:

01 .1000000000000000000000000000000 × 20
± 00 .1000000000000000000000000000000 × 20
01 .0000000000000000000000000000000 × 21

Example 4–7.Floating-Point Subtraction


A subtraction is performed in this example. Let
α = 01.0000000000000000000000000000001 × 20
b = 01.0000000000000000000000000000000 × 20

The operation to be performed is α– b. The mantissas are already aligned be-


cause the two numbers have the same exponent. The result is a large cancel-
lation of the upper bits, as shown below.

01 .0000000000000000000000000000001 × 20
– 01 .0000000000000000000000000000000 × 20
00 .0000000000000000000000000000001 × 20

4-16
Floating-Point Addition and Subtraction

The result must be normalized. In this case, a left-shift of 31 is required. The


exponent of the result is modified accordingly. The result is:

01 .0000000000000000000000000000001 × 20
– 01 .0000000000000000000000000000000 × 20
01 .0000000000000000000000000000000 × 2 –31

Example 4–8.Floating-Point Addition With a 32-Bit Shift


This example illustrates a situation where a full 32-bit shift is necessary to nor-
malize the result. Let

α = 01.1111111111111111111111111111111 × 2127
b = 10.0000000000000000000000000000000 × 2127

The operation to be performed is α + b.

01.1111111111111111111111111111111 × 2127
+ 10.0000000000000000000000000000000 × 2127
11.1111111111111111111111111111111 × 2127

Normalizing the result requires a left-shift of 32 and a subtraction of 32 from


the exponent. The result is:

01.1111111111111111111111111111111 × 2127
+ 10.0000000000000000000000000000000 × 2127
10.0000000000000000000000000000000 × 295

Example 4–9.Floating-Point Addition/Subtraction With Floating-Point 0


When floating-point addition and subtraction are performed with a float-
ing-point 0, the following identities are satisfied:

α ± 0 = α (α ≠ 0)

0±0=0

0 –α = – α (α ≠ 0)

Data Formats and Floating-Point Operation 4-17


Normalization Using the NORM Instruction

4.6 Normalization Using the NORM Instruction


The NORM instruction normalizes an extended-precision floating-point num-
ber that is assumed to be unnormalized. See Example 4–10. Since the num-
ber is assumed to be unnormalized, no implied most significant nonsign bit is
assumed. The NORM instruction:
1) Locates the most significant nonsign bit of the floating-point number,
2) Left-shifts to normalize the number, and
3) Adjusts the exponent.

Example 4–10. NORM Instruction


Assume that an extended-precision register contains the value
man = 00000000000000000001000000000001, exp = 0

When the normalization is performed on a number assumed to be unnormal-


ized, the binary point is assumed to be:
man = 0.0000000000000000001000000000001, exp = 0

This number is then sign-extended one bit so that the mantissa contains 33
bits.
man = 00.0000000000000000001000000000001, exp = 0

The intermediate result after the most significant nonsign bit is located and the
shift performed is:
man = 01.0000000000010000000000000000000, exp = –19

The final 32-bit value output after removing the redundant bit is:
man = 00000000000010000000000000000000, exp = –19

The NORM instruction is useful for counting the number of leading 0s or lead-
ing 1s in a 32-bit field. If the exponent is initially 0, the absolute value of the final
value of the exponent is the number of leading 1s or 0s. This instruction is also
useful for manipulating unnormalized floating-point numbers.

Given the extended-precision floating-point value a to be normalized, the nor-


malization, norm ( ), is performed as shown in Figure 4–15.

4-18
Normalization Using the NORM Instruction

Figure 4–15. Flowchart for NORM Instruction Operation


α

Test for special cases of c (man)


(1) (2)
α ( man) = 0 Leading nonsignificant
sign bits
k = # of leading
nonsignificant
sign bits
(3) (4)
c(exp) = –128
Sign-extended α(man) 1 bit
c (man) = α(man) < < k
c (exp) = α(exp) – k

Remove most significant nonsign bit (5)

Test for special cases of c (exp)


(6) (7)
c (exp) c (exp) in
underflow range

(8)
c (exp) = –128
No change to c (man)

(9) Set c to final result

c = norm(α)

Data Formats and Floating-Point Operation 4-19


Rounding: The RND Instruction

4.7 Rounding: The RND Instruction


The RND instruction rounds a number from the extended-precision float-
ing-point format to the single-precision floating-point format. Rounding is simi-
lar to floating-point addition. Given the number a to be rounded, the following
operation is performed first.
c = α(man) × 2α(exp) + (1 × 2α(exp) –24)

Next, a conversion from extended-precision floating-point to single-precision


floating-point format is performed. Given the extended-precision floating-point
value, the rounding, rnd( ), is performed as shown in Figure 4–16.

4-20
Rounding: The RND Instruction

Figure 4–16. Flowchart for Floating-Point Rounding by the RND Instruction


α(exp) – 24
α 1×2

Add α(man) and 1/2 of LSB


c ( man) = α ( man) + 2– 24

Test for special cases of c(man)

c (man) = 0 Overflow of c (man) No special case

c (exp) = –128 c (man) = c (man) < < 1


c (exp) = α (exp) + 1

Test for special cases of c (exp)


c (exp) overflow c (exp) in range

If c (man) > 0,
set c to most positive
single-precision value
If c (man) < 0,
set c to most negative
single-precision value

Set 8 LSBs of c(man) to 0

c = rnd(α)

Data Formats and Floating-Point Operation 4-21


Floating-Point-to-Integer Conversion

4.8 Floating-Point-to-Integer Conversion


Floating-point to integer conversion, using the FIX instructions, allows exten-
ded-precision floating-point numbers to be converted to single-precision inte-
gers in a single cycle. The floating-point to integer conversion of the value x
is referred to here as fix(x). The conversion does not overflow if a, the number
to be converted, is in the range
– 231 ≤ α ≤ 231 – 1

First, you must be certain that

α(exp) ≤ 30

If these bounds are not met, an overflow occurs. If an overflow occurs in the
positive direction, the output is the most positive integer. If an overflow occurs
in the negative direction, the output is the most negative integer. If α(exp) is
within the valid range, then α(man), with implied bit included, is sign-extended
and right-shifted (rs) by the amount
rs = 31 – α(exp)

This right-shift (rs) shifts out those bits corresponding to the fractional part of
the mantissa. For example:
If 0 ≤ × < 1, then fix(x) = 0.
If –1 ≤ × < 0, then fix(x) = –1.

The flowchart for the floating-point-to-integer conversion is shown in


Figure 4–17.

4-22
Floating-Point-to-Integer Conversion

Figure 4–17. Flowchart for Floating-Point-to-Integer Conversion by FIX Instructions


α

Test for special cases of α(exp)

α(exp) in range
α(exp) > 30
rs = 31 – α(exp)

Overflow Shift

If α(man) > 0, c = α(man) > > rs


c = most positive integer
If α(man) < 0,
c = most negative integer

Set c to final result

c = fix(α)

Data Formats and Floating-Point Operation 4-23


Integer-to-Floating-Point Conversion

4.9 Integer-to-Floating-Point Conversion


Integer to floating-point conversion, using the FLOAT instruction, allows sing-
le-precision integers to be converted to extended-precision floating-point
numbers. The flowchart for this conversion is shown in Figure 4–18.

Figure 4–18. Flowchart for Integer-to-Floating-Point Conversion by FLOAT Instructions


α

c (man) = α
c (exp) = 30

Test for special cases of c (man)

Leading nonsignificant
c (man) = 0 sign bits

k = # leading
nonsignificant
sign bits

c (exp) = –128 c (man) = c (man) < < k


c (exp) = 30 – k

Remove most significant nonsign bit

Set c to final result

c = float (α)

4-24
Chapter 5

Addressing

The TMS320C3x supports five groups of powerful addressing modes. Six


types of addressing may be used within the groups, which allow access of data
from memory, registers, and the instruction word. This chapter details the op-
eration, encoding, and implementation of the addressing modes. It also dis-
cusses the management of system stacks, queues, and dequeues in memory.

These are the major topics in this chapter:

Topic Page

5.1 Types of Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2


5.2 Groups of Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-19
5.3 Circular Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24
5.4 Bit-Reversed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-29
5.5 System and User Stack Management . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31

5-1
Types of Addressing

5.1 Types of Addressing


Six types of addressing allow access of data from memory, registers, and the
instruction word:
- Register
- Direct
- Indirect
- Short-immediate
- Long-immediate
- PC-relative

Some types of addressing are appropriate for some instructions but not others.
For this reason, the types of addressing are used in the five groups of address-
ing modes as follows:

- General addressing modes (G):


J Register
J Direct
J Indirect
J Short-immediate

- Three-operand addressing modes (T):


J Register
J Indirect

- Parallel addressing modes (P):


J Register
J Indirect

- Conditional-branch addressing modes (B):


J Register
J PC-relative

The six types of addressing are discussed first, followed by the five groups of
addressing modes.

5-2
Types of Addressing

5.1.1 Register Addressing


In register addressing, a CPU register contains the operand, as shown in this
example:

ABSF R1 ; R1 = |R1|

The syntax for the CPU registers, the assembler syntax, and the assigned
function for those registers are listed in Table 5–1.

Table 5–1. CPU Register Address/Assembler Syntax and Function


Assembler Assigned
CPU Register Address Syntax Function

00h R0 Extended-precision register


01h R1 Extended-precision register
02h R2 Extended-precision register
03h R3 Extended-precision register
04h R4 Extended-precision register
05h R5 Extended-precision register
06h R6 Extended-precision register
07h R7 Extended-precision register

08h AR0 Auxiliary register


09h AR1 Auxiliary register
0Ah AR2 Auxiliary register
0Bh AR3 Auxiliary register
0Ch AR4 Auxiliary register
0Dh AR5 Auxiliary register
0Eh AR6 Auxiliary register
0FH AR7 Auxiliary register

10h DP Data-page pointer


11h IR0 Index register 0
12h IR1 Index register 1
13h BK Block-size register
14h SP Active stack pointer

15h ST Status register


16h IE CPU/DMA interrupt enable
17h IF CPU interrupt flags
18h IOF I/O flags

19h RS Repeat start address


1Ah RE Repeat end address
1Bh RC Repeat counter

Addressing 5-3
Types of Addressing

5.1.2 Direct Addressing


In direct addressing, the data address is formed by the concatenation of the
eight least significant bits of the data page pointer (DP) with the 16 least signifi-
cant bits of the instruction word (expr). This results in 256 pages (64K words per
page), giving the programmer a large address space without requiring a change
of the page pointer. The syntax and operation for direct addressing are:

Syntax: @expr

Operation: address = DP concatenated with expr

Figure 5–1 shows the formation of the data address. Example 5–1 is an
instruction example with data before and after instruction execution.

Figure 5–1. Direct Addressing

31 16 15 0
Instruction
expr
Word

31 8 7 0
DP x x...x x page
(Data
Page Pointer)
31 24 23 0

0 0...0 0 address

31 0
operand

Example 5–1.Direct Addressing


ADDI @0BCDEh,R7

Before Instruction: After Instruction:

DP = 8Ah DP = 8Ah

R7 = 0h R7 = 12345678h

Data at 8ABCDEh = 12345678h Data at 8ABCDEh = 12345678h

5-4
Types of Addressing

5.1.3 Indirect Addressing


Indirect addressing is used to specify the address of an operand in memory
through the contents of an auxiliary register, optional displacements, and in-
dex registers. Only the 24 least significant bits of the auxiliary registers and in-
dex registers are used in indirect addressing. This arithmetic is performed by
the auxiliary register arithmetic units (ARAUs) on these lower 24 bits and is un-
signed. The upper eight bits are unmodified.
The flexibility of indirect addressing is possible because the ARAUs on the
TMS320C3x modify auxiliary registers in parallel with operations within the
main CPU. Indirect addressing is specified by a five-bit field in the instruction
word, referred to as the mod field. A displacement is either an explicit unsigned
eight-bit integer contained in the instruction word or an implicit displacement
of one. Two index registers, IR0 and IR1, can also be used in indirect address-
ing. In some cases, an optional addressing scheme using circular or bit-rev-
ersed addressing can be used. The mechanism for generating addresses in
circular addressing is discussed in Section 5.3 on page 5-24; bit-reversed is
discussed in Section 5.4 on page 5-29.

Note: Auxiliary Register


The auxiliary register (ARn) to be used is encoded in the instruction word ac-
cording to its binary representation n (for example, AR3 is encoded as 112),
not its register machine address (shown in Table 5–1).

Example 5–2.Auxiliary Register Indirect


An auxiliary register (ARn) contains the address of the operand to be fetched.
Operation: operand address = ARn
Assembler Syntax: *ARn
Modification Field: 11000

31 24 23 0
ARn x x address

31 0

operand

Table 5–2 lists the various kinds of indirect addressing, along with the value
of the modification (mod) field, assembler syntax, operation, and function for
each. The succeeding 17 examples show the operation for each kind of indi-
rect addressing. Figure 5–2 shows the format in the instruction encoding.

Addressing 5-5
Types of Addressing

Table 5–2. Indirect Addressing


Mod Field Syntax Operation Description
Indirect Addressing with Displacement
00000 *+ARn(disp) addr = ARn + disp With predisplacement add
00001 *– ARn(disp) addr = ARn – disp With predisplacement subtract
00010 *++ARn(disp) addr = ARn + disp With predisplacement add and modify
ARn = ARn + disp
00011 *– – ARn(disp) addr = ARn – disp With predisplacement subtract and modify
ARn = ARn – disp
00100 *ARn++(disp) addr = ARn With postdisplacement add and modify
ARn = ARn + disp
00101 *ARn – – (disp) addr = ARn With postdisplacement subtract and modify
ARn = ARn – disp
00110 *ARn++(disp)% addr = ARn With postdisplacement add and circular modify
ARn = circ(ARn + disp)
00111 *ARn – – (disp)% addr = ARn With postdisplacement subtract and circular
ARn = circ(ARn – disp) modify
Indirect Addressing with Index Register IR0
01000 *+ARn(IR0) addr = ARn + IR0 With preindex (IR0) add
01001 *– ARn(IR0) addr = ARn – IR0 With preindex (IR0) subtract
01010 *++ARn(IR0) addr = ARn + IR0 With preindex (IR0) add and modify
ARn = ARn + IR0
01011 * – – ARn(IR0) addr = ARn – IR0 With preindex (IR0) subtract and modify
ARn = ARn – IR0
01100 *ARn++(IR0) addr = ARn With postindex (IR0) add and modify
ARn = ARn + IR0
01101 *ARn – – (IR0) addr= ARn With postindex (IR0) subtract and modify
ARn = ARn – IR0
01110 *ARn++(IR0)% addr = ARn With postindex (IR0) add and circular
ARn = circ(ARn + IR0) modify
01111 *ARn – – (IR0)% addr = ARn With postindex (IR0) subtract and circular
ARn = circ(ARn) – IR0 modify
Legend: addr memory address ++ add and modify
ARn auxiliary register AR0–AR7 –– subtract and modify
circ( ) address in circular addressing % where circular addressing is performed
disp displacement

5-6
Types of Addressing

Table 5–2. Indirect Addressing (Continued)

Mod Field Syntax Operation Description


Indirect Addressing with Index Register IR1

10000 *+ ARn(IR1) addr = ARn + IR1 With preindex (IR1) add

10001 * – ARn(IR1) addr = ARn – IR1 With preindex (IR1) subtract

10010 * ++ ARn(IR1) addr = ARn + IR1 With preindex (IR1) add


ARn = ARn + IR1 and modify

10011 * – – ARn(IR1) addr = ARn – IR1 With preindex (IR1) subtract


ARn = ARn – IR1 and modify

10100 * ARn ++ (IR1) addr = ARn With postindex (IR1) add


ARn = ARn + IR1 and modify

10101 *ARn – – (IR1) addr = ARn With postindex (IR1) subtract


ARn = ARn – IR1 and modify

10110 * ARn ++ (IR1)% addr = ARn With postindex (IR1) add


ARn = circ(ARn + IR1) and circular modify

10111 * ARn – – (IR1)% addr = ARn With postindex (IR1) subtract


ARn = circ(ARn – IR1) and circular modify
Indirect Addressing (Special Cases)
11000 *ARn addr = ARn Indirect

11001 *ARn ++ (IR0)B addr = ARn With postindex (IR0) add


ARn = B(ARn + IR0) and bit-reversed modify
Legend: addr memory address circ( ) address in circular addressing
ARn auxiliary register AR0–AR7 ++ add and modify
B where bit-reversed addressing is performed % where circular addressing is performed

Example 5–3, Example 5–4, Example 5–5, Example 5–6, Example 5–7,
Example 5–8, Example 5–9, Example 5–10, Example 5–11, Example 5–12,
Example 5–13, Example 5–14, Example 5–15, Example 5–16,
Example 5–17, Example 5–18, and Example 5–19 exemplify indirect addres-
sing in Table 5–2.

Figure 5–2. Instruction Encoding Format


Most Significant Bit Least Significant Bit
MOD ARn disp†
5 Bits 3 Bits 0, 5, or 8 Bits
† disp field may not exist in some instructions

Addressing 5-7
Types of Addressing

Example 5–3.Indirect With Predisplacement Add


The address of the operand to be fetched is the sum of an auxiliary register
(ARn) and the displacement (disp). The displacement is either an eight-bit un-
signed integer contained in the instruction word or an implied value of 1.
Operation: operand address = ARn + disp
Assembler Syntax: *+ ARn(disp)
Modification Field: 00000

31 24 23 0

ARn x x address

31 8 7 0

disp 0 0...0 0 integer (+)

31 0
operand

Example 5–4.Indirect With Predisplacement Subtract


The address of the operand to be fetched is the contents of an auxiliary register
(ARn) minus the displacement (disp). The displacement is either an eight-bit
unsigned integer contained in the instruction word or an implied value of 1.
Operation: operand address = ARn – disp
Assembler Syntax: *– ARn(disp)
Modification Field: 00001

31 24 23 0

ARn x x address

31 8 7 0

disp 0 0...0 0 integer (–)

31 0
operand

5-8
Types of Addressing

Example 5–5.Indirect With Predisplacement Add and Modify


The address of the operand to be fetched is the sum of an auxiliary register
(ARn) and the displacement (disp). The displacement is either an eight-bit
unsigned integer contained in the instruction word or an implied value of 1.
After the data is fetched, the auxiliary register is updated with the address gen-
erated.
Operation: operand address = ARn + disp
ARn = ARn + disp
Assembler Syntax: *++ ARn (disp)
Modification Field: 00010

31 24 23 0

ARn x x address

31 8 7 0

disp 0 0...0 0 integer (+)

31 0
operand

Example 5–6.Indirect With Predisplacement Subtract and Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn) minus the displacement (disp). The displacement is either an eight-bit
unsigned integer contained in the instruction word or an implied value of 1. Af-
ter the data is fetched, the auxiliary register is updated with the address gener-
ated.
Operation: operand address = ARn – disp
ARn = ARn – disp
Assembler Syntax: *–– ARn(disp)
Modification Field: 00011

31 24 23 0

ARn x x address

31 8 7 0

disp 0 0...0 0 integer (–)

31 0
operand

Addressing 5-9
Types of Addressing

Example 5–7.Indirect With Postdisplacement Add and Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn). After the operand is fetched, the displacement (disp) is added to the
auxiliary register. The displacement is either an eight-bit unsigned integer con-
tained in the instruction word or an implied value of 1.
Operation: operand address = ARn
ARn = ARn + disp
Assembler Syntax: *ARn ++ (disp)
Modification Field: 00100

31 24 23 0

ARn x x address

31 8 7 0

disp 0 0...0 0 integer (+)

31 0
operand

Example 5–8.Indirect With Postdisplacement Subtract and Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn). After the operand is fetched, the displacement (disp) is subtracted from
the auxiliary register. The displacement is either an eight-bit unsigned integer
contained in the instruction word or an implied value of 1.
Operation: operand address = ARn
ARn = ARn – disp
Assembler Syntax: *ARn – – (disp)
Modification Field: 00101

31 24 23 0

ARn x x address

31 8 7 0

disp 0 0...0 0 integer (–)

31 0

operand

5-10
Types of Addressing

Example 5–9.Indirect With Postdisplacement Add and Circular Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn). After the operand is fetched, the displacement (disp) is added to the
contents of the auxiliary register using circular addressing. This result is used
to update the auxiliary register. The displacement is either an eight-bit un-
signed integer contained in the instruction word or an implied value of 1.
Operation: operand address = ARn
ARn = circ(ARn + disp)
Assembler Syntax: *ARn ++ (disp)%
Modification Field: 00110

31 24 23 0

ARn x x address

31 8 7 0 (%)
disp 0 0...0 0 integer (+)

31 0
operand

Example 5–10. Indirect With Postdisplacement Subtract and Circular Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn). After the operand is fetched, the displacement (disp) is subtracted from
the contents of the auxiliary register using circular addressing. This result is
used to update the auxiliary register. The displacement is either an eight-bit
unsigned integer contained in the instruction word or an implied value of 1.
Operation: operand address = ARn
ARn = circ(AR n – disp)
Assembler Syntax: *ARn – – (disp)%
Modification Field: 00111

31 24 23 0
ARn x x address

31 8 7 0 (%)
disp 0 0...0 0 integer (–)

31 0
operand

Addressing 5-11
Types of Addressing

Example 5–11. Indirect With Preindex Add

The address of the operand to be fetched is the sum of an auxiliary register


(ARn) and an index register (IR0 or IR1).
Operation: operand address = ARn + IRm
Assembler Syntax: *+ ARn(IRm)

Modification Field: 01000 if m = 0


10000 if m = 1

31 24 23 0

ARn x x address

31 24 23 0

IRm x x index (+)

31 0

operand

Example 5–12. Indirect With Preindex Subtract

The address of the operand to be fetched is the difference of an auxiliary regis-


ter (ARn) and an index register (IR0 or IR1).
Operation: operand address = ARn – IRm
Assembler Syntax: *– ARn(IRm)
Modification Field: 01001 if m = 0
10001 if m = 1

31 24 23 0

ARn x x address

31 24 23 0

IRm x x index (–)

31 0

operand

5-12
Types of Addressing

Example 5–13. Indirect With Preindex Add and Modify


The address of the operand to be fetched is the sum of an auxiliary register
(ARn) and an index register (IR0 or IR1). After the data is fetched, the auxiliary
register is updated with the address generated.
Operation: operand address = ARn + IRm
ARn = ARn + IRm
Assembler Syntax: *++ ARn(IRm)
Modification Field: 01010 if m = 0
10010 if m = 1

31 24 23 0
ARn x x address

31 24 23 0
IRm x x index (+)

31 0
operand

Example 5–14. Indirect With Preindex Subtract and Modify


The address of the operand to be fetched is the difference between an auxiliary
register (ARn) and an index register (IR0 or IR1). The resulting address be-
comes the new contents of the auxiliary register.
Operation: operand address = ARn – IRm
ARn = ARn – IRm
Assembler Syntax: *–– ARn(IRm)
Modification Field: 01011 if m = 0
10011 if m = 1

31 24 23 0
ARn x x address

31 24 23 0
IRm x x index (–)

31 0
operand

Addressing 5-13
Types of Addressing

Example 5–15. Indirect With Postindex Add and Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn). After the operand is fetched, the index register (IR0 or IR1) is added
to the auxiliary register.
Operation: operand address = ARn
ARn = ARn + IRm
Assembler Syntax: *ARn ++ (IRm)
Modification Field: 01100 if m = 0
10100 if m = 1

31 24 23 0
ARn x x address

31 24 23 0
(+)
IRm X X index

31 0
operand

Example 5–16. Indirect With Postindex Subtract and Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn). After the operand is fetched, the index register (IR0 or IR1) is sub-
tracted from the auxiliary register.
Operation: operand address = ARn
ARn = ARn – IRm
Assembler Syntax: *ARn – – (IRm)
Modification Field: 01101 if m = 0
10101 if m = 1

31 24 23 0
ARn x x address

31 24 23 0
IRm x x index (–)

31 0
operand

5-14
Types of Addressing

Example 5–17. Indirect With Postindex Add and Circular Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn). After the operand is fetched, the index register (IR0 or IR1) is added
to the auxiliary register. This value is evaluated using circular addressing and
replaces the contents of the auxiliary register.
Operation: operand address = ARn
ARn = circ(ARn + IRm)
Assembler Syntax: *ARn ++ (IRm)%
Modification Field: 01110 if m = 0
10110 if m = 1

31 24 23 0
ARn x x address

31 24 23 0 (%)
IRm x x index (+)

31 0
operand

Example 5–18. Indirect With Postindex Subtract and Circular Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn). After the operand is fetched, the index register (IR0 or IR1) is sub-
tracted from the auxiliary register. This result is evaluated using circular ad-
dressing and replaces the contents of the auxiliary register.
Operation: operand address = ARn
ARn = circ(ARn – IRm)
Assembler Syntax: *ARn – – (IRm)%
Modification Field: 01111 if m = 0
10111 if m = 1

31 24 23 0
ARn x x address

31 24 23 0 (%)
IRm x x index (–)

31 0
operand

Addressing 5-15
Types of Addressing

Example 5–19. Indirect With Postindex Add and Bit-Reversed Modify


The address of the operand to be fetched is the contents of an auxiliary register
(ARn). After the operand is fetched, the index register (IR0) is added to the
auxiliary register. This addition is performed with a reverse-carry propagation
and can be used to yield a bit-reversed (B) address. This value replaces the
contents of the auxiliary register.
Operation: operand address = ARn
ARn = B(ARn + IR0)
Assembler Syntax: *ARn ++ (IR0)B
Modification Field: 11001

31 24 23 0
ARn x x address

31 24 23 0 (B)
IRm x x index (+)

31 0
operand

5.1.4 Short-Immediate Addressing


In short-immediate addressing, the operand is a 16-bit immediate value con-
tained in the 16 least significant bits of the instruction word (expr). Depending
on the data types assumed for the instruction, the short-immediate operand
can be a two’s complement integer, an unsigned integer, or a floating-point
number. This is the syntax for this mode:

Syntax: expr

5-16
Types of Addressing

Example 5–20 illustrates before- and after-instruction data.

Example 5–20. Short-Immediate Addressing


SUBI 1,R0

Before Instruction: After Instruction:

R0 = 0h R0 = 0FFFFFFFFh

5.1.5 Long-Immediate Addressing


In long-immediate addressing, the operand is a 24-bit immediate value con-
tained in the 24 least significant bits of the instruction word (expr). This is the
syntax for this mode:

Syntax: expr

Example 5–21 illustrates before- and after-instruction data.

Example 5–21. Long-Immediate Addressing


BR 8000h

Before Instruction: After Instruction:

PC = 0h PC = 8000h

5.1.6 PC-Relative Addressing


Program counter (PC)-relative addressing is used for branching. It adds the
contents of the 16 or 24 least significant bits of the instruction word to the PC
register. The assembler takes the src (a label or address) specified by the user
and generates a displacement. If the branch is a standard branch, this dis-
placement is equal to [label – (instruction address +1)]. If the branch is a
delayed branch, this displacement is equal to [label – (instruction ad-
dress + 3)].

The displacement is stored as a 16-bit or 24-bit signed integer in the least sig-
nificant bits of the instruction word. The displacement is added to the PC during
the pipeline decode phase. Notice that because the PC is incremented by 1
in the fetch phase, the displacement is added to this incremented PC value.

Syntax: expr (src)

Example 5–22 illustrates before- and after-instruction data.

Addressing 5-17
Types of Addressing

Example 5–22. PC-Relative Addressing


BU NEWPC ; pc=1001h, NEWPC label = 1005h, displacement = 3

Before Instruction After Instruction


decode phase: execution phase:

PC = 1002h PC = 1005h

The 24-bit addressing mode encodes the program control instructions (for ex-
ample, BR, BRD, CALL, RPTB, and RPTBD). Depending on the instruction,
the new PC value is derived by adding a 24-bit signed value in the instruction
word with the present PC value. Bit 24 determines the type of branch (D = 0
for a standard branch or D = 1 for a delayed branch). Some of the instructions
are encoded in Figure 5–3.

Figure 5–3. Encoding for 24-Bit PC-Relative Addressing Mode


(a) BR, BRD: unconditional branches (standard and delayed)

31 25 24 23 0

0 1 1 0 0 0 0 0 displacement

(b) CALL: unconditional subroutine call

31 24 23 0

0 1 1 0 0 0 1 0 displacement

(c) RPTB: repeat block

31 25 24 23 0

0 1 1 0 0 1 0 0 displacement

5-18
Groups of Addressing Modes

5.2 Groups of Addressing Modes


Six types of addressing (covered in Section 5.1, beginning on page 5-2) form
these four groups of addressing modes:
- General addressing modes (G)
- Three-operand addressing modes (T)
- Parallel addressing modes (P)
- Conditional-branch addressing modes (B)

5.2.1 General Addressing Modes


Instructions that use the general addressing modes are general-purpose in-
structions, such as ADDI, MPYF, and LSH. Such instructions usually have this
form:

dst operation src → dst

where the destination operand is signified by dst and the source operand by
src; operation defines an operation to be performed on the operands using the
general addressing modes. Bits 31 –29 are 0, indicating general addressing
mode instructions. Bits 22 and 21 specify the general addressing mode (G)
field, which defines how bits 15–0 are to be interpreted for addressing the src
operand.

Options for bits 22 and 21 (G field) are as follows:

00 register (all CPU registers unless specified otherwise)


01 direct
10 indirect
11 immediate

If the src and dst fields contain register specifications, the value in these fields
contains the CPU register addresses as defined by Table 5–1 on page 5-3.
For the general addressing modes, the following values of ARn are valid:

ARn, 0 ≤ n ≤ 7

Figure 5–4 shows the encoding for the general addressing modes. The nota-
tion mod indicates the modification field that goes with the ARn field. Refer to
Table 5–2 on page 5-6 for further information.

Addressing 5-19
Groups of Addressing Modes

Figure 5–4. Encoding for General Addressing Modes


31 29 28 23 22 21 20 16 15 11 10 87 54 0

0 0 0 operation 0 0 dst 0 0 0 0 0 0 0 0 0 0 0 src


0 0 0 operation 0 1 dst direct
0 0 0 operation 1 0 dst modn ARn disp
0 0 0 operation 1 1 dst immediate

G Destination Source Operands

5.2.2 Three-Operand Addressing Modes


Instructions that use the three-operand addressing modes, such as
ADDI3, LSH3, CMPF3. or XOR3, usually have this form:

SRC1 operation SRC2 → dst

where the destination operand is signified by dst and the source operands by
SRC1 and SRC2; operation defines an operation to be performed. Note that
the 3 can be omitted from three-operand instructions.

Bits 31–29 are set to the value of 001, indicating three-operand addressing
mode instructions. Bits 22 and 21 specify the three-operand addressing mode
(T) field, which defines how bits 15–0 are to be interpreted for addressing the
SRC operands. Bits 15–8 define the SRC1 address; bits 7–0 define the SRC2
address. Options for bits 22 and 21 (T) are as follows:

T SRC1 SRC2

0 0 register register
0 1 indirect register
1 0 register indirect
1 1 indirect indirect

Figure 5–5 shows the encoding for three-operand addressing. If the SRC1
and SRC2 fields use the same auxiliary register, both addresses are correctly
generated. However, only the value created by the SRC1 field is saved in the
auxiliary register specified. The assembler issues a warning if you specify this
condition.

The following values of ARn and ARm are valid:

ARn,0 ≤ n ≤ 7
ARm,0 ≤ m ≤ 7

5-20
Groups of Addressing Modes

The notation modm or modn indicates that the modification field goes with the
ARm or ARn field, respectively. Refer to Table 5–2 on page 5-6 for further
information.

In indirect addressing of the three-operand addressing mode, displacements


(if used) are allowed to be 0 or 1, and the index registers (IR0 and IR1) can be
used. The displacement of 1 is implied and is not explicitly coded in the instruc-
tion word.

Figure 5–5. Encoding for Three-Operand Addressing Modes

31 29 28 23 22 21 20 16 15 13 12 11 10 87 54 3 2 0

0 0 1 operation 0 0 dst 0 0 0 src1 0 0 0 src2


0 0 1 operation 0 1 dst modn ARn 0 0 0 src2

0 0 1 operation 1 0 dst 0 0 0 src1 modn ARn

0 0 1 operation 1 1 dst modn ARn modm ARm

T SRC1 SRC2

5.2.3 Parallel Addressing Modes


Instructions that use parallel addressing, indicated by || (two vertical bars), al-
low the most parallelism possible. The destination operands are indicated as
d1 and d2, signifying dst1 and dst2, respectively (see Figure 5–6). The source
operands, signified by src1 and src2, use the extended-precision registers.
Operation refers to the parallel operation to be performed.

Figure 5–6. Encoding for Parallel Addressing Modes

31 3029 26 25 2423 22 21 19 18 16 15 10 11 87 32 0

1 0 operation P d1 d2 src1 src2 modn ARn modm ARm

src3 src4

Addressing 5-21
Groups of Addressing Modes

The parallel addressing mode (P) field specifies how the operands are to be
used, that is, whether they are source or destination. The specific relationship
between the P field and the operands is detailed in the description of the indi-
vidual parallel instructions (see Chapter 10). However, the operands are al-
ways encoded in the same way. Bits 31 and 30 are set to the value of 10, indi-
cating parallel addressing mode instructions. Bits 25 and 24 specify the paral-
lel addressing mode (P) field, which defines how bits 21–0 are to be interpreted
for addressing the src operands. Bits 21–19 define the src1 address, bits
18–16 define the src2 address, bits 15–8 the src3 address, and bits 7–0 the
src 4 address. The notations modn and modm indicate which modification field
goes with which ARn or ARm (auxiliary register) field, respectively. Following
is a list of the parallel addressing operands:
- src1 0 ≤ src1 ≤ 7 (extended-precision registers R0 – R7)
- src2 0 ≤ src2 ≤ 7 (extended-precision registers R0–R7)
- d1 If 0, dst1 is R0. If 1, dst1 is R1.
- d2 If 0, dst2 is R2. If 1, dst2 is R3.
- P 0≤ P≤3
- src3 indirect (disp = 0, 1, IR0, IR1)
- src4 indirect (disp = 0, 1, IR0, IR1)

As in the three-operand addressing mode, indirect addressing in the parallel


addressing mode allows for displacements of 0 or 1 and the use of the index
registers (IR0 and IR1). The displacement of 1 is implied and is not explicitly
coded in the instruction word.

In the encoding shown for this mode in Figure 5–6 on page 5-21, if the src3
and src4 fields use the same auxiliary register, both addresses are correctly
generated, but only the value created by the src3 field is saved in the auxiliary
register specified. The assembler issues a warning if you specify this condi-
tion.

5-22
Groups of Addressing Modes

5.2.4 Conditional-Branch Addressing Modes


Instructions using the conditional-branch addressing modes (Bcond, BcondD,
CALLcond, DBcond, and DBcondD) can perform a variety of conditional oper-
ations. Bits 31–27 are set to the value of 01101, indicating conditional-branch
addressing mode instructions. Bit 26 is set to 0 or 1; 0 selects DBcond, 1 se-
lects Bcond. Selection of bit 25 determines the conditional-branch addressing
mode (B). If B = 0, register addressing is used; if B = 1, PC-relative addressing
is used. Selection of bit 21 sets the type of branch: D = 0 for a standard branch
or D = 1 for a delayed branch. The condition field(cond) specifies the condition
checked to determine what action to take, that is, whether to branch (see
Chapter 10 for a list of condition codes). Figure 5–7 shows the encoding for
conditional-branch addressing.

Figure 5–7. Encoding for Conditional-Branch Addressing Modes


DBcond (D):

31 27 26 25 24 22 21 20 16 15 5 4 0

0 1 1 0 1 1 B ARn D cond 0 0 0 0 0 0 0 0 0 0 0 src reg


0 1 1 0 1 1 B ARn D cond immediate (PC relative)

Bcond (D):

31 27 26 25 24 22 21 20 16 15 5 4 0

0 1 1 0 1 0 B 0 0 0 D cond 0 0 0 0 0 0 0 0 0 0 0 src reg


0 1 1 0 1 0 B 0 0 0 D cond immediate (PC relative)

CALLcond:

31 27 26 25 24 22 21 20 16 15 5 4 0

0 1 1 1 0 0 B 0 0 0 0 cond 0 0 0 0 0 0 0 0 0 0 0 src reg


0 1 1 1 0 0 B 0 0 0 0 cond immediate (PC relative)

Addressing 5-23
Circular Addressing

5.3 Circular Addressing


Many algorithms, such as convolution and correlation, require the implemen-
tation of a circular buffer in memory. In convolution and correlation, the circular
buffer is used to implement a sliding window that contains the most recent data
to be processed. As new data is brought in, the new data overwrites the oldest
data. Key to the implementation of a circular buffer is the implementation of a
circular addressing mode. This section describes the circular addressing
mode of the TMS320C3x.

The block size register (BK) specifies the size of the circular buffer. By labeling
v
the most significant 1 of the BK register as bit N, with N 15, you can find the
address immediately following the bottom of the circular buffer by concatenat-
ing bits 31 through N + 1 of a user-selected register (ARn) with bits N through
0 of the BK register. The address of the top of the buffer is referred to as the
effective base (EB) and can be found by concatenating bits 31 through N + 1
of ARn, with bits N through 0 of EB being 0.

Figure 5–8 illustrates the relationships between the block size register (BK),
the auxiliary registers (ARn), the bottom of the circular buffer, the top of the cir-
cular buffer, and the index into the circular buffer.

A circular buffer of size R must start on a K-bit boundary (that is, the K LSBs
of the starting address of the circular buffer must be 0), where K is an integer
that satisfies 2K > R. Since the value R must be loaded into the BK register,
K w N + 1. For example, a 31-word circular buffer must start at an address
whose five LSBs are 0 (that is, XXXXXXXXXXXXXXXXXXXXXXXXXXX000002),
and the value 31 must be loaded into the BK register.

5-24
Circular Addressing

Figure 5–8. Flowchart for Circular Addressing


Most significant 1 at location N, where N v15
31 N+1 N 0
31 N+1 N 0
ARn H...H L...L
1 (N LSBs
BK 0...0 of BK)

31 N+1 N 0 31 N+1 N 0
1 (N LSBs
EB H...H 0...0 H...H
of BK)
Top of Buffer + 1
Bottom of Buffer + 1

31 N+1 N 0

Index H...H L...L

Circular
Addressing
Algorithm
Logic

New
Index 0...0 L′ . . . L′

31 N+1 N 0
New
ARn H...H L′ . . . L′

Legend: ARn auxiliary register n BK blocksize register


EB effective base H high-order bits
L low-order bits L′ new low-order bits
LSB least significant bit N bit value

Addressing 5-25
Circular Addressing

In circular addressing, index refers to the N LSBs of the auxiliary register se-
lected, and step is the quantity being added to or subtracted from the auxiliary
register. Follow these two rules when you use circular addressing:

- The step used must be less than or equal to the block size. The step size
is treated as an unsigned integer.

- The first time the circular queue is addressed, the auxiliary register must
be pointing to an element in the circular queue.

The algorithm for circular addressing is as follows:

If 0 ≤ index + step < BK:


index = index + step.

Else if index + step ≥ BK:


index = index + step – BK.

Else if index + step < 0:


index = index + step + BK.

Figure 5–9 shows how the circular buffer is implemented and illustrates the re-
lationship of the quantities generated and the elements in the circular buffer.

Figure 5–9. Circular Buffer Implementation

Address Data

31 N+1 N 0 Top of Circular Buffer


Effective Base (EB) H...H 0...0 → Element 0
MSBs of ARn Element 1

31 N+1 N 0
Auxiliary Register (ARn) H...H L...L → Element (N LSBs of ARn)
MSBs of ARn LSBs of ARn

31 N+1 N 0 Last Element


H...H LSBs BK → Last Element + 1
MSBs of ARn

5-26
Circular Addressing

Example 5–23 shows circular addressing operation. Assuming that all ARs
are four bits, let AR0 = 0000, and BK = 0110 (block size of 6). Example 5–23
shows a sequence of modifications and the resulting value of AR0.
Example 5–23 also shows how the pointer steps through the circular queue
with a variety of step sizes (both incrementing and decrementing).

Example 5–23. Circular Addressing

*AR0 ++ (5)% ; AR0 = 0 (0th value)


*AR0 ++ (2)% ; AR0 = 5 (1st value)
*AR0 – – (3)% ; AR0 = 1 (2nd value)
*AR0++(6)% ; AR0 = 4 (3rd value)
*AR0 – – % ; AR0 = 4 (4th value)
*AR0 ; AR0 = 3 (5th value)

Value Data Address

0th → Element 0 0
2nd → Element 1 1
Element 2 2

5th → Element 3 3

4th, 3rd → Element 4 4

1st → Element 5 (Last Element) 5

Last Element + 1 6

Addressing 5-27
Circular Addressing

Circular addressing is especially useful for the implementation of FIR filters.


Figure 5–10 shows one possible data structure for FIR filters. Note that the ini-
tial value of AR0 points to h(N –1), and the initial value of AR1 points to x(0).
Circular addressing is used in the TMS320C3x code for the FIR filter shown
in Example 5–24.

Figure 5–10. Data Structure for FIR Filters

Impulse Response Input Samples

AR0 → h(N –1) x(N –1)


h(N – 2) x(N – 2)
. .
. .
. .
h(2) x(2)

h(1) x(1)
h(0) x(0) ← AR1

Example 5–24. FIR Filter Code Using Circular Addressing


* Initialization
*
LDI N,BK ; Load block size.
LDI H,AR0 ; Load pointer to impulse response.
LDI X,AR1 ;Load pointer to bottom of input
* ;sample buffer.
*
TOP LDF IN, R3 ;Read input sample.
STF R3,*AR1++% ;Store with other samples,
;and point to top of buffer.
LDF 0,R0 ;Initialize R0.
LDF 0,R2 ;Initialize R2.
*
* Filter
*
RPTS N –1 ;Repeat next instruction.
MPYF3 *AR0++%,*AR1++%,R0
|| ADDF3 R0,R2,R2 ;Multiply and accumulate.
ADDF R0,R2 ;Last product accumulated.
*
STF R2,Y ;Save result.
B TOP ;Repeat.

5-28
Bit-Reversed Addressing

5.4 Bit-Reversed Addressing


Bit-reversed addressing on the TMS320C3x enhances execution speed and
program memory for FFT algorithms that use a variety of radices. The base
address of bit-reversed addressing must be located on a boundary of the size
of the table. For example, if IR0 = 2n–1, the n LSBs of the base address must
be 0. The base address of the data in memory must be on a 2n boundary. One
auxiliary register points to the physical location of a data value. IR0 specifies
one-half the size of the FFT; that is, the value contained in IR0 must be equal
to 2n–1 , where n is an integer and the FFT size is 2n. When you add IR0 to the
auxiliary register by using bit-reversed addressing, addresses are generated
in a bit-reversed fashion.

To illustrate this kind of addressing, assume eight-bit auxiliary registers. Let


AR2 contain the value 0110 0000 (96). This is the base address of the data in
memory. Let IR0 contain the value 0000 1000 (8). Example 5–25 shows a se-
quence of modifications of AR2 and the resulting values of AR2.

Example 5–25. Bit-Reversed Addressing


*AR2++(IR0)B ; AR2 = 0110 0000 (0th value)
*AR2++(IR0)B ; AR2 = 0110 1000 (1st value)
*AR2++(IR0)B ; AR2 = 0110 0100 (2nd value)
*AR2++(IR0)B ; AR2 = 0110 1100 (3rd value)
*AR2++(IR0)B ; AR2 = 0110 0010 (4th value)
*AR2++(IR0)B ; AR2 = 0110 1010 (5th value)
*AR2++(IR0)B ; AR2 = 0110 0110 (6th value)
*AR2 ; AR2 = 0110 1110 (7th value)

Table 5–3 shows the relationship of the index steps and the four LSBs of AR2.
You can find the four LSBs by reversing the bit pattern of the steps.

Addressing 5-29
Bit-Reversed Addressing

Table 5–3. Index Steps and Bit-Reversed Addressing


Step Bit Pattern Bit-Reversed Pattern Bit-Reversed Step
0 0000 0000 0
1 0001 1000 8
2 0010 0100 4
3 0011 1100 12

4 0100 0010 2
5 0101 1010 10
6 0110 0110 6
7 0111 1110 14

8 1000 0001 1
9 1001 1001 9
10 1010 0101 5
11 1011 1101 13

12 1100 0011 3
13 1101 1011 11
14 1110 0111 7
15 1111 1111 15

5-30
System and User Stack Management

5.5 System and User Stack Management


The TMS320C3x provides a dedicated system stack pointer (SP) for building
stacks in memory. The auxiliary registers can also be used to build a variety
of more general linear lists. This section discusses the implementation of the
following types of linear lists:
- Stack
The stack is a linear list for which all insertions and deletions are made at
one end of the list.
- Queue
The queue is a linear list for which all insertions are made at one end of the
list and all deletions are made at the other end.
- Dequeue
The dequeue is a double-ended queue linear list for which insertions and
deletions are made at either end of the list.

5.5.1 System Stack Pointer


The system stack pointer (SP) is a 32-bit register that contains the address of
the top of the system stack. The system stack fills from low-memory address
to high-memory address (see Figure 5–11). The SP always points to the last
element pushed onto the stack. A push performs a preincrement, and a pop
performs a postdecrement of the system stack pointer.
The program counter is pushed onto the system stack on subroutine calls,
traps, and interrupts. It is popped from the system stack on returns. The sys-
tem stack can be pushed and popped using the PUSH, POP, PUSHF, and
POPF instructions.

Figure 5–11. System Stack Configuration


Low Memory

Bottom of Stack

.
.
.

SP → Top of Stack

(Free)

High Memory

Addressing 5-31
System and User Stack Management

5.5.2 Stacks

Stacks can be built from low to high memory or high to low memory. Two cases
for each type of stack are shown. Stacks can be built using the preincrement/
decrement and postincrement/decrement modes of modifying the auxiliary
registers (AR). Stack growth from high-to-low memory can be implemented in
two ways:

CASE 1: Stores to memory using *– – ARn to push data onto the stack and
reads from memory using *ARn ++ to pop data off the stack.

CASE 2: Stores to memory using *ARn – – to push data onto the stack and
reads from memory using * ++ ARn to pop data off the stack.

Figure 5–12 illustrates these two cases. The only difference is that in case 1,
the AR always points to the top of the stack, and in case 2, the AR always points
to the next free location on the stack.

Figure 5–12. Implementations of High-to-Low Memory Stacks

Case 1 Case 2
Low Memory Low Memory
(Free) ARn → (Free)
ARn → Top of Stack Top of Stack

Bottom of Stack Bottom of Stack


High Memory High Memory

Stack growth from low-to-high memory can be implemented in two ways:

CASE 3: Stores to memory using *++ ARn to push data onto the stack and
reads from memory using *ARn – – to pop data off the stack.

CASE 4: Stores to memory using *ARn ++ to push data onto the stack and
reads from memory using *– – ARn to pop data off the stack.

Figure 5–13 shows these two cases. In case 3, the AR always points to the top
of the stack. In case 4, the AR always points to the next free location on the
stack.

5-32
System and User Stack Management

Figure 5–13. Implementations of Low-to-High Memory Stacks

Case 3 Case 4
Low Memory Low Memory

Bottom of Stack Bottom of Stack


. .
. .
. .
ARn → Top of Stack Top of Stack
(Free) ARn → (Free)
High Memory High Memory

5.5.3 Queues
A queue is like a FIFO. The implementation of queues is based on the manipu-
lation of auxiliary registers. Two auxiliary registers are used: one to mark the
front of the queue from which data is popped (or dequeued) and the other to
mark the rear of the queue where data is pushed. With proper management
of the auxiliary registers, the queue can also be circular. (A queue is circular
when the rear pointer is allowed to point to the beginning of the queue memory
after it has pointed to the end of the queue memory.)

Addressing 5-33
5-34
Chapter 6

Program Flow Control

The TMS320C3x provides a complete set of constructs that facilitate software


and hardware control of the program flow. Software control includes repeats,
branches, calls, traps, and returns. Hardware control includes operations,
reset, and interrupts. Because programming includes a variety of constructs,
you can select the one suited for your particular application.

Several interlocked operations instructions provide flexible multiprocessor


support and, through the use of external signals, a powerful means of
synchronization. They also guarantee the integrity of the communication and
result in a high-speed operation.

The TMS320C3x supports a nonmaskable external reset signal and a number


of internal and external interrupts. These functions can be programmed for a
particular application.

This chapter discusses the following major topics:

Topic Page

6.1 Repeat Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2


6.2 Delayed Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8
6.3 Calls, Traps, and Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
6.4 Interlocked Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
6.5 Reset Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
6.6 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-23
6.7 TMS320LC31 Power Management Modes . . . . . . . . . . . . . . . . . . . . . . 6-36

6-1
Repeat Modes

6.1 Repeat Modes


The repeat modes of the TMS320C3x can implement zero-overhead looping.
For many algorithms, most execution time is spent in an inner kernel of code.
Using the repeat modes allows these time-critical sections of code to be ex-
ecuted in the shortest possible time.

The TMS320C3x provides two instructions to support zero-overhead looping:

- RPTB (repeat a block of code). RPTB repeats execution of a block of code


a specified number of times.

- RPTS (repeat a single instruction). RPTS fetches a single instruction once


and then repeats its execution a number of times. Since the instruction is
fetched only once, bus traffic is minimized.

RPTB and RPTS are four-cycle instructions. These four cycles of overhead
occur during the initial execution of the loop. All subsequent executions of the
loop have no overhead (zero cycle).

Three registers (RS, RE, and RC) are associated with the updating of the pro-
gram counter (PC) when it is updated in a repeat mode. Table 6–1 describes
these registers.

Table 6–1. Repeat-Mode Registers

ÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁ
Register Function

ÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁ
RS Repeat Start Address Register. Holds the address of the first instruc-
tion of the block of code to be repeated.

RE ÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁ
Repeat End Address Register. Holds the address of the last instruc-
tion of the block of code to be repeated.

RC Repeat Count Register. Contains one less than the number of times
the block remains to be repeated. For example, to execute a block
N times, load N–1 into RC.

For correct operation of the repeat modes, you must correctly initialize all of
the above-mentioned registers.

6-2
Repeat Modes

6.1.1 Repeat-Mode Control Bits

Two bits are important to the operation of RPTB and RPTS:

- RM bit. The repeat-mode flag (RM) bit in the status register specifies
whether the processor is running in the repeat mode.
J RM = 0 indicates standard instruction fetching mode.
J RM = 1 indicates repeat-mode instruction fetches.

- S bit. The S bit is internal to the processor and cannot be programmed,


but this bit is necessary to fully describe the operation of RPTB and RPTS.
J S = 0 indicates standard instruction fetches.
J S = 1 and RM = 1 indicates repeat-single instruction fetches.

6.1.2 Repeat-Mode Operation

Information in the repeat-mode registers and associated control bits controls


the modification of the PC during repeat-mode fetches. The repeat modes
compare the contents of the RE register (repeat end address register) with the
PC after the execution of each instruction. If they match and the repeat counter
(RC) is nonnegative, the RC is decremented, the PC is loaded with the repeat
start address, and the processing continues. The fetches and appropriate sta-
tus bits are modified as necessary. Note that the RC is never modified when
the RM flag is 0.

The repeat counter should be loaded with a value one less than the number
of times to execute the block; for example, an RC value of 4 would execute the
block five times. The detailed algorithm for the update of the PC is shown in
Example 6–1.

Note: Maximum Number of Repeats


The maximum number of repeats occurs when RC = 8000 0000h. This re-
sults in 8000 0001h repetitions. The minimum number of repeats occurs
when RC = 0. This results in one repetition.
RE should be greater than or equal to RS (RE ≥ RS). Otherwise, the code
will not repeat even though the RM bit remains set to 1.
By writing a 0 into the repeat counter or writing 0 into the RM bit of the status
register, you can stop the repeating of the loop before completion.

Program Flow Control 6-3


Repeat Modes

Example 6–1. Repeat-Mode Control Algorithm


if RM == 1 ; If in repeat mode (RPTB or RPTS)
if S == 1 ; If RPTS
if first time through ; If this is the first fetch
fetch instruction from memory ; Fetch instruction from memory
else : If not the first fetch
fetch instruction from IR ; Fetch instruction from IR
RC – 1 → RC ; Decrement RC
if RC < 0 ; If RC is negative
; Repeat single mode completed
0 → ST(RM) ; Turn off repeat-mode bit
0 → S ; Clear S
PC + 1 → PC ; Increment PC
else if S == 0 ; If RPTB
fetch instruction from memory ; Fetch instruction from memory
if PC == RE ; If this is the end of the block
RC – 1 → RC ; Decrement RC
if RC ≥ 0 ; If RC is not negative
RS → PC ; Set PC to start of block
else if RC < 0 ; If RC is negative
0 → ST(RM) ; Turn off repeat mode bits
0 → S ; Clear S
PC + 1 → PC ; Increment PC

6.1.3 RPTB Instruction


The RPTB instruction repeats a block of code a specified number of times.

The number of times to repeat the block is the RC (repeat count) register value
plus one. Because the execution of RPTB does not load the RC, you must load
this register yourself. The RC register must be loaded before the RPTB instruc-
tion is executed. A typical setup of the block repeat operation is shown in
Example 6–2.

Example 6–2.RPTB Operation


LDI 15,RC ; Load repeat counter with 15
RPTB ENDLOOP ; Execute the block of code
STLOOP ; from STLOOP to ENDLOOP 16 times
.
.
.
ENDLOOP

6-4
Repeat Modes

Using the repeat-block mode of modifying the PC facilitates analysis of what


would happen in the case of branches within the block. Assume that the next
value of the PC will be either PC + 1 or the contents of the RS register. It is thus
apparent that this method of block repeat allows much branching within the
repeated block. Execution can go anywhere within the user’s code via inter-
rupts, subroutine calls, etc. For proper modification of the loop counter, the last
instruction of the loop must be fetched. You can stop the repeating of the loop
prior to completion by writing a 0 to the repeat counter or writing a 0 to the RM
bit of the status register.

6.1.4 RPTS Instruction


An RPTS src instruction repeats the instruction following the RPTS src + 1
times. Repeats of a single instruction initiated by RPTS are not interruptible,
because the RPTS fetches the instruction word only once and then keeps it
in the instruction register for reuse. An interrupt would cause the instruction
word to be lost. Refetching the instruction word from the instruction register
reduces memory accesses and, in effect, acts as a one-word program cache.
If you need a single instruction that is repeatable and interruptible, you can use
the RPTB instruction.

When RPTS src is executed, the following sequence of operations occurs:


1) PC + 1 → RS
2) PC + 1 → RE
3) 1 → RM status register bit
4) 1 → S bit
5) src → RC (repeat count register)

The RPTS instruction loads all registers and mode bits necessary for the oper-
ation of the single-instruction repeat mode. Step 1 loads the start address of
the block into RS. Step 2 loads the end address into the RE (end address of
the block). Since this is a repeat of a single instruction, the start address and
the end address are the same. Step 3 sets the status register to indicate the
repeat mode of operation. Step 4 indicates that this is the repeat single-instruc-
tion mode of operation. Step 5 loads src into RC.

Program Flow Control 6-5


Repeat Modes

6.1.5 Repeat-Mode Restrictions


Since the block repeat modes modify the program counter, other instructions
cannot modify the program counter at the same time. There are two restric-
tions:

- The last instruction in the block (or the only instruction in a block of
size 1) cannot be a Bcond, BR, DBcond, CALL, CALLcond, TRAPcond,
RETIcond, RETScond, IDLE, RPTB, or RPTS. Example 6–3 shows an in-
correctly placed standard branch.

- None of the last four instructions from the bottom of the block (or the only
instruction in a block of size 1) can be a BcondD, BRD, or DBcondD.
Example 6–4 shows an incorrectly placed delayed branch.

Note: Rule Violation


If either of these rules is violated, the PC will be undefined.

Example 6–3.Incorrectly Placed Standard Branch


LDI 15,RC ; Load repeat counter with 15
RPTB ENDLOOP ; Execute the block of code
STLOOP ; from STLOOP to ENDLOOP 16 times
.
.
.
ENDLOOP BR OOPS ; This branch violates rule 1

Example 6–4.Incorrectly Placed Delayed Branch


LDI 15,RC ; Load repeat counter with 15
RPTB ENDLOOP ; Execute block of code
STLOOP ; from STLOOP to ENDLOOP 16 times
.
.
.
BRD OOPS ; This branch violates rule 2
ADDF
MPYF
ENDLOOP SUBF

6.1.6 RC Register Value After Repeat Mode Completes


For the RPTB instruction, the RC register normally decrements to 0000 0000h
unless the block size is 1; in that case, it decrements to FFFF FFFFh. However,
if the RPTB instruction using a block size of 1 has a pipeline conflict in the
instruction being executed, the RC register decrements to 0000 0000h.
Example 6–5 illustrates a pipeline conflict. Refer to Chapter 9 for pipeline in-
formation.

6-6
Repeat Modes

RPTS normally decrements the RC register to FFFF FFFFh. However, if the


RPTS has a pipeline conflict on the last cycle, the RC register decrements to
0000 0000h.

Note: Number of Repetitions


In any case, the number of repetitions is always RC + 1.

Example 6–5. Pipeline Conflict in an RPTB Instruction


EDC .word40000000h ; The program is located in 4000000Fh
LDP EDC
LDI @EDC,AR0
LDI 15,RC ; Load repeat counter with 15
RPTB ENDLOOP ; Execute block of code
ENDLOOPLDI *AR0,R0 ; The *AR0 read conflicts with
; the instruction fetching
; Then RC decrements to 0
; If cache is enabled, RC decrements
; to FFFF FFFFh

6.1.7 Nested Block Repeats


Block repeats (RPTB) can be nested. Since the registers RS, RE, RC, and ST
control the repeat-mode status, these registers must be saved and restored
in order to nest block repeats. For example, if you write an interrupt service
routine that requires the use of RPTB, it is possible that the interrupt asso-
ciated with the routine may occur during repeated execution of a block. The
interrupt service routine can check the RM bit to determine whether the block
repeat mode is active. If this RM is set, the interrupt routine should save ST,
RS, RE, and RC, in that order. The interrupt routine can then perform a block
repeat. Before returning to the interrupted routine, the interrupt routine should
restore RC, RE, RS, and ST, in that order. If the RM bit is not set, you don’t need
to save and restore these registers.
The order in which the registers are saved/restored is important to guarantee
correct operation. The ST register should be restored last, after the RC, RE,
and RS registers. ST should be restored after restoring RC, because the RM
bit cannot be set to 1 if the RC register is 0 or –1. For this reason, if you execute
a POP ST instruction (with ST (RM bit) = 1) while RC = 0, the POP instruction
recovers all the ST register bits but not the RM bit that stays at 0 (repeat mode
disabled). Also, RS and RE should be correctly set before you activate the re-
peat mode.
The RPTS instruction can be used in a block repeat loop if the proper registers
are saved.

Program Flow Control 6-7


Delayed Branches

6.2 Delayed Branches


The TMS320C3x offers three main types of branching: standard, delayed, and
conditional delayed.

Standard branches empty the pipeline before performing the branch; this
guarantees correct management of the program counter and results in a
TMS320C3x branch taking four cycles. Included in this class are repeats,
calls, returns, and traps.

Delayed branches on the TMS320C3x do not empty the pipeline, but rather
guarantee that the next three instructions will execute before the program
counter is modified by the branch. The result is a branch that requires only a
single cycle, thus making the speed of the delayed branch very close to that
of the optimal block repeat modes of the TMS320C3x. However, unlike block
repeat modes, delayed branches may be used in situations other than looping.
Every delayed branch has a standard branch counterpart that is used when
a delayed branch cannot be used. The delayed branches of the TMS320C3x
are Bcond D, BRD, and DBcond D.

Conditional delayed branches use the conditions that exist at the end of the
instruction immediately preceding the delayed branch. They do not depend on
the instructions following the delayed branch. The condition flags are set by
a previous instruction only when the destination register is one of the exten-
ded-precision registers (R0–R7) or when one of the compare instructions
(CMPF, CMPF3, CMPI, CMPI3, TSTB, or TSTB3) is executed. Delayed
branches guarantee that the next three instructions will execute, regardless
of other pipeline conflicts.

When a delayed branch is fetched, it remains pending until the three subse-
quent instructions are executed. None of the three instructions that follow a
delayed branch can be any of the following (see Example 6–6):

Bcond DBcond D
Bcond D IDLE
BR RETIcond
BRD RETScond
CALL RPTB
CALLcond RPTS
DBcond TRAPcond

Delayed branches disable interrupts until the three instructions following the
delayed branch are completed. This is independent of whether the branch is
taken.

6-8
Delayed Branches

Note: Incorrect Use of Delayed Branches


If delayed branches are used incorrectly, the PC will be undefined.

Example 6–6.Incorrectly Placed Delayed Branches


B1: BD L1
NOP
NOP
B2: B L2 ; This branch is incorrectly placed.
NOP
NOP
NOP
.
.
.

Program Flow Control 6-9


Calls, Traps, and Returns

6.3 Calls, Traps, and Returns


Calls and traps provide a means of executing a subroutine or function while
providing a return to the calling routine.

The CALL, CALLcond, and TRAPcond instructions store the value of the PC
on the stack before changing the PC’s contents. The stack thus provides a re-
turn using either the RETScond or RETIcond instruction.

- The CALL instruction places the next PC value on the stack and places
the src (source) operand into the PC. The src is a 24-bit immediate value.
Figure 6–1 shows CALL response timing.

- The CALLcond instruction is similar to the CALL instruction (above) ex-


cept for the following:
J It executes only if a specific condition is true (the 20 conditions—in-
cluding unconditional—are listed in Table 10–9 on page -13).
J The src is either a PC-relative displacement or is in register-addres-
sing mode.
The condition flags are set by a previous instruction only when the destina-
tion register is one of the extended-precision registers (R0–R7) or when
one of the compare instructions (CMPF, CMPF3, CMPI, CMPI3, TSTB, or
TSTB3) is executed.

- The TRAPcond instruction also executes only if a specific condition is true


(same conditions as for the CALLcond instruction). When executing, the
following actions occur:
1) Interrupts are disabled with 0 written to bit GIE of the ST.
2) The next PC value is stored on the stack.
3) A vector is retrieved from one of the addresses 20h to 3Fh and is
loaded into the PC.
The particular address is identified by a trap number in the instruction.
Using the RETIcond to return re-enables interrupts.

- RETScond returns execution from any of the above three instructions by


popping the top of the stack to the PC. To execute, the specified condition
must be true. Conditions are the same as for the CALLcond instruction.

- RETIcond returns from traps or calls like the RETScond (above) with the
addition that RETIcond also sets the GIE bit of the status register, which
enables all interrupts whose enabling bit is set to 1. Conditions are the
same as for the CALLcond instruction.

6-10
Calls, Traps, and Returns

Calls and traps accomplish the same functional task (that is, a subfunction is
called and executed, and control is then returned to the calling function). Traps
offer several advantages. Among them are the following:

- Interrupts are automatically disabled when a trap is executed. This allows


critical code to execute without risk of being interrupted. Thus, traps are
generally terminated with a RETIcond instruction to re-enable interrupts.

- You can use traps to indirectly call functions. This is particularly beneficial
when a kernel of code contains the basic subfunctions to be used by appli-
cations. In this case, the functions in the kernel can be modified and relo-
cated without the need to recompile each application.

Figure 6–1. CALL Response Timing

Fetch CALL Decode CALL Read CALL Execute CALL Fetch First
(Store PC Subroutine
on Stack) Instruction
H3

H1

ADDR First Instruction


Vector Address
Address

Data PC Inst 1

Program Flow Control 6-11


Interlocked Operations

6.4 Interlocked Operations

Among the most common multiprocessing configurations is the sharing of


global memory by multiple processors. In order for multiple processors to ac-
cess this global memory and share data in a coherent manner, some sort of
arbitration or handshaking is necessary. This requirement for arbitration is the
purpose of the TMS320C3x interlocked operations.

The TMS320C3x provides a flexible means of multiprocessor support with five


instructions, referred to as interlocked operations. Through the use of external
signals, these instructions provide powerful synchronization mechanisms.
They also guarantee the integrity of the communication and result in a high-
speed operation. The interlocked-operation instruction group is listed in
Table 6–2.

Table 6–2. Interlocked Operations

Mnemonic Description Operation


LDFI Load floating-point value into a register, Signal interlocked
interlocked src → dst
LDII Load integer into a register, interlocked Signal interlocked
src → dst
SIGI Signal, interlocked Signal interlocked
Clear interlock

STFI Store floating-point value to memory, src → dst


interlocked Clear interlock

STII Store integer to memory, interlocked src → dst


Clear interlock

The interlocked operations use the two external flag pins, XF0 and XF1. XF0
must be configured as an output pin; XF1 is an input pin. When configured in
this manner, XF0 signals an interlock operation request, and XF1 acts as an
acknowledge signal for the requested interlocked operation. In this mode, XF0
and XF1 are treated as active-low signals.

The external timing for the interlocked loads and stores is the same as for stan-
dard loads and stores. The interlocked loads and stores may be extended like
standard accesses by using the appropriate ready signal (RDYint or XRDYint).
(RDYint and XRDYint are a combination of external ready input and software
wait states. Refer to Chapter 7, External Bus Operation, for more information
on ready generation.)

6-12
Interlocked Operations

The LDFI and LDII instructions perform the following actions:


1) Simultaneously set XF0 to 0 and begin a read cycle. The timing of XF0 is
similar to that of the address bus during a read cycle.
2) Execute an LDF or LDI instruction and extend the read cycle until XF1 is
set to 0 and a ready (RDYint or XRDYint) is signaled.
3) Leave XF0 set to 0 and end the read cycle.

The read/write operation is identical to any other read/write cycle except for
the special use of XF0 and XF1. The src operand for LDFI and LDII is always
a direct or indirect memory address. XF0 is set to 0 only if the src is located
off-chip; that is, STRB, MSTRB, or IOSTRB is active, or the src is one of the
on-chip peripherals. If on-chip memory is accessed, then XF0 is not asserted,
and the operation is as an LDF or LDI from internal memory.

The STFI and STII instructions perform the following operations:


1) Simultaneously set XF0 to 1 and begin a write cycle. The timing of XF0 is
similar to that of the address bus during a write cycle.
2) Execute an STF or STI instruction and extend the write cycle until a ready
(RDYint or XRDYint) is signaled.

As in the case for LDFI and LDII, the dst of STFI and STII affects XF0. If dst
is located off-chip (STRB, MSTRB, or IOSTRB is active) or the dst is one of
the on-chip peripherals, XF0 is set to 1. If on-chip memory is accessed, then
XF0 is not asserted and the operations are as an STF or STI to internal
memory.

The SIGI instruction functions as follows:


1) Sets XF0 to 0.
2) Idles until XF1 is set to 0.
3) Sets XF0 to 1 and ends the operation.

While the LDFI, LDII, and SIGI instructions are waiting for XF1 to be set to 0,
you can interrupt them. LDFI and LDII require a ready signal (RDYint or‘
XRDYint) in order to be interrupted. Because interrupts are taken on bus cycle
boundaries (see Section 6.6), an interrupt may be taken any time after a valid
ready. This allows you to implement protection mechanisms against deadlock
conditions by interrupting an interlocked load that has taken too long. Upon re-
turn from the interrupt, the next instruction is executed. The STFI and STII
instructions are not interruptible. Since the STFI and STII instructions com-
plete when ready is signaled, the delay until an interrupt can occur is the same
as for any other instruction.

Program Flow Control 6-13


Interlocked Operations

Interlocked operations can be used to implement a busy-waiting loop, to


manipulate a multiprocessor counter, to implement a simple semaphore
mechanism, or to perform synchronization between two TMS320C3xs. The
following examples illustrate the usefulness of the interlocked operations in-
structions.

Example 6–7 shows the implementation of a busy-waiting loop. If location


LOCK is the interlock for a critical section of code, and a nonzero means the
lock is busy, the algorithm for a busy-waiting loop can be used as shown.

Example 6–7.Busy-Waiting Loop

LDI 1,R0 ; Put 1 into R0


L1: LDII @LOCK,R1 ; Interlocked operation begun
; Contents of LOCK → R1
STII R0,@LOCK ; Put R0 (= 1) into LOCK, XF0 = 1
; Interlocked operation ended
BNZ L1 ; Keep trying until LOCK = 0

Example 6–8 shows how a location COUNT may contain a count of the num-
ber of times a particular operation needs to be performed. This operation may
be performed by any processor in the system. If the count is 0, the processor
waits until it is nonzero before beginning processing. The example also shows
the algorithm for modifying COUNT correctly.

Example 6–8.Multiprocessor Counter Manipulation

CT: OR 4,IOF ; XF0 = 1


; Interlocked operation ended
LDII @COUNT,R1 ; Interlocked operation begun
; Contents of COUNT → R1
BZ CT ; If COUNT = 0, keep trying
SUBI 1,R1 ; Decrement R1 (= COUNT)
STII R1,@COUNT ; Update COUNT, XF0 = 1
; Interlocked operation ended

Figure 6–2 illustrates multiple TMS320C3xs sharing global memory and using
the interlocked instructions as in Example 6–9, Example 6–10, and
Example 6–11.

6-14
Interlocked Operations

Figure 6–2. Multiple TMS320C3xs Sharing Global Memory

Global Memory

ADDR

CTRL
DATA
Arbitration Logic

Lock, Count, or S

XF0 XF1 (X)A (X)A XF0 XF1


(X)D (X)D
TMS320C3x #1 TMS320C3x #2
CTRL CTRL

Local Local
Memory Memory

It might sometimes be necessary for several processors to access some


shared data or other common resources. The portion of code that must access
the shared data is called a critical section.
To ease the programming of critical sections, semaphores may be used.
Semaphores are variables that can take only non-negative integer values.
Two primitive, indivisible operations are defined on semaphores (with S being
a semaphore):
V(S): S + 1 → S
P(S): P: if (S == 0), go to P
else S – 1 → S

Indivisibility of V(S) and P(S) means that when these processes access and
modify the semaphore S, they are the only processes accessing and modify-
ing S.
To enter a critical section, a P operation is performed on a common sema-
phore, say S (S is initialized to 1). The first processor performing P(S) will be
able to enter its critical section. All other processors are blocked because S
has become 0. After leaving its critical section, the processor performs a V(S),
thus allowing another processor to execute P(S) successfully.

Program Flow Control 6-15


Interlocked Operations

The TMS320C3x code for V(S) is shown in Example 6–9; code for P(S) is
shown in Example 6–10. Compare the code in Example 6–10 to the code in
Example 6–8.

Example 6–9.Implementation of V(S)


V: LDII @S,R0 ; Interlocked read of S begins (XFO = 0)
; Contents of S → R0
ADDI 1,R0 ; Increment R0 (= S)
STII R0,@S ; Update S, end interlock (XF0 = 0)

Example 6–10. Implementation of P(S)


P: OR 4,IOF ; End interlock (XF0 = 1)
NOP ; Avoid potential pipeline conflicts when
; executing out of cache, on-chip memory
; or zero wait-state memory
LDII @S,R0 ; Interlocked read of S begins
; Contents of S → R0
BZ P ; If S = 0, go to P and try again
SUBI 1,R0 ; Decrement R0 (= S)
STII R0,@S ; Update S, end interlock (XF0 = 1)

The SIGI operation can synchronize, at an instruction level, multiple


TMS320C3xs. Consider two processors connected as shown in Figure 6–3.
The code for the two processors is shown in Example 6–11.

Figure 6–3. Zero-Logic Interconnect of TMS320C3xs


TMS320C3x #1 TMS320C3x #2
XF0 XF1
XF1 XF0

Processor #1 runs until it executes the SIGI. It then waits until processor #2
executes a SIGI. At this point, the two processors have synchronized and con-
tinue execution.

6-16
Interlocked Operations

Example 6–11. Code to Synchronize Two TMS320C3xs at the Software Level

Time Code for TMS320C3x #1 Code for TMS320C3x #2

SIGI

(WAIT)

Synchronization Occurs SIGI

Program Flow Control 6-17


Reset Operation

6.5 Reset Operation


The TMS320C3x supports a nonmaskable external reset signal (RESET),
which is used to perform system reset. This section discusses the reset opera-
tion.

At powerup, the state of the TMS320C3x processor is undefined. You can use
the RESET signal to place the processor in a known state. This signal must
be asserted low for ten or more H1 clock cycles to guarantee a system reset.
H1 is an output clock signal generated by the TMS320C3x (see Chapter 13
for more information).

Reset affects the other pins on the device in either a synchronous or asynchro-
nous manner. The synchronous reset is gated by the TMS320C3x’s internal
clocks. The asynchronous reset directly affects the pins and is faster than the
synchronous reset. Table 6–3 shows the state of the TMS320C3x’s pins after
RESET = 0. Each pin is described according to whether the pin is reset syn-
chronously or asynchronously.

6-18
Reset Operation

Table 6–3. Pin Operation at Reset


Signal # Pins Operation at Reset
Primary Interface (61 Pins)

D31 – D0 32 Synchronous reset; placed in high-impedance state

A23 – A0 24 Synchronous reset; placed in high-impedance state

R/W 1 Synchronous reset; deasserted by going to a high level

STRB 1 Synchronous reset; deasserted by going to a high level

RDY 1 Reset has no effect.

HOLD 1 Reset has no effect.

HOLDA 1 Reset has no effect.

Expansion Interface (49 Pins)†

XD31 – XD0 32 Synchronous reset; placed in high-impedance state

XA12 – XA0 13 Synchronous reset; placed in high-impedance state

XR/W 1 Synchronous reset; placed in high-impedance state

MSTRB 1 Synchronous reset; deasserted by going to a high level

IOSTRB 1 Synchronous reset; deasserted by going to a high level

XRDY 1 Reset has no effect.

Control Signals (9 Pins)

RESET 1 Reset input pin

INT3 – INT0 4 Reset has no effect.

IACK 1 Synchronous reset; deasserted by going to a high level

MC/MP or 1 Reset has no effect.


MCBL/MP

XF1–XF0 2 Asynchronous reset; placed in high-impedance state


† Present only on TMS320C30

Program Flow Control 6-19


Reset Operation

Table 6–3. Pin Operation at Reset (Continued)

Signal # Pins Operation at Reset


Serial Port 0 Signals (6 Pins)

CLKX0 1 Asynchronous reset; placed in high-impedance state

DX0 1 Asynchronous reset; placed in high-impedance state

FSX0 1 Asynchronous reset; placed in high-impedance state

CLKR0 1 Asynchronous reset; placed in high-impedance state

DR0 1 Asynchronous reset; placed in high-impedance state

FSR0 1 Asynchronous reset; placed in high-impedance state

Serial Port 1 Signals (6 Pins) †

CLKX1 1 Asynchronous reset; placed in high-impedance state

DX1 1 Asynchronous reset; placed in high-impedance state

FSX1 1 Asynchronous reset; placed in high-impedance state

CLKR1 1 Asynchronous reset; placed in high-impedance state

DR1 1 Asynchronous reset; placed in high-impedance state

FSR1 1 Asynchronous reset; placed in high-impedance state

Timer 0 Signal (1 Pin)

TCLK0 1 Asynchronous reset; placed in high-impedance state

Timer 1 Signal (1 Pin)

TCLK1 1 Asynchronous reset; placed in high-impedance state

Supply and Oscillator Signals (29 Pins)

VDD (3 – 0) 4 Reset has no effect.

IODVDD (1,0) 2 Reset has no effect.

ADVDD (1,0) 2 Reset has no effect.

PDVDD 1 Reset has no effect.

DDVDD (1,0) 2 Reset has no effect.

MDVDD 1 Reset has no effect.

VSS (3 – 0) 4 Reset has no effect.


† Present only on TMS320C30

6-20
Reset Operation

Table 6–3. Pin Operation at Reset (Continued)


Signal # Pins Operation at Reset
DVSS (3 – 0) 2 Reset has no effect.
CVSS (1,0) 2 Reset has no effect.
IVSS 1 Reset has no effect.
VBBP 1 Reset has no effect.
SUBS 1 Reset has no effect.
X1 1 Reset has no effect.
X2/CLKIN 1 Reset has no effect.
H1 1 Synchronous reset. Will go to its initial state when RESET makes a 1 to 0
transition. See Chapter 13.
H3 1 Synchronous reset. Will go to its initial state when RESET makes a 1 to 0
transition. See Chapter 13.

Emulation, Test, and Reserved (18 Pins)


EMU0 1 Undefined
EMU1 1 Undefined
EMU2 1 Undefined
EMU3 1 Undefined
EMU4/SHZ 1 Undefined
EMU5† 1 Undefined
EMU6† 1 Undefined
RSV0† 1 Undefined
RSV1† 1 Undefined
RSV2† 1 Undefined
RSV3† 1 Undefined
RSV4† 1 Undefined
RSV5† 1 Undefined
RSV6† 1 Undefined
RSV7† 1 Undefined
RSV8† 1 Undefined
RSV9† 1 Undefined
RSV10† 1 Undefined
† Present only on TMS320C30

Program Flow Control 6-21


Reset Operation

At system reset, the following additional operations are performed:

- The peripherals are reset. This is a synchronous operation. The peripheral


reset is described in Chapter 8.

- The external bus control registers are reset. The reset values of the control
registers are described in Chapter 7.

- The following CPU registers are loaded with 0:


J ST (CPU status register)
J IE (CPU/DMA interrupt enable flags)
J IF (CPU interrupt flags)
J IOF (I/O flags)

- The reset vector is read from memory location 0h and loaded into the PC.
This vector contains the start address of the system reset routine.

- Execution begins. Refer to Example 11–1 on page 11-3 for an illustration


of a processor initialization routine.

Multiple TMS320C3xs driven by the same system clock may be reset and syn-
chronized. When the 1 to 0 transition of RESET occurs, the processor is placed
on a well-defined internal phase, and all of the TMS320C3xs will come up on
the same internal phase.

Unless otherwise specified, all registers are undefined after reset.

6-22
Interrupts

6.6 Interrupts
The TMS320C3x supports multiple internal and external interrupts, which can
be used for a variety of applications. This section discusses the operation of
these interrupts.

A functional diagram of the logic used to implement the external interrupt


inputs is shown in Figure 6–4; the logic for internal interrupts is similar. Addi-
tional information regarding internal interrupts can be found in Chapter 8.

Figure 6–4. Interrupt Logic Functional Diagram

Internal Interrupt
Set Signal EINTn(CPU)

Interrupt GIE(CPU)
Flag (n)
INTn Set Q Internal To
DQ D Q D Q Interrupt Control
Processor Section
CLK CLK CLK RESET
Internal Interrupt
Clear/Acknowledge GIE(DMA)
H1 H3 H1 Signal
EINTn(DMA)

External interrupts are synchronized internally, as illustrated by the three flip-


flops clocked by H1 and H3. Once synchronized, the interrupt input will set the
corresponding interrupt flag register (IF) bit if the interrupt is active.

External interrupts are latched internally on the falling edge of H1 (see Chapter
13 for timing information). An external interrupt must be held low for at least
one H1/H3 cycle to be recognized by the TMS320C3x. Interrupts should be
held low for only one or two H1 falling edges. If the interrupt is held low for three
or more H1 falling edges, multiple interrupts may be recognized.

6.6.1 Interrupt Vector Table


Table 6–4 and Table 6–5 contain the interrupt vectors. In the microprocessor
mode of the TMS320C30 and the TMS320C31 (Table 6–4) and the microcom-
puter mode of the TMS320C31 (Table 6–5), the interrupt vectors contain the
addresses of interrupt service routines that should start executing when an in-
terrupt occurs. On the other hand, in the microcomputer/boot loader mode of
the TMS320C31, the interrupt vector contains a branch instruction to the start
of the interrupt service routine.

Program Flow Control 6-23


Interrupts

Table 6–4. Reset, Interrupt, and Trap-Vector Locations for the TMS320C30/TMS320C31
Microprocessor Mode

Address Routine
00h RESET

01h INT0

02h INT1

03h INT2

04h INT3

05h XINT0

06h RINT0

07h XINT1†

08h RINT1†

09h TINT0

0Ah TINT1

0Bh DINT

0Ch
Reserved
1Fh

20h TRAP 0

3Bh TRAP 27

3Ch TRAP 28 (Reserved)

3Dh TRAP 29 (Reserved)

3Eh TRAP 30 (Reserved)

3Fh TRAP 31 (Reserved)


† Reserved on TMS320C31

6-24
Interrupts

Table 6–5. Reset, Interrupt, and Trap-Vector Locations for the TMS320C31 Microcomputer
Boot Mode

Address Description
809FC1 INT0

809FC2 INT1

809FC3 INT2

809FC4 INT3

809FC5 XINT0

809FC6 RINT0

809FC7 Reserved

809FC8 Reserved

809FC9 TINT0

809FCA TINT1

809FCB DINT0

809FCC–809FDF Reserved

809FE0 TRAP0

809FE1 TRAP1

• •

• •

• •

809FFB TRAP27

809FFC–809FFF Reserved

6.6.2 Interrupt Prioritization

When two interrupts occur in the same clock cycle or when two previously
received interrupts are waiting to be serviced, one interrupt will be serviced be-
fore the other. The CPU handles this prioritization by servicing the interrupt
with the least priority. Table 6–6 shows the priorities assigned to the reset and
interrupt vectors.

The CPU controls all prioritization of interrupts (see Table 6–6 for reset and in-
terrupt vector locations and priorities).

Program Flow Control 6-25


Interrupts

Table 6–6. Reset and Interrupt Vector Priorities


Reset or Vector
Interrupt Location Priority Function
RESET 0h 0 External reset signal input on the RESET pin

INT0 1h 1 External interrupt on the INT0 pin

INT1 2h 2 External interrupt on the INT1 pin

INT2 3h 3 External interrupt on the INT2 pin

INT3 4h 4 External interrupt on the INT3 pin

XINT0 5h 5 Internal interrupt generated when serial-port 0 transmit buffer is empty

RINT0 6h 6 Internal interrupt generated when serial-port 0 receive buffer is full

XINT1† 7h 7 Internal interrupt generated when serial-port 1 transmit buffer is empty

RINT1† 8h 8 Internal interrupt generated when serial-port 1 receive buffer is full

TINT0 9h 9 Internal interrupt generated by timer 0

TINT1 0Ah 10 Internal interrupt generated by timer 1

DINT 0Bh 11 Internal interrupt generated by DMA controller 0


† Reserved on TMS320C31

6.6.3 Interrupt Control Bits

Four CPU registers contain bits used to control interrupt operation:

- Status Register (ST)

The CPU global interrupt enable bit (GIE) located in the CPU status regis-
ter (ST) controls all maskable CPU interrupts. When this bit is set to 1, the
CPU responds to an enabled interrupt. When this bit is cleared to 0, all
CPU interrupts are disabled. Refer to subsection 3.1.7 on page 3-4 for
more information.

- CPU/DMA Interrupt Enable Register (IE)


This register individually enables/disables CPU and DMA (external, serial
port, and timer) interrupts. Refer to subsection 3.1.8 on page 3-7 for more
information.

- CPU Interrupt Flag Register (IF)


This register contains interrupt flag bits that indicate the corresponding in-
terrupt is set. Refer to subsection 3.1.9 on page 3-9 for more information.

6-26
Interrupts

- DMA Global Control Register

Interrupts to the DMA are controlled by the synchronization bits of the


DMA global control register. DMA interrupts are independent of the ST
(GIE) bit.

Interrupt Flag Register Behavior

When an external interrupt occurs, the corresponding bit of the IF register is


set to 1. When the CPU or DMA controller processes this interrupt, the corre-
sponding interrupt flag bit is cleared by the internal interrupt acknowledge sig-
nal. It should be noted, however, that if INTn is still low when the interrupt ac-
knowledge signal occurs, the interrupt flag bit will be cleared for only one cycle
and then set again because INTn is still low. Accordingly, it is theoretically pos-
sible that, depending on when the IF register is read, this bit may be 0 even
though INTn is 0. When the TMS320C3x is reset, 0 is written to the interrupt
flag register, thereby clearing all pending interrupts.

The interrupt flag register bits may be read and written under software control.
Writing a 1 to an IF register bit sets the associated interrupt flag to 1. Similarly,
writing a 0 resets the corresponding interrupt flag to 0. In this way, all interrupts
may be triggered and/or cleared through software. Since the interrupt flags
may be read, the interrupt pins may be polled in software when an interrupt-dri-
ven interface is not required.

Internal interrupts operate in a similar manner. In the IF register, the bit corre-
sponding to an internal interrupt may be read and written through software.
Writing a 1 sets the interrupt latch; writing a 0 clears it. All internal interrupts
are one H1/H3 cycle in length.

The CPU global interrupt enable bit (GIE), located in the CPU status register
(ST), controls all CPU interrupts. All DMA interrupts are controlled by the DMA
global interrupt enable bit, which is not dependent on ST(GIE) and is local to
the DMA. The DMA global interrupt enable bit is dependent, in part, on the
state of the DMA SYNC bits. It is not directly accessible through software (see
Chapter 8). The AND of the interrupt flag bit and the interrupt enables is then
connected to the interrupt processor.

6.6.4 Interrupt Processing

The ’C3x allows the CPU and DMA coprocessor to respond to and process in-
terrupts in parallel. Figure 6–5 on page 6-28 shows interrupt processing flow;
for exact sequence, refer to Table 6–7 on page 6-29.

Program Flow Control 6-27


Interrupts

Figure 6–5. Interrupt Processing

No Is an Enabled
Interrupt Set
?
Yes

If Enabled, If Enabled,
Interrupt Is Interrupt Is
a CPU Interrupt a DMA Interrupt

Disable Interrupts
Clear Interrupt Flag
GIE← 0

DMA Proceeds According


Clear Interrupt Flag
to SYNC Bits

PC → *(++SP) DMA Continues

Complete All Fetched Instructions

PC ← Interrupt Vector

CPU Starts Executing ISR Routine

Note: CPU and DMA Interrupts


CPU and DMA interrupts are acknowledged (responded to by the CPU) on
instruction fetch boundaries only. If instruction fetches are halted because
of pipeline conflicts or execution of RPTS loops, CPU and DMA interrupts are
not acknowledged until instruction fetching continues.

6-28
Interrupts

Table 6–7. Interrupt Latency

Cycle Description Fetch Decode Read Execute


1 Recognize interrupt in single-cycle fetched prog prog a prog a–1 prog a–2
(prog a + 1) instruction. a+1

2 Temporarily disable interrupt until GIE is cleared. — interrupt prog a prog a–1

3 Read the interrupt vector table. — — interrupt prog a

4 Clear Interrupt flag; clear GIE bit; store return address — — — interrupt
to stack.

5 Pipeline begins to fill with ISR instruction. isr1 — — —

6 Pipeline continues to fill with ISR instruction. isr2 isr1 — —

7 Pipeline continues to fill with ISR instruction. isr3 isr2 isr1 —

8 Execute first instruction of interrupt service routine. isr4 isr3 isr2 isr1

In the CPU interrupt processing cycle (left side of Figure 6–5), the correspond-
ing interrupt flag in the IF register is cleared, and interrupts are globally dis-
abled (GIE = 0). The CPU completes all fetched instructions. The current PC
is pushed to the top of the stack. The interrupt vector is fetched and loaded into
the PC, and the CPU starts executing the first instruction in the interrupt ser-
vice routine (ISR).

If you wish to make the interrupt service routine interruptible, you can set the
GIE bit to 1 after entering the ISR.

The DMA interrupt processing cycle (right side of Figure 6–5) is similar to that
of the CPU. After the pertinent interrupt flag is cleared, the DMA coprocessor
proceeds according to the status of the SYNC bits in the DMA coprocessor
global control register.

The interrupt acknowledge (IACK) instruction can be used to signal externally


that an interrupt has been serviced. If external memory is specified in the oper-
and, IACK drives the IACK pin and performs a dummy read. The read is per-
formed from the address specified by the IACK instruction operand. IACK is
typically placed in the early portion of an interrupt service routine. However,
it may be better suited at the end of the interrupt service routine or be totally
unnecessary.

Note the following:

- Interrupts are disabled during an RPTS and during a delayed branch (until
the three instructions following a delayed branch are completed). Inter-
rupts are held until after the branch.

Program Flow Control 6-29


Interrupts

- When an interrupt occurs, instructions currently in the decode and read


phases continue regular execution. This is not the case for an instruction
in the fetch phase:

J If the interrupt occurs in the first cycle of the fetch of an instruction, the
fetched instruction is discarded (not executed), and the address of
that instruction is pushed to the top of the system stack.

J If the interrupt occurs after first cycle of the fetch (in the case of a multi-
cycle fetch due to wait states), that instruction is executed, and the ad-
dress of the next instruction to be fetched is pushed to the top of the
system stack.

6.6.5 CPU Interrupt Latency

CPU interrupt latency, defined as the time from the acknowledgement of the
interrupt to the execution of the first interrupt service routine (ISR) instruction,
is at least eight cycles. This is explained in Table 6–7 on page 6-29, where the
interrupt is treated as an instruction. It assumed that all of the instructions are
single-cycle instructions.

6.6.6 CPU/DMA Interaction

If the DMA is not using interrupts for synchronization of transfers, it will not be
affected by the processing of the CPU interrupts. Detected interrupts are re-
sponded to by the CPU and DMA on instruction fetch boundaries only. Since
instruction fetches are halted due to pipeline conflicts or when executing
instructions in an RPTS loop, interrupts will not be responded to until instruc-
tion fetching continues. It is therefore possible to interrupt the CPU and DMA
simultaneously with the same or different interrupts and, in effect, synchronize
their activities. For example, it may be necessary to cause a high-priority DMA
transfer that avoids bus conflicts with the CPU (that is, that makes the DMA
higher priority than the CPU). This may be accomplished by using an interrupt
that causes the CPU to trap to an interrupt routine that contains an IDLE
instruction. Then if the same interrupt is used to synchronize DMA transfers,
the DMA transfer counter can be used to generate an interrupt and thus return
control to the CPU following the DMA transfer.

Since the DMA and CPU share the same set of interrupt flags, the DMA may
clear an interrupt flag before the CPU can respond to it. For example, if the
CPU interrupts are disabled, the DMA can respond to interrupts and thus clear
the associated interrupt flags.

6-30
Interrupts

6.6.7 TMS320C3x Interrupt Considerations


Give careful consideration to TMS320C3x interrupts, especially if you make
modifications to the status register when the global interrupt enable (GIE) bit
is set. This can result in the GIE bit being erroneously set or reset as described
in the following paragraphs.

The GIE bit is set to 0 by an interrupt. This can cause a processing error if any
code following within two cycles of the interrupt recognition attempts to read
or modify the status register. For example, if the status register is being pushed
onto the stack, it will be stored incorrectly if an interrupt was acknowledged two
cycles before the store instruction.

When an interrupt signal is recognized, the TMS320C3x continues executing


the instructions already in the read and decode phases in the pipeline. Howev-
er, because the interrupt is acknowledged, the GIE bit is reset to 0, and the
store instruction already in the pipeline will store the wrong status register
value.

For example, if the program is like this:


...
NOP
interrupt recognized ––>LDI @V_ADDR, AR1
MPYI *AR1, R0
PUSH ST
...
POP ST
...

the PUSH ST instruction will save the ST contents in memory, which includes
GIE = 0. Since the device is expected to have GIE = 1, the POP ST instruction
will put the wrong status register value into the ST.

A similar situation may occur if the GIE bit = 1 and an instruction executes that
is intended to modify the other status bits and leave the GIE bit set. In the
above example, this erroneous setting would occur if the interrupt were recog-
nized two cycles before the POP ST instruction. In that case, the interrupt
would clear the GIE bit, but the execution of the POP instruction would set the
GIE bit. Since the interrupt has been recognized, the interrupt service routine
will be entered with interrupts enabled, rather than disabled as expected.

One solution is to use traps. For example, you can use TRAP 0 to reset GIE
and use TRAP 1 to set GIE. This is accomplished by making TRAP 0 and
TRAP 1 be the instructions RETS and RETI, respectively.

Program Flow Control 6-31


Interrupts

Another alternative incorporates the following code fragment, which protects


against modifying or saving of the status register by disabling interrupts
through the interrupt enable register:
PUSH IE ; Save IE register • Added instructions to
LDI 0, IE ; Clear IE register avoid pipeline problems
NOP ; • 2 NOPs or useful instructions
NOP ;
AND 0DFFFh, ST ; Set GIE = 0 • Instruction that reads or
POP IE ; writes to ST register.
; Added instruction
; to avoid pipeline
; problems.

6.6.8 TMS320C30 Interrupt Considerations


The TMS320C30 has two unique exceptions to the interrupt operation.

- The status register global interrupt enable (GIE) bit may be erroneously
reset to 0 (disabled setting) if all of the following conditions are true:
J A conditional trap instruction (TRAPcond) has been fetched,
J The condition for the trap is false, and
J A pipeline conflict has occurred, resulting in a delay in the decode or
read phases of the instruction.
During the decode phase of a conditional trap, interrupts are temporarily
disabled to ensure that the trap will execute before a subsequent interrupt.
If a pipeline conflict occurs and causes a delay in execution of the condi-
tional trap, the interrupt disabled condition may become the last known
condition of the GIE bit. In the case that the trap condition is false, inter-
rupts will be permanently disabled until the GIE bit is intentionally set. The
condition does not present itself when the trap condition is true, because
normal operation of the instruction causes the GIE to be reset, and stan-
dard coding practice will set the GIE to 1 before the trap routine is exited.
Several instruction sequences that can cause pipeline conflicts have been
found:
J LDI mem,SP
TRAPcond n
J LDI mem,SP
NOP
TRAPcond n

6-32
Interrupts

J STI SP,mem
TRAPcond n
J STI Rx,*ARy
LDI *ARx,Ry
||LDI *ARz,Rw
TRAPcond n

Other similar conditions may also cause a delay in the execution. There-
fore, the following solution is recommended to avoid or rectify the problem.
Insert two NOP instructions immediately prior to the TRAPcond instruc-
tion. One NOP is insufficient in some cases, as illustrated in the second
bulleted item, above. This eliminates the opportunity for any pipeline con-
flicts in the immediately preceding instructions and enables the conditional
trap instruction to execute without delays.

- Asynchronous accesses to the interrupt flag register (IF) can cause the
TMS320C3x to fail to recognize and service an interrupt. This may occur
when an interrupt is generated and is ready to be latched into the IF regis-
ter on the same cycle that the IF is being written to by the CPU. Note that
logic operations (AND, OR, XOR) may write to the IF register.
The logic currently gives the CPU write priority; consequently, the as-
serted interrupt might be lost. This is particularly true if the asserted inter-
rupt has been generated internally (for example, a direct memory access
(DMA) interrupt). This situation can arise as a result of a decision to poll
certain interrupts or a desire to clear pending interrupts due to a long pulse
width. In the case of a long pulse width, the interrupt may be generated
after the CPU responds to the interrupt and attempts to automatically clear
it by the interrupt vector process.
The recommended solution is not to use the interrupt polling technique but
to design the external interrupt inputs to have pulse widths of between 1
and 2 instruction cycles. The alternative to strict polling is to periodically
enable and disable the interrupts that would be polled, thereby allowing
the normal interrupt vectoring to take place; that automatically clears the
interrupt flag without affecting other interrupts. If you need to clear a pend-
ing interrupt, it is recommended that you use a memory location to indicate
that the interrupt is invalid. Then the interrupt service routine can read that
location, clear it (if the pending interrupt is invalid), and return immediately.
The following code fragments show how a dummy interrupt due to a long
interrupt pulse might be handled:
ISR_n: PUSH ST ;
PUSH DP ; Save registers
PUSH R0 ;
LDI 0, DP ; Clear Data Page Pointer

Program Flow Control 6-33


Interrupts

LDI @DUMMY_INT, R0 ; If DUMMY_INT is 0 or positive,


BNN ISR_n_START ; go to ISR_n_START
STI DP, @DUMMY_INT ; Set DUMMY_INT = 0
POP R0 ;
POP DP ;
POP ST ; Housekeeping, return from interrupt
RETI ;

ISR_n_START: .
. ; Normal interrupt service routine
. ; Code goes here
LDI INT_Fn, R0 ;
AND IF, R0 ; If ones in IF reg match
BZ ISR_n_END ; INT_Fn, exit ISR
LDI 0, DP ; Otherwise clear
LDI 0FFFFh, R0 ; DP and set
STI R0, @DUMMY_INT ; DUMMY_INT negative & exit
ISR_n_END:
POP R0 ;
POP DP ; Exit ISR
POP ST ;
RETI ;

6.6.9 Prioritization and Control

The CPU controls all prioritization of interrupts (see Table 6–8 for reset and in-
terrupt vector locations and priorities). If the DMA is not using interrupts for
synchronization of transfers, it will not be affected by the processing of the
CPU interrupts. Detected interrupts are responded to by the CPU and DMA
on instruction fetch boundaries only. If instruction fetches are halted due to
pipeline conflicts or when executing instructions in an RPTS loop, interrupts
will not be responded to until instruction fetching continues. It is therefore pos-
sible to interrupt the CPU and DMA simultaneously with the same or different
interrupts and, in effect, synchronize their activities. For example, it may be
necessary to cause a high-priority DMA transfer that avoids bus conflicts with
the CPU, that is, make the DMA higher priority than the CPU. This may be ac-
complished by using an interrupt that causes the CPU to trap to an interrupt
routine that contains an IDLE instruction. Then if the same interrupt is used to
synchronize DMA transfers, the DMA transfer counter can be used to generate
an interrupt, thereby returning control to the CPU following the DMA transfer.

Since the DMA and CPU share the same set of interrupt flags, the DMA can
clear an interrupt flag before the CPU can respond to it. For example, if the
CPU interrupts are disabled, the DMA can respond to interrupts and thus clear
the associated interrupt flags.

6-34
Interrupts

Table 6–8. Reset and Interrupt Vector Locations


Reset or Vector
Interrupt Location Priority Function
RESET 0h 0 External reset signal input on the RESET pin

INT0 1h 1 External interrupt input on the INT0 pin

INT1 2h 2 External interrupt input on the INT1 pin

INT2 3h 3 External interrupt input on the INT2 pin

INT3 4h 4 External interrupt input on the INT3 pin

XINT0 5h 5 Internal interrupt generated when serial-port 0 transmit


buffer is empty

RINT0 6h 6 Internal interrupt generated when serial-port 0 receive


buffer is full

XINT1 † 7h 7 Internal interrupt generated when serial-port 1 transmit


buffer is empty

RINT1 † 8h 8 Internal interrupt generated when serial-port 1 receive


buffer is full

TINT0 9h 9 Internal interrupt generated by timer 0

TINT1 0Ah 10 Internal interrupt generated by timer 1

DINT 0Bh 11 Internal interrupt generated by DMA controller 0


† Reserved on TMS320C31

Program Flow Control 6-35


TMS320LC31 Power Management Modes

6.7 TMS320LC31 Power Management Modes

The TMS320LC31 CPU has been enhanced by the addition of two power man-
agement modes:
- IDLE2, and
- LOPOWER.

6.7.1 IDLE2

The H1 instruction clock is held high until one of the four external interrupts is
asserted. In IDLE2 mode, the TMS320C31 behaves as follows:

- No instructions are executed.

- The CPU, peripherals, and internal memory retain their previous states.

- The primary bus output pins are idle:


J The address lines remain in their previous states,
J The data lines are in the high-impedance state, and
J The output control signals are inactive.

- When the device is in the functional (non-emulation) mode, the clocks stop
with H1 high and H3 low (see Figure 6–6).

- The ’C31 will remain in IDLE2 until one of the four external interrupts
(INT3–INT0) is asserted for at least one H1 cycle. When one of the four
interrupts is asserted, the clocks start after a delay of one H1 cycle. When
the clocks restart, they may be in the opposite phase (that is, H1 may be
high if H3 was high before the clocks were stopped; H3 may be high if H1
_
was previously high). The H1 and H3 clocks will remain 180 out of phase
with each other (see Figure 6–7).

- For one of the four external interrupts to be recognized and serviced by


the CPU during the IDLE2 operation, the interrupt must be asserted for
less than three cycles but more than two cycles.

- The instruction following the IDLE2 instruction will not be executed until
after the return from interrupt instruction (RETI) is executed.

- When the device is in emulation mode, the H1 and H3 clocks will continue
to run normally and the CPU will operate as if an IDLE instruction had been
executed. The clocks continue to run for correct operation of the emulator.

6-36
TMS320C31 Power Management Modes

Delayed Branch
For correct device operation, the three instructions after a delayed
branch should not be IDLE or IDLE2 instructions.

Figure 6–6. IDLE2 Timing

CLKIN
Idle 2 Execution
H3

H1

ADDR

Data

Figure 6–7. Interrupt Response Timing After IDLE2 Operation

Fetch 1st
Instr of
Interrupt Vector Service
Clocks Driven Read Routing
CLKIN

H3

H1

INT3 to
INT0
INT3 to
INT0 Flag

ADDR Vector Address 1st Addr

Data

Program Flow Control 6-37


TMS320C31 Power Management Modes

6.7.2 LOPOWER
In the LOPOWER (low power) mode, the CPU continues to execute instruc-
tions, and the DMA can continue to perform transfers, but at a reduced clock
rate of CLKIN frequency .
16
A TMS320C31 with a CLKIN frequency of 32 MHz will perform identically to
a 2 MHz TMS320C31 with an instruction cycle time of 1,000 ns.

During the read phase of the . . . The TMS320C31 . . .


LOPOWER instruction (Figure 6–8) slows to 1/16 of full-speed operation.
MAXSPEED instruction (Figure 6–9) resumes full-speed operation.

Figure 6–8. LOPOWER Timing

CLKIN
LOPOWER Read
H3

H1

32 CLKIN

Figure 6–9. MAXSPEED Timing

CLKIN
MAXSPEED Read
H3

H1

32 CLKIN

6-38
Chapter 7

External Bus Operation

Memories and external peripheral devices are accessible through two external
interfaces on the TMS320C30:
- the primary bus, and
- the expansion bus.

On the TMS320C31, one bus, the primary bus, is available to access external
memories and peripheral devices. You can control wait-state generation, per-
mitting access to slower memories and peripherals, by manipulating
memory-mapped control registers associated with the interfaces and by using
an external input signal.

Major topics discussed in this chapter are listed below.

Topic Page

7.1 External Interface Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2


7.2 External Interface Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.3 Programmable Wait States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-28
7.4 Programmable Bank Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30

7-1
External Interface Control Registers

7.1 External Interface Control Registers


The TMS320C30 provides two external interfaces: the primary bus and the ex-
pansion bus. The TMS320C31 provides one external interface: the primary
bus. The primary bus consists of a 32-bit data bus, a 24-bit address bus, and
a set of control signals. The expansion bus consists of a 32-bit data bus, a
13-bit address bus, and a set of control signals. Both buses support soft-
ware-controlled wait states and an external ready input signal, and both buses
are useful for data, program, and I/O accesses.
Access is determined by an active strobe signal (STRB, MSTRB, or IOSTRB).
When a primary bus access is performed, STRB is low. The expansion bus of
the TMS320C30 supports two types of accesses:
- Memory access signalled by MSTRB low. The timing for an MSTRB ac-
cess is the same as that of the STRB access on the primary bus.
- External peripheral device access is signaled by IOSTRB low.
Each of the buses (primary and expansion) has an associated control register.
These registers are memory-mapped as shown in Figure 7–1.

Figure 7–1. Memory-Mapped External Interface Control Registers


Register Peripheral
Address
Expansion-Bus Control (see subsection 7.1.2)† 808060h
Reserved 808061h
Reserved 808062h
Reserved 808063h
Primary-Bus Control (see subsection 7.1.1) 808064h
Reserved 808065h
Reserved 808066h
Reserved 808067h
Reserved 808068h
Reserved 808069h
Reserved 80806Ah
Reserved 80806Bh
Reserved 80806Ch
Reserved 80806Dh
Reserved 80806Eh
Reserved 80806Fh

† Reserved on the TMS320C31

7-2
External Interface Control Registers

7.1.1 Primary-Bus Control Register


The primary bus control register is a 32-bit register that contains the control
bits for the primary bus (see Figure 7–2). Table 7–1 lists the register bits with
the bit names and functions.

Figure 7–2. Primary-Bus Control Register

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx BNKCMP WTCNT SWW HIZ NOHOLD HOLDST
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R

NOTE: xx = reserved bit, read as 0.


R = read, W = write.

External Bus Operation 7-3


External Interface Control Registers

Table 7–1. Primary-Bus Control Register Bits Summary


Bit Name Reset Value Function
0 HOLDST x† Hold status bit. This bit signals whether the port is being held
(HOLDST = 1) or is not being held (HOLDST = 0). This status bit is valid
whether the port has been held via hardware or software.

1 NOHOLD 0 Port hold signal. NOHOLD allows or disallows the port to be held by an
external HOLD signal. When NOHOLD = 1, the TMS320C3x takes over
the external bus and controls it, regardless of serviced or pending re-
quests by external devices. No hold acknowledge (HOLDA) is asserted
when a HOLD is received. However, it is asserted if an internal hold is
generated (HIZ = 1). NOHOLD is set to 0 at reset.

2 HIZ 0 Internal hold. When set (HIZ = 1), the port is put in hold mode. This is
equivalent to the external HOLD signal. By forcing a high-impedance
condition, the TMS320C3x can relinquish the external memory port
through software. HOLDA goes low when the port is placed in the
high-impedance state. HIZ is set to 0 at reset.

4–3 SWW 11 Software wait mode. In conjunction with WTCNT, this two-bit field de-
fines the mode of wait-state generation. It is set to 1 1 at reset.

7–5 WTCNT 111 Software wait mode. This three-bit field specifies the number of cycles
to use when in software wait mode for the generation of internal wait
states. The range is 0 (WTCNT = 0 0 0) to 7 (WTCNT = 1 1 1) H1/H3
cycles. It is set to 1 1 1 at reset.

12–8 10000 Bank compare. This five-bit field specifies the number of MSBs of the
BNKCMP address to be used to define the bank size. It is set to 1 0 0 0 0 at reset.

31–13 Reserved 0–0 Read as 0.


† x = 0 or 1

7-4
External Interface Control Registers

7.1.2 Expansion-Bus Control Register


The expansion-bus control register is a 32-bit register that contains control bits
for the expansion bus (see Figure 7–3 and Table 7–2).

Figure 7–3. Expansion-Bus Control Register

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx xx xx xx xx WTCNT SWW xx xx xx
R/W R/W R/W R/W R/W

NOTE: xx = reserved bit, read as 0.


R = read, W = write.

Table 7–2. Expansion-Bus Control Register Bits Summary


Reset
Bit Name Value Function
2– 0 Reserved 000 Read as 0.

4–3 SWW 11 Software wait-state generation. In conjunction with the WTCNT, this
two-bit field defines the mode of wait-state generation. It is set to 1 1
at reset.

7–5 WTCNT 111 Software wait mode. This three-bit field specifies the number of cycles
to use when in software wait mode for the generation of internal wait
states. The range is 0 (WTCNT = 0 0 0) to 7 ( WTCNT = 1 1 1) H1/H3
clock cycles. It is set to 1 1 1 at reset.

31–8 Reserved 0–0 Read as 0.

External Bus Operation 7-5


External Interface Timing

7.2 External Interface Timing


This section discusses functional timing of operations on the primary bus and
the expansion bus, the TMS320C3x’s two independent parallel buses.
Detailed timing specifications for all TMS320C3x signals are contained in Sec-
tion 13.5 on page 13-30.

The parallel buses implement three mutually exclusive address spaces distin-
guished through the use of three separate control signals: STRB, MSTRB, and
IOSTRB. The STRB signal controls accesses on the primary bus, and the
MSTRB and IOSTRB control accesses on the expansion bus. Since the two
buses are independent, you can make two accesses in parallel.

With the exception of bank switching and the external HOLD function (dis-
cussed later in this section), timing of primary bus cycles and MSTRB expan-
sion bus cycles are identical and are discussed collectively. The acronym
(M)STRB is used in references that pertain equally to STRB and MSTRB. Sim-
ilarly, (X)R/W, (X)A, (X)D, and (X)RDY are used to symbolize the equivalent
primary and expansion bus signals. The IOSTRB expansion bus cycles are
timed differently and are discussed independently.

7.2.1 Primary-Bus Cycles


All bus cycles comprise integral numbers of H1 clock cycles. One H1 cycle is
defined to be from one falling edge of H1 to the next falling edge of H1. For
full-speed (zero wait-state) accesses, writes require two H1 cycles and reads
one cycle; however, if the read follows a write, the read requires two
cycles.This applies to both the primary bus and the MSTRB expansion bus ac-
cess. Recall that, internally (from the perspective of the CPU and DMA), writes
require only one cycle if no accesses to that interface are in progress. The fol-
lowing discussions pertain to zero wait-state accesses unless otherwise spe-
cified.

The (M)STRB signal is low for the active portion of both reads and writes. The
active portion lasts one H1 cycle. Additionally, before and after the active por-
tion ((M)STRB low) of writes only, there is a transition cycle of H1. This transi-
tion cycle consists of the following sequence:

1) (M)STRB is high.

2) If required, (X)R/W changes state on H1 rising.

3) If required, address changes on H1 rising if the previous H1 cycle was the


active portion of a write. If the previous H1 cycle was a read, address
changes on H1 falling.

7-6
External Interface Timing

Figure 7–4 illustrates a read-read-write sequence for (M)STRB active and no


wait states. The data is read as late in the cycle as possible to allow maximum
access time from address valid. Note that although external writes require two
cycles, internally (from the perspective of the CPU and DMA) they require only
one cycle if no accesses to that interface are in progress. In the typical timing
for all external interfaces, the (X)R/W strobe does not change until (M)STRB
or IOSTRB goes inactive.

Figure 7–4. Read-Read-Write for (M)STRB = 0

H3

H1

(M)STRB

(X)R/W

(X)A

(X)D Read Read Write Data

(X)RDY

Note: Back-to-Back Read Operations


(M)STRB will remain low during back-to-back read operations.

External Bus Operation 7-7


External Interface Timing

Figure 7–5 illustrates a write-write-read sequence for (M)STRB active and no


wait states. The address and data written are held valid approximately
one-half cycle after (M)STRB changes.

Figure 7–5. Write-Write-Read for (M)STRB = 0

H3

H1

(M)STRB

(X)R/W

(X)A

(X)D Write Data Write Data Read

(X)RDY

7-8
External Interface Timing

Figure 7–6 illustrates a read cycle with one wait state. Since (X)RDY = 1, the
read cycle is extended. (M)STRB, (X)R/W, and (X)A are also extended one
cycle. The next time (X)RDY is sampled, it is 0.

Figure 7–6. Use of Wait States for Read for (M)STRB = 0

H3

H1

(M)STRB

XR/W

(X)A

(X)D Read Write Data

(X)RDY

Extra
Cycle

External Bus Operation 7-9


External Interface Timing

Figure 7–7 illustrates a write cycle with one wait state. Since initially (X)RDY =
1, the write cycle is extended. (M)STRB, (X)R/W, and (X)A are extended one
cycle. The next time (X)RDY is sampled, it is 0.

Figure 7–7. Use of Wait States for Write for (M)STRB = 0

H3

H1

(M)STRB

(X)R/W

(X)A

(X)D Write Data Write Data

(X)RDY

Extra
Cycle

7-10
External Interface Timing

7.2.2 Expansion-Bus I/O Cycles


In contrast to primary bus and MSTRB cycles, IOSTRB reads and writes are
both two cycles in duration (with no wait states) and exhibit the same timing.
During these cycles, address always changes on the falling edge of H1, and
IOSTRB is low from the rising edge of the first H1 cycle to the rising edge of
the second H1 cycle. The IOSTRB signal always goes inactive (high) between
cycles, and XR/W is high for reads and low for writes.

Figure 7–8 illustrates read and write cycles when IOSTRB is active and there
are no wait states. For IOSTRB accesses, reads and writes require a minimum
of two cycles. Some off-chip peripherals might change their status bits when
read or written to. Therefore, it is important to maintain valid addresses when
communicating with these peripherals. For reads and writes when IOSTRB is
active, IOSTRB is completely framed by the address.

Figure 7–8. Read and Write for IOSTRB = 0

H3

H1

IOSTRB

XR/W

XA

XD Read Write Data

XRDY

External Bus Operation 7-11


External Interface Timing

Figure 7–9 illustrates a read with one wait state when IOSTRB is active, and
Figure 7–10 illustrates a write with one wait state when IOSTRB is active. For
each wait state added, IOSTRB, XR/W, and XA are extended one clock cycle.
Writes hold the data on the bus one additional cycle. The sampling of XRDY
is repeated each cycle.

Figure 7–9. Read With One Wait State for IOSTRB = 0

H3

H1

IOSTRB

XR/W

XA

XD Read

XRDY

Extra
Cycle

7-12
External Interface Timing

Figure 7–10. Write With One Wait State for IOSTRB = 0

H3

H1

IOSTRB

XR/W

XA

XD Write Data

XRDY

Extra
Cycle

External Bus Operation 7-13


External Interface Timing

Figure 7–11, Figure 7–12, Figure 7–13, Figure 7–14, Figure 7–15,
Figure 7–16, Figure 7–17, Figure 7–18, Figure 7–19, Figure 7–20, and
Figure 7–21 illustrate the various transitions between memory reads and
writes, and I/O writes over the expansion bus.

Figure 7–11. Memory Read and I/O Write for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA Memory Address I/O Address

XD Read I/O Write

XRDY

7-14
External Interface Timing

Figure 7–12. Memory Read and I/O Read for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA Memory
I/O Address
Address

XD Read Read

XRDY

External Bus Operation 7-15


External Interface Timing

Figure 7–13. Memory Write and I/O Write for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA Memory Address I/O Address

XD Memory Write I/O Write

XRDY

7-16
External Interface Timing

Figure 7–14. Memory Write and I/O Read for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA Memory Address I/O Address

XD Memory Write I/O Read

XRDY

External Bus Operation 7-17


External Interface Timing

Figure 7–15. I/O Write and Memory Write for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA I/O Address Memory Address

XD I/O Write Memory Write

XRDY

7-18
External Interface Timing

Figure 7–16. I/O Write and Memory Read for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA I/O Address Memory Address

XD I/O Write Read

XRDY

External Bus Operation 7-19


External Interface Timing

Figure 7–17. I/O Read and Memory Write for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA I/O Address Memory Address

XD Read Memory Write

XRDY

7-20
External Interface Timing

Figure 7–18. I/O Read and Memory Read for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA I/O Address Memory Address

XD Read Read

XRDY

External Bus Operation 7-21


External Interface Timing

Figure 7–19. I/O Write and I/O Read for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA

XD Write Data Read

XRDY

7-22
External Interface Timing

Figure 7–20. I/O Write and I/O Write for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA

XD Write Data Write Data

XRDY

External Bus Operation 7-23


External Interface Timing

Figure 7–21. I/O Read and I/O Read for Expansion Bus

H3

H1

MSTRB

IOSTRB

XR/W

XA

XD Read Read

XRDY

7-24
External Interface Timing

Figure 7–22 and Figure 7–23 illustrate the signal states when a bus is inactive
(after an IOSTRB or (M)STRB access, respectively). The strobes (STRB,
MSTRB and IOSTRB) and (X)R/W) go to 1. The address is undefined, and the
ready signal (XRDY or RDY) is ignored.

Figure 7–22. Inactive Bus States for IOSTRB

H3

H1

IOSTRB

XR/W

XA

XD Write Data

XRDY XRDY Ignored

Bus Inactive

External Bus Operation 7-25


External Interface Timing

Figure 7–23. Inactive Bus States for STRB and MSTRB

H3

H1

(M)STRB

(X)R/W

(X)A

(X)D Write Data

(X)RDY (X)RDY Ignored

Bus Inactive

7-26
External Interface Timing

Figure 7–24 illustrates the timing for HOLD and HOLDA. HOLD is an external
asynchronous input. There is a minimum of one cycle delay from the time when
the processor recognizes HOLD = 0 until HOLDA = 0. When HOLDA = 0, the
address, data buses, and associated strobes are placed in a high-impedance
state. All accesses occurring over an interface are complete before a hold is
acknowledged.

Figure 7–24. HOLD and HOLDA Timing

H3

H1

HOLD

HOLDA

STRB

R/W

D Write Data

Bus
Inactive

External Bus Operation 7-27


Programmable Wait States

7.3 Programmable Wait States


You can control wait-state generation by manipulating memory-mapped con-
trol registers associated with both the primary and expansion interfaces. Use
the WTCNT field to load an internal timer, and use the SWW field to select one
of the following four modes of wait-state generation:
- External RDY
- WTCNT-generated RDYwtcnt
- Logical-AND of RDY and RDYwtcnt
- Logical-OR of RDY and RDYwtcnt

The four modes are used to generate the internal ready signal, RDYint, that
controls accesses. As long as RDYint = 1, the current external access is
delayed. When RDYint = 0, the current access completes. Since the use of
programmable wait states for both external interfaces is identical, only the pri-
mary bus interface is described in the following paragraphs.

RDYwtcnt is an internally generated ready signal. When an external access is


begun, the value in WTCNT is loaded into a counter. WTCNT can be any value
from 0 through 7. The counter is decremented every H1/H3 clock cycle until
it becomes 0. Once the counter is set to 0, it remains set to 0 until the next ac-
cess. While the counter is nonzero, RDYwtcnt = 1. While the counter is 0,
RDYwtcnt = 0.

7-28
Programmable Wait States

When SWW = 0 0, RDYint depends only on RDY. RDYwtcnt is ignored.


Table 7–3 is the truth table for this mode.

Table 7–3. Wait-State Generation When SWW = 0 0


RDY RDYwtcnt RDYint
0 0 0
0 1 0
1 0 1
1 1 1

When SWW = 0 1, RDYint depends only on RDYwtcnt. RDY is ignored.


Table 7–4 is the truth table for this mode.

Table 7–4. Wait-State Generation When SWW = 0 1


RDY RDYwtcnt RDYint
0 0 0
0 1 1
1 0 0
1 1 1

When SWW = 1 0, RDYint is the logical-OR (electrical-AND, since these sig-


nals are low true) of RDY and RDYwtcnt (see Table 7–5).

Table 7–5. Wait-State Generation When SWW = 1 0


RDY RDYwtcnt RDYint
0 0 0
0 1 0
1 0 0
1 1 1

When SWW = 1 1, RDYint is the logical-AND (electrical-OR, since these sig-


nals are low true) of RDY and RDYwtcnt. The truth table for this mode is
Table 7–6.

Table 7–6. Wait-State Generation When SWW = 1 1


RDY RDYwtcnt RDYint
0 0 0
0 1 1
1 0 1
1 1 1

External Bus Operation 7-29


Programmable Bank Switching

7.4 Programmable Bank Switching


Programmable bank switching allows you to switch between external memory
banks without externally inserting wait states due to memories that require
several cycles to turn off. Bank switching is implemented on the primary bus
and not on the expansion bus.

The size of a bank is determined by the number of bits specified to be ex-


amined on the BNKCMP field of the primary bus control register (see
Table 7–1 on page 7-4). For example (see Figure 7–25), if BNKCMP = 16,
the 16 MSBs of the address are used to define a bank. Since addresses are
24 bits, the bank size is specified by the eight LSBs, yielding a bank size of 256
words. If BNKCMP ≥ 16, only the 16 MSBs are compared. Bank sizes from 28
= 256 to 224 = 16M are allowed. Table 7–7 summarizes the relationship be-
tween BNKCMP, the address bits used to define a bank, and the resulting bank
size.

Figure 7–25. BNKCMP Example

24-bit address

23 8 7 0

Number of bits to compare Defines bank size

Table 7–7. BNKCMP and Bank Size


BNKCMP MSBs Defining a Bank Bank Size (32-Bit Words)
00000 None 224= 16M
00001 23 223= 8M
00010 23—22 222= 4M
00011 23—21 221= 2M
00100 23—20 220= 1M
00101 23—19 219= 512K
00110 23—18 218= 256K
00111 23—17 217= 128K
01000 23—16 216= 64K
01001 23—15 215= 32K
01010 23—14 214= 16K
01011 23—13 213= 8K
01100 23—22 212= 4K
01101 23—11 211= 2K
01110 23—12 210= 1K
01111 23—9 29 =512
10000 23—8 28 = 256
10000—11111 Reserved Undefined

7-30
Programmable Bank Switching

The TMS320C3x has an internal register that contains the MSBs (as defined
by the BNKCMP field) of the last address used for a read or write over the pri-
mary interface. At reset, the register bits are set to 0. If the MSBs of the address
being used for the current primary interface read do not match those contained
in this internal register, a read cycle is not asserted for one H1/H3 clock cycle.
During this extra clock cycle, the address bus switches over to the new ad-
dress, but STRB is inactive (high). The contents of the internal register are re-
placed with the MSBs being used for the current read of the current address.
If the MSBs of the address being used for the current read match the bits in
the register, a normal read cycle takes place.

If repeated reads are performed from the same memory bank, no extra cycles
are inserted. When a read is performed from a different memory bank, memory
conflicts are avoided by the insertion of an extra cycle. This feature can be dis-
abled by setting BNKCMP to 0. The insertion of the extra cycle occurs only
when a read is performed. The changing of the MSBs in the internal register
occurs for all reads and writes over the primary interface.

Figure 7–26 illustrates the addition of an inactive cycle when switches be-
tween banks of memory occur.

Figure 7–26. Bank-Switching Example

H3

H1

STRB

R/W

D Read Read Read

RDY

Extra
Cycle

External Bus Operation 7-31


7-32
Chapter 8

Peripherals

The TMS320C3x features two timers, two serial ports (one on the
TMS320C31), and an on-chip direct memory access (DMA) controller. These
peripheral modules are controlled through memory-mapped registers located
on the dedicated peripheral bus.

The DMA controller is used to perform input/output operations without interfer-


ing with the operation of the CPU. Therefore, it is possible to interface the
TMS320C3x to slow external memories and peripherals (A/Ds, serial ports,
etc.) without reducing the computational throughput of the CPU. The result is
improved system performance and decreased system cost.

Major topics discussed in this chapter on peripherals are listed below.

Topic Page

8.1 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2


8.2 Serial Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-13
8.3 DMA Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43

8-1
Timers

8.1 Timers
The TMS320C3x timer modules are general-purpose, 32-bit, timer/event
counters, with two signaling modes and internal or external clocking (see
Figure 8–1). You can use the timer modules to signal to the TMS320C3x or the
external world at specified intervals or to count external events. With an inter-
nal clock, you can use the timer to signal an external A/D converter to start a
conversion, or it can interrupt the TMS320C3x DMA controller to begin a data
transfer. The timer interrupt is one of the internal interrupts. With an external
clock, the timer can count external events and interrupt the CPU after a speci-
fied number of events. Each timer has an I/O pin that you can use as an input
clock to the timer, an output clock signal, or a general-purpose I/O pin.

Figure 8–1. Timer Block Diagram


Internal Clock/2
Counter (32-bit)
External Clock
INV
Counter Register
Period Register (31-0)
(31-0)

32
32

Comparator
?
Period = Counter

Pulse Generator

INV
TSTAT

Timer Out

Three memory-mapped registers are used by each timer:

- Global-Control Register
The global-control register determines the operating mode of the timer,
monitors the timer status, and controls the function of the I/O pin of the timer.

- Period Register
The period register specifies the timer’s signaling frequency.

8-2
Timers

- Counter Register
The counter register contains the current value of the incrementing count-
er. You can increment the timer on the rising edge or the falling edge of the
input clock. The counter is zeroed and can cause an internal interrupt
whenever its value equals that in the period register. The pulse generator
generates two types of external clock signals: pulse or clock. The memory
map for the timer modules is shown in Figure 8–2.

Figure 8–2. Memory-Mapped Timer Locations


Register Peripheral Address

Timer 0 Timer 1

Timer Global Control (See Table 8–1) 808020h 808030h


Reserved 808021h 808031h
Reserved 808022h 808032h
Reserved 808023h 808033h
Timer Counter (See subsection 8.1.2) 808024h 808034h
Reserved 808025h 808035h
Reserved 808026h 808036h
Reserved 808027h 808037h
Timer Period (See subsection 8.1.2) 808028h 808038h
Reserved 808029h 808039h
Reserved 80802Ah 80803Ah
Reserved 80802Bh 80803Bh
Reserved 80802Ch 80803Ch
Reserved 80802Dh 80803Dh
Reserved 80802Eh 80803Eh
Reserved 80802Fh 80803Fh

8.1.1 Timer Global-Control Register


The timer global control register is a 32-bit register that contains the global and
port control bits for the timer module. Table 8–1 defines this register’s bits,
names, and functions. Bits 3 –0 are the port control bits; bits 11 –6 are the tim-
er global control bits. Figure 8–3 shows the 32-bit register. Note that at reset,
all bits are set to 0 except for DATIN (which is set to the value read on TCLK).

Peripherals 8-3
Timers

Figure 8–3. Timer Global-Control Register


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx TSTAT INV CLKSRC C/P HLD GO xx xx DATIN DATOUT I/O FUNC
R R/W R/W R/W R/W R/W R R/W R/W R/W

R = Read, W = Write, xx = reserved bit, read as 0

Table 8–1. Timer Global-Control Register Bits Summary

Bits Name Reset Value Function


0 FUNC 0 FUNC controls the function of TCLK. If FUNC = 0, TCLK is confi-
gured as a general-purpose digital I/O port. If FUNC = 1, TCLK is
configured as a timer pin (see Figure 8–4 for a description of the
relationship between FUNC and CLKSRC).

1 I/O 0 If FUNC = 0 and CLKSRC = 0, TCLK is configured as a general-


purpose I/O pin. In this case, if I/O = 0, TCLK is configured as a
general-purpose input pin. If I/O = 1, TCLK is configured as a gen-
eral-purpose output pin.

2 DATOUT 0 DATOUT drives TCLK when the TMS320C3x is in I/O port mode.
You can use DATOUT as an input to the timer.

3 DATIN x† Data input on TCLK or DATOUT. A write has no effect.

5–4 Reserved 0–0 Read as 0.

6 GO 0 The GO bit resets and starts the timer counter. When GO = 1 and
the timer is not held, the counter is zeroed and begins increment-
ing on the next rising edge of the timer input clock. The GO bit is
cleared on the same rising edge. GO = 0 has no effect on the
timer.

7 HLD 0 Counter hold signal. When this bit is 0, the counter is disabled and
held in its current state. If the timer is driving TCLK, the state of
TCLK is also held. The internal divide-by-two counter is also held
so that the counter can continue where it left off when HLD is set to
1. You can read and modify the timer registers while the timer is
being held. RESET has priority over HLD. Table 8–2 shows the
effect of writing to GO and HLD.

8 C/P 0 Clock/Pulse mode control. When C/P = 1, clock mode is chosen,


and the signaling of the TSTAT flag and external output will have a
50 percent duty cycle. When C/P = 0, the status flag and external
output will be active for one H1 cycle during each timer period (see
Figure 8–5 on page 8-7).

† x = 0 or 1

8-4
Timers

Table 8–1. Timer Global-Control Register Bits Summary (Continued)


Bits Name Reset Value Function
9 CLKSRC 0 Specifies the source of the timer clock. When CLKSRC = 1, an inter-
nal clock with frequency equal to one-half of the H1 frequency is
used to increment the counter. The INV bit has no effect on the inter-
nal clock source. When CLKSRC = 0, you can use an external signal
from the TCLK pin to increment the counter. The external clock is
synchronized internally, thus allowing external asynchronous clock
sources that do not exceed the specified maximum allowable exter-
nal clock frequency. This will be less than f(H1)/2. (See Figure 8–4
for a description of the relationship between FUNC and CLKSRC).

10 INV 0 Inverter control bit. If an external clock source is used and INV = 1, the
external clock is inverted as it goes into the counter. If the output of the
pulse generator is routed to TCLK and INV = 1, the output is inverted
before it goes to TCLK (see Figure 8–1). If INV = 0, no inversion is
performed on the input or output of the timer. The INV bit has no effect,
regardless of its value, when TCLK is used in I/O port mode.

11 TSTAT 0 This bit indicates the status of the timer. It tracks the output of the
uninverted TCLK pin. This flag sets a CPU interrupt on a transition from
0 to 1. A write has no effect.

31–12 Reserved 0–0 Read as 0.


† x = 0 or 1

Peripherals 8-5
Timers

Figure 8–4. Timer Modes as Defined by CLKSRC and FUNC

Internal External Internal External


Timer Timer
Internal Internal
Timer In Clock Timer In Clock
Timer Out TCLK Timer Out TCLK

TSTAT I/O Port TSTAT DATIN


Control

CLKSRC = 1 (Internal) CLKSRC = 1 (Internal)


FUNC = 0 (I/O Pin) FUNC = 1 (Timer Pin)
(a) (b)

Timer Internal External Timer Internal External


Timer In TCLK Timer In TCLK
Timer Out Timer Out

TSTAT I/O Port TSTAT DATIN


Control

CLKSRC = 0 (External) CLKSRC = 0 (External)


FUNC = 0 (I/O Pin) FUNC = 1 (Timer Pin)
(c) (d)

8-6
Timers

Figure 8–5. Timer Timing


2/f(H1)
1/f(H1)

1/f(CLKSRC)
period register/f(CLKSRC)

TINT TINT TINT


(a) TSTAT and timer output (INV = 0) when C/P = 0 (pulse mode)

1/f(CLKSRC)
2/f(H1)

period register/f(CLKSRC)
2 x period register/f(CLKSRC)

TINT TINT
(b) TSTAT and timer output (INV = 0) when C/P = 1 (clock mode)

The rate of timer signaling is determined by the frequency of the timer input
clock and the period register. The following equations are valid with either an
internal or an external timer clock:

f(pulse mode) = f(timer clock) / period register

f(clock mode) = f(timer clock) / (2 x period register)

Note: Period Register


If the period register equals 0, refer to Section 8.1.2.

Table 8–2 shows the result of a write using specified values of the GO and HLD
bits in the global control register.

Peripherals 8-7
Timers

Table 8–2. Result of a Write of Specified Values of GO and HLD


GO HLD Result
0 0 All timer operations are held. No reset is performed. (Reset value)

0 1 Timer proceeds from state before write.

1 0 All timer operations are held, including zeroing of the counter. The
GO bit is not cleared until the timer is taken out of hold.

1 1 Timer resets and starts.

8.1.2 Timer Period and Counter Registers


The 32-bit timer period register is used to specify the frequency of the timer
signaling. The timer counter register is a 32-bit register, which is reset to 0
whenever it increments to the value of the period register. Both registers are
set to 0 at reset.

Certain boundary conditions affect timer operation. These conditions are listed
below:

- When the period and counter registers are 0, the operation of the timer is
dependent upon the C/P mode selected. In pulse mode (C/P = 0), TSTAT
is set and remains set. In clock mode (C/P = 1), the width of the cycle is
2/f(H1), and the external clocks are ignored.

- When the counter register is not 0 and the period register = 0, the counter
will count, roll over to 0, and then behave as described above.

- When the counter register is set to a value greater than the period register,
the counter may overflow when being incremented. Once the counter
reaches its maximum 32-bit value (0FFFFFFFFh), it simply clocks over to
0 and continues.

Writes from the peripheral bus override register updates from the counter and
new status updates to the control register.

8.1.3 Timer Pulse Generation


The timer pulse generator (see Figure 8–1 on page 8-2) can generate sever-
al external signals. You can invert these signals with the INV bit. The two basic
modes are pulse mode and clock mode, as shown in Figure 8–5 on page 8-7.
In both modes, an internal clock source f (timer clock) has a frequency of
f(H1)/2, and an externally generated clock source f (timer clock) can have a
maximum frequency of f(H1)/2.6. Refer to timer timing in subsection 13.5.16
on page 13-66. In pulse mode (C/P = 0), the width of the pulse is 1/f(H1).

8-8
Timers

Figure 8–6 provides some examples of the TCLKx output when the period reg-
ister is set to various values and clock or pulse mode is selected.

Figure 8–6. Timer Output Generation Examples


2H1
H1

(a) INV = 0, C/P = 0 (Pulse Mode)


Timer Period = 1
Also,
INV = 0, C/P = 1 (Clock Mode)
Timer Period = 0

4H1
H1

(b) INV = 0, C/P = 0 (Pulse Mode)


Timer Period = 2

6H1
H1

(c) INV = 0, C/P = 0 (Pulse Mode)


Timer Period = 3

4H1
2H1

(d) INV = 0, C/P = 1 (Clock Mode)


Timer Period = 1

8H1
4H1

(e) INV = 0, C/P = 1 (Clock Mode)


Timer Period = 2

12H1
6H1

(f) INV = 0, C/P = 1 (Clock Mode)


Timer Period = 3

Peripherals 8-9
Timers

8.1.4 Timer Operation Modes


The timer can receive its input and send its output in several different modes,
depending upon the setting of CLKSRC, FUNC, and I/O. The four timer modes
of operation are defined as follows:
- If CLKSRC = 1 and FUNC = 0, the timer input comes from the internal
clock. The internal clock is not affected by the INV bit. In this mode, TCLK
is connected to the I/O port control, and you use TCLK as a general-pur-
pose I/O pin (see Figure 8–7). If I/O = 0, TCLK is configured as a general-
purpose input pin whose state you can read in DATIN. DATOUT has no
effect on TCLK or DATIN. If I/O = 1, TCLK is configured as a
general-purpose output pin. DATOUT is placed on TCLK and can be read
in DATIN.

Figure 8–7. Timer I/O Port Configurations

Internal External

DATOUT (NC) TCLK

DATIN
I/O = 0
(a)

Internal External

DATOUT TCLK

DATIN
I/O = 1
(b)

- If CLKSRC = 1 and FUNC = 1, the timer input comes from the internal
clock, and the timer output goes to TCLK. This value can be inverted using
INV, and you can read in DATIN the value output on TCLK.
- If CLKSRC = 0 and FUNC = 0, the timer is driven according to the status
of the I/O bit. If I/O = 0, the timer input comes from TCLK. This value can
be inverted using INV, and you can read in DATIN the value of TCLK. If I/O
= 1, TCLK is an output pin. Then, TCLK and the timer are both driven by
DATOUT. All 0-to-1 transitions of DATOUT increment the counter. INV has
no effect on DATOUT. You can read in DATIN the value of DATOUT.
- If CLKSRC = 0 and FUNC = 1, TCLK drives the timer. If INV = 0, all 0-to-1
transitions of TCLK increment the counter. If INV = 1, all 1-to-0 transitions
of TCLK increment the counter. You can read in DATIN the value of TCLK.

8-10
Timers

Figure 8–4 on page 8-6 shows the four timer modes of operation.

8.1.5 Timer Interrupts


A timer interrupt is generated whenever the TSTAT bit of the timer control reg-
ister changes from a 0 to a 1. The frequency of timer interrupts depends on
whether the timer is set up in pulse mode or clock mode.

- In pulse mode, the interrupt frequency is determined by the following


equation:

f(timer clock)
f(interrupt) = , where
period register
f(interrupt) = timer frequency
f(timer clock) = interrupt frequency

- In clock mode, the interrupt frequency is determined by the following equa-


tion:

f(timer clock)
f(interrupt) = , where
2 x period register
f(interrupt) = timer frequency
f(timer clock) = interrupt frequency

The timer counter is automatically reset to 0 whenever it is equal to the value


in the timer period register. You can use the timer interrupt for either the CPU
or the DMA. Interrupt enable control for each timer, for either the CPU or the
DMA, is found in the CPU/DMA interrupt enable register. Refer to subsection
3.1.8 on page 3-7 for more information on the CPU/DMA interrupt enable
register.

When a timer interrupt occurs, a change in the state of the corresponding


TCLK pin will be observed if FUNC = 1 and CLKSRC = 1 in the timer global-
control register. The exact change in the state depends on the state of the
C/P bit.

Peripherals 8-11
Timers

8.1.6 Timer Initialization/Reconfiguration


The timers are controlled through memory-mapped registers located on the
dedicated peripheral bus. Following is the general procedure for initializing
and/or reconfiguring the timers:

1) Halt the timer by clearing the GO/HLD bits of the timer global-control regis-
ter. To do this, write a 0 to the timer global-control register. Note that the
timers are halted on RESET.

2) Configure the timer via the timer global-control register (with GO = HLD
= 0 ), the timer counter register, and timer period register, if necessary.

3) Start the timer by setting the GO/HLD bits of the timer global-control
register.

8-12
Serial Ports

8.2 Serial Ports


The TMS320C30 has two totally independent bidirectional serial ports. Both
serial ports are identical, and there is a complementary set of control registers
in each one. Only one serial port is available on the TMS320C31. You can con-
figure each serial port to transfer 8, 16, 24, or 32 bits of data per word simulta-
neously in both directions. The clock for each serial port can originate either
internally, via the serial port timer and period registers, or externally, via a
supplied clock. An internally generated clock is a divide-down of the clockout
frequency, f(H1). A continuous transfer mode is available, which allows the se-
rial port to transmit and receive any number of words without new synchroniza-
tion pulses.

Eight memory-mapped registers are provided for each serial port:


- Global-control register
- Two control registers for the six serial I/O pins
- Three receive/transmit timer registers
- Data-transmit register
- Data-receive register

The global-control register controls the global functions of the serial port and
determines the serial-port operating mode. Two port control registers control
the functions of the six serial port pins. The transmit buffer contains the next
complete word to be transmitted. The receive buffer contains the last complete
word received. Three additional registers are associated with the transmit/re-
ceive sections of the serial-port timer. A serial-port block diagram is shown in
Figure 8–8 on page 8-14, and the memory map of the serial ports is shown in
Figure 8–9 on page 8-15.

Peripherals 8-13
Serial Ports

Figure 8–8. Serial-Port Block Diagram

Receive Section Transmit Section

CLKR CLKX
Receive TSTAT TSTAT Transmit
CLKR CLKX
Timer (16) Timer (16)

RINT FSR FSX XINT


Receive Clock
FSR FSX

Bit Counter Bit Counter


(8/16/24/32) (8/16/24/32)

RSR XSR
(32) (32)

Load Load Load


Control Control
DX DX
DR DR

DX

DRR Load DXR


(32) (32)

8-14
Serial Ports

Figure 8–9. Memory-Mapped Locations for the Serial Ports


Register Peripheral Address
Serial Serial
Port 0 Port 1†
Serial-Port Global Control (See Figure 8–10) 808040h 808050h
Reserved 808041h 808051h
FSX/DX/CLKX Port Control (See Figure 8–11) 808042h 808052h
FSR/DR/CLKR Port Control (See Figure 8–12) 808043h 808053h
R/X Timer Control (See Figure 8–13) 808044h 808054h
R/X Timer Counter (See Figure 8–14) 808045h 808055h
R/X Timer Period (See Figure 8–15) 808046h 808056h
Reserved 808047h 808057h
Data Transmit (See Figure 8–16) 808048h 808058h
Reserved 808049h 808059h
Reserved 80804Ah 80805Ah
Reserved 80804Bh 80805Bh
Data Receive (See Figure 8–17) 80804Ch 80805Ch
Reserved 80804Dh 80805Dh
Reserved 80804Eh 80805Eh
Reserved 80804Fh 80805Fh

† Reserved locations on the TMS320C31

8.2.1 Serial-Port Global-Control Register


The serial-port global-control register is a 32-bit register that contains the glob-
al control bits for the serial port. Table 8–3 defines the register bits, bit names,
and bit functions. The register is shown in Figure 8–10.

Table 8–3. Serial-Port Global-Control Register Bits Summary


Bit Name Reset Value Function
0 RRDY 0 If RRDY = 1, the receive buffer has new data and is ready to be read. A
three H1/H3 cycle delay occurs from the loading of DRR to RRDY = 1. The
rising edge of this signal sets RINT. If RRDY= 0 at reset, the receive buffer
does not have new data since the last read. RRDY = 0 at reset and after
the receive buffer is read.

1 XRDY 1 If XRDY = 1, the transmit buffer has written the last bit of data to the shifter
and is ready for a new word. A three H1/H3 cycle delay occurs from the
loading of the transmit shifter until XRDY is set to 1. The rising edge of this
signal sets XINT. If XRDY = 0, the transmit buffer has not written the last
bit of data to the transmit shifter and is not ready for a new word. XRDY =
1 at reset.

2 FSXOUT 0 This bit configures the FSX pin as an input (FSXOUT = 0) or an output
(FSXOUT = 1).

Peripherals 8-15
Serial Ports

Table 8–3. Serial-Port Global-Control Register Bits Summary (Continued)

Bit Name Reset Value Function

3 XSREMPTY 0 If XSREMPTY = 0, the transmit shift register is empty. If XSREMPTY = 1,


the transmit shift register is not empty. Reset or XRESET causes this bit
to = 0.

4 RSRFULL 0 If RSRFULL = 1, an overrun of the receiver has occurred. In continuous


mode, RSRFULL is set to 1 when both RSR and DRR are full. In noncontin-
uous mode, RSRFULL is set to 1 when RSR and DRR are full and a new
FSR is received. A read causes this bit to be set to 0. This bit can be set
to 0 only by a system reset, a serial-port receive reset (RRESET = 1), or
a read. When the receiver tries to set RSRFULL to 1 at the same time that
the global register is read, the receiver will dominate, and RSRFULL is set
to 1. If RSRFULL = 0, no overrun of the receiver has occurred.

5 HS 0 If HS = 1, the handshake mode is enabled. If HS = 0, the handshake mode


is disabled.

6 XCLKSRCE 0 If XCLKSRCE = 1, the internal transmit clock is used. If XCLKSRCE = 0,


the external transmit clock is used.

7 RCLKSRCE 0 If RCLKSRCE = 1, the internal receive clock is used. If RCLKSRCE = 0,


the external receive clock is used.

8 XVAREN 0 This bit specifies fixed (XVAREN = 0) or variable (XVAREN = 1) data rate
signaling when transmitting. With a fixed data rate, FSX is active for at least
one XCLK cycle and then goes inactive before transmission begins. With
variable data rate, FSX is active while all bits are being transmitted. When
you use an external FSX and variable data rate signaling, the DX pin is driv-
en by the transmitter when FSX is held active or when a word is being
shifted out.

9 RVAREN 0 This bit specifies fixed (RVAREN = 0) or variable (RVAREN = 1) data rate
signaling when receiving. With a fixed data rate, FSR is active for at least
one RCLK cycle and then goes inactive before the reception begins. With
variable data rate, FSR is active while all bits are being received.

10 XFSM 0 Transmit frame sync mode. Configures the port for continuous mode oper-
ation(XFSM = 1) or standard mode (XFSM = 0). In continuous mode, only
the first word of a block generates a sync pulse, and the rest are simply
transmitted continuously to the end of the block. In standard mode, each
word has an associated sync pulse.

11 RFSM 0 Receive frame sync mode. Configures the port for continuous mode
(RFSM =1) or standard mode (RFSM = 0) operation. In continuous mode,
only the first word of a block generates a sync pulse, and the rest are simply
received continuously without expectation of another sync pulse. In stan-
dard mode, each word received has an associated sync pulse.

12 CLKXP 0 CLKX polarity. If CLKXP = 0, CLKX is active high. If CLKXP = 1, CLKX is


active low.

8-16
Serial Ports

Table 8–3. Serial-Port Global-Control Register Bits Summary (Continued)

Bit Name Reset Value Function


13 CLKRP 0 CLKR polarity. If CLKRP = 0, CLKR is active (high). If CLKRP =1, CLKR
is active (low).

14 DXP 0 DX polarity. If DXP = 0, DX is active (high). If DXP = 1, DX is active (low).

15 DRP 0 DR polarity. If DRP = 0, DR is active (high). If DRP = 1, DR is active (low).

16 FSXP 0 FSX polarity. If FSXP = 0, FSX is active (high). If FSXP = 1, FSX is


active (low).

17 FSRP 0 FSR polarity. If FSRP = 0, FSR is active (high). If FSRP = 1, FSR is


active (low).

19–18 XLEN 00 These two bits define the word length of serial data transmitted. All data
is assumed to be right-justified in the transmit buffer when fewer than 32
bits are specified.
0 0 --- 8 bits 1 0 --- 24 bits
0 1 --- 16 bits 1 1 --- 32 bits

21–20 RLEN 00 These two bits define the word length of serial data received. All data is
right-justified in the receive buffer.
0 0 --- 8 bits 1 0 --- 24 bits
0 1 --- 16 bits 1 1 --- 32 bits

22 XTINT 0 Transmit timer interrupt enable. If XTINT = 0, the transmit timer interrupt
is disabled. If XTINT = 1, the transmit timer interrupt is enabled.

23 XINT 0 Transmit interrupt enable. If XINT = 0, the transmit interrupt is disabled. If


XINT= 1, the transmit interrupt is enabled. Note that the CPU receive flag
XINT and the serial port-to-DMA interrupt (EXINT0 in the IE register) is the
OR of the enabled transmit timer interrupt and the enabled transmit inter-
rupt.

24 RTINT 0 Receive timer interrupt enable. If RTINT = 0, the receive timer interrupt is
disabled. If RTINT = 1, the receive timer interrupt is enabled.

25 RINT 0 Receive interrupt enable. If RINT = 0, the receive interrupt is disabled. If


RINT= 1, the receive interrupt is enabled. Note that the CPU receive flag
RINT and the serial-port-to-DMA interrupt (ERINT0 in the IE register) is the
OR of the enabled receive timer interrupt and the enabled receive inter-
rupt.

26 XRESET 0 Transmit reset. If XRESET = 0, the transmit side of the serial port is reset.
To take the transmit side of the serial port out of reset, set XRESET to 1.
However, do not set XRESET to 1 until at least three cycles after XRESET
goes inactive. This applies only to system reset. Setting XRESET to 0 does
not change the contents of any of the serial-port control registers. It places
the transmitter in a state corresponding to the beginning of a frame of data.
Resetting the transmitter generates a transmit interrupt. Reset this bit dur-
ing the time the mode of the transmitter is set. You can toggle XFSM with-
out resetting the global-control register.

Peripherals 8-17
Serial Ports

Table 8–3. Serial-Port Global-Control Register Bits Summary (Concluded)


Bit Name Reset Value Function
27 RRESET 0 Receive reset. If RRESET = 0, the receive side of the serial port is reset.
To take the receive side of the serial port out of reset, set RRESET to 1.
Setting RRESET to 0 does not change the contents of any of the serial-
port control registers. It places the receiver in a state corresponding to the
beginning of a frame of data. Reset this bit at the same time that the mode
of the receiver is set. RFSM can be toggled without resetting the global-
control register.

31–28 Reserved 0–0 Read as 0.

Figure 8–10. Serial-Port Global-Control Register


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx RRESET XRESET RINT RTINT XINT XTINT RLEN XLEN FSRP FSXP
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
DRP DXP CLKRP CLKXP RFSM XFSM RVAREN XVAREN RCLK XCLK HS RSR XSR FSXOUT XRDY RRDY
SRCE SRCE FULL EMPTY

R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R R R/W R R

R = Read, W = Write, xx = reserved bit, read as 0

8.2.2 FSX/DX/CLKX Port-Control Register


This 32-bit port control register controls the function of the serial port FSX, DX,
and CLKX pins. At reset, all bits are set to 0. Table 8–4 defines the register bits,
bit names, and functions. Figure 8–11 shows this port control register.

8-18
Serial Ports

Table 8–4. FSX/DX/CLKX Port-Control Register Bits Summary

Bit Name Reset Value Function


0 CLKXFUNC 0 CLKXFUNC controls the function of CLKX. If CLKXFUNC = 0,
CLKX is configured as a general-purpose digital I/O port. If
CLKXFUNC = 1, CLKX is a serial port pin.

1 CLKXI/O 0 If CLKX I/O = 0, CLKX is configured as a general-purpose input


pin. If CLKX I/O = 1, CLKX is configured as a general-purpose out-
put pin.

2 CLKXDATOUT 0 Data output on CLKX.

3 CLKXDATIN x Data input on CLKX. A write has no effect.

4 DXFUNC 0 DXFUNC controls the function of DX. If DXFUNC = 0, DX is config-


ured as a general-purpose digital I/O port. If DXFUNC = 1, DX is
a serial port pin.

5 DX I/O 0 If DX I/O = 0, DX is configured as a general-purpose input pin. If


DX I/O = 1, DX is configured as a general-purpose output pin.

6 DXDATOUT 0 Data output on DX.

7 DXDATIN x† Data input on DX. A write has no effect.

8 FSXFUNC 0 FSXFUNC controls the function of FSX. If FSXFUNC = 0, FSX is


configured as a general-purpose digital I/O port. If FSXFUNC = 1,
FSX is a serial port pin.

9 FSX I/O 0 If FSX I/O = 0, FSX is configured as a general-purpose input pin.


If FSX I/O = 1, FSX is configured as a general-purpose output pin.

10 FSXDATOUT 0 Data output on FSX.

11 FSXDATIN x† Data input on FSX. A write has no effect.

31–12 Reserved 0–0 Read as 0.


† x = 0 or 1

Figure 8–11. FSX/DX/CLKX Port-Control Register


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
FSX FSX FSX FSX DX DX DX DX CLKX CLKX CLKX CLKX
xx xx xx xx
DATIN DATOUT I/O FUNC DATIN DATOUT I/O FUNC DATIN DATOUT I/O FUNC
R R/W R/W R/W R R/W R/W R/W R R/W R/W R/W

R = Read, W = Write, xx = reserved bit, read as 0

Peripherals 8-19
Serial Ports

8.2.3 FSR/DR/CLKR Port-Control Register

This 32-bit port control register is controlled by the function of the serial port
FSR, DR, and CLKR pins. At reset, all bits are set to 0. Table 8–5 defines the
register bits, the bit names, and functions. Figure 8–12 illustrates this port con-
trol register.

Table 8–5. FSR/DR/CLKR Port-Control Register Bits Summary


Bit Name Reset Value Function
0 CLKRFUNC 0 CLKRFUNC controls the function of CLKR. If CLKRFUNC = 0,
CLKR is configured as a general-purpose digital I/O port. If
CLKRFUNC = 1, CLKR is a serial port pin.
1 CLKRI/O 0 If CLKRI/O = 0, CLKR is configured as a general-purpose input pin.
If CLKRI/O = 1, CLKR is configured as a general-purpose output pin.
2 CLKRDATOUT 0 Data output on CLKR.
3 CLKRDATIN x Data input on CLKR. A write has no effect.
4 DRFUNC 0 DRFUNC controls the function of DR. If DRFUNC = 0, DR is
configured as a general-purpose digital I/O port. If DRFUNC = 1, DR
is a serial port pin.
5 DR I/O 0 If DRI/O = 0, DR is configured as a general-purpose input pin.
If DRI/O = 1, DR is configured as a general-purpose output pin.
6 DRDATOUT 0 Data output on DR
7 DRDATIN x† Data input on DR. A write has no effect.
8 FSRFUNC 0 FSRFUNC controls the function of FSR. If FSRFUNC = 0, FSR is
configured as a general-purpose digital I/O port. If
FSRFUNC = 1, FSR is a serial port pin.
9 FSR I/O 0 If FSR I/O = 0, FSR is configured as a general-purpose input pin. If
FSR I/O = 1, FSR is configured as a general-purpose output pin.

10 FSRDATOUT 0 Data output on FSR

11 FSRDATIN x Data input on FSR. A write has no effect.

31–12 Reserved 0–0 Read as 0.


† x = 0 or 1

Figure 8–12. FSR/DR/CLKR Port-Control Register


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
FSR FSR FSR FSR DR DR DR DR CLKR CLKR CLKR CLKR
xx xx xx xx
DATIN DATOUT I/O FUNC DATIN DATOUT I/O FUNC DATIN DATOUT I/O FUNC
R R/W R/W R/W R R/W R/W R/W R R/W R/W R/W

R = Read, W = Write, xx = reserved bit, read as 0

8-20
Serial Ports

8.2.4 Receive/Transmit Timer-Control Register


A 32-bit receive/transmit timer control register contains the control bits for the
timer module. At reset, all bits are set to 0. Table 8–6 lists the register bits, bit
names, and functions. Bits 5 –0 control the transmitter timer. Bits 11 –6 control
the receiver timer. Figure 8–13 shows the register. The serial port receive/
transmit timer function is similar to timer module operation. It can be consid-
ered a 16-bit-wide timer. Refer to Section 8.1 on page 8-2 for more informa-
tion on timers.

Table 8–6. Receive/Transmit Timer-Control Register


Bit Name Reset Value Function
0 XGO 0 The XGO bit resets and starts the transmit timer counter. When XGO
is set to 1 and the timer is not held, the counter is zeroed and begins
incrementing on the next rising edge of the timer input clock. The XGO
bit is cleared on the same rising edge. Writing 0 to XGO has no effect
on the transmit timer.
1 XHLD 0 Transmit counter hold signal. When this bit is set to 0, the counter is dis-
abled and held in its current state. The internal divide-by-two counter
is also held so that the counter will continue where it left off when XHLD
is set to 1. You can read and modify the timer registers while the timer
is being held. RESET has priority over XHLD.
2 XC/P 0 XClock/Pulse mode control. When XC/P = 1, the clock mode is chosen.
The signaling of the status flag and external output has a 50 percent
duty cycle. When XC/P = 0, the status flag and external output are ac-
tive for one CLKOUT cycle during each timer period.
3 XCLKSRC 0 This bit specifies the source of the transmit timer clock. When
XCLKSRC = 1, an internal clock with frequency equal to one-half the
CLKOUT frequency is used to increment the counter. When XCLKSRC
= 0, you can use an external signal from the CLKX pin to increment the
counter. The external clock source is synchronized internally, thus al-
lowing for external asynchronous clock sources that do not exceed the
specified maximum allowable external clock frequency, that is, less
than f(H1)/2.6.
4 Reserved 0 Read as zero.
5 XTSTAT 0 This bit indicates the status of the transmit timer. It tracks what would
be the output of the uninverted CLKX pin. This flag sets a CPU interrupt
on a transition from 0 to 1. A write has no effect.
6 RGO 0 The RGO bit resets and starts the receive timer counter. When RGO
is set to 1 and the timer is not held, the counter is zeroed and begins
incrementing on the next rising edge of the timer input clock. The RGO
bit is cleared on the same rising edge. Writing 0 to RGO has no effect
on the receive timer.
7 RHLD 0 Receive counter hold signal. When this bit is set to 0, the counter is dis-
abled and held in its current state. The internal divide-by-two counter
is also held so that the counter will continue where it left off when RHLD
is set to 1. You can read and modify the timer registers while the timer
is being held. RESET has priority over RHLD.

Peripherals 8-21
Serial Ports

Table 8–6. Receive/Transmit Timer-Control Register (Concluded)


Bit Name Reset Value Function
8 RC/P 0 RClock/Pulse mode control. When RC/P = 1, the clock mode is cho-
sen. The signaling of the status flag and external output has a 50 per-
cent duty cycle. When RC/P = 0, the status flag and external output
are active for one CLKOUT cycle during each timer period.
9 RCLKSRC 0 This bit specifies the source of the receive timer clock. When
RCLKSRC = 1, an internal clock with frequency equal to one-half the
CLKOUT frequency is used to increment the counter. When
RCLKSRC = 0, you can use an external signal from the CLKR pin to
increment the counter. The external clock source is synchronized in-
ternally, thus allowing for external asynchronous clock sources that
do not exceed the specified maximum allowable external clock fre-
quency, that is, less than f(H1)/2.6.
10 Reserved 0 Read as zero.
11 RTSTAT 0 This bit indicates the status of the receive timer. It tracks what would
be the output of the uninverted CLKR pin. This flag sets a CPU inter-
rupt on a transition from 0 to 1. A write has no effect.
31— 12 Reserved 0–0 Read as 0.

Figure 8–13. Receive/Transmit Timer-Control Register


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx RTSTAT xx RCLKSRC RC/P RHLD RGO XTSTAT xx XCLKSRC XC/P XHLD XGO
R R/W R/W R R/W R/W R R/W R/W R/W

R = Read, W = Write, xx = reserved bit, read as 0

8.2.5 Receive/Transmit Timer-Counter Register


The receive/transmit timer counter register is a 32-bit register (see
Figure 8–14). Bits 15–0 are the transmit timer counter, and bits 31 —16 are the
receive timer counter. Each counter is cleared to 0 whenever it increments to
the value of the period register (see Section 8.2.6). It is also set to 0 at reset.

Figure 8–14. Receive/Transmit Timer Counter Register


31 16
Receive Counter

15 0
Transmit Counter

NOTE: All bits are read/write.

8-22
Serial Ports

8.2.6 Receive/Transmit Timer-Period Register


The receive/transmit timer period register is a 32-bit register (see
Figure 8–15). Bits 15 –0 are the timer transmit period, and bits 31 –16 are the
receive period. Each register is used to specify the period of the timer. It is also
cleared to 0 at reset.

Figure 8–15. Receive/Transmit Timer-Period Register


31 16
Receive Period

15 0
Transmit Period

Note: All bits are read/write.

8.2.7 Data-Transmit Register


When the data-transmit register (DXR) is loaded, the transmitter loads the
word into the transmit shift register (XSR), and the bits are shifted out. The
delay from a write to DXR until an FSX occurs (or can be accepted) is two
CLKX cycles. The word is not loaded into the shift register until the shifter is
empty. When DXR is loaded into XSR, the XRDY bit is set, specifying that the
buffer is available to receive the next word. Four tap points within the transmit
shift register are used to transmit the word. These tap points correspond to the
four data word sizes and are illustrated in Figure 8–16. The shift is a left-shift
(LSB to MSB) with the data shifted out of the MSB corresponding to the appro-
priate tap point.

Figure 8–16. Transmit Buffer Shift Operation


← Shift Direction ←

31 24 23 16 15 8 7 0

32-bit 24-bit 16-bit 8-bit


word tap word tap word tap word tap

Peripherals 8-23
Serial Ports

8.2.8 Data-Receive Register


When serial data is input, the receiver shifts the bits into the receive shift regis-
ter (RSR). When the specified number of bits are shifted in, the data-receive
register (DRR) is loaded from RSR, and the RRDY status bit is set. The receiv-
er is double-buffered. If the DRR has not been read and the RSR is full, the
receiver is frozen. New data coming into the DR pin is ignored. The receive
shifter will not write over the DRR. The DRR must be read to allow new data
in the RSR to be transferred to the DRR. When a write to DRR occurs at the
same time that an RSR to DRR transfer takes place, the RSR to DRR transfer
has priority.

Data is shifted to the left (LSB to MSB). Figure 8–17 illustrates what happens
when words less than 32 bits are shifted into the serial port. In this figure, it is
assumed that an 8-bit word is being received and that the upper three bytes
of the receive buffer are originally undefined. In the first portion of the figure,
byte a has been shifted in. When byte b is shifted in, byte a is shifted to the left.
When the data receive register is read, both bytes a and b are read.

Figure 8–17. Receive Buffer Shift Operation


← Shift Direction ←

31 24 23 16 15 8 7 0
After Byte a X X X a

After Byte b X X a b

8.2.9 Serial-Port Operation Configurations


Several configurations are provided for the operation of the serial port clocks
and timer. The clocks for each serial port can originate either internally or exter-
nally. Figure 8–18 shows serial port clocking in the I/O mode (CLKRFUNC =
0) when CLKX is either an input or an output. Figure 8–19 shows clocking in
the serial-port mode (CLKRFUNC=1). Both figures use a transmit section for
an example. The same relationship holds for a receive section.

8-24
Serial Ports

Figure 8–18. Serial-Port Clocking in I/O Mode

Internal External Internal External

TSTAT Internal TSTAT


Timer in Clock Timer in

XSR XSR

DATOUT DATAOUT
DATIN DATIN

CLKRFUNC = 0 (I/O Mode) CLKRFUNC = 0 (I/O Mode)


CLKXI/O = 1 (CLKX, an Output) CLKXI/O = 1 (CLKX, an Output)
XCLKSRC = 1 (Internal CLK for Timer) XCLKSRC = 0 (External CLK for Timer)
(a) (b)

Internal External
Internal External TSTAT
TSTAT Timer in
Internal CLKX
Timer in Clock XSR
XSR CLKX DATOUT (NC)
DATOUT (NC) DATIN
DATIN

CLKRFUNC = 0 (I/O Mode) CLKRFUNC = 0 (I/O Mode)


CLKXI/O = 0 (CLKX, an Input) CLKXI/O = 0 (CLKX, an Input)
XCLKSRC = 1 (Internal CLK for Timer) XCLKSRC = 0 (External CLK for Timer)
(c) (d)

Peripherals 8-25
Serial Ports

Figure 8–19. Serial-Port Clocking in Serial-Port Mode

Internal External Internal External


TSTAT Internal TSTAT Internal
Timer Timer
Clock Clock

CLKX CLKX
XSR XSR
DATOUT (NC) INV DATOUT (NC)
DATIN DATIN INV

CLKRFUNC = 1 (Serial-Port Mode) CLKRFUNC = 1 (Serial-Port Mode)


XCLKSRCE = 1 (Output Serial-Port CLK) XCLKSRCE = 0 (Input Serial-Port CLK)
XCLKSRC = 0 or 1 XCLKSRC = 1 (Internal CLK for Timer)
(a) (b)

Internal External

TSTAT
Timer
CLKX

XSR
DATOUT (NC) INV
DATIN
FUNC = 1 (Serial-Port Mode)
XCLKSRCE = 0 (Input Serial-Port CLK)
XCLKSRC = 0 (External CLK for Timer)
(c)

8.2.10 Serial-Port Timing


The formula for calculating the frequency of the serial-port clock with an inter-
nally generated clock is dependent upon the operation mode of the serial-port
timers, defined as

f (pulse mode) = f (timer clock)/period register

f (clock mode) = f (timer clock)/(2 x period register)

An internally generated clock source f(timer clock) has a maximum frequency


of f(H1)/2. An externally generated serial-port clock f (timer clock) (CLKX or
CLKR) has a maximum frequency of less than f(H1)/2.6. See serial port timing
in Table 13–27 on page 13-57. Also, see subsection 8.1.3 on page 8-8 for in-
formation on timer pulse/clock generation.

8-26
Serial Ports

Transmit data is clocked out on the rising edge of the selected serial-port clock.
Receive data is latched into the receive shift register on the falling edge of the
serial-port clock. All data is transmitted and loaded MSB first and right-justi-
fied. If fewer than 32 bits are transferred, the data are right-justified in the 32-bit
transmit and receive buffers. Therefore, the LSBs of the transmit buffer are
the bits that are transmitted.

The transmit ready (XRDY) signal specifies that the data-transmit register
(DXR) is available to be loaded with new data. XRDY goes active as soon as
the data is loaded into the transmit shift register (XSR). The last word may still
be shifting out when XRDY goes active. If DXR is loaded before the last word
has completed transmission, the data bits transmitted are consecutive; that is,
the LSB of the first word immediately precedes the MSB of the second, with
all signaling valid as in two separate transmits. XRDY goes inactive when DXR
is loaded and remains inactive until the data is loaded into the shifter.

The receive ready (RRDY) signal is active as long as a new word of data is
loaded into the data receive register and has not been read. As soon as the
data is read, the RRDY bit is turned off.

When FSX is specified as an output, the activity of the signal is determined


solely by the internal state of the serial port. If a fixed data rate is specified, FSX
goes active when DXR is loaded into XSR to be transmitted out. One serial-
clock cycle later, FSX turns inactive, and data transmission begins. If a variable
data rate is specified, the FSX pin is activated when the data transmission be-
gins and remains active during the entire transmission of the word. Again, the
data is transmitted one clock cycle after it is loaded into the data transmit
register.

An input FSX in the fixed data rate mode should go active for at least one serial
clock cycle and then inactive to initiate the data transfer. The transmitter then
sends the number of bits specified by the LEN bits. In the variable data-rate
mode, the transmitter begins sending from the time FSX goes active until the
number of specified bits has been shifted out. In the variable data-rate mode,
when the FSX status changes prior to all the data bits being shifted out, the
transmission completes, and the DX pin is placed in a high-impedance state.
An FSR input is exactly complementary to the FSX.

When using an external FSX, if DXR and XSR are empty, a write to DXR results
in a DXR-to-XSR transfer. This data is held in the XSR until an FSX occurs.
When the external FSX is received, the XSR begins shifting the data. If XSR
is waiting for the external FSX, a write to DXR will change DXR, but a DXR-to-
XSR transfer will not occur. XSR begins shifting when the external FSX is re-
ceived, or when it is reset using XRESET.

Peripherals 8-27
Serial Ports

Continuous Transmit and Receive Modes

When continuous mode is chosen, consecutive writes do not generate or ex-


pect new sync pulse signaling. Only the first word of a block begins with an ac-
tive synchronization. Thereafter, data continues to be transmitted as long as
new data is loaded into DXR before the last word has been transmitted. As
soon as TXRDY is active and all of the data has been transmitted out of the
shift register, the DX pin is placed in a high-impedance state, and a subsequent
write to DXR initiates a new block and a new FSX.

Similarly with FSR, the receiver continues shifting in new data and loading
DRR. If the data-receive buffer is not read before the next word is shifted in,
you will lose subsequent incoming data. You can use the RFSM bit to terminate
the receive-continuous mode.

Handshake Mode

The handshake mode (HS = 1) allows for direct connection between proces-
sors. In this mode, all data words are transmitted with a leading 1 (see
Figure 8–20). For example, if an eight-bit word is to be transmitted, the first bit
sent is a 1, followed by the eight-bit data word.

In this mode, once the serial port transmits a word, it will not transmit another
word until it receives a separately transmitted zero bit. Therefore, the 1 bit that
precedes every data word is, in effect, a request bit.

Figure 8–20. Data Word Format in Handshake Mode


Data Word (8 Bits)

DX 1

leading 1

After a serial port receives a word (with the leading 1) and that word has been
read from the DRR, the receiving serial port sends a single 0 to the transmitting
serial port. Thus, the single 0 bit acts as an acknowledge bit (see Figure 8–21).
This single acknowledge bit is sent every time the DRR is read, even if the DRR
does not contain new data.

Figure 8–21. Single Zero Sent as an Acknowledge Bit

DX 0

single 0

8-28
Serial Ports

When the serial port is placed in the handshake mode, the insertion and dele-
tion of a leading 1 for transmitted data, the sending of a 0 for acknowledgement
of received data, and the waiting for this acknowledge bit are all performed au-
tomatically. Using this scheme, it is simple to connect processors with no exter-
nal hardware and to guarantee secure communication. Figure 8–22 is a typi-
cal configuration.

In the handshake mode, FSX is automatically configured as an output. Contin-


uous mode is automatically disabled. After a system reset or XRESET, the
transmitter is always permitted to transmit. The transmitter and receiver must
be reset when entering the handshake mode.

Figure 8–22. Direct Connection Using Handshake Mode


TMS320C3x #1 TMS320C3x #2
CLKX CLKR
FSX FSR
DX DR

CLKR CLKX
FSR FSX
DR DX

8.2.11 Serial-Port Interrupt Sources


A serial port has the following interrupt sources:

- The transmit timer interrupt: The rising edge of XTSTAT causes a sing-
le-cycle interrupt pulse to occur. When XTINT is 0, this interrupt pulse is
disabled.

- The receive timer interrupt: The rising edge of RTSTAT causes a single-
cycle interrupt pulse to occur. When RTINT is 0, this interrupt pulse is dis-
abled.

- The transmitter interrupt: Occurs immediately following a DXR-to-XSR


transfer. The transmitter interrupt is a single-cycle pulse. When the
serial-port global-control register bit XINT is 0, this interrupt pulse is dis-
abled.

- The receiver interrupt: Occurs immediately following an RSR to DRR


transfer. The receiver interrupt is a single-cycle pulse. When the
serial-port global-control register bit RINT is 0, this interrupt pulse is
disabled.
The transmit timer interrupt pulse is ORed with the transmitter interrupt pulse to create
the CPU transmit interrupt flag XINT. The receive timer interrupt pulse is ORed with the
receiver interrupt pulse to create the CPU receive interrupt flag RINT.

Peripherals 8-29
Serial Ports

8.2.12 Serial-Port Functional Operation

The following paragraphs and figures illustrate the functional timing of the vari-
ous serial-port modes of operation. The timing descriptions are presented with
the assumption that all signal polarities are configured to be positive, that is,
CLKXP = CLKRP = DXP = DRP = FSXP = FSRP = 0. Logical timing, in situa-
tions where one or more of these polarities are inverted, is the same except
with respect to the opposite polarity reference points, that is, rising vs. falling
edges, etc.

These discussions pertain to the numerous operating modes and configura-


tions of the serial-port logic. When it is necessary to switch operating modes
or change configurations of the serial port, you should do so only when
XRESET or RRESET are asserted (low), as appropriate. Therefore, when
transmit configurations are modified, XRESET should be low, and when re-
ceive configurations are modified, RRESET should be low. When you use
handshake mode, however, since the transmitter and receiver are interrelated,
you should make any configuration changes with XRESET and RRESET both
low.

All of the serial-port operating configurations can be broadly classified in two


categories: fixed data-rate timing and variable data-rate timing. The following
paragraphs discuss fixed and variable data-rate operation and all of their vari-
ations.

Fixed Data-Rate Timing Operation

Fixed data-rate serial-port transfers can occur in two varieties: burst mode and
continuous mode. In burst mode, transfers of single words are separated by
periods of inactivity on the serial port. In continuous mode, there are no gaps
between successive word transfers; the first bit of a new word is transferred
on the next CLKX/R pulse following the last bit of the previous word. This oc-
curs continuously until the process is terminated.

In burst mode with fixed data-rate timing, FSX/FSR pulses initiate transfers,
and each transfer involves a single word. With an internally generated FSX
(see Figure 8–23), transmission is initiated by loading DXR. In this mode,
there is a delay of approximately 2.5 CLKX cycles (depending on CLKX and
H1 frequencies) from the time DXR is loaded until FSX occurs. With an exter-
nal FSX, the FSX pulse initiates the transfer, and the 2.5-cycle delay effectively
becomes a setup requirement for loading DXR with respect to FSX. Therefore,
in this case, you must load DXR no later than three CLKX cycles before FSX
occurs. Once the XSR is loaded from the DXR, an XINT is generated.

8-30
Serial Ports

Figure 8–23. Fixed Burst Mode


CLKX/R

FSR/FSX (External)

FSX (Internal)

DX/DR A1 AN

DXR Loaded XINT RINT

In receive operations, once a transfer is initiated, FSR is ignored until the last
bit. For burst-mode transfers, FSR must be low during the last bit, or another
transfer will be initiated. After a full word has been received and transferred to
the DRR, an RINT is generated.

In fixed data-rate mode, you can perform continuous transfers even if R/XFSM
= 0, as long as properly timed frame synchronization is provided, or as long
as DXR is reloaded each cycle with an internally generated FSX (see
Figure 8–24).

Figure 8–24. Fixed Continuous Mode With Frame Sync

CLKX/R

FSX (Internal)

FSR/FSX (External)

DR/DX A1 AN B1 BN C1

XINT XINT
RINT RINT
DXR Loaded XINT

DXR Loaded Load DXR Load DXR


Read DRR Read DRR

Peripherals 8-31
Serial Ports

For receive operations and with externally generated FSX, once transfers
have begun, frame sync pulses are required only during the last bit transferred
to initiate another contiguous transfer. Otherwise, frame sync inputs are ig-
nored. Therefore, continuous transfers will occur if frame sync is held high.
With an internally generated FSX, there is a delay of approximately 2.5 CLKX
cycles from the time DXR is loaded until FSX occurs. This delay occurs each
time DXR is loaded; therefore, during continuous transmission, the instruction
that loads DXR must be executed by the N–3 bit for an N-bit transmission.
Since delays due to pipelining may vary, you should incorporate a conserva-
tive margin of safety in allowing for this delay.

Once the process begins, an XINT and an RINT are generated at the begin-
ning of each transfer. The XINT indicates that the XSR has been loaded from
DXR and can be used to cause DXR to be reloaded. To maintain continuous
transmission in fixed rate mode with frame sync, especially with an internally
generated FSX, DXR must be reloaded early in the ongoing transfer.

The RINT indicates that a full word has been received and transferred into the
DRR. RINT is therefore commonly used to indicate an appropriate time to read
DRR.

Continuous transfers are terminated by discontinuing frame sync pulses or, in


the case of internally generated FSX, not reloading DXR.

You can accomplish continuous serial-port transfers without the use of frame
sync pulses if R/XFSM are set to 1. In this mode, operation of the serial port
is similar to continuous operation with frame sync, except that a frame sync
pulse is involved only in the first word transferred, and no further frame sync
pulses are used. Following the first word transferred (see Figure 8–25), no in-
ternal frame sync pulses are generated, and frame sync inputs are ignored.
Additionally, you should set R/XFSM prior to or during the first word trans-
ferred; you must set R/XFSM no later than the transfer of the N–1 bit of the first
word, except for transmit operations. For transmit operations in the fixed data-
rate mode, XFSM must be set no later than the N–2 bit. You must clear
R/XFSM no later than the N–1 bit to be recognized in the current cycle.

8-32
Serial Ports

Figure 8–25. Fixed Continuous Mode Without Frame Sync

CLKX/R

FSR/FSX (External)

FSX (Internal)

DR/DX A1 AN B1 BN C1

XINT Set XINT XINT


R/XFSM RINT RINT

DXR Loaded
DXR Loaded Load DXR Load DXR
Read DRR Read DRR

Timing of RINT and XINT and data transfers to and from DXR and DRR, re-
spectively, are the same as in fixed data-rate continuous mode with frame
sync. This mode of operation also exhibits the same delay of 2.5 CLKX cycles
after DXR is loaded before an internal FSX is generated. As in the case of con-
tinuous operation in fixed data-rate mode with frame sync, you must reload
DXR no later than transmission of the N–3 bit.

When you use continuous operation in fixed data-rate mode, R/XFSM can be
set and cleared as desired, even during active transfers, to enable or disable
the use of frame sync pulses as dictated by system requirements. Under most
conditions, the effect of changing the state of R/XFSM occurs during the trans-
fer in which the R/XFSM change was made, provided the change was made
early enough in the transfer. For transmit operations with internal FSX in fixed
data-rate mode, however, a one-word delay occurs before frame sync pulse
generation resumes when clearing XFSM to 0 (see Figure 8–26). Therefore,
in this case, one additional word is transferred before the next FSX pulse is
generated. Also note that, as discussed previously, the clearing of XFSM is
recognized during the transmission of the word currently being transmitted as
long as XFSM is cleared no later than the N–1 bit. The setting of XFSM is rec-
ognized as long as XFSM is set no later than the N–2 bit.

Peripherals 8-33
Serial Ports

Figure 8–26. Exiting Fixed Continuous Mode Without Frame Sync, FSX Internal

1st Word 2nd Word 3rd Word 4th Word 5th Word

CLKX
FSX
(Internal)

DX A1 AN B1 BN C1 CN D1 DN E1 EN F1 FN

LOAD DXR SET XFSM RESET XFSM

Variable Data-Rate Timing Operation

Variable data-rate timing also supports operation in either burst or continuous


mode. Burst-mode operation with variable data-rate timing is similar to burst-
mode operation with fixed data-rate timing. With variable data-rate timing (see
Figure 8–27), however, FSX/R and data timing differ slightly at the beginning
and end of transfers. Specifically, there are three major differences between
fixed and variable data-rate timing:

- FSX/R pulses typically last for the entire transfer interval, although FSR
and external FSX are ignored after the first bit transferred. FSX/R pulses
in fixed data-rate mode typically last only one CLKX/R cycle but can last
longer.

- Data transfer begins during the CLKX/R cycle in which FSX/R occurs,
rather than the CLKX/R cycle following FSX/R, as is the case with fixed
data-rate timing.

- With variable data-rate timing, frame sync inputs are ignored until the end
of the last bit transferred, rather than the beginning of the last bit trans-
ferred, as is the case with fixed data-rate timing.

8-34
Serial Ports

Figure 8–27. Variable Burst Mode

CLKX/R

FSR/FSX (External)

FSX (Internal)

DX/DR A1 AN

DXR Loaded XINT RINT

When you transmit continuously in variable data-rate mode with frame sync,
timing is the same as for fixed data-rate mode, except for the differences be-
tween these two modes as described under Variable Data-Rate Timing Opera-
tion. The only other exception is that you must reload DXR no later than the
N–4 bit to maintain continuous operation of the variable data-rate mode (see
Figure 8–28); you must reload DXR no later than the N–3 bit to maintain con-
tinuous operation of the fixed data-rate mode.

Figure 8–28. Variable Continuous Mode With Frame Sync

CLKX/R

FSR/FSX (External)

FSX (Internal)

DX/DR A1 AN B1 BN C1 C2

DXR Loaded XINT XINT XINT


RINT RINT
Load
DXR Load DXR Load DXR
Read DRR Read DRR

Continuous operation in variable data-rate mode without frame sync (see


Figure 8–29) is also similar to continuous operation without frame sync in fixed
data-rate mode. As with variable data-rate mode continuous operation with
frame sync, you must reload DXR no later than the N–4 bit to maintain continu-
ous operation. Additionally, when R/XFSM is set or cleared in the variable da-
ta-rate mode, you must make the modification no later than the N–1 bit for the
result to be affected in the current transfer.

Peripherals 8-35
Serial Ports

Figure 8–29. Variable Continuous Mode Without Frame Sync

CLKX/R

FSR/FSX (External)

FSX (Internal)

DX/DR A1 AN B1 BN C1 C2

XINT
Set XINT XINT
DXR Loaded R/XFSM RINT RINT

Load DXR Load DXR


DXR Loaded Read DRR Read DRR

8.2.13 Serial-Port Initialization/Reconfiguration

The serial ports are controlled through memory-mapped registers on the dedi-
cated peripheral bus. Following is a general procedure for initializing and/or
reconfiguring the serial ports.

1) Halt the serial port by clearing the XRESET and/or RRESET bits of the ser-
ial-port global-control register. To do this, write a 0 to the serial-port global-
control register. Note that the serial ports are halted on RESET.

2) Configure the serial port via the serial-port global-control register (with
XRESET = RRESET = 0) and the FSX/DX/CLKX and FSR/DR/CLKR port-
control registers. If necessary, configure the receive/transmit registers:
timer control (with XHLD = RHLD = 0), timer counter, and timer period. Re-
fer to subsection 8.2.14 for more information.

3) Start the serial port operation by setting the XRESET and RRESET bits
of the serial-port global-control register and the XHLD and RHLD bits of
the serial-port receive/transmit timer-control register, if necessary.

8.2.14 TMS320C3x Serial-Port Interface Examples

In addition to the examples presented in this section, DMA/serial port initializa-


tion examples can be found in Example 8–6 and Example 8–7 on pages 8-59
and 8-61, respectively.

8-36
Serial Ports

8.2.14.1 Handshake Mode Example

When handshake mode is used, the transmit (FSX/DS/CLKX) and receive


(FSR/DR/CLKR) signals transmit and receive data, respectively. In other
words, even if the TMS320C3x serial port is receiving data only with hand-
shake mode, the transmit signals are still needed to transmit the acknowledge
signal. This is the serial port register setup for the TMS320C3x serial port
handshake communication, as shown in Figure 8–22 on page 8-29:

Global control = 011x0x0xxxx00000000xx01100100b


Transmit port control = 0111h
Receive port control = 0111h
S_port timer control = 0Fh
S_port timer count = 0h
S_port timer period ≥ 01h (if two C3xs have the same
system clock)
x = user-configurable

Since the FSX is set as an output and continuous mode is disabled when hand-
shake mode is selected, you should set the XFSM and RFSM bits to 0 and the
FSXOUT bit to 1 in the global control register. You should set the XRESET,
RRESET, and HS bits to 1 in order to start the handshake communication. You
should set the polarity of the serial port pins active (high) for simplification. Al-
though the CLKX/CLKR can be set as either input or output, you should set
the CLKX as output and the CLKR as input. The rest of the bits are user-confi-
gurable as long as both serial ports have consistent setup.

You need the serial port timer only if the CLKX or CLKR is configured as an
output. Since only the CLKX is configured as an output, you should set the tim-
er control register to 0Fh. When the serial port timer is used, you should also
set the serial timer register to the proper value for the clock speed. The serial
port timer clock speed setup is similar to the TMS320C3x timer. Refer to Sec-
tion 8.1 on page 8-2 for detailed information on timer clock generation.

The maximum clock frequency for serial transfers is F(CLKIN)/4 if the internal
clock is used and F(CLKIN)/5.2 if an external clock is used. Therefore, if two
TMS320C3xs have the same system clock, the timer period register should
be set equal to or greater than 1, which makes the clock frequency equal to
F(CLKIN)/8.

Example 8–1 and Example 8–2 are serial port register setups for the above
case. (Assume two TMS320C3xs have the same system clock.)

Peripherals 8-37
Serial Ports

Example 8–1.Serial-Port Register Setup #1


Global control = 0EBC0064h; 32 bits, fixed data rate, burst mode,
Transmit port control = 0111h ; FSX (output), CLKX (output) = F(CLKIN)/8
Receive port control = 0111h ; CLKR (input), handshake mode, transmit
S_port timer control = 0Fh ; and receive interrupt is enabled.
S_port timer count = 0h
S_port timer period ≥ 01h

Example 8–2.Serial-Port Register Setup #2


Global control = 0C000364h; 8 bits, variable data rate, burst mode,
Transmit port control = 0111h; FSX (output), CLKX (output) = f(CLKIN)/24
Receive port control = 0111h ; CLKR (input), handshake mode, transmit
S_port timer control = 0Fh ; and receive interrupt is disabled.
S_port timer count = 0h
S_port timer period ≥ 01h

Since the data has a leading 1 and the acknowledge signal is a 0 in the hand-
shake mode, the TMS320C3x serial port can distinguish between the data and
the acknowledge signal. Therefore, even if the TMS320C3x serial port re-
ceives the data before the acknowledge signal, the data will not be misinter-
preted as the acknowledge signal and be lost. In addition, the acknowledge
signal is not generated until the data is read from the data receive register
(DRR). Therefore, the TMS320C3x will not transmit the data and the acknowl-
edge signal simultaneously.

8.2.14.2 CPU Transfer With Serial-Port Transmit Polling Method

Example 8–3 sets up the CPU to transfer data (128 words) from an array buffer
to the serial port 0 output register when the previous value stored in the serial
port output register has been sent. Serial port 0 is initialized to transmit 32-bit
data words with an internally generated frame sync and a bit-transfer rate of
8H1 cycles/bit.

8-38
Serial Ports

Example 8–3.CPU Transfer With Serial-Port Transmit Polling Method


* TITLE: CPU TRANSFER WITH SERIAL-PORT TRANSMIT POLLING METHOD
*
.GLOBAL START
.DATA
SOURCE .WORD _ARRAY
.BSS _ARRAY,128 ; DATA ARRAY LOCATED IN .BSS SECTION
; THE UNDERSCORE USED IS JUST TO MAKE IT
; ACCESSIBLE FROM C (OPTIONAL)
SPORT .WORD 808040H ; SERIAL-PORT GLOBAL CONTROL REG ADDRESS
SPRESET .WORD 008C0044 ; SERIAL-PORT RESET
SGCCTRL .WORD 048C0044H ; SERIAL-PORT GLOBAL CONTROL REG INITIALIZATION
SXCTRL .WORD 111H ; SERIAL-PORT TX PORT CONTROL REG INITIALIZATION
STCTRL .WORD 00FH ; SERIAL-PORT TIMER CONTROL REG INITIALIZATION
STPERIOD .WORD 00000002h ; SERIAL-PORT TIMER PERIOD
RESET .WORD 0H ; SERIAL-PORT TIMER RESET VALUE
.TEXT
START LDP RESET ; LOAD DATA PAGE POINTER
ANDN 10H,IE ; DISABLE SERIAL-PORT TRANSMIT INTERRUPT TO CPU

* SERIAL PORT INITIALIZATION


LDI @SPORT,AR1
LDI @RESET,R0
LDI 4,IR0
STI R0,*+AR1(IR0) ; SERIAL-PORT TIMER RESET
LDI @SPRESET,R0
STI R0,*AR1 ; SERIAL-PORT RESET
LDI @SXCTRL,R0 ; SERIAL-PORT TX CONTROL REG INITIALIZATON
STI R0,*+AR1(3)
LDI @STPERIOD,R0 ; SERIAL–PORT TIMER PERIOD INITIALIZATION
STI R0,*+AR1(6)
LDI @STCTRL,R0 ; SERIAL-PORT TIMER CONTROL REG INITIALIZATION
STI R0,*+AR1(4)
LDI @SGCCTRL,R0 ; SERIAL-PORT GLOBAL CONTROL REG INITIALIZATION
STI R0,*AR1

* CPU WRITES THE FIRST WORD

LDI @SOURCE,AR0
LDI *AR0++,R1
STI R1,*+AR1(8)

* CPU WRITES 127 WORDS TO THE SERIAL PORT OUTPUT REG

LDI 8,IR0
LDI 2,R0
LDI 126,RC
RPTB LOOP
WAIT AND *AR1,R0,R2 ; WAIT UNTIL XRDY BIT = 1
BZ WAIT
LOOP STI R1,*+AR1(IR0)
|| LDI *++AR0(1),R1
BU $
.END

Peripherals 8-39
Serial Ports

8.2.14.3 Serial AIC Interface Example

The TLC320C4x analog interface chips (AIC) from Texas Instruments offer a
zero-glue-logic interface to the TMS320C3x family of DSPs. The interface is
shown in Figure 8–30 as an example of the TMS320C3x serial-port configura-
tion and operation.

Figure 8–30. TMS320C3x Zero-Glue-Logic Interface to TLC3204x Example

TMS320C3x TMS320C4x
XF0 RESET WORD VCC
CLKR0 SCLK
CLKX0 OUT+ Analog
FSR0 FSR OUT– Out
DR0 DR
FSX0 FSX IN+ Analog
DX0 DX IN– In
TCLK0 MCLK
GND

The TMS320C3x resets the AIC through the external pin XF0. It also gener-
ates the master clock for the AIC through the timer 0 output pin, TCLK0. (Pre-
cise selection of a sample rate may require the use of an external oscillator
rather than the TCLK0 output to drive the AIC MCLK input.) In turn, the AIC
generates the CLKR0 and CLKX0 shift clocks as well as the FSR0 and FSX0
frame synchronization signals.

A typical use of the AIC requires an 8-kHz sample rate of the analog signal.
If the clock input frequency to the TMS320C3x device is 30 MHz, you should
load the following values into the serial port and timer registers.

Serial Port:
Port global control register: 0E970300h
FSX/DX/CLKX port control register 00000111h
FSR/DR/CLKR port control register 00000111h

Timer:
Timer global control register 000002C1h
Timer period register 00000001h

8.2.14.4 Serial A/D and D/A Interface Example

The DSP201/2 and DSP101/2 family of D/As and A/Ds from Burr Brown also
offer a zero-glue-logic interface to the TMS320C3x family of DSPs. The inter-
face is shown in Example 8–4. This interface is used as an example of the
TMS320C3x serial-port configuration and operation.

8-40
Serial Ports

Example 8–4.TMS320C3x Zero-Glue-Logic Interface to Burr Brown A/D and D/A

Burr Brown DSP102 A/D Burr Brown DSP202 D/A

CASC +5 V +5 V CASC

TMS320C3x

XCLK CLKR0 CLKX0 XCLK

SOUTA DR0 DX0 SINA


± 2.75 V VINA VOUTA ±3V
SYNC SINB
FSR0
± 2.75 V VINB VOUTB ±3V
FSX0 SYNC
SSF +5 V
OSC0
+5 V SSF
OSC1
+5 V SWL

1 MOhm CONV TCLK0 CONV

12.29 MHz

22 pF 22 pF

The DSP102 A/D is interfaced to the TMS320C3x serial port receive side; the
DSP202 D/A is interfaced to the transmit side. The A/Ds and D/As are hard-
wired to run in cascade mode. In this mode, when the TMS320C3x initiates a
convert command to the A/D via the TCLK0 pin, both analog inputs are con-
verted into two 16-bit words, which are concatenated to form one 32-bit word.
The A/D signals the TMS320C3x via the A/D’s SYNC signal (connected to the
TMS320C3x FSR0 pin) that serial data is to be transmitted. The 32-bit word
is then serially transmitted, MSB first, out the SOUTA serial pin of the DSP102
to the DR0 pin of the TMS320C3x serial port. The TMS320C3x is programmed
to drive the analog interface bit clock from the CLKX0 pin of the TMS320C3x.
The bit clock drives both the A/D’s and D/A’s XCLK input. The TMS320C3x
transmit clock also acts as the input clock on the receive side of the
TMS320C3x serial port. Since the receive clock is synchronous to the internal
clock of the TMS320C3x, the receive clock can run at full speed (that is,
f(H1)/2).

Peripherals 8-41
Serial Ports

Similarly, on receiving a convert command, the pipelined D/A converts the last
word received from the TMS320C3x and signals the TMS320C3x via the
SYNC signal (connected to the TMS320C3x FSX0 pin) to begin transmitting
a 32-bit word representing the two channels of data to be converted. The data
transmitted from the TMS320C3x DX0 pin is input to both the SINA and SINB
inputs of the D/A as shown in the figure.

The TMS320C3x is set up to transfer bits at the maximum rate of about eight
Mbps, with a dual-channel sample rate of about 44.1 kHz. Assuming a 32-MHz
CLKIN, you can configure this standard-mode fixed-data-rate signaling inter-
face by setting the registers as described below:

Serial Port:
Port global-control register: 0EBC0040h
FSX/DX/CLKX port-control register 00000111h
FSR/DR/CLKR port-control register 00000111h
Receive/transmit timer-control register 0000000Fh

Timer:
Timer global-control register 000002C1h
Timer period register 000000B5h

8-42
DMA Controller

8.3 DMA Controller


The TMS320C3x has an on-chip direct memory access (DMA) controller that
reduces the need for the CPU to perform input/output functions. The DMA con-
troller can perform input/output operations without interfering with the opera-
tion of the CPU. Therefore, it is possible to interface the TMS320C3x to slow
external memories and peripherals (A/Ds, serial ports, etc.) without reducing
the computational throughput of the CPU. The result is improved system per-
formance and decreased system cost.

A DMA transfer consists of two operations: a read from a memory location and
a write to a memory location. The DMA controller can read from and write to
any location in the TMS320C3x memory map. This includes all
memory-mapped peripherals. The operation of the DMA is controlled with the
following set of memory-mapped registers:
- DMA global-control register
- DMA source-address register
- DMA destination-address register
- DMA transfer-counter register

Table 8–7 shows these registers, their memory-mapped addresses, and their
functions. Each of these DMA registers is discussed in the succeeding subsec-
tions.

Peripherals 8-43
DMA Controller

Table 8–7. Memory-Mapped Locations for a DMA Channel


Peripheral
Register Address

DMA Global Control (See Table 8–8) 808000h

Reserved 808001h

Reserved 808002h

Reserved 808003h

DMA Source Address (see subsection 8.3.2) 808004h

Reserved 808005h

DMA Destination Address (see subsection 8.3.2) 808006h

Reserved 808007h

DMA Transfer Counter (see subsection 8.3.3) 808008h

Reserved 808009h

Reserved 80800Ah

Reserved 80800Bh

Reserved 80800Ch

Reserved 80800Dh

Reserved 80800Eh

Reserved 80800Fh

8-44
DMA Controller

Table 8–8. DMA Global-Control Register Bits


Bit Name Reset Value Function
1–0 START 0–0 These bits control the state in which the DMA starts and stops. The
DMA may be stopped without any loss of data (see Table 8–9).

3–2 STAT 0–0 These bits indicate the status of the DMA and change every cycle
(see Table 8–10).

4 INCSRC 0 If INCSRC = 1, the source address is incremented after every read.

5 DECSRC 0 If DECSRC = 1, the source address is decremented after every


read. If INCSRC = DECSRC, the source address is not modified
after a read.

6 INCDST 0 If INCDST = 1, the destination address is incremented after every


write.

7 DECDST 0 If DECDST = 1, the destination address is decremented after every


write. If INCDST = DECDST, the destination address is not modified
after a write.

9–8 SYNC 0–0 The SYNC bits determine the timing synchronization between the
events initiating the source and the destination transfers. The inter-
pretation of the SYNC bits is shown in Table 8–11.

10 TC 0 The TC bit affects the operation of the transfer counter. If TC = 0,


transfers are not terminated when the transfer counter becomes 0.
If TC = 1, transfers are terminated when the transfer counter be-
comes 0.

11 TCINT 0 If TCINT = 1, the DMA interrupt is set when the transfer counter
makes a transition to 0. If TCINT = 0, the DMA interrupt is not set
when the transfer counter makes a transition to 0.

31–12 Reserved 0–0 Read as 0.

Note: When the DMA completes a transfer, the START bits remain in 11 (base 2). The DMA starts when the START bits are set
to 11 and one of the following conditions applies:

- The transfer counter is set to a value different from 0x0, or


- The TC bit is set to 0.

Peripherals 8-45
DMA Controller

Table 8–9. START Bits and Operation of the DMA (Bits 0–1)
START Function
00 DMA read or write cycles in progress will be completed; any data read will
be ignored. Any pending read or write will be cancelled. The DMA is reset
so that when it starts a new transaction begins; that is, a read is per-
formed. (Reset value)

01 If a read or write has begun, it is completed before it stops. If a read or


write has not begun, no read or write is started.

10 If a DMA transfer has begun, the entire transfer is completed (including


both read and write operations) before stopping. If a transfer has not be-
gun, none is started.

11 DMA starts from reset or restarts from the previous state.

Table 8–10.STAT Bits and Status of the DMA (Bits 2–3)


STAT Function
00 DMA is being held between DMA transfer (between a write and read).
This is the value at reset. (Reset value)

01 DMA is being held in the middle of a DMA transfer, that is, between a read
and a write.

10 Reserved.

11 DMA busy; that is, DMA is performing a read or write or waiting for a
source or destination synchronization interrupt.

Table 8–11. SYNC Bits and Synchronization of the DMA (Bits 8–9)
SYNC Function
00 No synchronization. Enabled interrupts are ignored. (Reset value)

01 Source synchronization. A read is performed when an enabled interrupt


occurs.

10 Destination synchronization. A write is performed when an enabled inter-


rupt occurs.

11 Source and destination synchronization. A read is performed when an


enabled interrupt occurs. A write is then performed when the next en-
abled interrupt occurs.

8-46
DMA Controller

8.3.1 DMA Global-Control Register


The global-control register controls the state in which the DMA controller oper-
ates. This register also indicates the status of the DMA, which changes every
cycle. Source and destination addresses can be incremented, decremented,
or synchronized using specified global-control register bits. At system reset,
all bits in the DMA control register are cleared to 0. Table 8–8 on page 8-45
lists the register bits, names, and functions. Figure 8–31 shows the bit config-
uration of the global-control register.

Figure 8–31. DMA Global-Control Register


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx TCINT TC SYNC DECDST INCDST DECSRC INCSRC STAT START
R/W R/W R/W R/W R/W R/W R/W R/W R R R/W R/W

R = Read, W = Write, xx = reserved bit, read as 0

8.3.2 Destination- and Source-Address Registers


The DMA destination-and-source address registers are 24-bit registers whose
contents specify destination and source addresses. As specified by control
bits DECSRC, INCSRC, DECDST, and INCDST of the DMA global-control
register, these registers are incremented and decremented at the end of the
corresponding memory access, that is, the source register for a read and the
destination register for a write. On system reset, 0 is written to these registers.

8.3.3 Transfer-Counter Register


The transfer-counter register is a 24-bit register, controlled by a 24-bit counter
that counts down. The counter decrements at the beginning of a DMA memory
write. In this way, it can control the size of a block of data transferred. The trans-
fer counter register is set to 0 at system reset. When the TCINT bit of DMA
global-control register is set, the transfer-counter register will cause a DMA in-
terrupt flag to be set upon count down to 0.

8.3.4 CPU/DMA Interrupt-Enable Register


The CPU/DMA interrupt enable register (IE) is a 32-bit register located in the
CPU register file. The CPU interrupt enable bits are in locations 10–1. The
DMA interrupt-enable bits are in locations 26 –16. A 1 in a CPU/DMA interrupt-
enable register bit enables the corresponding interrupt. A 0 disables the corre-
sponding interrupt. At reset, 0 is written to this register.

Peripherals 8-47
DMA Controller

Table 8–12 lists the bits, names, and functions of the CPU/DMA interrupt en-
able register. Figure 8–32 shows the IE register. The priority and decoding
schemes of CPU and DMA interrupts are identical. Note that when the DMA
receives an interrupt, this interrupt is acted upon according to the SYNC field
of the DMA control register. Also note that an interrupt can affect the DMA but
not the CPU and can affect the CPU but not the DMA. Refer to subsection 3.1.8
on page 3-7 and to Chapter 6.

Table 8–12.CPU/DMA Interrupt-Enable Register Bits


Bit Name Function
0 EINT0 Enable external interrupt 0 (CPU)
1 EINT1 Enable external interrupt 1 (CPU)
2 EINT2 Enable external interrupt 2 (CPU)
3 EINT3 Enable external interrupt 3 (CPU)
4 EXINT0 Enable serial-port 0 transmit interrupt (CPU)
5 ERINT0 Enable serial-port 0 receive interrupt (CPU)
6 EXINT1 Enable serial-port 1 transmit interrupt (CPU)
7 ERINT1 Enable serial-port 1 receive interrupt (CPU)
8 ETINT0 Enable timer 0 interrupt (CPU)
9 ETINT1 Enable timer 1 interrupt (CPU)
10 EDINT Enable DMA controller interrupt (CPU)
15–11 Reserved Read as 0
16 EINT0 Enable external interrupt 0 (DMA)
17 EINT1 Enable external interrupt 1 (DMA)
18 EINT2 Enable external interrupt 2 (DMA)
19 EINT3 Enable external interrupt 3 (DMA)
20 EXINT0 Enable serial-port 0 transmit interrupt (DMA)
21 ERINT0 Enable serial-port 0 receive interrupt (DMA)
22 EXINT1 Enable serial-port 1 transmit interrupt (DMA)
23 ERINT1 Enable serial-port 1 receive interrupt (DMA)
24 ETINT0 Enable timer 0 interrupt (DMA)
25 ETINT1 Enable timer 1 interrupt (DMA)
26 EDINT Enable DMA controller interrupt (DMA)
31–27 Reserved Read as 0

8-48
DMA Controller

Figure 8–32. CPU/DMA Interrupt-Enable Register


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

EDINT ETINT1 ETINT0 ERINT1 EXINT1 ERINT0 EXINT0 EINT3 EINT2 EINT1 EINT0
xx xx xx xx xx
(DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA)

R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

EDINT ETINT1 ETINT0 ERINT1 EXINT1 ERINT0 EXINT0 EINT3 EINT2 EINT1 EINT0
xx xx xx xx xx
(CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU)

R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W

Note: xx = Reserved bit, read as 0


R = read, W = write

8.3.5 DMA Memory Transfer Operation

Each DMA memory transfer consists of two parts:

- Read data from the address specified by the DMA source register

- Write data that has been read to the address specified by the DMA desti-
nation register

A transfer is complete only when the read and write are complete. You can stop
a transfer by setting the START bits to the desired value. When the DMA is re-
started (START = 1 1), it completes any pending transfer.

At the end of a DMA read, the source address is modified as specified by the
SRCINC and SRCDEC bits of the DMA global-control register. At the end of
a DMA write, the destination address is modified as specified by the DSTINC
and DSTDEC bits of the DMA global control register. At the end of every DMA
write, the DMA transfer counter is decremented.

DMA on-chip reads and writes (reads and writes from on-chip memory and pe-
ripherals) are single-cycle. DMA off-chip reads are two cycles. The first cycle
is the external read, and the second cycle loads the DMA register. The external
read cycle is identical to a CPU read cycle. DMA off-chip writes are identical
to CPU off-chip writes. If the DMA has been started and is transferring data
over either external bus, you should not modify the bus-control register asso-
ciated with that bus. If you must modify the bus-control register (see Chapter
7), stop the DMA, make the modification, and then restart the DMA. Failure to
do this may produce an unexpected zero-wait-state bus access.

Peripherals 8-49
DMA Controller

Through the 24-bit source and destination registers, the DMA is capable of ac-
cessing any memory-mapped location in the TMS320C3x memory map.
Table 8–13, Table 8–14, and Table 8–15 show the number of cycles a DMA
transfer requires, depending on whether the source and destination are on-
chip memory and peripherals, the external port, or the I/O port. T represents
the number of transfers to be performed, Cr represents the number of wait-
states for the source read, and Cw represents the number of wait-states for the
destination write. Each entry in the table represents the total cycles required
to do the T transfers, assuming that there are no pipeline conflicts.

Accompanying each table is a figure illustrating the timing of the DMA transfer.
|R| and |W| represent single-cycle reads and writes, respectively. |R.R| and
|W.W| represent multicycle reads and writes. |Cr| and |Cw| show the number
of wait cycles for a read and write.

Table 8–13.DMA Timing When Destination Is On-Chip

Cycles (H1) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Source On-Chip R R R : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : :
Destination On-Chip W W W : : : : : : : : : : : :

Source Primary Bus R .R .R: I R.R . R: I R .R .R :I : : : :


Cr : : Cr : : Cr : : : : :
: : : : : : : : : : : : : : : : : :
Destination On-Chip : : : W : : : W : : : W : : :
Source Expansion Bus R .R .R: I R .R. R : I R .R .R: I : : : :
: Cr : : Cr : : Cr : : : : :
: : : : : : : : : : : : : : : : : :
Destination On-Chip : : : W : : : W : : : W : : :

Source Destination On-Chip


On-Chip (1 + 1)T
Primary Bus (2 + Cr + 1)T
Expansion Bus (2 + Cr + 1)T

Legend:
T = Number of transfers
Cr = Source-read wait states
Cw = Destination-write wait states
|R| = Single-cycle reads
|W| = Single-cycle writes
|R.R| = Multicycle reads
|W.W| = Multicycle writes
| I| = Internal register cycle

8-50
DMA Controller

Table 8–14.DMA Timing When Destination Is a Primary Bus

Cycles (H1) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Source On-Chip R R : : : : R
: : : : : : : : :
: : : : : :
: : : : : : : : : : : :
W . W . W .W W . W .W . W W . W . W . W : : : : :
Destination Primary Bus : : Cw : Cw : Cw : : : : :
Source Primary Bus R .R . R: I : : : .R .R . R : I : : : : : :
Cr : : : : : Cr : : : : : :
: : : : : : : : : : : : : : : : : :
: : : W . W . W .W : : : W . W . W .W : :
Destination Primary Bus : : : : : Cw : : : : : Cw : :
Source Expansion Bus R . R .R : I R .R.R: I R . R .R : I : : : :
Cr : :Cr : : Cr : : : : :
: : : : :
: : : : : : : : : : : : :
: : : W . W .W . W W .W .W . W W . W. W . W
Destination Primary Bus : : : : : Cw : : Cw : : Cw

Source Destination Primary Bus


On-Chip 1 + (2 + Cw)T
Primary (2 + Cr + 2 + Cw)T
Bus
Expansion (2 + Cr + 2 + Cw)
Bus + (2 + Cw + max(1, Cr – Cw +
1))(T – 1)

Legend:

T = Number of transfers
Cr = Source-read wait states
Cw = Destination-write wait states
|R| = Single-cycle reads
|W| = Single-cycle writes
|R.R| = Multicycle reads
|W.W| = Multicycle writes
|I| = Internal register cycle

Peripherals 8-51
DMA Controller

Table 8–15.DMA Timing When Destination Is an Expansion Bus

Cycles (H1) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Source On-Chip R R : : R : : : : : : : : : : :
: : : : : : : : : : : : : : : : : :
W . W . W .W W . W .W . W W . W . W . W : : : : :
Destination Expansion Bus : : Cw : Cw : Cw : : : : :
Source Primary Bus R .R .R I R .R .R : I R.R . R : I : : : :
Cr : : Cr : : Cr : : : : :
: : : : : : : : : : : : : : : : : :
: : : W . W . W .W W . W . W .W W . W . W .W
Destination Expansion Bus : : : : : Cw : : Cw : : Cw
Source Expansion Bus R .R .R :I : : : R . R .R : I : : : : : :
Cr : :
: : : Cr : : : : : : :
: : : : :
: : : : : : : : : : : : :
: : : W . W .W . W : : : W .W .W . W : :
Destination Expansion Bus : : : : : Cw : : : : : Cw : :

Source Destination Expansion Bus


On-Chip 1 + (2 + Cw)T
Primary (2 + Cr + 2 + Cw)
Bus + (2 + Cw + max(1,Cr – Cw +
1))(T – 1)
Expansion (2 + Cr + 2 + Cw)T
Bus

Legend:

T = Number of transfers
Cr = Source-read wait states
Cw = Destination-write wait states
|R| = Single-cycle reads
|W| = Single-cycle writes
|R.R| = Multicycle reads
|W.W| = Multicycle writes
|I| = Internal register cycle

8-52
DMA Controller

Table 8–16 shows the maximum DMA transfer rates, assuming that there are
no wait states (Cr = Cw = 0). Table 8–17 shows the maximum DMA transfer
rates, assuming there is one wait state for the read (Cr = 1) and no wait states
for the write (Cw = 0). Table 8–18 shows the maximum DMA transfer rates,
assuming there is one wait state for the read (Cr = 1) and one wait state for the
write (Cw = 1).

In each table, the time for the complete transfer (the read and the write) is con-
sidered. Since one bus access is required for the read and another for the
write, internal bus transfer rates will be twice the DMA transfer rate. It is also
assumed that no conflicts with the CPU exist. Rates are listed in Mwords/sec.
A word is 32 bits (4 bytes).

Table 8–16.Maximum DMA Transfer Rates When Cr = Cw = 0


Destination
S
Source Internal Primary Expansion
Internal 8.33 Mwords/sec 8.33 Mwords/sec 8.33 Mwords/sec

Primary 5.56 Mwords/sec 4.17 Mwords/sec 5.56 Mwords/sec

Expansion 5.56 Mwords/sec 5.56 Mwords/sec 4.17 Mwords/sec

Table 8–17.Maximum DMA Transfer Rates When Cr = 1, Cw = 0


Destination
S
Source Internal Primary Expansion
Internal 8.33 Mwords/sec 8.33 Mwords/sec 8.33 Mwords/sec

Primary 4.17 Mwords/sec 3.33 Mwords/sec 4.17 Mwords/sec

Expansion 4.17 Mwords/sec 4.17 Mwords/sec 3.33 Mwords/sec

Table 8–18.Maximum DMA Transfer Rates When Cr = 1, Cw = 1


Destination
S
Source Internal Primary Expansion
Internal 8.33 Mwords/sec 5.56 Mwords/sec 5.56 Mwords/sec

Primary 4.17 Mwords/sec 2.78 Mwords/sec 4.17 Mwords/sec

Expansion 4.17 Mwords/sec 4.17 Mwords/sec 2.78 Mwords/sec

Peripherals 8-53
DMA Controller

8.3.6 Synchronization of DMA Channels


You can synchronize a DMA channel with interrupts. Refer to Table 8–11 on
page 8-46 for the relationship between the SYNC bits of the DMA global con-
trol register and the synchronization performed. This section describes the fol-
lowing four synchronization mechanisms:
- No synchronization (SYNC = 0 0)
- Source synchronization (SYNC = 0 1)
- Destination synchronization (SYNC = 1 0)
- Source and destination synchronization (SYNC = 1 1)

No Synchronization

When SYNC = 0 0, no synchronization is performed. The DMA performs reads


and writes whenever there are no conflicts. All interrupts are ignored and
therefore are considered to be globally disabled. However, no bits in the DMA
interrupt-enable register are changed. Figure 8–33 shows the
synchronization mechanism when SYNC = 0 0.

Figure 8–33. No DMA Synchronization


Start

Disable DMA Interrupts

DMA Channel Performs a Read

DMA Channel Performs a Write

Go to Start

Source Synchronization

When SYNC = 0 1, the DMA is synchronized to the source (see Figure 8–34).
A read will not be performed until an interrupt is received by the DMA. Then
all DMA interrupts are disabled globally. However, no bits in the DMA interrupt
enable register are changed.

8-54
DMA Controller

Figure 8–34. DMA Source Synchronization


Start

Idle Until Enabled Interrupt Is Received

Disable DMA Interrupts Globally

DMA Channel Performs a Read

Enable DMA Interrupts Globally

DMA Channel Performs a Write

Go to Start

Destination Synchronization

When SYNC= 1 0, the DMA is synchronized to the destination. First, all inter-
rupts are ignored until the read is complete. Though the DMA interrupts are
considered globally disabled, no bits in the DMA interrupt-enable register are
changed. A write will not be performed until an interrupt is received by the
DMA. Figure 8–35 shows the synchronization mechanism when SYNC = 1 0.

Figure 8–35. DMA Destination Synchronization


Start

DMA Channel Performs a Read

Idle Until Enabled Interrupt Is Received

Disable DMA Interrupts Globally

DMA Channel Performs a Write

DMA Interrupts Are Enabled Globally

Go to Start

Source and Destination Synchronization

When SYNC = 1 1, the DMA is synchronized to both the source and destina-
tion. A read is performed when an interrupt is received. A write is performed
on the following interrupt. Source and destination synchronization when
SYNC = 1 1 is shown in Figure 8–36.

Peripherals 8-55
DMA Controller

Figure 8–36. DMA Source and Destination Synchronization

Start

Idle Until Enabled Interrupt is Received

Disable DMA Interrupts Globally

DMA Channel Performs a Read

Enable DMA Interrupts Globally

Idle Until Enabled Interrupt Is Received

Disable DMA Interrupts Globally

DMA Channel Performs a Write

Enable DMA Interrupts Globally

Go to Start

8.3.7 DMA Interrupts

You can generate a DMA interrupt to the CPU whenever the transfer count
reaches 0, indicating that the last transfer has taken place. The TCINT bit in
the DMA global control register determines whether the interrupt will be gener-
ated. If TCINT = 1, the DMA interrupt is generated. If TCINT = 0, the DMA inter-
rupt is not generated. If the DMA interrupt is generated, the EDINT bit, bit 10
in the interrupt enable register, must also be set to enable the CPU to be inter-
rupted by the DMA.

A second bit in the DMA global control register, the TC bit, is also generally
associated with the state of the TCINT bit and the interrupt operation. The TC
bit determines whether transfers are terminated when the transfer counter be-
comes 0 or whether they are allowed to continue. If TC = 1, transfers are termi-
nated when the transfer count becomes 0. If TC = 0, transfers are not termi-
nated when the transfer count becomes 0.

In general, if TCINT is 0, TC should also be cleared to 0. Otherwise, the DMA


transfer will terminate, and the CPU will not be notified. If TCINT is 1, TC should
also be 1 in most cases. In this case, the CPU will be notified when the transfer
completes, and the DMA will be halted and ready to start a new transfer.

8-56
DMA Controller

8.3.8 DMA Initialization/Reconfiguration


You can control the DMA through memory-mapped registers located on the
dedicated peripheral bus. Following is the general procedure for initializing
and/or reconfiguring the DMA:
1) Halt the DMA by clearing the START bits of the DMA global-control regis-
ter. You can do this by writing a 0 to the DMA global-control register. Note
that the DMA is halted on RESET.
2) Configure the DMA via the DMA global-control register (with START = 00),
as well as the DMA source, destination, and transfer-counter registers, if
necessary. Refer to subsection 8.3.10 on page 8-58 for more information.
3) Start the DMA by setting the START bits of the DMA global-control register
as necessary.

8.3.9 Hints for DMA Programming


The following hints help you improve your DMA programming and avoid unex-
pected results:
- Reset the DMA register before starting it. This clears any previously
latched interrupt that may no longer exist.
- In the event of a CPU-DMA access conflict, the CPU always prevails.
Carefully allocate the different sections of the program in memory for fast-
er execution. If a CPU program access conflicts with a DMA access, enab-
ling the cache helps if the program is located in external memory. DMA on-
chip access happens during the H3 phase. Refer to Chapter 9 for details
on CPU accesses.

Note: Expansion and Peripheral Buses


The expansion and peripheral buses cannot be accessed simultaneously
because they are multiplexed into a common port (see Figure 2–1 on page
2-3). This might increase CPU-DMA access conflicts.

- Ensure that each interrupt is received when you use interrupt synchroniza-
tion; otherwise, the DMA will never complete the block transfer.
- Use read/write synchronization when reading from or writing to serial ports
to guarantee data validity.
The following are indications that the DMA has finished a set of transfers:
- The DINT bit in the IIF register is set to 1 (interrupt polling). This requires
that the TCINT bit in the DMA control register be set first. This interrupt-
polling method does not cause any additional CPU-DMA access conflict.

Peripherals 8-57
DMA Controller

- The transfer counter has a zero value. However, notice that the transfer
counter is decremented after the DMA read operation finishes (not after
the write operation). Nevertheless, a transfer counter with a 0 value can
be used as an indication of a transfer completion.

- The STAT bits in the DMA channel control register are set to 002. You can
poll the DMA channel control register for this value. However, because the
DMA registers are memory-mapped into the peripheral bus address
space, this option can cause further CPU-DMA access conflicts.

8.3.10 DMA Programming Examples

Example 8–5, Example 8–6, and Example 8–7 illustrate initialization proce-
dures for the DMA.

When linking the examples, you should allocate section memory addresses
carefully to avoid CPU-DMA conflict. In the ’C3x, the CPU always prevails in
cases of conflict. In the event of a CPU program–DMA data conflict, the enab-
ling of the cache helps if the .text section is in external memory. For example,
when linking the code in Example 8–5, Example 8–6, and Example 8–7, the
.text section can be allocated into RAM0, .data into RAM1, and .bss into
RAM1, where RAM0 and RAM1 correspond to on-chip RAM block 0 and block
1, respectively.

In Example 8–5, the DMA initializes a 128-element array to 0. The DMA sends
an interrupt to the CPU after the transfer is completed. This program assumes
previous initialization of the CPU interrupt vector table (specifically the DMA-
to-CPU interrupt). The program initializes the ST and IE registers for interrupt
processing.

Example 8–5.Array Initialization With DMA


* TITLE: ARRAY INITIALIZATION WITH DMA
*
.GLOBAL START
.DATA
DMA .WORD 808000H ; DMA GLOBAL CONTROL REG ADDRESS
RESET .WORD 0C40H ; DMA GLOBAL CONTROL REG RESET VALUE
CONTROL .WORD 0C43H ; DMA GLOBAL CONTROL REG INITIALIZATION
SOURCE .WORD ZERO ; DATA SOURCE ADDRESS
DESTIN .WORD _ARRAY ; DATA DESTINATION ADDRESS
COUNT .WORD 128 ; NUMBER OF WORDS TO TRANSFER
ZERO .FLOAT 0.0 ; ARRAY INITIALIZATION VALUE 0.0 = 0x80000000
.BSS _ARRAY,128 ; DATA ARRAY LOCATED IN .BSS SECTION
.TEXT

8-58
DMA Controller

START LDP DMA ; LOAD DATA PAGE POINTER


LDI @DMA,AR0 ; POINT TO DMA GLOBAL CONTROL REGISTER
LDI @RESET,R0 ; RESET DMA
STI R0,*AR0
LDI @SOURCE,R0 ; INITIALIZE DMA SOURCE ADDRESS REGISTER
STI R0,*+AR0(4)
LDI @DESTIN,R0 ; INITIALIZE DMA DESTINATION ADDRESS REGISTER
STI R0,*+AR0(6)
LDI @COUNT,R0 ; INITIALIZE DMA TRANSFER COUNTER REGISTER
STI R0,*+AR0(8)
OR 400H,IE ; ENABLE INTERRUPT FROM DMA TO CPU
OR 2000H,ST ; ENABLE CPU INTERRUPTS GLOBALLY
LDI @CONTROL,R0 ; INITIALIZE DMA GLOBAL CONTROL REGISTER
STI R0,*AR0 ; START DMA TRANSFER
BU $
.END
Example 8–6 sets up the DMA to transfer data (128 words) from the serial port
0 input register to an array buffer with serial port receive interrupt (RINT0). The
DMA sends an interrupt to the CPU when the data transfer completes.
Serial port 0 is initialized to receive 32-bit data words with an internally gener-
ated receive-bit clock and a bit-transfer rate of 8H1 cycles/bit.
This program assumes previous initialization of the CPU interrupt vector table
(specifically the DMA-to-CPU interrupt). The serial port interrupt directly af-
fects only the DMA; therefore, no CPU serial port interrupt vector setting is re-
quired.

Example 8–6.DMA Transfer With Serial-Port Receive Interrupt


* TITLE DMA TRANSFER WITH SERIAL PORT RECEIVE INTERRUPT
*
.GLOBAL START
.DATA
DMA .WORD 808000H ; DMA GLOBAL CONTROL REG ADDRESS
CONTROL .WORD 0D43H ; DMA GLOBAL CONTROL REG INITIALIZATION
SOURCE .WORD 80804CH ; DATA SOURCE ADDRESS: SERIAL PORT INPUT REG
DESTIN .WORD _ARRAY ; DATA DESTINATION ADDRESS
COUNT .WORD 128 ; NUMBER OF WORDS TO TRANSFER
IEVAL .WORD 00200400H ; IE REGISTER VALUE
RESET1 .WORD 0D40H ; DMA RESET
.BSS _ARRAY,128 ; DATA ARRAY LOCATED IN .BSS SECTION
; THE UNDERSCORE USED IS JUST TO MAKE IT
; ACCESSIBLE FROM C (OPTIONAL)
SPORT .WORD 808040H ; SERIAL PORT GLOBAL CONTROL REG ADDRESS
SGCCTRL .WORD 0A300080H ; SERIAL PORT GLOBAL CONTROL REG INITIALIZATION
SRCTRL .WORD 111H ; SERIAL PORT RX PORT CONTROL REG INITIALIZATION
STCTRL .WORD 3C0H ; SERIAL PORT TIMER CONTROL REG INITIALIZATION
STPERIOD .WORD 00020000H ; SERIAL PORT TIMER PERIOD
SPRESET .WORD 01300080H ; SERIAL PORT RESET
RESET .WORD 0H ; SERIAL-PORT TIMER RESET
.TEXT
START LDP DMA ; LOAD DATA PAGE POINTER

Peripherals 8-59
DMA Controller

* DMA INITIALIZATION
LDI @DMA,AR0 ; POINT TO DMA GLOBAL CONTROL REGISTER
LDI @SPORT,AR1
LDI @RESET,R0
STI R0,*+AR1(4) ; RESET SPORT TIMER
LDI @RESET1,R0
STI R0,*AR0 ; RESET DMA
LDI @SPRESET,R0
STI R0,*AR1 ; RESET SPORT
LDI @SOURCE,R0 ; INITIALIZE DMA SOURCE ADDRESS REGISTER
STI R0,*+AR0(4)
LDI @DESTIN,R0 ; INITIALIZE DMA DESTINATION ADDRESS REGISTER
STI R0,*+AR0(6)
LDI @COUNT,R0 ; INITIALIZE DMA TRANSFER COUNTER REGISTER
STI R0,*+AR0(8)
OR @IEVAL,IE ; ENABLE INTERRUPTS
OR 2000H,ST ; ENABLE CPU INTERRUPTS GLOBALLY
LDI @CONTROL,R0 ; INITIALIZE DMA GLOBAL CONTROL REGISTER
STI R0,*AR0 ; START DMA TRANSFER
* SERIAL PORT INITIALIZATION
LDI @SRCTRL,R0 ; SERIAL-PORT RECEIVE CONTROL REG INITIALIZATION
STI R0,*+AR1(3)
LDI @STPERIOD,R0 ; SERIAL-PORT TIMER PERIOD INITIALIZATION
STI R0,*+AR1(6)
LDI @STCTRL,R0 ; SERIAL-PORT TIMER CONTROL REG INITIALIZATION
STI R0,*+AR1(4)
LDI @SGCCTRL,R0 ; SERIAL-PORT GLOBAL CONTROL REG INITIALIZATION
STI R0,*AR1
BU $
.END
Example 8–7 sets up the DMA to transfer data (128 words) from an array buff-
er to the serial port 0 output register with serial port transmit interrupt XINT0.
The DMA sends an interrupt to the CPU when the data transfer completes.
Serial port 0 is initialized to transmit 32-bit data words with an internally gener-
ated frame sync and a bit-transfer rate of 8H1 cycles/bit. The receive-bit clock
is internally generated and equal in frequency to one-half of the ’C3x H1 fre-
quency.
This program assumes previous initialization of the CPU interrupt vector table
(specifically the DMA-to-CPU interrupt). The serial port interrupt directly af-
fects only the DMA; therefore, no CPU serial port interrupt vector setting is re-
quired.

Note: Serial Port Transmit Synchronization


The DMA uses serial port transmit interrupt XINT0 to synchronize transfers.
Because the XINT0 is generated when the transmit buffer has written the last
bit of data to the shifter, an initial CPU write to the serial port is required to
trigger XINT0 to enable the first DMA transfer.

8-60
DMA Controller

Example 8–7.DMA Transfer With Serial-Port Transmit Interrupt


* TITLE: DMA TRANSFER WITH SERIAL PORT TRANSMIT INTERRUPT
* .GLOBAL START
.DATA
DMA .WORD 808000H ; DMA GLOBAL CONTROL REG ADDRESS
CONTROL .WORD 0E13H ; DMA GLOBAL CONTROL REG INITIALIZATION
SOURCE .WORD (_ARRAY+1) ; DATA SOURCE ADDRESS
DESTIN .WORD 80804CH ; DATA DESTIN ADDRESS: SERIAL-PORT OUTPUT REG
COUNT .WORD 127 ; NUMBER OF WORDS TO TRANSFER =(MSG LENGHT–1)
IEVAL .WORD 00100400H ; IE REGISTER VALUE
.BSS _ARRAY,128 ; DATA ARRAY LOCATED IN .BSS SECTION
; THE UNDERSCORE USED IS JUST TO MAKE IT
; ACCESSIBLE FROM C (OPTIONAL)
RESET1 .WORD 0E10H ; DMA RESET
SPORT .WORD 808040H ; SERIAL-PORT GLOBAL CONTROL REG ADDRESS
SGCCTRL .WORD 04880044H ; SERIAL-PORT GLOBAL CONTROL REG INITIALIZATION
SXCTRL .WORD 111H ; SERIAL-PORT TX PORT CONTROL REG INITIALIZATION
STCTRL .WORD 00FH ; SERIAL-PORT TIMER CONTROL REG INITIALIZATION
STPERIOD .WORD 00000002H ; SERIAL-PORT TIMER PERIOD
SPRESET .WORD 00880044H ; SERIAL-PORT RESET
RESET .WORD 0H ; SERIAL-PORT TIMER RESET
.TEXT
START LDP DMA ; LOAD DATA PAGE POINTER
* DMA INITIALIZATION
LDI @DMA,AR0 ; POINT TO DMA GLOBAL CONTROL REGISTER
LDI @SPORT,AR1
LDI @RESET,R0
STI R0,*+AR1(4) ; RESET SPORT TIMER
STI R0,*AR0 ; RESET DMA
STI R0,*AR1 ; RESET SPORT
LDI @SOURCE,R0 ; INITIALIZE DMA SOURCE ADDRESS REGISTER
STI R0,*+AR0(4)
LDI @DESTIN,R0 ; INITIALIZE DMA DESTINATION ADDRESS REGISTER
STI R0,*+AR0(6)
LDI @COUNT,R0 ; INITIALIZE DMA TRANSFER COUNTER REGISTER
STI R0,*+AR0(8)
OR @IEVAL,IE ; ENABLE INTERRUPT FROM DMA TO CPU
OR 2000H,ST ; ENABLE CPU INTERRUPTS GLOBALLY
LDI @CONTROL,R0 ; INITIALIZE DMA GLOBAL CONTROL REGISTER
STI R0,*AR0 ; START DMA TRANSFER

Peripherals 8-61
DMA Controller

* SERIAL PORT INITIALIZATION


LDI @SXCTRL,R0 ; SERIAL-PORT TX CONTROL REG INITIALIZATION
STI R0,*+AR1(2)
LDI @STPERIOD,R0 ; SERIAL–PORT TIMER PERIOD INITIALIZATION
STI R0,*+AR1(6)
LDI @STCTRL,R0 ; SERIAL-PORT TIMER CONTROL REG INITIALIZATION
STI R0,*+AR1(4)
LDI @SGCCTRL,R0 ; SERIAL-PORT GLOBAL CONTROL REG INITIALIZATION
STI R0,*AR1

* CPU WRITES THE FIRST WORD (TRIGGERING EVENT –––> XINT IS GENERATED)
LDI @SOURCE,AR0
LDI *–AR0(1),R0
STI R0,*+AR1(8)
BU $
.END

Other examples are as follows:

- Transfer a 256-word block of data from off-chip memory to on-chip


memory and generate an interrupt on completion. The order of memory
is to be maintained.

DMA source address: 800000h


DMA destination address: 809800h
DMA transfer counter: 00000100h
DMA global control: 00000C53h
CPU/DMA interrupt enable (IE): 00000400h

- Transfer a 128-word block of data from on-chip memory to off-chip


memory and generate an interrupt on completion. The order of memory
is to be inverted; that is, the highest addressed member of the block is to
become the lowest addressed member.

DMA source address: 809800h


DMA destination address: 800000h
DMA transfer counter: 00000080h
DMA global control: 00000C93h
CPU/DMA interrupt enable (IE): 00000400h

- Transfer a 200-word block of data from the serial-port-0 receive register


to on-chip memory and generate an interrupt on completion. The transfer
is to be synchronized with the serial-port-0 receive interrupt.

DMA source address: 80804Ch


DMA destination address: 809C00h
DMA transfer counter: 000000C8h
DMA global control: 00000D43h
CPU/DMA interrupt enable (IE): 00200400h

8-62
DMA Controller

- Transfer a 200-word block of data from off-chip memory to the serial-port-0


transmit register and generate an interrupt on completion. The transfer is
to be synchronized with the serial-port-0 transmit interrupt.

DMA source address: 809C00h


DMA destination address: 808048h
DMA transfer counter: 000000C8h
DMA global control: 00000E13h
CPU/DMA interrupt enable (IE): 00400400h

- Transfer data continuously between the serial-port-0 receive register and


the serial-port-0 transmit register to create a digital loop back. The transfer
is to be synchronized with the serial-port-0 receive and transmit interrupts.

DMA source address: 80804Ch


DMA destination address: 808048h
DMA transfer counter: 00000000h
DMA global control: 00000303h
CPU/DMA interrupt enable (IE): 00300000h

Peripherals 8-63
8-64
Chapter 9

Pipeline Operation

Two characteristics of the TMS320C3x that contribute to its high performance


are:
- Pipelining, and
- Concurrent I/O and CPU operation.

Five functional units control TMS320C3x operation:


- Fetch
- Decode
- Read
- Execute
- Direct memory access (DMA)

Pipelining is the overlapping or parallel operations of the fetch, decode, read,


and execute levels of a basic instruction.

By performing input/output operations, the DMA controller reduces the need


for the CPU to do so, thereby decreasing pipeline interference and enhancing
the CPU’s computational throughput.

Major topics discussed in this chapter are as follows:

Topic Page

9.1 Pipeline Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2


9.2 Pipeline Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
9.3 Resolving Register Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-18
9.4 Resolving Memory Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-21
9.5 Clocking of Memory Accesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-23

9-1
Pipeline Structure

9.1 Pipeline Structure


The five major units of the TMS320C3x pipeline structure and their functions
are as follows:

- Fetch Unit (F)


This unit fetches the instruction words from memory and updates the pro-
gram counter (PC).

- Decode Unit (D)


This unit decodes the instruction word and performs address generation.
The unit also controls any modifications to the auxiliary registers and the
stack pointer.

- Read Unit (R)


This unit, if required, reads the operands from memory.

- Execute Unit (E)


This unit, if required, reads the operands from the register file, performs
any necessary operation, and writes results to the register file. If required,
the unit writes results of previous operations to memory.

- DMA Channel (DMA)


The DMA channel reads and writes to memory.

A basic instruction has four levels:


- Fetch
- Decode
- Read
- Execute

Figure 9–1 illustrates these four levels of the pipeline structure. The levels are
indexed according to instruction and execution cycle. The perfect overlap in
the pipeline, where all four units operate in parallel, occurs at cycle (m). Those
levels about to be executed are at m + 1, and those just executed are at m – 1.
The TMS320C3x pipeline control allows a high-speed execution rate of one
execution per cycle. It also manages pipeline conflicts so that they are trans-
parent to the user. You do not need to take any special precautions to guaran-
tee correct operation.

9-2
Pipeline Structure

Figure 9–1. TMS320C3x Pipeline Structure

CYCLE F D R E
m–3 W – – –
m–2 X W – –
m–1 Y X W –
m Z Y X W Perfect overlap
m+1 – Z Y X
m+2 – – Z Y
m+3 – – – Z

D = Decode, E = Execute, F = Fetch, R = Read; W, X, Y, Z = Instruction Representations

Priorities from highest to lowest have been assigned to each of the functional
units as follows:
1) Execute (highest)
2) Read
3) Decode
4) Fetch
5) DMA (lowest)

When the processing of an instruction is ready to pass to the next higher pipe-
line level, but that level is not ready to accept a new input, a pipeline conflict
occurs. In this case, the lower-priority unit waits until the higher-priority unit
completes its currently executing function.

Despite the DMA controller’s low priority, you can minimize or even eliminate
conflicts with the CPU through suitable data structuring because the DMA con-
troller has its own data and address buses.

Pipeline Operation 9-3


Pipeline Conflicts

9.2 Pipeline Conflicts


The pipeline conflicts of the TMS320C3x can be grouped into the following
categories:

- Branch Conflicts
Branch conflicts involve most of those instructions or operations that read
and/or modify the PC.

- Register Conflicts
Register conflicts involve delays that can occur when reading from or writ-
ing to registers that are used for address generation.

- Memory Conflicts
Memory conflicts occur when the internal units of the TMS320C3x com-
pete for memory resources.

Each of these three categories is discussed in the following sections. Exam-


ples are included. Note that in these examples, when data is refetched or an
operation is repeated, the symbol representing the stage of the pipeline is ap-
pended with a number. For example, if a fetch is performed again, the instruc-
tion mnemonic is repeated. When an access is detained for multiple cycles be-
cause of not ready, the symbols RDY and RDY are used to indicate not ready
and ready, respectively.

9.2.1 Branch Conflicts


The first class of pipeline conflicts occurs with standard (nondelayed)
branches, that is, BR, Bcond, DBcond, CALL, IDLE, RPTB, RPTS, RETIcond,
RETScond, interrupts, and reset. Conflicts arise with these instructions and
operations because during their execution, the pipeline is used only for the
completion of the operation; other information fetched into the pipeline is dis-
carded or refetched, or the pipeline is inactive. This is referred to as flushing
the pipeline. Flushing the pipeline is necessary in these cases to guarantee
that portions of succeeding instructions do not inadvertently get partially ex-
ecuted. TRAPcond and CALLcond are classified differently from the other
types of branches and are considered later.

Example 9–1 shows the code and pipeline operation for a standard branch.

Note: Dummy Fetch


One dummy fetch (an MPYF instruction) is performed, which affects the
cache. After the branch address is available, a new fetch (an OR instruction)
is performed.

9-4
Pipeline Conflicts

Example 9–1.Standard Branch


BR THREE ; Unconditional branch
MPYF ; Not executed
ADD ; Not executed
SUBF ; Not executed
AND ; Not executed
.
.
.
THREE OR ; Fetched after BR is fetched
STI
.
.

PIPELINE OPERATION

PC F D R E
n BR – – –
n+1 MPYF BR – –
n+1 (nop) (nop) BR –
n+1 (nop) (nop) (nop) BR
THREE OR (nop) (nop) (nop)
STI OR (nop) (nop)

THREE → PC Fetch held for


new PC value

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

RPTS and RPTB both flush the pipeline, allowing the RS, RE, and RC registers
to be loaded at the proper time relative to the flow of the pipeline. If these regis-
ters are loaded without the use of RPTS or RPTB, no flushing of the pipeline
occurs. If you are not using any of the repeat modes, then you can use RS, RE,
and RC as general-purpose 32-bit registers and not cause any pipeline con-
flicts. In cases such as the nesting of RPTB due to nested interrupts, it might
be necessary to load and store these registers directly while using the repeat
modes. Since up to four instructions can be fetched before entering the repeat
mode, you should follow loads by a branch to flush the pipeline. If the RC is
changing when an instruction is loading it, the direct load takes priority over
the modification made by the repeat mode logic.

Pipeline Operation 9-5


Pipeline Conflicts

Delayed branches are implemented to guarantee the fetching of the next three
instructions. The delayed branches include BRD, BcondD, and DBcondD.
Example 9–2 shows the code and pipeline operation for a delayed branch.

Example 9–2.Delayed Branch


BRD THREE ; Unconditional delayed branch
MPYF ; Executed
ADD ; Executed
SUBF ; Executed
AND ; Not executed
.
.
.
THREE MPYF ; Fetched after SUBF is fetched
.
.
.

PIPELINE OPERATION

PC F D R E

n BRD — — —

n+1 MPYF BRD — — No execute delay

n+2 ADDF MPYF BRD —

n+3 SUBF ADDF MPYF BRD

THREE MPYF SUBF ADDF MPYF

THREE → PC

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

9-6
Pipeline Conflicts

9.2.2 Register Conflicts


Register conflicts involve reading or writing registers used for addressing.
These conflicts occur when the pertinent register is not ready to be used. Some
conditions under which you can avoid register conflicts are discussed in Sec-
tion 9.3 on page 9-18.

The registers comprise the following three functional groups:

- Group 1
This group includes auxiliary registers (AR0–AR7), index registers (IR0,
IR1), and block size register (BK).

- Group 2
This group includes the data page pointer (DP).

- Group 3
This group includes the system stack pointer (SP).

If an instruction writes to one of these three groups, the decode unit cannot use
any register within that particular group until the write is complete, that is, in-
struction execution is completed. In Example 9–3, an auxiliary register is
loaded, and a different auxiliary register is used on the next instruction. Since
the decode stage needs the result of the write to the auxiliary register, the de-
code of this second instruction is delayed two cycles. Every time the decode
is delayed, a refetch of the program word is performed; that is, the ADDF is
fetched three times. Since these are actual refetches, they can cause not only
conflicts with the DMA controller but also cache hits and misses.

Pipeline Operation 9-7


Pipeline Conflicts

Example 9–3.Write to an AR Followed by an AR for Address Generation


LDI 7,AR1 ; 7 → AR1
NEXT MPYF *AR2,R0 ; Decode delayed 2 cycles
ADDF
FLOAT

PIPELINE OPERATION

PC F D R E

n LDI — — —

n+1 MPYF LDI — —

n+2 ADDF MPYF LDI —

n+2 ADDF MPYF (nop) LDI 7,AR1

n+2 ADDF MPYF (nop) (nop)

n+3 FLOAT ADDF MPYF (nop)

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

The case for reads of these groups is similar to the case for writes. If an
instruction must read a member of one of these groups, the use of that particu-
lar group by the decode for the following instruction is delayed until the read
is complete. The registers are read at the start of the execute cycle and there-
fore require only a one-cycle delay of the following decode. For four registers
(IR0, IR1, BK, or DP), there is no delay. For all other registers, including the
SP, the delay occurs.

In Example 9–4, two auxiliary registers are added together, with the result go-
ing to an extended-precision register. The next instruction uses a different aux-
iliary register as an address register.

9-8
Pipeline Conflicts

Example 9–4.A Read of ARs Followed by ARs for Address Generation


ADDI AR0,AR1,R1 ; AR0 + AR1 → R1
NEXT MPYF *++AR2,R0 ; Decode delayed one cycle
ADDF
FLOAT

PIPELINE OPERATION

PC F D R E

n ADDI — — —

n+1 MPYF ADDI — —

n+2 ADDF MPYF ADDI —

n+2 ADDF MPYF (nop) ADDI AR0,AR1,R0

n+3 FLOAT ADDF MPYF (nop)

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

Loop counter auxiliary registers for the decrement and branch (DBR)) instruc-
tion are regarded in the same way as they are for addressing. Therefore, the
operation shown in Example 9–3 and Example 9–4 can also occur for this in-
struction.

Pipeline Operation 9-9


Pipeline Conflicts

9.2.3 Memory Conflicts


Memory conflicts can occur when the memory bandwidth of a physical
memory space is exceeded. For example, RAM blocks 0 and 1 and the ROM
block can support only two accesses per cycle. The external interface can sup-
port only one access per cycle. Section 9.4 on page 9-21 contains some condi-
tions under which you can avoid memory conflicts.

Memory pipeline conflicts consist of the following four types:

- Program wait
A program fetch is prevented from beginning.

- Program fetch incomplete


A program fetch has begun but is not yet complete.

- Execute only
An instruction sequence requires three CPU data accesses in a single
cycle.

- Hold everything
A primary or expansion bus operation must complete before another one
can proceed.

These four types of memory conflicts are illustrated in examples and dis-
cussed in the paragraphs that follow.

Program Wait

Two conditions can prevent the program fetch from beginning:

- The start of a CPU data access when:


J Two CPU data accesses are made to an internal RAM or ROM block,
and a program fetch from the same block is necessary.
J One of the external ports is starting a CPU data access, and a program
fetch from the same port is necessary.

- A multicycle CPU data access or DMA data access over the external bus
is needed.

9-10
Pipeline Conflicts

Example 9–5 illustrates a program wait until a CPU data access completes.
In this case, *AR0 and *AR1 are both pointing to data in RAM block 0, and the
MPYF instruction will be fetched from RAM block 0. This results in the conflict
shown in Example 9–5. Since no more than two accesses can be made to
RAM block 0 in a single cycle, the program fetch cannot begin and must wait
until the CPU data accesses are complete.

Example 9–5.Program Wait Until CPU Data Access Completes


ADDF3 *AR0,*AR1,R0
FIX
MPYF
ADDF3
NEGB

PIPELINE OPERATION

PC F D R E

n ADDF3 — — —

n+1 FIX ADDF3 — —

n+2 (WAIT) FIX ADDF3 —

n+2 MPYF (nop) FIX ADDF3 *AR0,AR1,R0

n+3 ADDF3 MPYF (nop) FIX

n+4 NEGB ADDF3 MPYF (nop)


D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

Example 9–6 shows a program wait due to a multicycle data-data access or


a multicycle DMA access. The ADDF, MPYF, and SUBF are fetched from a
portion of memory other than the external port that the DMA requires. The
DMA begins a multicycle access. The program fetch corresponding to the
CALL is made to the same external port that the DMA is using.

Either of two cases may produce this situation:

- One of the following two memory boundaries is crossed:


J From 7F FFFFh to 80 0000h, or
J From 80 9FFFh to 80 A000h.

- Code that has been cached is executed, and the instruction prior to the
ADDF is one of the following (conditional or unconditional):
J a delayed branch instruction, or
J a delayed decrement and branch instruction.

Pipeline Operation 9-11


Pipeline Conflicts

Even though the DMA has the lowest priority, multicycle access cannot be
aborted. The program fetch must therefore wait until the DMA access com-
pletes.

Example 9–6.Program Wait Due to Multicycle Access

PIPELINE OPERATION

PC F D R E

n ADDF — — —

n+1 MPYF ADDF — —

n+2 SUBF MPYF ADDF —

n+3 (WAIT) SUBF MPYF ADDF

n+3 CALL (nop) SUBF MPYF

n+4 — CALL (nop) SUBF


D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

Program Fetch Incomplete

A program fetch incomplete occurs when a program fetch requires more than
one cycle to complete due to wait states. In Example 9–7, the MPYF and
ADDF are fetched from memory that supports single-cycle accesses. The
SUBF is fetched from memory, which requires one wait state. One example
that demonstrates this conflict is a fetch across a bank boundary on the
primary port. See Section 7.4 on page 7-30.

Example 9–7.Multicycle Program Memory Fetches

PIPELINE OPERATION

PC F D R E

n MPYF — — —

n+1 ADDF MPYF — —

n + 2 RDY SUBF ADDF MPYF —

n + 2 RDY SUBF (nop) ADDF MPYF

n+3 ADDI SUBF (nop) ADDF


D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

9-12
Pipeline Conflicts

Execute Only

The execute only type of memory pipeline conflict occurs when performing an
interlocked load or when a sequence of instructions requires three CPU data
accesses in a single cycle. There are three cases in which this occurs:

- An instruction performs a store and is followed by an instruction that does


two memory reads.

- An instruction performs two stores and is followed by an instruction that


performs at least one memory read.

- An interlocked load (LDII or LDFI) instruction is performed, and XF1 = 1.

The first case is shown in Example 9–8. Since this sequence requires three
data memory accesses and only two are available, only the execute phase of
the pipeline is allowed to proceed. The dual reads required by the LDF || LDF
are delayed one cycle. Note that a refetch of the next instruction can occur.

Example 9–8.Single Store Followed by Two Reads


STF R0,*AR1 ; R0 → *AR1
LDF *AR2,R1 ; *AR2 → R1 in parallel with
 LDF *AR3,R2 ; *AR3 → R2

PIPELINE OPERATION

PC F D R E

n STF — — —

n+1 LDF  LDF STF — —

n+2 W LDF  LDF STF —

n+3 X W LDF  LDF STF

n+4 X W LDF  LDF (nop)

n+4 Y X W LDF  LDF *AR2,R1 and *AR3,R2

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter, W,X,Y = Instruction Representations

Pipeline Operation 9-13


Pipeline Conflicts

Example 9–9 shows a parallel store followed by a single load or read. Since
the two parallel stores are required, the next CPU data memory read must wait
a cycle before beginning. One program memory refetch can occur.

Example 9–9.Parallel Store Followed by Single Read


STF R0,*AR0 ; R0 → *AR0 in parallel with
 STF R2,*AR1 ; R2 → *AR1
ADDF @SUM,R1 ; R1 + @SUM → R1
IACK
ASH

PIPELINE OPERATION

PC F D R E

n STF  STF — — —

n+1 ADDF STF  STF — —

n+2 IACK ADDF STF  STF —

n+3 ASH IACK ADDF STF  STF

n+4 ASH IACK ADDF (nop)

n+4 — ASH IACK ADDF

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

9-14
Pipeline Conflicts

The final case involves an interlocked load (LDII or LDFI) instruction and XF1
= 1. Since the interlocked loads use the XF1 pin as an acknowledge that the
read can complete, the loads might need to extend the read cycle, as shown
in Example 9–10. Note that a program refetch can occur.

Example 9–10. Interlocked Load


NOT R1,R0
LDII 300h,AR2
ADDI *AR2,R2
CMPI R0,R2

PIPELINE OPERATION

PC F D R E

n NOT — — —

n+1 LDII NOT — —

n+2 ADDI LDII NOT —

n+3 CMPI ADDI LDII NOT

n+3 — CMPI ADDI LDII

n+4 — CMPI ADDI LDII

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

Hold Everything

Three situations result in hold-everything memory pipeline conflicts:

- A CPU data load or store cannot be performed because an external port is


busy.

- An external load takes more than one cycle.

- Conditional calls and traps are processed.

Pipeline Operation 9-15


Pipeline Conflicts

The first type of hold everything conflict occurs when one of the external ports
is busy due to an access that has started but is not complete. In Example 9–11,
the first store is a two-cycle store. The CPU writes the data to an external port.
The port control then takes two cycles to complete the data-data write. The
LDF is a read over the same external port. Since the store is not complete, the
CPU continues to attempt LDF until the port is available.

Example 9–11. Busy External Port


STF R0,@DMA1
LDF @DMA2,R0

PIPELINE OPERATION

PC F D R E

n STF — — —

n+1 LDF STF — —

n+2 W LDF STF —

n+2 W LDF (nop) STF

n+2 W LDF (nop) (nop)

n+3 X W LDF (nop)

n+4 Y X W LDF

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter, W, X, Y = Instruction Representations

9-16
Pipeline Conflicts

The second type of hold everything conflict involves multicycle data reads. The
read has begun and continues until completed. In Example 9–12, the LDF is
performed from an external memory that requires several cycles to complete.

Example 9–12. Multicycle Data Reads


LDF @DMA,R0

PIPELINE OPERATION

PC F D R E

n LDF — — —

n+1 I LDF — —

n+2 J I LDF —

n+3 K ,(dummy) I LDF —

n+3 K2 J I LDF
D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter, I, J, K = Instruction Representations

The final type of hold everything conflict involves conditional calls and traps,
which are different from the other branch instructions. Whereas the other
branch instructions are conditional loads, the conditional calls and traps are
conditional stores, which require one cycle more than a conditional branch
(see Example 9–13). The added cycle is used to push the return address after
the call condition is evaluated.

Example 9–13. Conditional Calls and Traps

PIPELINE OPERATION

PC F D R E

n9 CALLcond — — —

n+1 I CALLcond — —

n+1 (nop) (nop) CALLcond —

n+1 (nop) (nop) (nop) CALLcond

n+1 (nop) (nop) (nop) CALLcond

n + 2 / CALLaddr I (nop) (nop) (nop)


D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter, I, = Instruction Representation

Pipeline Operation 9-17


Resolving Register Conflicts

9.3 Resolving Register Conflicts


If the auxiliary registers (AR7–AR0), the index registers (IR1–IR0), data page
pointer (DP), or stack pointer (SP) are accessed for any reason other than ad-
dress generation, pipeline conflicts associated with the next memory access
can occur. The pipeline conflicts and delays are presented in subsection 9.2
on page 9-4.

Example 9–14, Example 9–15, and Example 9–16 demonstrate either some
common uses of these registers that do not produce a conflict or ways that you
can avoid the conflict.

Example 9–14. Address Generation Update of an AR Followed by an AR for Address


Generation
LDF 7.0,R0 ; 7.0 → R0
MPYF *++AR0(IR1),R0
ADDF *AR2,R0
FIX
MPYF
ADDF

PIPELINE OPERATION

PC F D R E

n LDF — — —

n+1 MPYF LDF — —

n+2 ADDF MPYF LDF —

n+3 FIX ADDF MPYF LDF

n+4 MPYF FIX ADDF MPYF

n+5 ADDF MPYF FIX ADDF

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter, W, X, Y, Z = Instruction Representations

9-18
Resolving Register Conflicts

Example 9–15. Write to an AR Followed by an AR for Address Generation Without a


Pipeline Conflict
LDI @TABLE,AR2
MPYF @VALUE,R1
ADDF R2,R1
MPYF *AR2++,R1
SUBF
STF

PIPELINE OPERATION

PC F D R E

n LDI — — —

n+1 MPYF LDI — —

n+2 ADDF MPYF LDI —

n+3 MPYF ADDF MPYF LDI 7,


AR2

n+4 SUBF MPYF ADDF MPYF

n+5 STF SUBF MPYF ADDF

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

Pipeline Operation 9-19


Resolving Register Conflicts

Example 9–16. Write to DP Followed by a Direct Memory Read Without a Pipeline Conflict
LDP TABLE_ADDR
POP R0
LDF *–AR3(2),R1
LDI @TABLE_ADDR,AR0
PUSHF R6
PUSH R4

PIPELINE OPERATION

PC F D R E

n LDP — — —

n+1 POP LDP — —

n+2 LDF POP LDP —

n+3 LDI LDF POP LDP

n+4 PUSHF LDI LDF POP

n+5 PUSH PUSHF LDI LDF

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

9-20
Resolving Memory Conflicts

9.4 Resolving Memory Conflicts


If program fetches and data accesses are performed in such a manner that the
resources being used cannot provide the necessary bandwidth, the program
fetch is delayed until the data access is complete. Certain configurations of
program fetch and data accesses yield conditions under which the
TMS320C3x can achieve maximum throughput.

Table 9–1 shows how many accesses can be performed from the different
memory spaces when it is necessary to do a program fetch and a single data
access and still achieve maximum performance (one cycle). As shown in
Table 9–1, four cases achieve one-cycle maximization.

Table 9–1. One Program Fetch and One Data Access for Maximum Performance
Accesses From Expansion Bus†
Primary Bus Dual-Access Or Peripheral
Case # Accesses Internal Memory Accesses
1 1 1 –
2 1 – 1
2 from any
3 – combination –
of internal memory
4 – 1 1
† The expansion bus is available only on the TMS320C30.

Pipeline Operation 9-21


Resolving Memory Conflicts

Table 9–2 shows how many accesses can be performed from the different
memory spaces when it is necessary to do a program fetch and two data ac-
cesses and still achieve maximum performance (one cycle). Six conditions
achieve this maximization.

Table 9–2. One Program Fetch and Two Data Accesses for Maximum Performance
Accesses From Expansion† Or
Primary Bus Dual-Access Peripheral Bus
Case # Accesses Internal Memory Accesses

1 1 2 from any –
combination
of internal memory
2† 1 Program 1 Data 1 Data
3† 1 Data 1 Data 1 Program
4 – 2 from same internal –
memory block and
1 from a different
internal memory
block
5 – 3 from different –
internal memory
blocks
6 – 2 from any 1
combination
of internal memory
† The expansion bus is available only on the TMS320C30.

9-22
Clocking of Memory Accesses

9.5 Clocking of Memory Accesses


This section uses the relationships between internal clock phases (H1 and H3)
to memory accesses to illustrate how the TMS320C3x handles multiple
memory accesses. Whereas the previous section discusses the interaction
between sequences of instructions, this section discusses the flow of data on
an individual instruction basis.

Each major clock period of 60 ns is composed of two minor clock periods of


30 ns, labeled H3 and H1. The active clock period for H3 and H1 is the time
when that signal is high.
Major Clock Period

H1

H3

The precise operation of memory reads and writes can be defined according
to these minor clock periods. The types of memory operations that can occur
are program fetches, data loads and stores, and DMA accesses.

9.5.1 Program Fetches


Internal program fetches are always performed during H3 unless a single data
store must occur at the same time due to another instruction in the pipeline.
In this case, the program fetch occurs during H1, and the data store during H3.

External program fetches always start at the beginning of H3, with the address
being presented on the external bus. At the end of H1, they are completed with
the latching of the instruction word.

Pipeline Operation 9-23


Clocking of Memory Accesses

9.5.2 Data Loads and Stores


Four types of instructions perform loads, memory reads, and stores:
- Two-operand instructions,
- Three-operand instructions,
- Multiplier/ALU operation with store instructions, and
- Parallel multiply and add instructions.

See Chapter 5 for detailed information on addressing modes.

As discussed in Chapter 7, the number of bus cycles for external memory


accesses differs in some cases from the number of CPU execution cycles. For
external reads, the number of bus cycles and CPU execution cycles is identi-
cal. For external writes, there are always at least two bus cycles, but unless
there is a port access conflict, there is only one CPU execution cycle. In the
following examples, any difference in the number of bus cycles and CPU
cycles is noted.

Two-Operand Instruction Memory Accesses

Two-operand instructions include all instructions whose bits 31–29 are 000 or
010 (see Figure 9–2). In the case of a data read, bits 15–0 represent the src
operand. Internal data reads are always performed during H1. External data
reads always start at the beginning of H3, with the address being presented
on the external bus; they complete with the latching of the data word at the end
of H1.

Figure 9–2. Two-Operand Instruction Word

31 24 23 16 15 87 0

0 X 0 Operation G dst(src) src(dst)

In the case of a data store, bits 15–0 represent the dst operand. Internal data
stores are performed during H3. External data stores always start at the begin-
ning of H3, with the address and data being presented on the external bus.

Three-Operand Instruction Memory Reads

Three-operand instructions include all instructions whose bits 31–29 are 001
(see Figure 9–3). The source operands, src1 and src2, come from either regis-
ters or memory. When one or more of the source operands are from memory,
these instructions are always memory reads.

9-24
Clocking of Memory Accesses

Figure 9–3. Three-Operand Instruction Word

31 24 23 16 15 87 0

0 0 1 Operation T dst src1 src2

If only one of the source operands is from memory (either src1 or src2) and is
located in internal memory, the data is read during H1. If the single memory
source operand is in external memory, the read starts at the beginning of H3,
with the address being presented on the external bus, and completes with the
latching of the data word at the end of H1.

If both source operands are to be fetched from memory, several cases occur.
If both operands are located in internal memory, the src1 read is performed
during H3 and the src2 read during H1, thus completing two memory reads in
a single cycle.

If src1 is in internal memory and src2 is in external memory, the src2 access
begins at the start of H3 and latches at the end of H1. At the same time, the
src1 access to internal memory is performed during H3. Again, two memory
reads are completed in a single cycle.

If src1 is in external memory and src2 is in internal memory, two cycles are nec-
essary to complete the two reads. In the first cycle, both operands are ad-
dressed. Since src1 takes an entire cycle to be read and latched from external
memory, the internal operation on src2 cannot be completed until the second
cycle. Ordering the operands so that src1 is located internally is necessary to
achieve single-cycle execution.

If src1 and src2 are both from external memory, two cycles are required to com-
plete the two reads. In the first cycle, the src1 access is performed and loaded
on the next H3; in the second cycle, the src2 access is performed and loaded
on that cycle’s H1.

If src2 is in external memory and src1 is in on-chip or external memory and is


immediately preceded by a single store instruction to external memory, a
dummy src2 read can occur between the execution of the store instruction and
the src2 read, regardless of which memory space is accessed (STRB,
MSTRB, or IOSTRB). The dummy read can cause an externally interfaced
FIFO address pointer to be incremented prematurely, thereby causing the loss
of FIFO data. Example 9–17 illustrates how the dummy read can occur.
Example 9–18 offers an alternative code segment that suppresses the dummy
read. In the alternative code segment, the dummy read is eliminated by swap-
ping the order of the source operands.

Pipeline Operation 9-25


Clocking of Memory Accesses

Example 9–17. Dummy src2 Read


STI R0,*AR6 ; AR6 points to MSTRB space
ADDI3 *AR3,*AR1,R0 ; AR3 points to on-chip RAM
; AR1 points to MSTRB space

H1

H3

PIPELINE OPERATION
PC F D R E

n STI

n+1 ADDI3 STI

n+2 ADDI3 STI

n+3 — STI R0,*AR6


The read of src2 cannot start
n+4 — — until the store is complete.

n+5 ADDI3 — dummy load of src2

n+6 — — second cycle of dummy load

n+7 ADDI3 — actual read of src2 and src1

n+8 ADDI3 *AR3,*AR1,R0

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

Two cycles are required for the MSTRB store. Two other cycles are required for the
dummy MSTRB read of *AR3 (because the read follows a write). One cycle is required
for an actual MSTRB read of *AR3.

9-26
Clocking of Memory Accesses

Example 9–18. Operand Swapping Alternative


Switch the operands of the three-operand instruction so that the internal read
is performed first.
STI R0,*AR6 ;AR6 points to MSTRB space
ADDI3 *AR1,*AR3,R0 ;AR3 points to on-chip RAM
;AR1 points to MSTRB space

H1

H3

PIPELINE OPERATION
PC F D R E

n STI

n+1 ADDI3 STI

n+2 ADDI3 STI

n+3 — STI R0,*AR6

n+4 — — The read of src2 cannot start


until the store is complete.
n+5 ADDI3 — actual read of src2 and src1

n+6 — — second cycle of src2 read

n+7 — ADDI3 *AR1,*AR3,R0

D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter

Operations with Parallel Stores

The next class of instructions includes every instruction that has a store in par-
allel with another instruction. Bits 31 and 30 for these instructions are equal
to 1 1.

The instruction word format for those operations that perform a multiply or ALU
operation in parallel with a store is shown in Figure 9–4. If the store operation
to dst2 is external or internal, it is performed during H3. Two bus cycles are
required for external stores, but only one CPU cycle is necessary to complete
the write.

If the memory read operation is external, it starts at the beginning of H3 and


latches at the end of H1. If the memory read operation is internal, it is per-

Pipeline Operation 9-27


Clocking of Memory Accesses

formed during H1. Note that memory reads are performed by the CPU during
the read (R) phase of the pipeline, and stores are performed during the ex-
ecute (E) phase.

Figure 9–4. Multiply or CPU Operation With a Parallel Store

31 24 23 16 15 87 0

1 1 Operation dst1 src1 src3 dst2 src2

The instruction word format for those instructions that have parallel stores to
memory is shown in Figure 9–5. If both destination operands, dst1 and dst2,
are located in internal memory, dst1 is stored during H3 and dst2 during H1,
thus completing two memory stores in a single cycle.

If dst1 is in external memory and dst2 is in internal memory, the dst1 store be-
gins at the start of H3. The dst2 store to internal memory is performed during
H1. Two bus cycles are required for the external store, but only one CPU cycle
is necessary to complete the write. Again, two memory stores are completed
in a single cycle.

If dst1 is in internal memory and dst2 is in external memory, an additional bus


cycle is necessary to complete the dst2 store. Only one CPU cycle is neces-
sary to complete the write, but the port access requires three bus cycles. In the
first cycle, the internal dst1 store is performed during H3, and dst2 is written
to the port during H1. During the next cycle, the dst2 store is performed on the
external bus, beginning in H3, and executes as normal through the following
cycle.

If dst1 and dst2 are both written to external memory, a single CPU cycle is still
all that is necessary to complete the stores. In this case, four bus cycles are
required.

1) In the first cycle, both dst1 and dst2 are written to the port, and the external
bus access for dst1 begins.

2) The store for dst1 is completed on the second cycle, and the store for dst2
begins on the third external bus cycle.

3) Finally, the store for dst2 is completed on the fourth external bus cycle.

9-28
Clocking of Memory Accesses

Figure 9–5. Two Parallel Stores

31 24 23 16 15 87 0

1 1 ST || ST src2 0 0 0 src1 dst1 dst2

Parallel Multiplies and Adds

Memory addressing for parallel multiplies and adds is similar to that for three-
operand instructions. The parallel multiplies and adds include all instructions
whose bits 31–30 = 10 (see Figure 9–6).

For these operations, src3 and src4 are both located in memory. If both oper-
ands are located in internal memory, src3 is performed during H3, and src4 is
performed during H1, thus completing two memory reads in a single cycle.

If src3 is in internal memory and src4 is in external memory, the src4 access
begins at the start of H3 and latches at the end of H1. At the same time, the
src3 access to internal memory is performed during H3. Again, two memory
reads are completed in one cycle.

If src3 is in external memory and src4 is in internal memory, two cycles are nec-
essary to complete the two reads. In the first cycle, the internal src4 access
is performed. During the H3 of the next cycle, the src3 access is performed.

If src3 and src4 are both from external memory, two cycles are necessary to
complete the two reads. In the first cycle, the src3 access is performed; in the
second cycle, the src4 access is performed.

Figure 9–6. Parallel Multiplies and Adds

31 24 23 16 15 87 0

1 0 Operation P d1 d2 src1 src2 src3 src4

Pipeline Operation 9-29


9-30
Chapter 10

Assembly Language Instructions

The TMS320C3x assembly language instruction set supports numeric-inten-


sive, signal-processing, and general-purpose applications. The instructions
are organized into major groups consisting of load-and-store, two- or three-op-
erand arithmetic/logical, parallel, program-control, and interlocked operations
instructions. The addressing modes used with the instructions are described
in Chapter 5.

The TMS320C3x instruction set can also use one of 20 condition codes with
any of the 10 conditional instructions, such as LDFcond. This chapter defines
the condition codes and flags.

The assembler allows optional syntax forms to simplify the assembly language
for special-case instructions. These optional forms are listed and explained.

Each of the individual instructions is described and listed in alphabetical order


(see subsection 10.3.2 on page 10-16). Example instructions demonstrate the
special format and explain its content.

This chapter discusses the following major topics:

Topic Page

10.1 Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2


10.2 Condition Codes and Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-10
10.3 Individual Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-14

10-1
Instruction Set

10.1 Instruction Set


All of the instructions in the TMS320C3x instruction set are one machine word
long. Most require one cycle to execute. All instructions are a single machine
word long, and most instructions require one cycle to execute. In addition to
multiply and accumulate instructions, the TMS320C3x possesses a full com-
plement of general-purpose instructions.

The instruction set contains 113 instructions organized into the following func-
tional groups:
- Load-and-store
- Two-operand arithmetic/logical
- Three-operand arithmetic/logical
- Program control
- Interlocked operations
- Parallel operations

Each of these groups is discussed in the succeeding subsections.

10.1.1 Load-and-Store Instructions


The TMS320C3x supports 12 load-and-store instructions (see Table 10–1).
These instructions can:
- Load a word from memory into a register,
- Store a word from a register into memory, or
- Manipulate data on the system stack.

Two of these instructions can load data conditionally. This is useful for locating
the maximum or minimum value in a data set. See Section 10.2 on page 10-10
for detailed information on condition codes.

Table 10–1. Load-and-Store Instructions


Instruction Description Instruction Description
LDE Load floating-point exponent POP Pop integer from stack

LDF Load floating-point value POPF Pop floating-point value from stack

LDFcond Load floating-point value PUSH Push integer on stack


conditionally

LDI Load integer PUSHF Push floating-point value on stack

LDIcond Load integer conditionally STF Store floating-point value

LDM Load floating-point mantissa STI Store integer

LDP Load data page pointer

10-2
Instruction Set

10.1.2 Two-Operand Instructions


The TMS320C3x supports 35 two-operand arithmetic and logical instructions.
The two operands are the source and destination. The source operand can be
a memory word, a register, or a part of the instruction word. The destination
operand is always a register.

As shown in Table 10–2, these instructions provide integer, floating-point, or


logical operations, and multiprecision arithmetic.

Table 10–2.Two-Operand Instructions

Instruction Description Instruction Description


ABSF Absolute value of a floating- NORM Normalize floating-point value
point number

ABSI Absolute value of an integer NOT Bitwise logical-complement

ADDC† Add integers with carry OR† Bitwise logical-OR

ADDF† Add floating-point values RND Round floating-point value

ADDI† Add integers ROL Rotate left

AND† Bitwise logical-AND ROLC Rotate left through carry

ANDN† Bitwise logical-AND with ROR Rotate right


complement

ASH† Arithmetic shift RORC Rotate right through carry

CMPF† Compare floating-point values SUBB† Subtract integers with borrow

CMPI† Compare integers SUBC Subtract integers conditionally

FIX Convert floating-point value to SUBF Subtract floating-point values


integer

FLOAT Convert integer to floating-point SUBI Subtract integer


value

LSH† Logical shift SUBRB Subtract reverse integer with


borrow

MPYF† Multiply floating-point values SUBRF Subtract reverse floating-point


value

MPYI† Multiply integers SUBRI Subtract reverse integer

NEGB Negate integer with borrow TSTB† Test bit fields

NEGF Negate floating-point value XOR† Bitwise exclusive-OR

NEGI Negate integer


† Two- and three-operand versions

Assembly Language Instructions 10-3


Instruction Set

10.1.3 Three-Operand Instructions


Most instructions have only two operands; however, some arithmetic and log-
ical instructions have three-operand versions. The 17 three-operand instruc-
tions allow the TMS320C3x to read two operands from memory or the CPU
register file in a single cycle and store the results in a register. The following
factors differentiate the two- and three-operand instructions:

- Two-operand instructions have a single source operand (or shift count)


and a destination operand.

- Three-operand instructions can have two source operands (or one source
operand and a count operand) and a destination operand. A source oper-
and can be a memory word or a register. The destination of a three-oper-
and instruction is always a register.

Table 10–3 lists the instructions that have three-operand versions. Note that
you can omit the 3 in the mnemonic from three-operand instructions (see sub-
section 10.3.2 on page 10-16).

Table 10–3.Three-Operand Instructions


Instruction Description Instruction Description
ADDC3 Add with carry MPYF3 Multiply floating-point values

ADDF3 Add floating-point values MPYI3 Multiply integers

ADDI3 Add integers OR3 Bitwise logical-OR

AND3 Bitwise logical-AND SUBB3 Subtract integers with borrow

ANDN3 Bitwise logical-AND with complement SUBF3 Subtract floating-point values

ASH3 Arithmetic shift SUBI3 Subtract integers

CMPF3 Compare floating-point values TSTB3 Test bit fields

CMPI3 Compare integers XOR3 Bitwise exclusive-OR

LSH3 Logical shift

10-4
Instruction Set

10.1.4 Program-Control Instructions


The program-control instruction group consists of all of those instructions (17)
that affect program flow. The repeat mode allows repetition of a block of code
(RPTB) or of a single line of code (RPTS). Both standard and delayed
(single-cycle) branching are supported. Several of the program control instruc-
tions are capable of conditional operations (see Section 11.2 on page 11-6
for detailed information on condition codes). Table 10–4 lists the program con-
trol instructions.

Table 10–4. Program Control Instructions

Instruction Description Instruction Description


Bcond Branch conditionally (standard) IDLE Idle until interrupt

BcondD Branch conditionally (delayed) NOP No operation

BR Branch unconditionally (standard) RETIcond Return from interrupt conditionally

BRD Branch unconditionally (delayed) RETScond Return from subroutine


conditionally

CALL Call subroutine RPTB Repeat block of instructions

CALLcond Call subroutine conditionally RPTS Repeat single instruction

DBcond Decrement and branch SWI Software interrupt


conditionally (standard)

DBcondD Decrement and branch TRAPcond Trap conditionally


conditionally (delayed)

IACK Interrupt acknowledge

10.1.5 Low-Power Control Instructions


The low-power control instruction group consists of three instructions that af-
fect the low-power modes. The low-power idle (IDLE2) instruction allows ex-
tremely low-power mode. The divide-clock-by-16 (LOPOWER) instruction re-
duces the rate of the input clock frequency. The restore-clock-to-regular-
speed (MAXSPEED) instruction causes the resumption of full-speed opera-
tion. Table 10–5 lists the low-power control instructions.

Table 10–5.Low-Power Control Instructions

Instruction Description Instruction Description


IDLE2 Low-power idle MAXSPEED Restore clock to regular speed

LOPOWER Divide clock by 16

Assembly Language Instructions 10-5


Instruction Set

10.1.6 Interlocked-Operations Instructions


The interlocked operations instructions (Table 10–6) support multiprocessor
communication and the use of external signals to allow for powerful synchroni-
zation mechanisms. The instructions also guarantee the integrity of the com-
munication and result in a high-speed operation. Refer to Chapter 6 for exam-
ples of the use of interlocked instructions.

Table 10–6. Interlocked Operations Instructions

Instruction Description Instruction Description


LDFI Load floating-point value, interlocked STFI Store floating-point value, inter-
locked

LDII Load integer, interlocked STII Store integer, interlocked

SIGI Signal, interlocked

10-6
Instruction Set

10.1.7 Parallel-Operations Instructions

The parallel-operations instructions group makes a high degree of parallelism


possible. Some of the TMS320C3x instructions can occur in pairs that will be
executed in parallel. These instructions offer the following features:
- Parallel loading of registers,
- Parallel arithmetic operations, or
- Arithmetic/logical instructions used in parallel with a store instruction.

Each instruction in a pair is entered as a separate source statement. The sec-


ond instruction in the pair must be preceded by two vertical bars (||).
Table 10–7 lists the valid instruction pairs.

Table 10–7.Parallel Instructions

Mnemonic Description
Parallel Arithmetic with Store Instructions
ABSF Absolute value of a floating-point number and store floating-point value
|| STF
ABSI Absolute value of an integer and store integer
|| STI
ADDF3 Add floating-point values and store floating-point value
|| STF
ADDI3 Add integers and store integer
|| STI
AND3 Bitwise logical-AND and store integer
|| STI
ASH3 Arithmetic shift and store integer
|| STI
FIX Convert floating-point to integer and store integer
|| STI
FLOAT Convert integer to floating-point value and store floating-point value
|| STF
LDF Load floating-point value and store floating-point value
|| STF
LDI Load integer and store integer
|| STI
LSH3 Logical shift and store integer
|| STI
MPYF3 Multiply floating-point values and store floating-point value
|| STF
MPYI3 Multiply integer and store integer
|| STI

Assembly Language Instructions 10-7


Instruction Set

Table 10–7.Parallel Instructions (Continued)

Mnemonic Description
Parallel Arithmetic with Store Instructions (Concluded)
NEGF Negate floating-point value and store floating-point value
|| STF
NEGI Negate integer and store integer
|| STI
NOT Complement value and store integer
|| STI
OR3 Bitwise logical-OR value and store integer
|| STI
STF Store floating-point values
|| STF
STI Store integers
|| STI
SUBF3 Subtract floating-point value and store floating-point value
|| STF
SUBI3 Subtract integer and store integer
|| STI
XOR3 Bitwise exclusive-OR values and store integer
|| STI
Parallel Load Instructions
LDF Load floating-point
|| LDF
LDI Load integer
|| LDI
Parallel Multiply and Add/Subtract Instructions
MPYF3 Multiply and add floating-point
|| ADDF3
MPYF3 Multiply and subtract floating-point
|| SUBF3
MPYI3 Multiply and add integer
|| ADDI3
MPYI3 Multiply and subtract integer
|| SUBI3

10-8
Instruction Set

10.1.8 Illegal Instructions


The TMS320C3x has no illegal instruction-detection mechanism. Fetching an
illegal (undefined) opcode can cause the execution of an undefined operation.
Proper use of the TI TMS320 floating-point software tools will not generate an
illegal opcode. Only the following can cause the generation of an illegal op-
code:
- Misuse of the tools
- An error in the ROM code
- Defective RAM

Assembly Language Instructions 10-9


Condition Codes and Flags

10.2 Condition Codes and Flags


The TMS320C3x provides 20 condition codes (00000–10100, excluding
01011) that you can place in the cond field of any of the conditional instructions,
such as RETScond or LDFcond. The conditions include signed and unsigned
comparisons, comparisons to 0, and comparisons based on the status of indi-
vidual condition flags. Note that all conditional instructions can accept the suf-
fix U to indicate unconditional operation.
Seven condition flags provide information about properties of the result of
arithmetic and logical instructions. The condition flags are stored in the status
register (ST) and are affected by an instruction only when either of the follow-
ing two cases occurs:
- The destination register is one of the extended-precision registers
(R7–R0). (This allows for modification of the registers used for addressing
but does not affect the condition flags during computation.)
- The instruction is one of the compare instructions (CMPF, CMPF3, CMPI,
CMPI3, TSTB, or TSTB3). (This makes it possible to set the condition flags
according to the contents of any of the CPU registers.)
The condition flags can be modified by most instructions when either of the
preceding conditions is established and either of the following two cases oc-
curs:
- A result is generated when the specified operation is performed to infinite
precision. This is appropriate for compare and test instructions that do not
store results in a register. It is also appropriate for arithmetic instructions
that produce underflow or overflow.
- The output is written to the destination register, as shown in Table 10–8.
This is appropriate for other instructions that modify the condition flags.

Table 10–8.Output Value Formats


Type Of Operation Output Format
Floating-point 8-bit exponent, one sign bit, 31-bit fraction

Integer 32-bit integer

Logical 32-bit unsigned integer

Figure 10–1 on page 10-11 shows the condition flags in the low-order bits of
the status register. Following the figure is a list of status register condition flags
and descriptions of how the flags are set by most instructions. For specific de-
tails of the effect of a particular instruction on the condition flags, see the de-
scription of that instruction in subsection 10.3.3 on page 10-18.

10-10
Condition Codes and Flags

Figure 10–1. Status Register


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

xx xx GIE CC CE CF xx RM OVM LUF LV UF N Z V C


R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W

NOTE: xx = reserved bit


R = read, W = write

LUF Latched Floating-Point Underflow Condition Flag

LUF is set whenever UF (floating-point underflow flag) is set. LUF can be


cleared only by a processor reset or by modifying it in the status register (ST).

LV Latched Overflow Condition Flag

LV is set whenever V (overflow condition flag) is set. Otherwise, it is un-


changed. LV can be cleared only by a processor reset or by modifying it in the
status register (ST).

UF Floating-Point Underflow Condition Flag

A floating-point underflow occurs whenever the exponent of the result is less


than or equal to –128. If a floating-point underflow occurs, UF is set, and the
output value is set to 0. UF is cleared if a floating-point underflow does not oc-
cur.

N Negative Condition Flag

Logical operations assign N the state of the MSB of the output value. For inte-
ger and floating-point operations, N is set if the result is negative, and cleared
otherwise. Zero is positive.

Z Zero Condition Flag

For logical, integer, and floating-point operations, Z is set if the output is 0 and
cleared otherwise.

Assembly Language Instructions 10-11


Condition Codes and Flags

V Overflow Condition Flag

For integer operations, V is set if the result does not fit into the format specified
for the destination (that is, –2 32 ≤ result ≤ 2 32 – 1). Otherwise, V is cleared.
For floating-point operations, V is set if the exponent of the result is greater
than 127; otherwise,V is cleared. Logical operations always clear V.

C Carry Flag

When an integer addition is performed, C is set if a carry occurs out of the bit
corresponding to the MSB of the output. When an integer subtraction is per-
formed, C is set if a borrow occurs into the bit corresponding to the MSB of the
output. Otherwise, for integer operations, C is cleared. The carry flag is unaf-
fected by floating-point and logical operations. For shift instructions, this flag
is set to the final value shifted out; for a 0 shift count, this is set to 0.

Table 10–9 lists the condition mnemonic, code, description, and flag for each
of the 20 condition codes.

10-12
Condition Codes and Flags

Table 10–9. Condition Codes and Flags


Condition Code Description Flag†
Unconditional Compares
U 00000 Unconditional Don’t care
Unsigned Compares
LO 00001 Lower than C
LS 00010 Lower than or same as C OR Z
HI 00011 Higher than ∼C AND ∼Z
HS 00100 Higher than or same as ∼C
EQ 00101 Equal to Z
NE 00110 Not equal to ∼Z
Signed Compares
LT 00111 Less than N
LE 01000 Less than or equal to N OR Z
GT 01001 Greater than ∼N AND ∼Z
GE 01010 Greater than or equal to ∼N
EQ 00101 Equal to Z
NE 00110 Not equal to ∼Z
Compare to Zero
Z 00101 Zero Z
NZ 00110 Not zero ∼Z
P 01001 Positive ∼N AND ∼Z
N 00111 Negative N
NN 01010 Nonnegative ∼N
Compare to Condition Flags
NN 01010 Nonnegative ∼N
N 00111 Negative N
NZ 00110 Nonzero ∼Z
Z 00101 Zero Z
NV 01100 No overflow ∼V
V 01101 Overflow V
NUF 01110 No underflow ∼UF
UF 01111 Underflow UF
NC 00100 No carry ∼C
C 00001 Carry C
NLV 10000 No latched overflow ∼LV
LV 10001 Latched overflow LV
NLUF 10010 No latched floating-point underflow ∼LUF
LUF 10011 Latched floating-point underflow LUF
ZUF 10100 Zero or floating-point underflow Z OR UF
† ∼ = logical complement (not-true condition)

Assembly Language Instructions 10-13


Individual Instructions

10.3 Individual Instructions


This section contains the individual assembly language instructions for the
TMS320C3x. The instructions are listed in alphabetical order. Information for
each instruction includes assembler syntax, operation, operands, encoding,
description, cycles, status bits, mode bit, and examples.

Definitions of the symbols and abbreviations, as well as optional syntax forms


allowed by the assembler, precede the individual instruction description sec-
tion. Also, an example instruction shows the special format used and explains
its content.

A functional grouping of the instructions, as well as a complete instruction set


summary, can be found in Section 10.1 on page 10-2. Appendix A lists the
opcodes for all of the instructions. Refer to Chapter 5 for information on
memory addressing. Code examples using many of the instructions are pro-
vided in Chapter 11.

10.3.1 Symbols and Abbreviations


Table 10–10 lists the symbols and abbreviations used in the individual instruc-
tion descriptions.

10-14
Individual Instructions

Table 10–10. Instruction Symbols

Symbol Meaning
src Source operand
src1 Source operand 1
src2 Source operand 2
src3 Source operand 3
src4 Source operand 4

dst Destination operand


dst1 Destination operand 1
dst2 Destination operand 2
disp Displacement
cond Condition
count Shift count

G General addressing modes


T Three-operand addressing modes
P Parallel addressing modes
B Conditional-branch addressing modes

|x| Absolute value of x


x→y Assign the value of x to destination y
x(man) Mantissa field (sign + fraction) of x
x(exp) Exponent field of x

op1
|| op2 Operation 1 performed in parallel with operation 2

x AND y Bitwise logical-AND of x and y


x OR y Bitwise logical-OR of x and y
x XOR y Bitwise logical-XOR of x and y
∼x Bitwise logical-complement of x

x << y Shift x to the left y bits


x >> y Shift x to the right y bits
*++SP Increment SP and use incremented SP as address
*SP– – Use SP as address and decrement SP

ARn Auxiliary register n


IRn Index register n
Rn Register address n
RC Repeat count register
RE Repeat end address register
RS Repeat start address register
ST Status register

C Carry bit
GIE Global interrupt enable bit
N Trap vector
PC Program counter
RM Repeat mode flag
SP System stack pointer

Assembly Language Instructions 10-15


Individual Instructions

10.3.2 Optional Assembler Syntax


The assembler allows a relaxed syntax form for some instructions. These op-
tional forms simplify the assembly language so that special-case syntax can
be ignored. Following is a list of these optional syntax forms.
- You can omit the destination register on unary arithmetic and logical oper-
ations when the same register is used as a source. For example,
ABSI R0,R0 can be written as ABSI R0.
Instructions affected: ABSI, ABSF, FIX, FLOAT, NEGB, NEGF, NEGI,
NORM, NOT, RND
- You can write all three-operand instructions without the 3. For example,
ADDI3 R0,R1,R2 can be written as ADDI R0,R1,R2.
Instructions affected: ADDC3, ADDF3, ADDI3, AND3, ANDN3, ASH3,
LSH3, MPYF3, MPYI3, OR3, SUBB3, SUBF3, SUBI3, XOR3
This also applies to all of the pertinent parallel instructions.
- You can write all three-operand comparison instructions without the 3. For
example,
CMPI3 R0,*AR0 can be written as CMPI R0,*AR0.
Instructions affected: CMPI3, CMPF3, TSTB3
- Indirect operands with an explicit 0 displacement are allowed. In three-op-
erand or parallel instructions, operands with 0 displacement are automati-
cally converted to no-displacement mode. For example:
LDI *+AR0(0),R1 is legal.
Also
ADDI3 *+AR0(0),R1,R2 is equivalent to ADDI3 *AR0,R1,R2.
- You can write indirect operands with no displacement, in which case a dis-
placement of 1 is assumed. For example,
LDI *AR0++(1),R0 can be written as LDI *AR0++,R0.
- All conditional instructions accept the suffix U to indicate unconditional op-
eration. Also, you can omit the U from unconditional short branch instruc-
tions. For example:
BU label can be written as B label.
- You can write labels with or without a trailing colon. For example:
label0: NOP
label1 NOP
label2: (Label assembles to next source line.)

10-16
Individual Instructions

- Empty expressions are not allowed for the displacement in indirect mode:
LDI *+AR0(),R0 is not legal.

- You can precede long immediate mode operands (destination of BR and


CALL) with an @ sign:
BR label can be written as BR @label.

- You can use the LDP pseudo-op to load a register (usually DP) with the
eight MSBs of a relocatable address:
LDP addr,REG or LDP @addr,REG
The @ sign is optional.
If the destination REG is the DP, you can omit the DP in the operand. LDP
generates an LDI instruction with an immediate operand and a special re-
location type.

- You can write parallel instructions in either order. For example:


ADDI can be written as STI
|| STI || ADDI.

- You can write the parallel bars indicating part 2 of a parallel instruction any-
where on the line from column 0 to the mnemonic. For example:
ADDI can be written as ADDI
|| STI || STI.

- If the second operand of a parallel instruction is the same as the third (des-
tination register) operand, you can omit the third operand. This allows you
to write three-operand parallel instructions that look like normal two-oper-
and instructions. For example,
ADDI *AR0,R2,R2 can be written as ADD *AR0,R2
|| MPYI *AR1,R0,R0 || MPYI *AR1,R0.
Instructions (applies to all parallel instructions that have a register second
operand) affected: ADDI, ADDF, AND, MPYI, MPYF, OR, SUBI, SUBF,
and XOR.

- You can write all commutative operations in parallel instructions in either


order. For example, you can write the ADDI part of a parallel instruction in
either of two ways:
ADDI *AR0,R1,R2 or ADDI R1,*AR0,R2.

Instructions affected: parallel instructions containing any of ADDI, ADDF,


MPYI, MPYF, AND, OR, and XOR.

Assembly Language Instructions 10-17


Individual Instructions

- Use the syntax in Table 10–11 to designate CPU registers in operands.


Note the alternate notation Rn, 0 nv v 27, which is used to designate
any CPU register.

Table 10–11. CPU Register Syntax


Assemblers Alternate
Syntax Register Syntax Assigned Function
R0 R0 Extended-precision register
R1 R1 Extended-precision register
R2 R2 Extended-precision register
R3 R3 Extended-precision register
R4 R4 Extended-precision register
R5 R5 Extended-precision register
R6 R6 Extended-precision register
R7 R7 Extended-precision register

AR0 R8 Auxiliary register


AR1 R9 Auxiliary register
AR2 R10 Auxiliary register
AR3 R11 Auxiliary register
AR4 R12 Auxiliary register
AR5 R13 auxiliary register
AR6 R14 Auxiliary register
AR7 R15 Auxiliary register

DP R16 Data-page pointer


IR0 R17 Index register 0
IR1 R18 Index register 1
BK R19 Block-size register
SP R20 Active stack pointer

ST R21 Status register


IE R22 CPU/DMA interrupt enable
IF R23 CPU interrupt flags
IOF R24 I/O flags

RS R25 Repeat start address


RE R26 Repeat end address
RC R27 Repeat counter

10.3.3 Individual Instruction Descriptions


Each assembly language instruction for the TMS320C3x is described in
this section in alphabetical order. The description includes the assembler syn-
tax, operation, operands, encoding, description, cycles, status bits, mode bit,
and examples.

10-18
Example Instruction EXAMPLE

Syntax INST src, dst

or

INST1 src2, dst1


|| INST2 src3, dst2

Each instruction begins with an assembler syntax expression. You can place
labels either before the command (instruction mnemonic) on the same line or
on the preceding line in the first column. The optional comment field that con-
cludes the syntax is not included in the syntax expression. Space(s) are
required between each field (label, command, operand, and comment fields).

The syntax examples illustrate the common one-line syntax and the two-line
syntax used in parallel addressing. Note that the two vertical bars || that indi-
cate a parallel addressing pair can be placed anywhere before the mnemonic
on the second line. The first instruction in the pair can have a label, but the sec-
ond instruction cannot have a label.

Operation |src | → dst

or

|src2 | → dst1
|| src3 → dst2

The instruction operation sequence describes the processing that occurs


when the instruction is executed. For parallel instructions, the operation se-
quence is performed in parallel. Conditional effects of status register specified
modes are listed for such conditional instructions as Bcond.

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate
dst register (Rn, 0 ≤ n ≤ 27)
or
src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Operands are defined according to the addressing mode and/or the type of ad-
dressing used. Note that indirect addressing uses displacements and the in-
dex registers. Refer to Chapter 5 for detailed information on addressing.

Assembly Language Instructions 10-19


EXAMPLE Example Instruction

Encoding
31 24 23 16 15 87 0

0 0 0 INST G dst src

or
31 24 23 16 15 87 0

1 1 INST1INST2 dst1 0 0 0 src3 dst2 src2

Encoding examples are shown using general addressing and parallel addres-
sing. The instruction pair for the parallel addressing example consists of
INST1 and INST2.

Description Instruction execution and its effect on the rest of the processor or memory con-
tents is described. Any constraints on the operands imposed by the processor
or the assembler are discussed. The description parallels and supplements
the information given by the operation block.

Cycles 1

The digit specifies the number of cycles required to execute the instruction.

Status Bits LUF Latched Floating-Point Underflow Condition Flag. 1 if a


floating-point underflow occurs; unchanged otherwise.
LV Latched Overflow Condition Flag. 1 if an integer or floating-point
overflow occurs; unchanged otherwise.
UF Floating-Point Underflow Condition Flag. 1 if a floating-point un-
derflow occurs; 0 otherwise.
N Negative Condition Flag. 1 if a negative result is generated; 0 other-
wise. In some instructions, this flag is the MSB of the output.
Z Zero Condition Flag. 1 if a 0 result is generated; 0 otherwise. For log-
ical and shift instructions, 1 if a 0 output is generated; 0 otherwise.
V Overflow Condition Flag. 1 if an integer or floating-point overflow oc-
curs; 0 otherwise.
C Carry Flag. 1 if a carry or borrow occurs; 0 otherwise. For shift instruc-
tions, this flag is set to the value of the last bit shifted out; 0 for a shift
count of 0.

The seven condition flags stored in the status register (ST) are modified by the
majority of instructions only if the destination register is R7–R0. The flags pro-
vide information about the properties of the result or the output of arithmetic
or logical operations.

10-20
Example Instruction EXAMPLE

Mode Bit OVM Overflow Mode Flag. In general, integer operations are affected by the
OVM bit value (described in Table 3–2 on page 3-6).

Example INST @98AEh,R5

Before Instruction:

DP = 80h
R5 = 0766900000h = 2.30562500e+02
Memory at 8098AEh = 5CDFh = 1.00001107e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

DP = 80h
R5 = 0066900000h = 1.80126953e + 00
Memory at 8098AEh = 5CDFh = 1.00001107e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0

The sample code presented in the above format shows the effect of the code
on system pointers (for example, DP or SP), registers (for example, R1 or R5),
memory at specific locations, and the seven status bits. The values given for
the registers include the leading 0s to show the exponent in floating-point oper-
ations. Decimal conversions are provided for all register and memory loca-
tions. The seven status bits are listed in the order in which they appear in the
assembler and simulator (see Section 10.2 on page 10-10 and Table 10–9 on
page 10-13 for further information on these seven status bits).

Assembly Language Instructions 10-21


ABSF Absolute Value of Floating-Point

Syntax ABSF src, dst

Operation |src| → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate
dst register (Rn, ≤ 0 n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 0 0 0 0 0 0 0 0 G dst src

Description The absolute value of the src operand is loaded into the dst register. The src
and dst operands are assumed to be floating-point numbers.

An overflow occurs if src (man) = 80000000h and src (exp) = 7Fh. The result
is dst (man) = 7FFFFFFFh and dst (exp) = 7Fh.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 0
N 0
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example ABSF R4,R7

Before Instruction:

R4 = 05C8000F971h = –9.90337307e + 27
R7 = 07D251100AEh = 5.48527255e + 37
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

R4 = 05C8000F971h = –9.90337307e + 27
R7 = 05C7FFF068Fh = 9.90337307e + 27
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-22
Parallel ABSF and STF ABSF||STF

Syntax ABSF src2, dst1


|| STF src3, dst2

Operation |src2 | → dst1


|| src3 → dst2

Operands src2 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding

31 24 23 16 15 87 0

1 1 0 0 1 0 0 dst1 0 0 0 src3 dst2 src2

Description A floating-point absolute value and a floating-point store are performed in par-
allel. All registers are read at the beginning and loaded at the end of the ex-
ecute cycle. This means that if one of the parallel operations (STF) reads from
a register and the operation being performed in parallel (ABSF) writes to the
same register, STF accepts as input the contents of the register before it is mo-
dified by the ABSF.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.
If src3 and dst1 point to the same register, src3 is read before the write to dst1.

An overflow occurs if src (man) = 80000000h and src (exp) = 7Fh. The result
is dst (man) = 7FFFFFFFh and dst (exp) = 7Fh.

Cycles 1

Status Bits LUF Unaffected


LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 0
N 0
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected

Mode Bit OVM Operation is not affected by OVM bit value.

Example ABSF *++AR3(IR1) ,R4


 STF R4,*– AR7(1)

Assembly Language Instructions 10-23


ABSF||STF Parallel ABSF and STF

Before Instruction:

AR3 = 809800h
IR1 = 0AFh
R4 = 733C00000h = 1.79750e + 02
AR7 = 8098C5h
Data at 8098AFh = 58B4000h = – 6.118750e + 01
Data at 8098C4h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR3 = 8098AFh
IR1 = 0AFh
R4 = 574C00000h = 6.118750e + 01
AR7 = 8098C5h
Data at 8098AFh = 58B4000h = –6.118750e + 01
Data at 8098C4h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-24
Absolute Value of Integer ABSI

Syntax ABSI src, dst

Operation |src| → dst

Operands src general addressing modes (G):


00 any CPU register
01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 0 0 0 0 1 G dst src

Description The absolute value of the src operand is loaded into the dst register. The src
and dst operands are assumed to be signed integers.

An overflow occurs if src = 80000000h. If ST(OVM) = 1, the result is


dst = 7FFFFFFFh. If ST(OVM) = 0, the result is dst = 80000000h.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 0
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.

Example 1 ABSI R0,R0


or
ABSI R0

Before Instruction:

R0 = 0FFFFFFCBh = – 53

After Instruction:

R0 = 035h = 53

Assembly Language Instructions 10-25


ABSI Absolute Value of Integer

Example 2 ABSI *AR1,R3

Before Instruction:

AR1 = 20h
R3 = 0h
Data at 20h = 0FFFFFFCBh = – 53

After Instruction:

AR1 = 20h
R3 = 35h = 53
Data at 20h = 0FFFFFFCBh = – 53

10-26
Parallel ABSI and STI ABSI||STI

Syntax ABSI src2, dst1

|| STI src3, dst2

Operation |src2 | → dst1


|| src3 → dst2

Operands src2 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ 1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 0 1 0 1 dst1 0 0 0 src3 dst2 src2

Description An integer absolute value and an integer store are performed in parallel. All
registers are read at the beginning and loaded at the end of the execute cycle.
This means that, if one of the parallel operations (STI) reads from a register
and the operation being performed in parallel (ABSI) writes to the same regis-
ter, STI accepts as input the contents of the register before it is modified by the
ABSI.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

An overflow occurs if src = 80000000h. If ST(OVM) = 1, the result is dst =


7FFFFFFFh. If ST(OVM) = 0, the result is dst = 80000000h.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 0
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.

Assembly Language Instructions 10-27


ABSI||STI Parallel ABSI and STI

Example ABSI *–AR5(1),R5


|| STI R1,*AR2– –(IR1)

Before Instruction:

AR5 = 8099E2h
R5 = 0h
R1 = 42h = 66
AR2 = 8098FFh
IR1 = 0Fh
Data at 8099E1h = 0FFFFFFCBh = – 53
Data at 8098FFh = 2h = 2
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR5 = 8099E2h
R5 = 35h = 53
R1 = 42h = 66
AR2 = 8098F0h
IR1 = 0Fh
Data at 8099E1h = 0FFFFFFCBh = – 53
Data at 8098FFh = 42h = 66
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-28
Add Integer With Carry ADDC

Syntax ADDC src, dst

Operation dst + src + C → dst

Operands src general addressing modes (G):

00 any CPU register


01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 0 0 1 0 G dst src

Description The sum of the dst and src operands and the carry (C) flag is loaded into the
dst register. The dst and src operands are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Example ADDC R1,R5

Before Instruction:

R1 = 00FFFF5C25h = – 41,947
R5 = 00FFFF019Eh = – 65,122
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R1 = 00FFFF5C25h = – 41,947
R5 = 00FFFE5DC4h = – 107,068
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-29


ADDC3 Add Integer With Carry, 3-Operand

Syntax ADDC3 src2, src1, dst

Operation src1 + src2 + C → dst

Operands src1 three-operand addressing modes (T):


00 any CPU register
01 indirect (disp = 0, 1, IR0, IR1)
10 any CPU register
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 any CPU register
01 any CPU register
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 0 0 0 0 T dst src1 src2

Description The sum of the src1 and src2 operands and the carry (C) flag is loaded into
the dst register. The src1, src2, and dst operands are assumed to be signed
integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
U 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

10-30
Add Integer With Carry, 3-Operand ADDC3

Example 1 ADDC3 *AR5++(IR0),R5,R2


or
ADDC3 R5,*AR5++(IR0),R2

Before Instruction:

AR5 = 809908h
IR0 = 10h
R5 = 066h = 102
R2 = 0h
Data at 809908h = 0FFFFFFCBh = – 53
LUF LV UF N Z V C = 0 0 0 0 0 0 1

After Instruction:

AR5 = 809918h
IR0 = 10h
R5 = 066h = 102
R2 = 032h = 50
Data at 809908h = 0FFFFFFCBh = – 53
LUF LV UF N Z V C = 0 0 0 0 0 0 1

Example 2 ADDC3 R2, R7, R0

Before Instruction:

R2 = 02BCh = 700
R7 = 0F82h = 3970
R0 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 1

After Instruction:

R2 = 02BCh = 700
R7 = 0F82h = 3970
R0 = 0123Fh = 4671
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-31


ADDF Add Floating-Point

Syntax ADDF src, dst


Operation dst + src → dst
Operands src general addressing modes (G):
00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate
dst register (Rn, 0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 0 0 0 0 0 0 1 1 G dst src

Description The sum of the dst and src operands is loaded into the dst register. The dst and
src operands are assumed to be floating-point numbers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example ADDF *AR4++(IR1),R5
Before Instruction:
AR4 = 809800h
IR1 = 12Bh
R5 = 0579800000h = 6.23750e+01
Data at 809800h = 86B2800h = 4.7031250e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR4 = 80992Bh
IR1 = 12Bh
R5 = 09052C0000h = 5.3268750e+02
Data at 809800h = 86B2800h = 4.7031250e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-32
Add Floating-Point, 3-Operand ADDF3

Syntax ADDF3 src2, src1, dst


Operation src1 + src2 → dst
Operands src1 three-operand addressing modes (T):
00 register (Rn1, 0 ≤ n1 ≤ 7)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 7)
11 indirect (disp = 0, 1, IR0, IR1)
src2 three-operand addressing modes (T):
00 register (Rn2, 0 ≤ n2 ≤ 7)
01 register (Rn2, 0 ≤ n2 ≤ 7)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)
dst register (Rn, 0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 0 1 0 0 0 0 0 1 T dst src1 src2

Description The sum of the src1 and src2 operands is loaded into the dst register. The src1,
src2, and dst operands are assumed to be floating-point numbers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example 1 ADDF3 R6,R5,R1
or
ADDF3 R5,R6,R1
Before Instruction:
R6 = 086B280000h = 4.7031250e + 02
R5 = 0579800000h = 6.23750e+01
R1 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-33


ADDF3 Add Floating-Point, 3-Operand

After Instruction:

R6 = 086B280000h = 4.7031250e + 02
R5 = 0579800000h = 6.23750e + 01
R1 = 09052C0000h = 5.3268750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Example 2 ADDF3 *+AR1(1),*AR7++(IR0),R4

Before Instruction:

AR1 = 809820h
AR7 = 8099F0h
IR0 = 8h
R4 = 0h
Data at 809821h = 700F000h = 1.28940e + 02
Data at 8099F0h = 34C2000h = 1.27590e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR1 = 809820h
AR7 = 8099F8h
IR0 = 8h
R4 = 070DB20000h = 1.41695313e + 02
Data at 809821h = 700F000h = 1.28940e + 02
Data at 8099F0h = 34C2000h = 1.27590e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-34
Parallel ADDF3 and STF ADDF3||STF

Syntax ADDF3 src2, src1, dst1


|| STF src3, dst2

Operation src1 + src2 → dst1


|| src3 → dst2

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn2, 0 ≤ n2 ≤ 7)
src3 register (Rn3, 0 ≤ n3 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 0 1 1 0 dst1 src1 src3 dst2 src2

Description A floating-point addition and a floating-point store are performed in parallel. All
registers are read at the beginning and loaded at the end of the execute cycle.
This means that if one of the parallel operations (STF) reads from a register
and the operation being performed in parallel (ADDF3) writes to the same reg-
ister, STF accepts as input the contents of the register before it is modified by
the ADDF3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected

Mode Bit OVM Operation is not affected by OVM bit value.

Example ADDF3 *+AR3(IR1),R2,R5


|| STF R4,*AR2

Assembly Language Instructions 10-35


ADDF3||STF Parallel ADDF3 and STF

Before Instruction:

AR3 = 809800h
IR1 = 0A5h
R2 = 070C800000h = 1.4050e + 02
R5 = 0h
R4 = 057B400000h = 6.281250e + 01
AR2 = 8098F3h
Data at 8098A5h = 733C000h = 1.79750e + 02
Data at 8098F3h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR3 = 809800h
IR1 = 0A5h
R2 = 070C800000h = 1.4050e+02
R5 = 0820200000h = 3.20250e + 02
R4 = 057B400000h = 6.281250e + 01
AR2 = 8098F3h
Data at 8098A5h = 733C000h = 1.79750e + 02
Data at 8098F3h = 57B4000h = 6.28125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-36
Add Integer ADDI

Syntax ADDI src, dst

Operation dst + src → dst

Operands src general addressing modes (G):


00 any CPU register
01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 0 0 1 0 0 G dst src

Description The sum of the dst and src operands is loaded into the the dst register. The
dst and src operands are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Example ADDI R3,R7

Before Instruction:

R3 = 0FFFFFFCBh = – 53
R7 = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R3 = 0FFFFFFCBh = – 53
R7 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-37


ADDI3 Add Integer, 3-Operand

Syntax ADDI3 <src2 >,<src1 >,<dst >

Operation src1 + src2 → dst

Operands src1 three-operand addressing modes (T):


00 any CPU register
01 indirect (disp = 0, 1, IR0, IR1)
10 any CPU register
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 any CPU register
01 any CPU register
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 0 0 1 0 T dst src1 src2

Description The sum of the src1 and src2 operands is loaded into the dst register. The src1,
src2, and dst operands are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Example 1 ADDI3 R4,R7,R5

Before Instruction:

R4 = 0DCh = 220
R7 = 0A0h = 160
R5 = 10h = 16
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-38
Add Integer, 3-Operand ADDI3

After Instruction:

R4 = 0DCh = 220
R7 = 0A0h = 160
R5 = 017Ch = 380
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Example 2 ADDI3 *–AR3(1),*AR6– –(IR0),R2

Before Instruction:

AR3 = 809802h
AR6 = 809930h
IR0 = 18h
R2 = 10h = 16
Data at 809801h = 2AF8h = 11,000
Data at 809930h = 3A98h = 15,000
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR3 = 809802h
AR6 = 809918h
IR0 = 18h
R2 = 06598h = 26,000
Data at 809801h = 2AF8h = 11,000
Data at 809930h = 3A98h = 15,000
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-39


ADDI3||STI Parallel ADDI3 and STI

Syntax ADDI3 src2, src1, dst1


|| STI src3, dst2

Operation src1 + src2 → dst1


|| src3 → dst2

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn2, 0 ≤ n2 ≤ 7)
src3 register (Rn3, 0 ≤ n3 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 0 1 1 1 dst1 src1 src3 dst2 src2

Description An integer addition and an integer store are performed in parallel. All registers
are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (ADDI3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the ADDI3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

10-40
Parallel ADDI3 and STI ADDI3||STI

Example ADDI3 *AR0– –(IR0),R5,R0


 STI R3,*AR7

Before Instruction:

AR0 = 80992Ch
IR0 = 0Ch
R5 = 0DCh = 220
R0 = 0h
R3 = 35h = 53
AR7 = 80983Bh
Data at 80992Ch = 12Ch = 300
Data at 80983Bh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR0 = 809920h
IR0 = 0Ch
R5 = 0DCh = 220
R0 = 208h = 520
R3 = 35h = 53
AR7 = 80983Bh
Data at 80992Ch = 12Ch = 300
Data at 80983Bh = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-41


AND Bitwise Logical-AND

Syntax AND src, dst

Operands dst AND src → dst

Operands src general addressing modes (G):


00 any CPU register
01 direct
10 indirect
11 immediate (not sign-extended)

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 0 0 1 0 1 G dst src

Description The bitwise logical-AND between the dst and src operands is loaded into the
dst register. The dst and src operands are assumed to be unsigned integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example AND R1,R2

Before Instruction:

R1 = 80h
R2 = 0AFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 1

After Instruction:

R1 = 80h
R2 = 80h
LUF LV UF N Z V C = 0 0 0 0 0 0 1

10-42
Bitwise Logical-AND, 3-Operand AND3

Syntax AND3 src2, src1, dst

Operation src1 AND src2 → dst

Operands src1 three-operand addressing modes (T):


00 any CPU register
01 indirect (disp = 0, 1, IR0, IR1)
10 any CPU register
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 any CPU register
01 any CPU register
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 0 0 1 1 T dst src1 src2

Description The bitwise logical-AND between the src1 and src2 operands is loaded into
the destination register. The src1, src2, and dst operands are assumed to be
unsigned integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-43


AND3 Bitwise Logical-AND, 3-Operand

Example 1 AND3 *AR0– –(IR0),*+AR1,R4

Before Instruction:

AR0 = 8098F4h
IR0 = 50h
AR1 = 809951h
R4 = 0h
Data at 8098F4h = 30h
Data at 809952h = 123h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR0 = 8098A4h
IR0 = 50h
AR1 = 809951h
R4 = 020h
Data at 8098F4h = 30h
Data at 809952h = 123h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Example 2 AND3 *–AR5,R7,R4

Before Instruction:

AR5 = 80985Ch
R7 = 2h
R4 = 0h
Data at 80985Bh = 0AFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR5 = 80985Ch
R7 = 2h
R4 = 2h
Data at 80985Bh = 0AFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-44
Parallel AND3 and STI AND3||STI

Syntax AND3 src2, src1, dst1


 STI src3, dst2

Operation src1 AND src2 → dst1


|| src3 → dst2

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn2, 0 ≤ n2 ≤ 7)
src3 register (Rn3, 0 ≤ n3 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 1 0 0 0 dst1 src1 src3 dst2 src2

Description A bitwise logical-AND and an integer store are performed in parallel. All regis-
ters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (AND3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the AND3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-45


AND3||STI Parallel AND3 and STI

Example AND3 *+AR1(IR0),R4,R7


|| STI R3,*AR2

Before Instruction:

AR1 = 8099F1h
IR0 = 8h
R4 = 0A323h
R7 = 0h
R3 = 35h = 53
AR2 = 80983Fh
Data at 8099F9h = 5C53h
Data at 80983Fh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR1 = 8099F1h
R0 = 8h
R4 = 0A323h
R7 = 03h
R3 = 35h = 53
AR2 = 80983Fh
Data at 8099F9h = 5C53h
Data at 80983Fh = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-46
Bitwise Logical-AND With Complement ANDN

Syntax ANDN src, dst

Operation dst AND ∼src → dst


Operands src general addressing modes (G):
00 any CPU register
01 direct
10 indirect
11 immediate (not sign-extended)
dst any CPU register
Encoding
31 24 23 16 15 87 0

0 0 0 0 0 0 1 1 0 G dst src

Description The bitwise logical-AND between the dst operand and the bitwise logical com-
plement (∼) of the src operand is loaded into the dst register. The dst and src
operands are assumed to be unsigned integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example ANDN @980Ch,R2

Before Instruction:

DP = 80h
R2 = 0C2Fh
Data at 80980Ch = 0A02h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

DP = 80h
R2 = 042Dh
Data at 80980Ch = 0A02h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-47


ANDN3 Bitwise Logical-ANDN, 3-Operand

Syntax ANDN3 src2, src1, dst

Operation src1 AND ∼src2 → dst

Operands src1 three-operand addressing modes (T):


00 any CPU register
01 indirect (disp = 0, 1, IR0, IR1)
10 any CPU register
11 indirect (disp = 0, 1, IR0, IR1)
src2 three-operand addressing modes (T):
00 any CPU register
01 any CPU register
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IO0, IR1)
dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 0 1 0 0 T dst src1 src2

Description The bitwise logical-AND between the src1 operand and the bitwise logical
complement (∼) of the src2 operand is loaded into the dst register. The src1,
src2, and dst operands are assumed to be unsigned integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example 1 ANDN3 R5,R3,R7

Before Instruction:

R5 = 0A02h
R3 = 0C2Fh
R7 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-48
Bitwise Logical-ANDN, 3-Operand ANDN3

After Instruction:

R5 = 0A02h
R3 = 0C2Fh
R7 = 042Dh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Example 2 ANDN3 R1,*AR5++(IR0),R0

Before Instruction:

R1 = 0CFh
AR5 = 809825h
IR0 = 5h
R0 = 0h
Data at 809825h = 0FFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R1 = 0CFh
AR5 = 80982Ah
IR0 = 5h
R0 = 0F30h
Data at 809825h = 0FFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-49


ASH Arithmetic Shift

Syntax ASH count, dst

Operation If (count ≥ 0):


dst << count → dst

Else:
dst >> |count | → dst

Operands count general addressing modes (G):


00 any CPU register
01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 0 0 1 1 1 G dst count

Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count of up to 32 bits.

If the count operand is greater than 0, the dst operand is left-shifted by the
value of the count operand. Low-order bits shifted in are 0-filled, and high-ord-
er bits are shifted out through the carry (C) bit.

Arithmetic left-shift:
C ← dst ← 0

If the count operand is less than 0, the dst operand is right-shifted by the abso-
lute value of the count operand. The high-order bits of the dst operand are sign-
extended as it is right-shifted. Low-order bits are shifted out through the C bit.

Arithmetic right-shift:
sign of dst → dst → C

If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count and dst operands are assumed to be signed integers.

Cycles 1

10-50
Arithmetic Shift ASH

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Mode Bit OVM Operation is not affected by OVM bit value.

Example 1 ASH R1,R3

Before Instruction:

R1 = 10h = 16
R3 = 0AE000h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R1 = 10h
R3 = 0E0000000h
LUF LV UF N Z V C = 0 1 0 1 0 1 0

Example 2 ASH @98C3h,R5

Before Instruction:

DP = 80h
R5 = 0AEC00001h
Data at 8098C3h = 0FFE8 = – 24
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

DP = 80h
R5 = 0FFFFFFAEh
Data at 8098C3h = 0FFE8 = – 24
LUF LV UF N Z V C = 0 0 0 1 0 0 1

Assembly Language Instructions 10-51


ASH3 Arithmetic Shift, 3-Operand

Syntax ASH3 count, src, dst

Operation If (count ≥ 0):


src << count → dst
Else:
src >> |count | → dst
Operands count three-operand addressing modes (T):
00 register (Rn2, 0 ≤ n2 ≤ 27)
01 register (Rn2, 0 ≤ n2 ≤ 27)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

src three-operand addressing modes (T):


00 register (Rn1, 0 ≤ n1 ≤ 27)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 27)
11 indirect (disp = 0, 1, IR0, IR1)
dst register (Rn, 0 ≤ n ≤ 27)
Encoding
31 24 23 16 15 87 0

0 0 1 0 0 0 1 0 1 T dst src count

Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count of up to 32 bits.

If the count operand is greater than 0, the src operand is left-shifted by the
value of the count operand. Low-order bits shifted in are 0-filled, and high-ord-
er bits are shifted out through the status register’s C bit.

Arithmetic left-shift:
C ← src ← 0
If the count operand is less than 0, the src operand is right-shifted by the abso-
lute value of the count operand. The high-order bits of the src operand are sign-
extended as they are right-shifted. Low-order bits are shifted out through the
C (carry) bit.

Arithmetic right-shift:
sign of src → src → C
If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count, src, and dst operands are assumed to be signed integers.

10-52
Arithmetic Shift, 3-Operand ASH3

Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Mode Bit OVM Operation is not affected by OVM bit value.
Example ASH3 *AR3– –(1),R5,R0
Before Instruction:
AR3 = 809921h
R5 = 02B0h
R0 = 0h
Data at 809921h = 10h = 16
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR3 = 809920h
R5 = 000002B0h
R0 = 02B00000h
Data at 809921h = 10h = 16
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Example ASH3 R1,R3,R5
Before Instruction:
R1 = 0FFFFFFF8h = – 8
R3 = 0FFFFCB00h
R5 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 0FFFFFFF8h = – 8
R3 = 0FFFFCB00h
R5 = 0FFFFFFCBh
LUF LV UF N Z V C = 0 0 0 1 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-53


ASH3||STI Parallel ASH3 and STI

Syntax ASH3 count, src2, dst1


|| STI src3, dst2
Operation If (count ≥ 0):
src2 << count → dst1
Else:
src2 >> |count| → dst1
|| src3 → dst2
Operands count register (Rn1, 0 ≤ n1 ≤ 7)
src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn2, 0 ≤ n2 ≤ 7)
src3 register (Rn3, 0 ≤ n3 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)
Encoding
31 24 23 16 15 87 0

1 1 0 1 0 0 1 dst1 count src3 dst2 src2

Description The seven least significant bits of the count operand register are used to gen-
erate the two’s complement shift count of up to 32 bits.
If the count operand is greater than 0, the src2 operand is left-shifted by the
value of the count operand. Low-order bits shifted in are 0-filled, and high-ord-
er bits are shifted out through the C bit.
Arithmetic left-shift:
C ← src2 ← 0
If the count operand is less than 0, the src2 operand is right-shifted by the ab-
solute value of the count operand. The high-order bits of the src2 operand are
sign-extended as it is right-shifted. Low-order bits are shifted out through the
C bit.
Arithmetic right-shift:
sign of src2 → src2 → C
If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count and dst operands are assumed to be signed integers.
All registers are read at the beginning and loaded at the end of the execute
cycle. This means that, if one of the parallel operations (STI) reads from a reg-
ister and the operation being performed in parallel (ASH3) writes to the same
register, STI accepts as input the contents of the register before it is modified
by the ASH3.

10-54
Parallel ASH3 and STI ASH3||STI

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Mode Bit OVM Operation is not affected by OVM bit value.

Example ASH3 R1,*AR6++(IR1),R0


|| STI R5,*AR2

Before Instruction:

AR6 = 809900h
IR1 = 8Ch
R1 = 0FFE8h = – 24
R0 = 0h
R5 = 35h = 53
AR2 = 8098A2h
Data at 809900h = 0AE000000h
Data at 8098A2h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR6 = 80998Ch
IR1 = 8Ch
R1 = 0FFE8h = – 24
R0 = 0FFFFFFAEh
R5 = 35h = 53
AR2 = 8098A2h
Data at 809900h = 0AE000000h
Data at 8098A2h = 35h = 53
LUF LV UF N Z V C = 0 0 0 1 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-55


Bcond Branch Conditionally (Standard)

Syntax Bcond src

Operation If cond is true:


If src is in register-addressing mode (Rn, 0 ≤ n ≤ 27),
src → PC.
If src is in PC-relative mode (label or address),
displacement + PC + 1 → PC.
Else, continue.

Operands src conditional-branch addressing modes (B):


0 register
1 PC-relative

Encoding
31 24 23 16 15 87 0

0 1 1 0 1 0 B 0 0 0 0 cond register or displacement

Description Bcond signifies a standard branch that executes in four cycles. A branch is per-
formed if the condition is true (since a pipeline flush also occurs on a true condi-
tion; see Section 9.2 on page 9-4). If the src operand is expressed in register
addressing mode, the contents of the specified register are loaded into the PC.
If the src operand is expressed in PC-relative mode, the assembler generates
a displacement: displacement = label – (PC of branch instruction + 1). This dis-
placement is stored as a 16-bit signed integer in the 16 least significant bits
of the branch instruction word. This displacement is added to the PC of the
branch instruction plus 1 to generate the new PC.

The TMS320C3x provides 20 condition codes that you can use with this in-
struction (see Table 10–9 on page -13 for a list of condition mnemonics, condi-
tion codes and flags). Condition flags are set on a previous instruction only
when the destination register is one of the extended-precision registers (R7–
R0) or when one of the compare instructions (CMPF, CMPF3, CMPI, CMPI3,
TSTB, or TSTB3) is executed.

Cycles 4

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-56
Branch Conditionally (Standard) Bcond

Example BZ R0

Before Instruction:

PC = 2B00h
R0 = 0003FF00h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 3FF00h
R0 = 0003FF00h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note:
If a BZ instruction is executed immediately following a RND instruction with
a 0 operand, the branch is not performed, because the 0 flag is not set. To
circumvent this problem, execute a BZUF instead of a BZ instruction.

Assembly Language Instructions 10-57


BcondD Branch Conditionally (Delayed)

Syntax Bcond D src


Operation If cond is true:
If src is in register-addressing mode (Rn, 0 ≤ n ≤ 27),
src → PC.
If src is in PC-relative mode (label or address),
displacement + PC + 3 → PC.
Else, continue.
Operands src conditional-branch addressing modes (B):
0 register
1 PC-relative
Encoding
31 24 23 16 15 87 0

0 1 1 0 1 0 B 0 0 0 1 cond register or displacement

Description Bcond D signifies a delayed branch that allows the three instructions after the
delayed branch to be fetched before the PC is modified. The effect is a single-
cycle branch, and the three instructions following Bcond D will not affect the
cond.
A branch is performed if the condition is true. If the src operand is expressed
in register-addressing mode, the contents of the specified register are loaded
into the PC. If the src operand is expressed in PC-relative mode, the assembler
generates a displacement: displacement = label – (PC of branch instruction
+ 3). This displacement is stored as a 16-bit signed integer in the 16 least sig-
nificant bits of the branch instruction. This displacement is added to the PC of
the branch instruction plus 3 to generate the new PC. The TMS320C3x pro-
vides 20 condition codes that you can use with this instruction (see Table 10–9
on page -13 for a list of condition mnemonics, condition codes, and flags). Con-
dition flags are set on a previous instruction only when the destination register
is one of the extended-precision registers (R7–R0) or when one of the com-
pare instructions (CMPF, CMPF3, CMPI, CMPI3, TSTB, or TSTB3) is ex-
ecuted.
Cycles 1
Status Bits LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-58
Branch Conditionally (Delayed) BcondD

Example BNZD 36 (36 = 24h)

Before Instruction:

PC = 50h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 77h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-59


BR Branch Unconditionally (Standard)

Syntax BR src

Operation src → PC or PC + disp → PC, where disp = src – (PC + 1)

Operands src long-immediate addressing mode

Encoding
31 24 23 16 15 87 0

0 1 1 0 0 0 0 0 disp

Description BR performs a PC-relative branch that executes in four cycles, since a pipeline
flush also occurs upon execution of the branch; see Section 9.2 on page 9-4.
An unconditional branch is performed. The src operand is assumed to be a
24-bit unsigned integer. Note that bit 24 = 0 for a standard branch.

Cycles 4

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example BR 805Ch

Before Instruction:

PC = 80h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 805Ch
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-60
Branch Unconditionally (Delayed) BRD

Syntax BRD src

Operation src → PC

Operands src long-immediate addressing mode

Encoding
31 24 23 16 15 87 0

0 1 1 0 0 0 0 1 src

Description BRD signifies a delayed branch that allows the three instructions after the
delayed branch to be fetched before the PC is modified. The effect is a
single-cycle branch.

An unconditional branch is performed. The src operand is assumed to be a


24-bit unsigned integer. Note that bit 24 = 1 for a delayed branch.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example BRD 2Ch

Before Instruction:

PC = 1Bh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 2Ch
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-61


CALL Call Subroutine

Syntax CALL src

Operation Next PC → *++SP


src → PC

Operands src long-immediate addressing mode

Encoding
31 24 23 16 15 87 0

0 1 1 0 0 0 1 0 src

Description A call is performed. The next PC value is pushed onto the system stack. The
src operand is loaded into the PC. The src operand is assumed to be a 24-bit
unsigned immediate operand.

Cycles 4

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example CALL 123456h

Before Instruction:

PC = 5h
SP = 809801h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 123456h
SP = 809802h
Data at 809802h = 6h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-62
Call Subroutine Conditionally CALLcond

Syntax CALLcond src

Operation If cond is true:


Next PC → *++SP
If src is in register addressing mode (Rn, 0 ≤ n ≤ 27),
src → PC.
If src is in PC-relative mode (label or address),
displacement + PC + 1 → PC.
Else, continue.

Operands src conditional-branch addressing modes (B):


0 register
1 PC-relative
Encoding
31 24 23 16 15 87 0

0 1 1 1 0 0 B 0 0 0 0 cond register or displacement

Description A call is performed if the condition is true. If the condition is true, the next PC
value is pushed onto the system stack. If the src operand is expressed in regis-
ter addressing mode, the contents of the specified register are loaded into the
PC. If the src operand is expressed in PC-relative mode, the assembler gener-
ates a displacement: displacement = label – (PC of call instruction + 1). This
displacement is stored as a 16-bit signed integer in the 16 least significant bits
of the call instruction word. This displacement is added to the PC of the call
instruction plus 1 to generate the new PC.

The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page -13 for a list of condition mnemonics, condi-
tion codes, and flags). Condition flags are set on a previous instruction only
when the destination register is one of the extended-precision registers (R7–
R0) or when one of the compare instructions (CMPF, CMPF3, CMPI, CMPI3,
TSTB, or TSTB3) is executed.

Cycles 5

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-63


CALLcond Call Subroutine Conditionally

Example CALLNZ R5

Before Instruction:

PC = 123h
SP = 809835h
R5 = 789h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 789h
SP = 809836h
R5 = 789h
Data at 809836h = 124h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-64
Compare Floating-Point CMPF

Syntax CMPF src, dst

Operation dst – src


Operands src general addressing modes (G):
00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate
dst register (Rn, 0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 0 0 0 0 1 0 0 0 G dst src

Description The src operand is subtracted from the dst operand. The result is not loaded
into any register, thus allowing for nondestructive compares. The dst and src
operands are assumed to be floating-point numbers.

Cycles 1

Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example CMPF *+AR4,R6

Before Instruction:

AR4 = 8098F2h
R6 = 070C800000h = 1.4050e+02
Data at 8098F3h = 070C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

AR4 = 8098F2h
R6 = 070C800000h = 1.4050e + 02
Data at 8098F3h = 070C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 1 0 0

Assembly Language Instructions 10-65


CMPF3 Compare Floating-Point, 3-Operand

Syntax CMPF3 src2, src1

Operation src1 – src2

Operands src1 three-operand addressing modes (T):


00 register (Rn1, 0 ≤ n1 ≤ 7)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 7)
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 register (Rn2, 0 ≤ n2 ≤ 7)
01 register (Rn2, 0 ≤ n2 ≤ 7)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 0 1 1 0 T 0 0 0 0 0 src1 src2

Description The src2 operand is subtracted from the src1 operand. The result is not loaded
into any register, thus allowing for nondestructive compares. The src1 and
src2 operands are assumed to be floating-point numbers. Although this in-
struction has only two operands, it is designated as a three-operand instruc-
tion because operands are specified in the three-operand format.

Cycles 1

Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-66
Compare Floating-Point, 3-Operand CMPF3

Example CMPF3 *AR2,*AR3– –(1)

Before Instruction:

AR2 = 809831h
AR3 = 809852h
Data at 809831h = 77A7000h = 2.5044e + 02
Data at 809852h = 57A2000h = 6.253125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR2 = 809831h
AR3 = 809851h
Data at 809831h = 77A7000h = 2.5044e + 02
Data at 809852h = 57A2000h = 6.253125e + 01
LUF LV UF N Z V C = 0 0 0 1 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-67


CMPI Compare Integer

Syntax CMPI src, dst

Operation dst – src

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 0 0 1 0 0 1 G dst src

Description The src operand is subtracted from the dst operand. The result is not loaded
into any register, thus allowing for nondestructive compares. The dst and src
operands are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is not affected by OVM bit value.

Example CMPI R3,R7

Before Instruction:

R3 = 898h = 2200
R7 = 3E8h = 1000
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R3 = 898h = 2200
R7 = 3E8h = 1000
LUF LV UF N Z V C = 0 0 0 1 0 0 1

10-68
Compare Integer, 3-Operand CMPI3

Syntax CMPI3 src2, src1

Operation src1 – src2

Operands src1 three-operand addressing modes (T):


00 register (Rn1, 0 ≤ n1 ≤ 27)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 27)
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 register (Rn2, 0 ≤ n2 ≤ 27)
01 register (Rn2, 0 ≤ n2 ≤ 27)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 0 1 1 1 T 0 0 0 0 0 src1 src2

Description The src2 operand is subtracted from the src1 operand. The result is not loaded
into any register, thus allowing for nondestructive compares. The src1 and
src2 operands are assumed to be signed integers. Although this instruction
has only two operands, it is designated as a three-operand instruction be-
cause operands are specified in the three-operand format.

Cycles 1

Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise

Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-69


CMPI3 Compare Integer, 3-Operand

Example CMPI3 R7,R4

Before Instruction:

R7 = 03E8h = 1000
R4 = 0898h = 2200
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R7 = 03E8h = 1000
R4 = 0898h = 2200
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-70
Decrement and Branch Conditionally (Standard) DBcond

Syntax DBcond ARn, src


Operation ARn – 1 → ARn
If cond is true and ARn ≥ 0 :
If src is in register addressing mode (Rn, 0 ≤ n ≤ 27),
src → PC.
If src is in PC-relative mode (label or address),
displacement + PC + 1 → PC.
Else, continue.
Operands src conditional-branch addressing modes (B):
0 register
1 PC-relative
ARn register (0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 1 1 0 1 1 B ARn 0 cond register or displacement

Description DBcond signifies a standard branch that executes in four cycles because the
pipeline must be flushed if cond is true. The specified auxiliary register is de-
cremented and a branch is performed if the condition is true and the specified
auxiliary register is greater than or equal to 0. The condition flags are those set
by the last previous instruction that affects the status bits.
The auxiliary register is treated as a 24-bit signed integer. The most significant
eight bits are unmodified by the decrement operation. The comparison of the
auxiliary register uses only the 24 least significant bits of the auxiliary register.
Note that the branch condition does not depend on the auxiliary register decre-
ment.
If the src operand is expressed in register addressing mode, the contents of
the specified register are loaded into the PC. If the src operand is expressed
in PC-relative addressing mode, the assembler generates a displacement:
displacement = label – (PC of branch instruction + 1). This integer is stored as
a 16-bit signed integer in the 16 least significant bits of the branch instruction
word. This displacement is added to the PC of the branch instruction plus 1 to
generate the new PC.
The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page -13 for a list of condition mnemonics, condi-
tion codes, and flags). Condition flags are set on a previous instruction only
when the destination register is one of the extended-precision registers
(R0–R7) or when one of the compare instructions (CMPF, CMPF3, CMPI,
CMPI3, TSTB, or TSTB3) is executed.

Assembly Language Instructions 10-71


DBcond Decrement and Branch Conditionally (Standard)

Cycles 4

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example CMPI 200,R3


DBLT AR3,R2

Before Instruction:

PC = 5Fh
AR3 = 12h
R2 = 9Fh
R3 = 80h
LUF LV UF N Z V C = 0 0 0 1 0 0 0

After Instruction:

PC = 9Fh
AR3 = 11h
R2 = 9Fh
R3 = 80h
LUF LV UF N Z V C = 0 0 0 1 0 0 0

10-72
Decrement and Branch Conditionally (Delayed) DBcondD

Syntax DBcond D ARn, src

Operation ARn – 1 → ARn


If cond is true and ARN ≥ 0:
If src is in register addressing mode (Rn, 0 ≤ n ≤ 27)
src → PC
If src is in PC-relative mode (label or address)
displacement + PC + 3 → PC.

Else, continue.

Operands src conditional-branch addressing modes (B):


0 register
1 PC-relative

ARn register (0 ≤ n ≤ 7)

Encoding

31 24 23 16 15 87 0

0 1 1 0 1 1 B ARn 1 cond register or displacement

Description DBcond D signifies a delayed branch that allows the three instructions after
the delayed branch to be fetched before the PC is modified. The effect is a
single-cycle branch. The specified auxiliary register is decremented, and a
branch is performed if the condition is true and the specified auxiliary register
is greater than or equal to 0. The condition flags are those set by the last pre-
vious instruction that affects the status bits. The three instructions following the
DBcond D do not affect the cond.

The auxiliary register is treated as a 24-bit signed integer. The most significant
eight bits are unmodified by the decrement operation. The comparison of the
auxiliary register uses only the 24 least significant bits of the auxiliary register.
Note that the branch condition does not depend on the auxiliary register decre-
ment.

If the src operand is expressed in register-addressing mode, the contents of


the specified register are loaded into the PC. If the src is expressed in PC-rela-
tive addressing, the assembler generates a displacement: displacement = la-
bel – (PC of branch instruction + 3). This displacement is added to the PC of
the branch instruction plus 3 to generate the new PC. Note that bit 21 = 1 for
a delayed branch.

Assembly Language Instructions 10-73


DBcondD Decrement and Branch Conditionally (Delayed)

The TMS320C3x provides 20 condition codes that you can use with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Condition flags are set on a previous instruction
only when the destination register is one of the extended-precision registers
(R7–R0) or when one of the compare instructions (CMPF, CMPF3, CMPI,
CMPI3, TSTB, or TSTB3) is executed.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example CMPI 26h,R2


DBZD AR5, $+110h

Before Instruction:

PC = 100h
R2 = 26h
AR5 = 67h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 210h
R2 = 26h
AR5 = 66h
LUF LV UF N Z V C = 0 0 0 0 1 0 0

10-74
Floating-Point-to-Integer Conversion FIX

Syntax FIX src, dst

Operation fix(src) → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 0 1 0 1 0 G dst src

Description The floating-point operand src is converted to the nearest integer less than or
equal to it in value, and the result is loaded into the dst register. The src oper-
and is assumed to be a floating-point number and the dst operand a signed
integer.

The exponent field of the result register (if it has one) is not modified.

Integer overflow occurs when the floating-point number is too large to be rep-
resented as a 32-bit two’s complement integer. In the case of integer overflow,
the result will be saturated in the direction of overflow.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-75


FIX Floating-Point-to-Integer Conversion

Example FIX R1,R2

Before Instruction:

R1 = 0A28200000h = 1.3454e + 3
R2 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R1 = 0A28200000h = 13454e + 3
R2 = 541h = 1345
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-76
Parallel FIX and STI FIX||STI

Syntax FIX src2, dst1


|| STI src3, dst2

Operation fix(src2 ) → dst1


|| src3 → dst2

Operands src2 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 1 0 1 0 dst1 0 0 0 src3 dst2 src2

Description A floating-point to integer conversion is performed. All registers are read at the
beginning and loaded at the end of the execute cycle. This means that, if one
of the parallel operations (STI) reads from a register, and the operation being
performed in parallel (FIX) writes to the same register, STI accepts as input the
contents of the register before it is modified by FIX.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Integer overflow occurs when the floating-point number is too large to be rep-
resented as a 32-bit two’s complement integer. In the case of integer overflow,
the result will be saturated in the direction of overflow.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-77


FIX||STI Parallel FIX and STI

Example FIX *++AR4(1),R1


|| STI R0,*AR2

Before Instruction:

AR4 = 8098A2h
R1 = 0h
R0 = 0DCh = 220
AR2 = 80983Ch
Data at 8098A3h = 733C000h = 1.7950e + 02
Data at 80983Ch = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR4 = 8098A3h
R1 = 0B3h = 179
R0 = 0DCh = 220
AR2 = 80983Ch
Data at 8098A3h = 733C000h = 1.79750e + 02
Data at 80983Ch = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-78
Integer-to-Floating-Point Conversion FLOAT

Syntax FLOAT src, dst

Operation float (src) → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate
dst register (Rn, 0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 0 0 0 0 1 0 1 1 G dst src

Description The integer operand src is converted to the floating-point value equal to it, and
the result loaded into the dst register. The src operand is assumed to be a
signed integer, and the dst operand a floating-point number.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example FLOAT *++AR2(2),R5

Before Instruction:

AR2 = 809800h
R5 = 034C2000h = 1.27578125e + 01
Data at 809802h = 0AEh = 174
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

AR2 = 809802h
R5 = 072E00000h = 1.74e + 02
Data at 809802h = 0AEh = 174
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-79


FLOAT||STF Parallel FLOAT and STF

Syntax FLOAT src2, dst1


|| STF src3, dst2

Operation float(src2 ) → dst1


|| src3 → dst2

Operands src2 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 3 7)
dst2 register (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 1 0 1 1 dst1 0 0 0 src3 dst2 src2

Description An integer to floating-point conversion is performed. All registers are read at


the beginning and loaded at the end of the execute cycle. This means that if
one of the parallel operations (STF) reads from a register and the operation
being performed in parallel (FLOAT) writes to the same register, then STF ac-
cepts as input the contents of the register before it is modified by FLOAT.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.

10-80
Parallel FLOAT and STF FLOAT||STF

Example FLOAT *+AR2(IR0),R6


|| STF R7,*AR1

Before Instruction:

AR2 = 8098C5h
IR0 = 8h
R6 = 0h
R7 = 034C200000h = 1.27578125e + 01
AR1 = 809933h
Data at 8098CDh = 0AEh = 174
Data at 809933h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR2 = 8098C5h
IR0 = 8h
R6 = 072E000000h = 1.740e + 02
R7 = 034C200000h = 1.27578125e + 01
AR1 = 809933h
Data at 8098CDh = 0AEh = 174
Data at 809933h = 034C2000h = 1.27578125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-81


IACK Interrupt Acknowledge

Syntax IACK src

Operation Perform a dummy read operation with IACK = 0.


At end of dummy read, set IACK to 1.

Operands src general addressing modes (G):


01 direct
10 indirect
Encoding
31 24 23 16 15 87 0

0 0 0 1 1 0 1 1 0 G 0 0 0 0 0 src

Description A dummy read operation is performed. If off-chip memory is specified, IACK


is set to 0 at half H1 cycle after the beginning of the decode phase of the IACK
instruction. At the first half of the H1 cycle of the dummy read, IACK is set to
1. Because of a multicycle read, the IACK signal will not be extended. This in-
struction can be used to generate an external interrupt acknowledge. The
IACK signal and the address can be used to signal interrupt acknowledge to
external devices. The data read by the processor is unused.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected

Mode Bit OVM Operation is not affected by OVM bit value.


Example IACK *AR5

Before Instruction:

IACK = 1
PC = 300h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

IACK = 1
PC = 301h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-82
Idle Until Interrupt IDLE

Syntax IDLE

Operation 1 → ST(GIE)
Next PC → PC
Idle until interrupt.

Operands None

Encoding
31 24 23 16 15 87 0

0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

Description The global interrupt enable bit is set, the next PC value is loaded into the PC,
and the CPU idles until an interrupt is received. When the interrupt is received,
the contents of the PC are pushed onto the active system stack.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example IDLE ; The processor idles until a reset


; or unmasked interrupt occurs.

Assembly Language Instructions 10-83


IDLE2 Low-Power Idle

Syntax IDLE2 (TMS320LC31 Only)


Operation 1 → ST(GIE)
Next PC → PC
Idle until interrupt.
Operands None
Encoding
31 24 23 16 15 87 0

0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

Description The IDLE2 instruction serves the same function as IDLE, except that it re-
moves the functional clock input from the internal device. This allows for ex-
tremely low power mode. The PC is incremented once, and the device remains
in an idle state until one of the external interrupts (INT0–3) is asserted.
In IDLE2 mode, the ’C31 will behave as follows:
- The CPU, peripherals, and memory will retain their previous states.
- When the device is in the functional (nonemulation) mode, the clocks will
stop with H1 high and H3 low.
- The ’LC31 will remain in IDLE2 until one of the four external interrupts
(INT3 – INT0) is asserted for at least two H1 cycles. When one of the four
interrupts is asserted, the clocks start after a delay of one H1 cycle. The
clocks can start up in the phase opposite that in which they were stopped
(that is, H1 might start high when H3 was high before stopping, and H3
might start high when H1 was high before stopping.) However, the H1 and
H3 clocks remain 180° out of phase with each other.
- During IDLE2 operation, for one of the four external interrupts to be recog-
nized by the CPU and serviced, it must be asserted for at least two H1
cycles. For the processor to recognize only one interrupt when it restarts
operation, the interrupt must be asserted for less than three cycles.
- When the ’LC31 is in emulation mode, the H1 and H3 clocks will continue
to run normally, and the CPU will operate as if an IDLE instruction had been
executed. The clocks continue to run for correct operation of the emulator.

Delayed Branch
For correct device operation, the three instructions after a delayed
branch should not be IDLE or IDLE2 instructions.

10-84
Low-Power Idle IDLE2

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example IDLE2 ; The processor idles until a reset


; or unmasked interrupt occurs.

Assembly Language Instructions 10-85


LDE Load Floating-Point Exponent

Syntax LDE src, dst


Operation src(exp) → dst(exp)
Operands src general addressing modes (G):
00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate
dst register (Rn, 0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 0 0 0 0 1 1 0 1 G dst src

Description The exponent field of the src operand is loaded into the exponent field of the
dst register. No modification of the dst register mantissa field is made unless
the value of the exponent loaded is the reserved value of the exponent for 0
as determined by the precision of the src operand. Then the mantissa field of
the dst register is set to 0. The src and dst operands are assumed to be float-
ing-point numbers. Immediate values are evaluated in the short floating-point
format.
Cycles 1
Status Bits LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example LDE R0,R5
Before Instruction:
R0 = 0200056F30h = 4.00066337e + 00
R5 = 0A056FE332h = 1.06749648e + 03
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R0 = 0200056F30h = 4.00066337e + 00
R5 = 02056FE332h = 4.16990814e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-86
Load Floating-Point LDF

Syntax LDF src, dst

Operation src → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 7)

Encoding
31 24 23 16 15 87 0

0 0 0 0 0 1 1 1 0 G dst src

Description The src operand is loaded into the dst register. The dst and src operands are
assumed to be floating-point numbers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example LDF @9800h,R2

Before Instruction:

DP = 80h
R2 = 0h
Data at 809800h = 10C52A00h = 2.19254303e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

DP = 80h
R2 = 010C52A00h = 2.19254303e + 00
Data at 809800h = 10C52A00h = 2.19254303e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-87


LDFcond Load Floating-Point Conditionally

Syntax LDFcond src, dst

Operation If cond is true:


src → dst.

Else:
dst is unchanged.

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 7)

Encoding
31 24 23 16 15 87 0

0 1 0 0 cond G dst src

Description If the condition is true, the src operand is loaded into the dst register. otherwise,
the dst register is unchanged. The dst and src operands are assumed to be
floating-point numbers.

The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Note that an LDFU (load floating-point uncondi-
tionally) instruction is useful for loading R7–R0 without affecting condition
flags. Condition flags are set on a previous instruction only when the destina-
tion register is one of the extended-precision registers (R7–R0) or when one
of the compare instructions (CMPF, CMPF3, CMPI, CMPI3, TSTB, or TSTB3)
is executed.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-88
Load Floating-Point Conditionally LDFcond

Example LDFZ R3,R5

Before Instruction:

R3 = 2CFF2CD500h = 1.77055560e +13


R5 = 5F0000003Eh = 3.96140824e + 28
LUF LV UF N Z V C = 0 0 0 0 1 0 0

After Instruction:

R3 = 2CFF2CD500h = 1.77055560e +13


R5 = 2CFF2CD500h = 1.77055560e +13
LUF LV UF N Z V C = 0 0 0 0 1 0 0

Assembly Language Instructions 10-89


LDFI Load Floating-Point, Interlocked

Syntax LDFI src, dst

Operation Signal interlocked operation


src → dst
Operands src general addressing modes (G):
01 direct
10 indirect
dst register (Rn, 0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 0 0 0 0 1 1 1 1 G dst src

Description The src operand is loaded into the dst register. An interlocked operation is sig-
naled over XF0 and XF1. The src and dst operands are assumed to be floating-
point numbers. Note that only direct and indirect modes are allowed. Refer to
Section 6.4 on page 6-12 for detailed description.

Cycles 1 if XF1 = 0 (See Section 6.4 on page 6-12)

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example LDFI *+AR2,R7

Before Instruction:

AR2 = 8098F1h
R7 = 0h
Data at 8098F2h = 584C000h = – 6.28125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

AR2 = 8098F1h
R7 = 0584C00000h = – 6.28125e + 01
Data at 8098F2h = 584C000h = – 6.28125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 1

10-90
Parallel LDF and LDF LDF||LDF

Syntax LDF src2, dst2


|| LDF src1, dst1

Operation src2 → dst2


|| src1 → dst1

Operands src1 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src2 indirect (disp = 0, 1, IR0, IR1)
dst2 register (Rn2, 0 ≤ n2 ≤ 7)

Encoding
31 24 23 16 15 87 0

1 1 0 0 0 1 0 dst2 dst1 0 0 0 src1 src2

Description Two floating-point loads are performed in parallel. If the LDFs load the same
register, the assembler issues a warning. The result is that of LDF src2, dst2.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-91


LDF||LDF Parallel LDF and LDF

Example LDF *– – AR1(IR0),R7


|| LDF *AR7++(1),R3

Before Instruction:

AR1 = 80985Fh
IR0 = 8h
R7 = 0h
AR7 = 80988Ah
R3 = 0h
Data at 809857h = 70C8000h = 1.4050e + 02
Data at 80988Ah = 57B4000h = 6.281250e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR1 = 809857h
R0 = 8h
R7 = 070C800000h = 1.4050e + 02
AR7 = 80988Bh
R3 = 057B400000h = 6.281250e + 01
Data at 809857h = 70C8000h = 1.4050e + 02
Data at 80988Ah = 57B4000h = 6.281250e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-92
Parallel LDF and STF LDF||STF

Syntax LDF src2, dst1


|| STF src3, dst2

Operation src2 → dst1


|| src3 → dst2

Operands src2 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 1 1 0 0 dst1 0 0 0 src3 dst2 src2

Description A floating-point load and a floating-point store are performed in parallel.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-93


LDF||STF Parallel LDF and STF

Example LDF *AR2– – (1),R1


|| STF R3,*AR4++(IR1)

Before Instruction:

AR2 = 8098E7h
R1 = 0h
R3 = 057B400000h = 6.28125e + 01
AR4 = 809900h
IR1 = 10h
Data at 8098E7h = 70C8000h = 1.4050e + 02
Data at 809900h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR2 = 8098E6h
R1 = 070C800000h = 1.4050e + 02
R3 = 057B400000h = 6.28125e + 01
AR4 = 809910h
IR1 = 10h
Data at 8098E7h = 70C8000h = 1.4050e + 02
Data at 809900h = 57B4000h = 6.28125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-94
Load Integer LDI

Syntax LDI src, dst

Operation src → dst

Operands src general addressing modes (G):


00 any CPU register
01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 0 0 0 0 G dst src

Description The src operand is loaded into the dst register. The dst and src operands are
assumed to be signed integers. An alternate form of LDI, LDP, is used to load
the data page pointer register (DP). See the LDP instruction and subsec-
tion 10.3.2 on page 10-16.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example LDI *–AR1(IR0),R5

Before Instruction:

AR1 = 2Ch
IR0 = 5h
R5 = 3C5h = 965
Data at 27h = 26h = 38
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-95


LDI Load Integer

After Instruction:

AR1 = 2Ch
IR0 = 5h
R5 = 26h = 38
Data at 27h = 26h = 38
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-96
Load Integer Conditionally LDIcond

Syntax LDIcond src, dst

Operation If cond is true:


src → dst,

Else:
dst is unchanged.

Operands src general addressing modes (G):

00 any CPU register


01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 1 0 1 cond G dst src

Description If the condition is true, the src operand is loaded into the dst register. otherwise,
the dst register is unchanged. Regardless of the condition, the read of the src
takes place. The dst and src operands are assumed to be signed integers.

The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Note that an LDIU (load integer unconditionally)
instruction is useful for loading R7–R0 without affecting the condition flags.
Condition flags are set on a previous instruction only when the destination reg-
ister is one of the extended-precision registers (R7–R0) or when one of the
compare instructions (CMPF, CMPF3, CMPI, CMPI3, TSTB, or TSTB3) is ex-
ecuted.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-97


LDIcond Load Integer Conditionally

Example LDIZ *ARO++,R6

Before Instruction:

ARO = 8098FO
Data at 8098FOh = 027Ch = 636
R6 = 0FE2h = 4,066
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

ARO = 8098F1h
Data at 8098FOh = 027Ch = 636
R6 = 0FE2h = 4,066
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Auxiliary Register Arithmetic


The test condition does not affect the auxiliary register arithmetic. (AR
modification will always occur.)

10-98
Load Integer, Interlocked LDII

Syntax LDII src, dst

Operation Signal interlocked operation


src → dst
Operands src general addressing modes (G):
01 direct
10 indirect
dst any CPU register
Encoding
31 24 23 16 15 87 0

0 0 0 0 1 0 0 0 1 G dst src

Description The src operand is loaded into the dst register. An interlocked operation is sig-
naled over XF0 and XF1. The src and dst operands are assumed to be signed
integers. Note that only the direct and indirect modes are allowed. Refer to
Section 6.4 on page 6-12 for detailed description.

Cycles 1 if XF = 0 (See Section 6.4 on page 6-12)

Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example LDII @985Fh,R3

Before Instruction:

DP = 80
R3 = 0h
Data at 80985Fh = 0DCh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

DP = 80
R3 = 0DCH
Data at 80985Fh = 0DCh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-99


LDI||LDI Parallel LDI and LDI

Syntax LDI src2, dst2


|| LDI src1, dst1

Operation src2 → dst2


|| src1 → dst1

Operands src1 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src2 indirect (disp = 0, 1, IR0, IR1)
dst2 register (Rn2, 0 ≤ n2 ≤ 7)

Encoding
31 24 23 16 15 87 0

1 1 0 0 0 1 1 dst2 dst1 0 0 0 src1 src2

Description Two integer loads are performed in parallel. A warning is issued by the assem-
bler if the LDIs load the same register. The result is that of LDI src2, dst2.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-100
Parallel LDI and LDI LDI||LDI

Example LDI *–AR1(1),R7


|| LDI *AR7++(IR0),R1

Before Instruction:

AR1 = 809826h
R7 = 0h
AR7 = 8098C8h
IR0 = 10h
R1 = 0h
Data at 809825h = 0FAh = 250
Data at 8098C8h = 2EEh = 750
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR1 = 809826h
R7 = 0FAh = 250
AR7 = 8098D8h
IR0 = 10h
R1 = 02EEh = 750
Data at 809825h = 0FAh = 250
Data at 8098C8h = 2EEh = 750
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-101


LDI||STI Parallel LDI and STI

Syntax LDI src2, dst1


|| STI src3, dst2

Operation src2 → dst1


|| src3 → dst2

Operands src2 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 1 1 0 1 dst1 0 0 0 src3 dst2 src2

Description An integer load and an integer store are performed in parallel. If src2 and dst2
point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-102
Parallel LDI and STI LDI||STI

Example LDI *–AR1(1),R2


|| STI R7,*AR5++(IR0)

Before Instruction:

AR1 = 8098E7h
R2 = 0h
R7 = 35h = 53
AR5 = 80982Ch
IR0 = 8h
Data at 8098E6h = 0DCh = 220
Data at 80982Ch = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR1 = 8098E7h
R2 = 0DCh = 220
R7 = 35h = 53
AR5 = 809834h
IR0 = 8h
Data at 8098E6h = 0DCh = 220
Data at 80982Ch = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-103


LDM Load Floating-Point Mantissa

Syntax LDM src, dst

Operation src (man) → dst (man)

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 7)

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 0 0 1 0 G dst src

Description The mantissa field of the src operand is loaded into the mantissa field of the
dst register. The dst exponent field is not modified. The src and dst operands
are assumed to be floating-point numbers. If the src operand is from memory,
the entire memory contents are loaded as the mantissa. If immediate address-
ing mode is used, bits 15–12 of the instruction word are forced to 0 by the as-
sembler.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected

Mode Bit OVM Operation is not affected by OVM bit value.

Example LDM 156.75,R2 (156.75 = 071CC00000h)

Before Instruction:

R2 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R2 = 001CC00000h = 1.22460938e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-104
Load Data Page Pointer LDP

Syntax LDP src, DP

Operation src → data page pointer

Operands src is the 8 MSBs of the absolute 24-bit source address (src).
The “, DP” in the operand is optional.

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 src

Description This pseudo-op is an alternate form of the LDUI instruction, except that LDP
is always in the immediate addressing mode. The src operand field contains
the eight MSBs of the absolute 24-bit src address (essentially, only
bits 23 –16 of src are used). These eight bits are loaded into the eight LSBs
of the data page pointer.

The eight LSBs of the pointer are used in direct addressing as a pointer to the
page of data being addressed. There is a total of 256 pages, each page 64K
words long. Bits 31 – 8 of the pointer are reserved and should be kept set to 0.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example LDP @809900h, DP


or
LDP @809900h

Before Instruction:

DP = 65h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

DP = 80h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-105


LOPOWER Divide Clock by 16

Syntax LOPOWER (TMS320LC31 Only)

Operation H1/16 → H1

Operands None

Encoding
31 23 0

0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

Description Device continues to execute instructions, but at the reduced rate of the CLKIN
frequency divided by 16 (that is, in LOPOWER mode, an ’LC31 with a CLKIN
frequency of 32 MHz will perform in the same way as a 2-MHz ’LC31, which
has an instruction cycle time of 1000 ns). This allows for low-power operation.

The ’LC31 CPU slows down during the read phase of the LOPOWER instruc-
tion. To exit the LOPOWER power-down mode, invoke the MAXSPEED
instruction (opcode = 1080 0000 h). The ’LC31 resumes full-speed operation
during the read phase of the MAXSPEED instruction.

Delayed Branch
Do not run the IDLE2 instruction in the LOPOWER mode.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example LOPOWER ; The processor slows down operation to


; 1/16th of the H1 clock.

10-106
Logical Shift LSH

Syntax LSH count, dst

Operation If count ≥ 0:
dst << count → dst

Else:
dst >> |count | → dst

Operands count general addressing modes (G):


00 any CPU register
01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 0 0 1 1 G dst count

Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count. If the count operand is greater than 0, the dst
operand is left-shifted by the value of the count operand. Low-order bits shifted
in are 0-filled, and high-order bits are shifted out through the carry (C) bit.

Logical left-shift:
C ← dst ← 0

If the count operand is less than 0, the dst is right-shifted by the absolute value
of the count operand. The high-order bits of the dst operand are 0-filled as they
are shifted to the right. Low-order bits are shifted out through the C bit.

Logical right-shift:
0 → dst → C

If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count operand is assumed to be a signed integer, and the dst operand is as-
sumed to be an unsigned integer.

Assembly Language Instructions 10-107


LSH Logical Shift

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the last bit shifted out. 0 for a shift count of 0.

Mode Bit OVM Operation is not affected by OVM bit value.

Example 1 LSH R4,R7

Before Instruction:

R4 = 018h = 24
R7 = 02ACh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R4 = 018h = 24
R7 = 0AC000000h
LUF LV UF N Z V C = 0 0 0 1 0 1 0

Example 2 LSH *–AR5(IR1),R5

Before Instruction:

AR5 = 809908h
IR0 = 4h
R5 = 0012C00000h
Data at 809904h = 0FFFFFFF4h = –12
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR5 = 809908h
IR0 = 4h
R5 = 0000012C00h
Data at 809904h = 0FFFFFFF4h = –12
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-108
Logical Shift, 3-Operand LSH3

Syntax LSH3 count, src, dst


Operation If count ≥ 0:
src << count → dst
Else:
src >> |count | → dst
Operands src three-operand addressing modes (T):
00 any CPU register
01 indirect (disp = 0, 1, IR0, IR1)
10 any CPU register
11 indirect (disp = 0, 1, IR0, IR1)
count three-operand addressing modes (T):
00 any CPU register
01 any CPU register
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)
dst register (Rn, 0 ≤ n ≤ 27)
Encoding
31 24 23 16 15 87 0

0 0 1 0 0 1 0 0 0 T dst src count

Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count.
If the count operand is greater than 0, a copy of the src operand is left-shifted
by the value of the count operand, and the result is written to the dst. (The src
is not changed.) Low-order bits shifted in are 0-filled, and high-order bits are
shifted out through the C (carry) bit.
Logical left-shift:
C ← src ← 0
If the count operand is less than 0, the src operand is right-shifted by the abso-
lute value of the count operand. The high-order bits of the dst operand are 0-
filled as they are shifted to the right. Low-order bits are shifted out through the
C bit.
Logical right-shift:
0 → src → C
If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count operand is assumed to be a signed integer. The src and dst operands
are assumed to be unsigned integers.

Assembly Language Instructions 10-109


LSH3 Logical Shift, 3-Operand

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Unaffected if dst is not R7–R0.
Mode Bit OVM Operation is not affected by OVM bit value.

Example 1 LSH3 R4,R7,R2

Before Instruction:

R4 = 018h = 24
R7 = 02ACh
R2 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R4 = 018h = 24
R7 = 02ACh
R2 = 0AC000000h
LUF LV UF N Z V C = 0 0 0 1 0 1 0

Example 2 LSH3 *–AR4(IR1),R5,R3

Before Instruction:

AR4 = 809908h
IR1 = 4h
R5 = 012C00000h
R3 = 0h
Data at 809904h = 0FFFFFFF4h = –12
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-110
Logical Shift, 3-Operand LSH3

After Instruction:

AR4 = 809908h
IR1 = 4h
R5 = 012C00000h
R3 = 0000012C00h
Data at 809904h = 0FFFFFFF4h = –12
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-111


LSH3||STI Parallel LSH3 and STI

Syntax LSH3 count, src2, dst1


|| STI src3, dst2
Operation If count ≥ 0:
src2 << count → dst1
Else:
src2 >> |count | → dst1
|| src3 → dst2
Operands count register (Rn1, 0 ≤ n1 ≤ 7)
src1 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn3, 0 ≤ n3 ≤ 7)
src2 register (Rn4, 0 ≤ n4 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)
Encoding
31 24 23 16 15 87 0

1 1 0 1 1 1 0 dst1 count src3 dst2 src2

Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count.

If the count operand is greater than 0, a copy of the src2 operand is left-shifted
by the value of the count operand, and the result is written to the dst1. (The
src2 is not changed.) Low-order bits shifted in are 0-filled, and high-order bits
are shifted out through the C (carry) bit.

Logical left-shift:
C ← src2 ← 0
If the count operand is less than 0, the src2 operand is right-shifted by the ab-
solute value of the count operand. The high-order bits of the dst operand are
0-filled as they are shifted to the right. Low-order bits are shifted out through
the C (carry bit).

Logical right-shift:
0 → src2 → C
If the count operand is 0, no shift is performed, and the carry bit is set to 0.

The count operand is assumed to be a seven-bit signed integer, and the src2
and dst1 operands are assumed to be unsigned integers. All registers are read
at the beginning and loaded at the end of the execute cycle. This means that
if one of the parallel operations (STI) reads from a register and the operation
being performed in parallel (LSH3) writes to the same register, STI accepts as
input the contents of the register before it is modified by the LSH3.

10-112
Parallel LSH3 and STI LSH3||STI

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Mode Bit OVM Operation is affected by OVM bit value.

Example 1 LSH3 R2,*++AR3(1),R0


|| STI R4,*–AR5

Before Instruction:

R2 = 18h = 24
AR3 = 8098C2h
R0 = 0h
R4 = 0DCh = 220
AR5 = 8098A3h
Data at 8098C3h = 0ACh
Data at 8098A2h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R2 = 18h = 24
AR3 = 8098C3h
R0 = 0AC000000h
R4 = 0DCh = 220
AR5 = 8098A3h
Data at 8098C3h = 0ACh
Data at 8098A2h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 1 0 1 0

Assembly Language Instructions 10-113


LSH3||STI Parallel LSH3 and STI

Example 2 LSH3 R7,*AR2– – (1),R2


|| STI R0,*+AR0(1)

Before Instruction:

R7 = 0FFFFFFF4h = –12
AR2 = 809863h
R2 = 0h
R0 = 12Ch = 300
AR0 = 8098B7h
Data at 809863h = 2C000000h
Data at 8098B8h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R7 = 0FFFFFFF4h = –12
AR2 = 809862h
R2 = 2C000h
R0 = 12Ch = 300
AR0 = 8098B7h
Data at 809863h = 2C000000h
Data at 8098B8h = 12Ch = 300
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-114
Restore Clock to Regular Speed MAXSPEED

Syntax MAXSPEED

Operation H1/16 → H1

Operands None

Encoding
31 23 16 15 87 0

0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Description Exits LOPOWER power-down mode (invoked by LOPOWER instruction with


opcode 10800001h). The ’LC31 resumes full-speed operation during the read
phase of the MAXSPEED instruction.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example MAXSPEED ; The processor resumes full-speed operation.

Assembly Language Instructions 10-115


MPYF Multiply Floating Point

Syntax MPYF src, dst

Operation dst × src → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 7)

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 0 1 0 0 G dst src

Description The product of the dst and src operands is loaded into the dst register. The src
operand is assumed to be a single-precision floating-point number, and the dst
operand is an extended-precision floating-point number.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example MPYF R0,R2

Before Instruction:

R0 = 070C800000h = 1.4050e + 02
R2 = 034C200000h = 1.27578125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R0 = 070C800000h = 1.4050e + 02
R2 = 0A600F2000h = 1.79247266e + 03
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-116
Multiply Floating Point, 3-Operand MPYF3

Syntax MPYF3 src2, src1, dst

Operation src1 × src2 → dst

Operands src1 three-operand addressing modes (T):


00 register (Rn1, 0 ≤ n1 ≤ 7)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 7)
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 register (Rn2, 0 ≤ n2 ≤ 7)
01 register (Rn2, 0 ≤ n2 ≤ 7)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

dst register (Rn, 0 ≤ n ≤ 7)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 1 0 0 1 T dst src1 src2

Description The product of the src1 and src2 operands is loaded into the dst register. The
src1 and src2 operands are assumed to be single-precision floating-point
numbers, and the dst operand is an extended-precision floating-point number.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-117


MPYF3 Multiply Floating Point, 3-Operand

Example 1 MPYF3 R0,R7,R1

Before Instruction:

R0 = 057B400000h = 6.281250e + 01
R7 = 0733C00000h = 1.79750e + 02
R1 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R0 = 057B400000h = 6.281250e + 01
R7 = 0733C00000h = 1.79750e + 02
R1 = 0D306A3000h = 1.12905469e + 04
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Example 2 MPYF3 *+AR2(IR0),R7,R2


or
MPYF3 R7,*+AR2(IR0),R2

Before Instruction:

AR2 = 809800h
IR0 = 12Ah
R7 = 057B400000h = 6.281250e + 01
R2 = 0h
Data at 80992Ah = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR2 = 809800h
IR0 = 12Ah
R7 = 057B400000h = 6.281250e + 01
R2 = 0D09E4A000h = 8.82515625e + 03
Data at 80992Ah = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-118
Parallel MPYF3 and ADDF3 MPYF3||ADDF3

Syntax MPYF3 srcA, srcB, dst1


|| ADDF3 srcC, srcD, dst2

Operation srcA × srcB → dst1


|| srcC + srcD → dst2

Operands srcA
srcB Any two indirect (disp = 0,1,IR0,IR1)
srcC Any two register (0 ≤ Rn ≤ 7)
srcD

dst1 register (d1):


0 = R0
1 = R1

dst2 register (d2):


0 = R2
1 = R3

src1 register (Rn, 0 ≤ n ≤ 7)


src2 register (Rn, 0 ≤ n ≤ 7)
src3 indirect (disp = 0, 1, IR0, IR1)
src4 indirect (disp = 0, 1, IR0, IR1)

P parallel addressing modes (0 ≤ P ≤ 3)

Operation (P Field)

00 src3 × src4, src1 + src2


01 src3 × src1, src4 + src2
10 src1 × src2, src3 + src4
11 src3 × src1, src2 + src4

Encoding
31 24 23 16 15 87 0

1 0 0 0 0 0 P d1 d2 src1 src2 src3 src4

Description A floating-point multiplication and a floating-point addition are performed in


parallel. All registers are read at the beginning and loaded at the end of the
execute cycle. This means that if one of the parallel operations (MPYF3) reads
from a register and the operation being performed in parallel (ADDF3) writes
to the same register, then MPYF3 accepts as input the contents of the register
before it is modified by the ADDF3.

Assembly Language Instructions 10-119


MPYF3||ADDF3 Parallel MPYF3 and ADDF3

Any combination of addressing modes can be coded for the four possible
source operands as long as two are coded as indirect and two are register. The
assignment of the source operands srcA – srcD to the src1 – src4 fields
varies, depending on the combination of addressing modes used, and the P
field is encoded accordingly.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 0
Z 0
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example MPYF3 *AR5++(1),*– – AR1(IR0),R0


|| ADDF3 R5,R7,R3

Before Instruction:

AR5 = 8098C5h
AR1 = 8098A8h
IR0 = 4h
R0 = 0h
R5 = 0733C00000h = 1.79750e + 02
R7 = 070C800000h = 1.4050e + 02
R3 = 0h
Data at 8098C5h = 34C0000h = 1.2750e + 01
Data at 8098A4h = 1110000h = 2.265625e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-120
Parallel MPYF3 and ADDF3 MPYF3||ADDF3

After Instruction:

AR5 = 8098C6h
AR1 = 8098A4h
IR0 = 4h
R0 = 0467180000h = 2.88867188e + 01
R5 = 0733C00000h = 1.79750e + 02
R7 = 070C800000h = 1.4050e + 02
R3 = 0820200000h = 3.20250e + 02
Data at 8098C5h = 34C0000h = 1.2750e + 01
Data at 8098A4h = 1110000h = 2.265625e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-121


MPYF3||STF Parallel MPYF3 and STF

Syntax MPYF3 src2, src1, dst


|| STF src3, dst2

Operation src1 × src2 → dst1


|| src3 → dst2

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn3, 0 ≤ n3 ≤ 7)
src3 register (Rn4, 0 ≤ n4 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 1 1 1 1 dst1 src1 src3 dst2 src2

Description A floating-point multiplication and a floating-point store are performed in paral-


lel. All registers are read at the beginning and loaded at the end of the execute
cycle. This means that if one of the parallel operations (MPYF3) writes to a reg-
ister and the operation being performed in parallel (STF) reads from the same
register, the STF accepts as input the contents of the register before it is modi-
fied by the MPYF3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; 0 unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-122
Parallel MPYF3 and STF MPYF3||STF

Example MPYF3 *–AR2(1),R7,R0


|| STF R3,*AR0– – (IR0)

Before Instruction:

AR2 = 80982Bh
R7 = 057B400000h = 6.281250e + 01
R0 = 0h
R3 = 086B280000h = 4.7031250e + 02
AR0 = 809860h
IR0 = 8h
Data at 80982Ah = 70C8000h = 1.4050e + 02
Data at 809860h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR2 = 80982Bh
R7 = 057B400000h = 6.281250e + 01
R0 = 0D09E4A000h = 8.82515625e + 03
R3 = 086B280000h = 4.7031250e + 02
AR0 = 809858h
IR0 = 8h
Data at 80982Ah = 70C8000h = 1.4050e + 02
Data at 809860h = 86B280000h = 4.7031250e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-123


MPYF3||SUBF3 Parallel MPYF3 and SUBF3

Syntax MPYF3 srcA, srcB, dst1


|| SUBF3 srcC, srcD, dst2

Operation srcA × srcB → dst1


|| srcD – srcC → dst2

Operands srcA
srcB Any two indirect (disp = 0,1,IR0,IR1)
srcC Any two register (0 ≤ Rn ≤ 7)
srcD

dst1 register (d1):


0 = R0
1 = R1

dst2 register (d2):


0 = R2
1 = R3

src1 register (Rn, 0 ≤ n ≤ 7)


src2 register (Rn, 0 ≤ n ≤ 7)
src3 indirect (disp = 0, 1, IR0, IR1)
src4 indirect (disp = 0, 1, IR0, IR1)

P parallel addressing modes (0 ≤ P ≤ 3)

Operation (P Field)

00 src3 × src4, src1 – src2


01 src3 × src1, src4 – src2
10 src1 × src2, src3 – src4
11 src3 × src1, src2 – src4

Encoding
31 24 23 16 15 87 0

1 0 0 0 0 1 P d1 d2 src1 src2 src3 src4

Description A floating-point multiplication and a floating-point subtraction are performed


in parallel. All registers are read at the beginning and loaded at the end of the
execute cycle. This means that if one of the parallel operations (MPYF3) reads
from a register and the operation being performed in parallel (SUBF3) writes
to the same register, MPYF3 accepts as input the contents of the register be-
fore it is modified by the SUBF3.

10-124
Parallel MPYF3 and SUBF3 MPYF3||SUBF3

Any combination of addressing modes can be coded for the four possible
source operands as long as two are coded as indirect and two are coded regis-
ter. The assignment of the source operands srcA – srcD to the src1 – src4
fields varies, depending on the combination of addressing modes used, and
the P field is encoded accordingly.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 0
Z 0
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example MPYF3 R5,*++AR7(IR1),R0


|| SUBF3 R7,*AR3– – (1),R2
or
MPYF3 *++AR7(IR1), R5,R0
|| SUBF3 R7,*AR3– – (1),R2

Before Instruction:

R5 = 034C000000h = 1.2750e + 01
AR7 = 809904h
IR1 = 8h
R0 = 0h
R7 = 0733C00000h = 1.79750e + 02
AR3 = 8098B2h
R2 = 0h
Data at 80990Ch = 1110000h = 2.250e + 00
Data at 8098B2h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-125


MPYF3||SUBF3 Parallel MPYF3 and SUBF3

After Instruction:

R5 = 034C000000h = 1.2750e + 01
AR7 = 80990Ch
IR1 = 8h
R0 = 0467180000h = 2.88867188e + 01
R7 = 0733C00000h = 1.79750e + 02
AR3 = 8098B1h
R2 = 05E3000000h = – 3.9250e + 01
Data at 80990Ch = 1110000h = 2.250e + 00
Data at 8098B2h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-126
Multiply Integer MPYI

Syntax MPYI src, dst


Operation dst × src → dst
Operands src general addressing modes (G):
00 any CPU register
01 direct
10 indirect
11 immediate
dst any CPU register
Encoding
31 24 23 16 15 87 0

0 0 0 0 1 0 1 0 1 G dst src

Description The product of the dst and src operands is loaded into the dst register. The src
and dst operands, when read, are assumed to be 24-bit signed integers. The
result is assumed to be a 48-bit signed integer. The output to the dst register
is the 32 least significant bits of the result.
Integer overflow occurs when any of the most significant 16 bits of the 48-bit
result differs from the most significant bit of the 32-bit output value.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
Example MPYI R1,R5
Before Instruction:
R1 = 000033C251h = 3,392,081
R5 = 000078B600h = 7,910,912
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 000033C251h = 3,392,081
R5 = 00E21D9600h = – 501,377,536
LUF LV UF N Z V C = 0 1 0 1 0 1 0

Assembly Language Instructions 10-127


MPYI3 Multiply Integer, 3-Operand

Syntax MPYI3 src2, src1, dst

Operation src1 × src2 → dst

Operands src1 three-operand addressing modes (T):


0 0 any CPU register
0 1 indirect (disp = 0, 1, IR0, IR1)
1 0 any CPU register
1 1 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


0 0 any CPU register
0 1 any CPU register
1 0 indirect (disp = 0, 1, IR0, IR1)
1 1 indirect (disp = 0, 1, IR0, IR1)

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 1 0 1 0 T dst src1 src2

Description The product of the src1 and src2 operands is loaded into the dst register. The
src1 and src2 operands are assumed to be 24-bit signed integers. The result
is assumed to be a signed 48-bit integer. The output to the dst register is the
32 least significant bits of the result.

Integer overflow occurs when any of the most significant 16 bits of the 48-bit
result differs from the most significant bit of the 32-bit output value.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.

10-128
Multiply Integer, 3-Operand MPYI3

Example 1 MPYI3 *AR4,*–AR1(1),R2

Before Instruction:

AR4 = 809850h
AR1 = 8098F3h
R2 = 0h
Data at 809850h = 0ADh = 173
Data at 8098F2h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR4 = 809850h
AR1 = 8098F3h
R2 = 094ACh = 38,060
Data at 809850h = 0ADh = 173
Data at 8098F2h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Example 2 MPYI3 *– – AR4(IR0),R2,R7

Before Instruction:

AR4 = 8099F8h
IR0 = 8h
R2 = 0C8h = 200
R7 = 0h
Data at 8099F0h = 32h = 50
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR4 = 8099F0h
IR0 = 8h
R2 = 0C8h = 200
R7 = 02710h = 10,000
Data at 8099F0h = 32h = 50
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-129


MPYI3||ADDI3 Parallel MPYI3 and ADDI3

Syntax MPYI3 srcA, srcB, dst1


|| ADDI3 srcC, srcD, dst2

Operation srcA × srcB → dst1


|| srcD + srcC → dst2

Operands srcA
srcB Any two indirect (disp = 0,1,IR0,IR1)
srcC Any two register (0 ≤ Rn ≤ 7)
srcD

dst1 register (d1):


0 = R0
1 = R1

dst2 register (d2):


0 = R2
1 = R3

src1 register (Rn, 0 ≤ n ≤ 7)


src2 register (Rn, 0 ≤ n ≤ 7)
src3 indirect (disp = 0, 1, IR0, IR1)
src4 indirect (disp = 0, 1, IR0, IR1)

P parallel addressing modes (0 ≤ P ≤ 3)

Operation (P Field)

00 src3 × src4, src1 + src2


01 src3 × src1, src4 + src2
10 src1 × src2, src3 + src4
11 src3 × src1, src2 + src4

Encoding
31 24 23 16 15 87 0

1 0 0 0 1 0 P d1 d2 src1 src2 src3 src4

Description An integer multiplication and an integer addition are performed in parallel. All
registers are read at the beginning and loaded at the end of the execute cycle.
This means that if one of the parallel operations (MPYI3) reads from a register
and the operation being performed in parallel (ADDI3) writes to the same reg-
ister, then MPYI3 accepts as input the contents of the register before it is modi-
fied by the ADDI3.

10-130
Parallel MPYI3 and ADDI3 MPYI3||ADDI3

Any combination of addressing modes can be coded for the four possible
source operands as long as two are coded as indirect and two are coded as
register. The assignment of the source operands srcA – srcD to the
src1 – src4 fields varies, depending on the combination of addressing modes
used, and the P field is encoded accordingly. To simplify processing when the
order is not significant, the assembler may change the order of operands in
commutative operations.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 0
Z 0
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.

Example MPYI3 R7,R4,R0


|| ADDI3 *–AR3,*AR5– –(1),R3

Before Instruction:

R7 = 14h = 20
R4 = 64h = 100
R0 = 0h
AR3 = 80981Fh
AR5 = 80996Eh
R3 = 0h
Data at 80981Eh = 0FFFFFFCBh = – 53
Data at 80996Eh = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-131


MPYI3||ADDI3 Parallel MPYI3 and ADDI3

After Instruction:

R7 = 14h = 20
R4 = 64h = 100
R0 = 07D0h = 2000
AR3 = 80981Fh
AR5 = 80996Dh
R3 = 0h
Data at 80981Eh = 0FFFFFFCBh = – 53
Data at 80996Eh = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-132
Parallel MPYI3 and STI MPYI3||STI

Syntax MPYI3 src2, src1, dst1


|| STI src3, dst2

Operation src1 × src2 → dst1


|| src3 → dst2

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn3, 0 ≤ n3 ≤ 7)
src3 register (Rn4, 0 ≤ n4 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 1 0 0 0 0 dst1 src1 src3 dst2 src2

Description An integer multiplication and an integer store are performed in parallel. All reg-
isters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (MPYI3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the MPYI3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Integer overflow occurs when any of the most significant 16 bits of the 48-bit
result differ from the most significant bit of the 32-bit output value.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.

Assembly Language Instructions 10-133


MPYI3||STI Parallel MPYI3 and STI

Example MPYI3 *++AR0(1),R5,R7


|| STI R2,*–AR3(1)

Before Instruction:

AR0 = 80995Ah
R5 = 32h = 50
R7 = 0h
R2 = 0DCh = 220
AR3 = 80982Fh
Data at 80995Bh = 0C8h = 200
Data at 80982Eh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR0 = 80995Bh
R5 = 32h = 50
R7 = 2710h = 10000
R2 = 0DCh = 220
AR3 = 80982Fh
Data at 80995Bh = 0C8h = 200
Data at 80982Eh = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-134
Parallel MPYI3 and SUBI3 MPYI3||SUBI3

Syntax MPYI3 srcA, srcB, dst1


|| SUBI3 srcC, srcD, dst2

Operation srcA × srcB → dst1


|| srcD – srcC → dst2

Operands srcA
srcB Any two indirect (disp = 0,1,IR0,IR1)
srcC Any two register (0 ≤ Rn ≤ 7)
srcD

dst1 register (d1):


0 = R0
1 = R1

dst2 register (d2):


0 = R2
1 = R3

src1 register (Rn, 0 ≤ n ≤ 7)


src2 register (Rn, 0 ≤ n ≤ 7)
src3 indirect (disp = 0, 1, IR0, IR1)
src4 indirect (disp = 0, 1, IR0, IR1)

P parallel addressing modes (0 ≤ P ≤ 3)

Operation (P Field)

00 src3 × src4, src1 – src2


01 src3 × src1, src4 – src2
10 src1 × src2, src3 – src4
11 src3 × src1, src2 – src4

Encoding
31 24 23 16 15 87 0

1 0 0 0 1 1 P d1 d2 src1 src2 src3 src4

Description An integer multiplication and an integer subtraction are performed in parallel.


All registers are read at the beginning and loaded at the end of the execute
cycle. This means that if one of the parallel operations (MPYI3) reads from a
register and the operation being performed in parallel (SUBI3) writes to the
same register, MPYI3 accepts as input the contents of the register before it is
modified by the SUBI3.

Assembly Language Instructions 10-135


MPYI3||SUBI3 Parallel MPYI3 and SUBI3

Any combination of addressing modes can be coded for the four possible
source operands as long as two are coded as indirect and two are coded as reg-
ister. The assignment of the source operands srcA – srcD to the src1 – src4
fields varies, depending on the combination of addressing modes used, and the
P field is encoded accordingly. To simplify processing when the order is not sig-
nificant, the assembler may change the order of operands in commutative op-
erations.

Integer overflow occurs when any of the most significant 16 bits of the 48-bit
result differs from the most significant bit of the 32-bit output value.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 1 if an integer underflow occurs; 0 otherwise
N 0
Z 0
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.

Example MPYI3 R2,*++AR0(1),R0


|| SUBI3 *AR5– –(IR1),R4,R2
or
MPYI3 *++AR0(1),R2,R0
|| SUBI3 *AR5– –(IR1),R4,R2

Before Instruction:

R2 = 32h = 50
AR0 = 8098E3h
R0 = 0h
AR5 = 8099FCh
IR1 = 0Ch
R4 = 07D0h = 2000
Data at 8098E4h = 62h = 98
Data at 8099FCh = 4B0h = 1200
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-136
Parallel MPYI3 and SUBI3 MPYI3||SUBI3

After Instruction:

R2 = 320h = 800
AR0 = 8098E4h
R0 = 01324h = 4900
AR5 = 8099F0h
IR1 = 0Ch
R4 = 07D0h = 2000
Data at 8098E4h = 62h = 98
Data at 8099FCh = 4B0h = 1200
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-137


NEGB Negative Integer With Borrow

Syntax NEGB src, dst

Operation 0 – src – C → dst

Operands src general addressing modes (G):


00 any CPU register
01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 0 1 1 0 G dst src

Description The difference of the 0, src, and C operands is loaded into the dst register. The
dst and src are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Example NEGB R5,R7

Before Instruction:

R5 = 0FFFFFFCBh = – 53
R7 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 1

After Instruction:

R5 = 0FFFFFFCBh = – 53
R7 = 34h = 52
LUF LV UF N Z V C = 0 0 0 0 0 0 1

10-138
Negate Floating Point NEGF

Syntax NEGF src, dst

Operation 0 – src → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate
dst register (Rn, 0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 0 0 0 1 0 1 1 1 G dst src

Description The difference of the 0 and src operands is loaded into the dst register. The
dst and src operands are assumed to be floating-point numbers.
Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected

Mode Bit OVM Operation is affected by OVM bit value.


Example NEGF *++AR3(2),R1

Before Instruction:

AR3 = 809800h
R1 = 057B400025h = 6.28125006e + 01
Data at 809802h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

AR3 = 809802h
R1 = 07F3800000h = –1.4050e + 02
Data at 809802h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-139


NEGF||STF Parallel NEGF and STF

Syntax NEGF src2, dst1


|| STF src3, dst2

Operation 0 – src2 → dst1


|| src3 → dst2

Operands src2 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 1 0 0 0 1 dst1 0 0 0 src3 dst2 src2

Description A floating-point negation and a floating-point store are performed in parallel.


All registers are read at the beginning and loaded at the end of the execute
cycle. This means that if one of the parallel operations (STF) reads from a reg-
ister and the operation being performed in parallel (NEGF) writes to the same
register, STF accepts as input the contents of the register before it is modified
by the NEGF.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; 0 unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-140
Parallel NEFG and STF NEGF||STF

Example NEGF *AR4– – (1),R7


|| STF R2,*++AR5(1)

Before Instruction:

AR4 = 8098E1h
R7 = 0h
R2 = 0733C00000h = 1.79750e + 02
AR5 = 809803h
Data at 8098E1h = 57B400000h = 6.281250e + 01
Data at 809804h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR4 = 8098E0h
R7 = 0584C00000h = – 6.281250e + 01
R2 = 0733C00000h = 1.79750e + 02
AR5 = 809804h
Data at 8098E1h = 57B4000h = 6.281250e + 01
Data at 809804h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-141


NEGI Negate Integer

Syntax NEGI src, dst

Operation 0 – src → dst

Operands src general addressing modes (G):


00 any CPU register
01 direct
10 indirect
11 immediate

dst any CPU register

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 1 0 0 0 G dst src

Description The difference of the 0 and src operands is loaded into the dst register. The
dst and src operands are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Example NEGI 174,R5 (174 = 0AEh)

Before Instruction:

R5 = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R5 = 0FFFFFF52 = –174
LUF LV UF N Z V C = 0 0 0 1 0 0 1

10-142
Parallel NEGI and STI NEGI||STI

Syntax NEGI src2, dst1


|| STI src3, dst2

Operation 0 – src2 → dst1


|| src3 → dst2

Operands src2 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 1 0 0 1 0 dst1 0 0 0 src3 dst2 src2

Description An integer negation and an integer store are performed in parallel. All registers
are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (NEGI) writes to the same register, then
STI accepts as input the contents of the register before it is modified by the
NEGI.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Assembly Language Instructions 10-143


NEGI||STI Parallel NEGI and STI

Example NEGI *–AR3,R2


|| STI R2,*AR1++

Before Instruction:

AR3 = 80982Fh
R2 = 19h = 25
AR1 = 8098A5h
Data at 80982Eh = 0DCh = 220
Data at 8098A5h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR3 = 80982Fh
R2 = 0FFFFFF24h = – 220
AR1 = 8098A6h
Data at 80982Eh = 0DCh = 220
Data at 8098A5h = 19h = 25
LUF LV UF N Z V C = 0 0 0 1 0 0 1

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-144
No Operation NOP

Syntax NOP src


Operation No ALU or multiplier operations.
ARn is modified if src is specified in indirect mode.
Operands src general addressing modes (G):
00 register (no operation)
10 indirect (modify ARn, 0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0

0 0 0 0 1 1 0 0 1 G 0 0 0 0 0 src

Description If the src operand is specified in the indirect mode, the specified addressing
operation is performed, and a dummy memory read occurs. If the src operand
is omitted, no operation is performed.
Cycles 1
Status Bits LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example 1 NOP
Before Instruction:
PC = 3Ah
After Instruction:
PC = 3Bh
Example 2 NOP *AR3– – (1)
Before Instruction:
PC = 5h
AR3 = 809900h
After Instruction:
PC = 6h
AR3 = 8098FFh

Assembly Language Instructions 10-145


NORM Normalize

Syntax NORM src, dst

Operation norm (src) → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 1 0 1 0 G dst src

Description The src operand is assumed to be an unnormalized floating-point number; that


is, the implied bit is set equal to the sign bit. The dst is set equal to the normal-
ized src operand with the implied bit removed. The dst operand exponent is
set to the src operand exponent minus the size of the left-shift necessary to
normalize the src. The dst operand is assumed to be a normalized floating-
point number.

If src (exp) = –128 and src (man) = 0, then dst = 0, Z = 1, and UF = 0. If src (exp)
= –128 and src (man) ≠ 0, then dst = 0, Z = 0, and UF = 1. For all other cases
of the src, if a floating-point underflow occurs, then dst (man) is forced to 0 and
dst (exp) = –128. If src (man) = 0, then dst (man) = 0 and dst (exp) = –128. Re-
fer to Section 4.6 on page 4-18 for more information.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV Unaffected
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-146
Normalize NORM

Example NORM R1,R2

Before Instruction:

R1 = 0400003AF5h
R2 = 070C800000h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R1 = 0400003AF5h
R2 = F26BD40000h = 1.12451613e – 04
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-147


NOT Bitwise Logical-Complement

Syntax NOT src, dst

Operation ∼src → dst

Operands src general addressing modes (G):


00 any CPU register
01 direct
10 indirect
11 immediate
dst any CPU register
Encoding
31 24 23 16 15 87 0

0 0 0 0 1 1 0 1 1 G dst src

Description The bitwise logical-complement of the src operand is loaded into the dst regis-
ter. The complement is formed by a logical-NOT of each bit of the src operand.
The dst and src operands are assumed to be unsigned integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
Example NOT @982Ch,R4

Before Instruction:

DP = 80h
R4 = 0h
Data at 80982Ch = 5E2Fh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

DP = 80h
R4 = 0FFFFA1D0h
Data at 80982Ch = 5E2Fh
LUF LV UF N Z V C = 0 0 0 1 0 0 0

10-148
Parallel NOT and STI NOT||STI

Syntax NOT src2, dst1


|| STI src3, dst2

Operation ∼src2 → dst1


|| src3 → dst2

Operands src2 indirect (disp = 0, 1, IR0, IR1)


dst1 register (Rn1, 0 ≤ n1 ≤ 7)
src3 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 1 0 0 1 1 dst1 0 0 0 src3 dst2 src2

Description A bitwise logical-NOT and an integer store are performed in parallel. All regis-
ters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (NOT) writes to the same register, STI
accepts as input the contents of the register before it is modified by the NOT.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-149


NOT||STI Parallel NOT and STI

Example NOT *+AR2,R3


|| STI R7,*– – AR4 (IR1)

Before Instruction:

AR2 = 8099CBh
R3 = 0h
R7 = 0DCh = 220
AR4 = 809850h
IR1 = 10h
Data at 8099CCh = 0C2Fh
Data at 809840h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR2 = 8099CBh
R3 = 0FFFFF3D0h
R7 = 0DCh = 220
AR4 = 809840h
IR1 = 10h
Data at 8099CCh = 0C2Fh
Data at 809840h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 1 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-150
Bitwise Logical-OR OR

Syntax OR src, dst


Operation dst OR src → dst
Operands src general addressing modes (G):
00 any CPU register
01 direct
10 indirect
11 immediate (not sign-extended)
dst any CPU register
Encoding
31 24 23 16 15 87 0

0 0 0 1 0 0 0 0 0 G dst src

Description The bitwise logical OR between the src and dst operands is loaded into the dst
register. The dst and src operands are assumed to be unsigned integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example OR *++AR1(IR1),R2
Before Instruction:
AR1 = 809800h
IR1 = 4h
R2 = 012560000h
Data at 809804h = 2BCDh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 809804h
IR1 = 4h
R2 = 012562BCDh
Data at 809804h = 2BCDh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-151


OR3 Bitwise Logical-OR, 3-Operand

Syntax OR3 src2, src1, dst

Operation src1 OR src2 → dst

Operands src1 three-operand addressing modes (T):


00 register (Rn1, 0 n1 ≤ 27)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 27)
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 register (Rn2, 0 ≤ n2 ≤ 27)
01 register (Rn2, 0 ≤ n2 ≤ 27)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 1 0 1 1 T dst src1 src2

Description The bitwise logical-OR between the src1 and src2 operands is loaded into the
dst register. The src1, src2, and dst operands are assumed to be unsigned in-
tegers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-152
Bitwise Logical-OR, 3-Operand OR3

Example OR3 *++AR1(IR1),R2,R7

Before Instruction:

AR1 = 809800h
IR1 = 4h
R2 = 012560000h
R7 = 0h
Data at 809804h = 2BCDh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR1 = 809804h
IR1 = 4h
R2 = 012560000h
R7 = 012562BCDh
Data at 809804h = 2BCDh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-153


OR3||STI Parallel OR3 and STI

Syntax OR3 src2, src1, dst1


|| STI src3, dst2

Operation src1 OR src2 → dst1


| src3 → dst2

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn2, 0 ≤ n2 ≤ 7)
src3 register (Rn3, 0 ≤ n3 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 1 0 1 0 0 dst1 src1 src3 dst2 src2

A bitwise logical-OR and an integer store are performed in parallel. All regis-
ters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (OR3) writes to the same register, then
STI accepts as input the contents of the register before it is modified by the
OR3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-154
Parallel OR3 and STI OR3||STI

Example OR3 *++AR2,R5,R2


|| STI R6,*AR1– –

Before Instruction:

AR2 = 809830h
R5 = 800000h
R2 = 0h
R6 = 0DCh = 220
AR1 = 809883h
Data at 809831h = 9800h
Data at 809883h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR2 = 809831h
R5 = 800000h
R2 = 809800h
R6 = 0DCh = 220
AR1 = 809882h
Data at 809831h = 9800h
Data at 809883h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-155


POP Pop Integer

Syntax POP dst

Operation *SP– – → dst

Operands dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 1 1 0 0 0 1 dst 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Description The top of the current system stack is popped and loaded into the dst register
(32 LSBs). The top of the stack is assumed to be a signed integer. The POP
is performed with a postdecrement of the stack pointer. The exponent bits of
an extended precision register (R7–R0) are left unmodified.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example POP R3

Before Instruction:

SP = 809856h
R3 = 012DAh = 4,826
Data at 809856h = FFFF0DA4h = – 62,044
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

SP = 809855h
R3 = 0FFFF0DA4h = –62,044
Data at 809856h = FFFF0DA4h = – 62,044
LUF LV UF N Z V C = 0 0 0 1 0 0 0

10-156
Pop Floating Point POPF

Syntax POPF dst

Operation *SP–– → dst1

Operands dst register (Rn, 0 ≤ n ≤ 7)

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 1 1 0 1 0 1 dst 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Description The top of the current system stack is popped and loaded into the dst register
(32 MSBs). The top of the stack is assumed to be a floating-point number. The
POP is performed with a postdecrement of the stack pointer. The eight LSBs
of an extended precision register (R7–R0) are 0 filled.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
UF 0
LV Unaffected
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example POPF R4

Before Instruction:

SP = 80984Ah
R4 = 025D2E0123h = 6.91186578e + 00
Data at 80984Ah = 5F2C1302h = 5.32544007e + 28
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

SP = 809849h
R4 = 5F2C130200h = 5.32544007e + 28
Data at 80984Ah = 5F2C1302h = 5.32544007e + 28
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-157


PUSH PUSH Integer

Syntax PUSH src

Operation src → *++SP

Operands src register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 1 1 1 0 0 1 src 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Description The contents of the src register (32 LSBs) are pushed on the current system
stack. The src is assumed to be a signed integer. The PUSH is performed with
a preincrement of the stack pointer. The integer or mantissa portion of an ex-
tended precision register (R7–R0) is saved with this instruction.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example PUSH R6

Before Instruction:

SP = 8098AEh
R6 = 025C128081h = 633,415,688
Data at 8098AFh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

SP = 8098AFh
R6 = 025C128081h = 633,415,688
Data at 8098AFh = 5C128081h = 1,544,716,417
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-158
PUSH Floating Point PUSHF

Syntax PUSHF src

Operation src → *++SP

Operands src register (Rn, 0 ≤ n ≤ 7)

Encoding
31 24 23 16 15 87 0

0 0 0 0 1 1 1 1 1 0 1 src 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Description The contents of the src register (32 MSBs) are pushed on the current system
stack. The src is assumed to be a floating-point number. The PUSH is per-
formed with a preincrement of the stack pointer. The eight LSBs of the mantis-
sa are not saved. (Note the difference in R2 and the value on the stack in the
example below.)

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example PUSHF R2

Before Instruction:

SP = 809801h
R2 = 025C128081h = 6.87725854e + 00
Data at 809802h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

SP = 809802h
R2 = 025C128081h = 6.87725854e + 00
Data at 809802h = 025C1280h = 6.87725830e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-159


RETIcond Return From Interrupt Conditionally

Syntax RETIcond

Operation If cond is true:


*SP – – → PC
1 → ST (GIE).

Else, continue.

Operands None

Encoding
31 24 23 16 15 87 0

0 1 1 1 1 0 0 0 0 0 0 cond 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Description A conditional return is performed. If the condition is true, the top of the stack
is popped to the PC, and a 1 is written to the global interrupt enable (GIE) bit
of the status register. This has the effect of enabling all interrupts for which the
corresponding interrupt enable bit is a 1.

The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Condition flags are set on a previous instruction
only when the destination register is one of the extended-precision registers
(R7–R0) or when one of the compare instructions (CMPF, CMPF3, CMPI,
CMPI3, TSTB, or TSTB3) is executed.

Cycles 4

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-160
Return From Interrupt Conditionally RETIcond

Example RETINZ

Before Instruction:

PC = 456h
SP = 809830h
ST = 0h
Data at 809830h = 123h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 123h
SP = 80982Fh
ST = 2000h
Data at 809830h = 123h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-161


RETScond Return From Subroutine Conditionally

Syntax RETScond
Operation If cond is true:
*SP– – → PC.
Else, continue.
Operands None
Encoding
31 24 23 16 15 87 0

0 1 1 1 1 0 0 0 1 0 0 cond 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Description A conditional return is performed. If the condition is true, the top of the stack
is popped to the PC.
The TMS320C3x provides 20 condition codes that you can use with this in-
struction (see Table 10–9 on page -13 for a list of condition mnemonics, condi-
tion codes, and flags). Condition flags are set on a previous instruction only
when the destination register is one of the extended-precision registers (R7–
R0) or when one of the compare instructions (CMPF, CMPF3, CMPI, CMPI3,
TSTB, or TSTB3) is executed.
Cycles 4
Status Bits LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example RETSGE
Before Instruction:
PC = 123h
SP = 80983Ch
Data at 80983Ch = 456h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 456h
SP = 80983Bh
Data at 80983Ch = 456h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-162
Round Floating Point RND

Syntax RND src, dst

Operation rnd(src) → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 7)

Encoding

31 24 23 16 15 87 0

0 0 0 1 0 0 0 1 0 G dst src

Description The result of rounding the src operand is loaded into the dst register.The src
operand is rounded to the nearest single-precision floating-point value. If the
src operand is exactly half-way between two single-precision values, it is
rounded to the most positive value.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs or the src operand is 0;
0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z Unaffected
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected

Mode Bit OVM Operation is affected by OVM bit value.

Example RND R5,R2

Before Instruction:

R5 = 0733C16EEFh = 1.79755599e + 02
R2 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-163


RND Round Floating Point

After Instruction:

R5 = 0733C16EEFh = 1.79755599e + 02
R2 = 0733C16F00h = 1.79755600e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: BZUF Instruction


If a BZ instruction is executed immediately following an RND instruction with
a 0 operand, the branch is not performed because the zero flag is not set.
To circumvent this problem, execute a BZUF instruction instead of a BZ
instruction.

10-164
Rotate Left ROL

Syntax ROL dst

Operation dst left-rotated 1 bit → dst

Operands dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 0 0 1 1 1 1 dst 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

Description The contents of the dst operand are left-rotated one bit and loaded into the dst
register. This is a circular rotation, with the MSB transferred into the LSB.

Rotate left:

C dst

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the bit rotated out of the high-order bit. Unaffected
if dst is not R7 – R0.
Mode Bit OVM Operation is not affected by OVM bit value.

Example ROL R3

Before Instruction:

R3 = 80025CD4h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R3 = 0004B9A9h
LUF LV UF N Z V C = 0 0 0 0 0 0 1

Assembly Language Instructions 10-165


ROLC Rotate Left Through Carry

Syntax ROLC dst

Operation dst left-rotated one bit through carry bit → dst

Operands dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 0 1 0 0 1 1 dst 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

Description The contents of the dst operand are left-rotated one bit through the carry bit
and loaded into the dst register. The MSB is rotated to the carry bit at the same
time the carry bit is transferred to the LSB.

Rotate left through carry bit:

C dst

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the bit rotated out of the high-order bit. If dst is not
R7–R0, then C is shifted into the dst but not changed.
Mode Bit OVM Operation is not affected by OVM bit value.

Example 1 ROLC R3

Before Instruction:

R3 = 00000420h
LUF LV UF N Z V C = 0 0 0 0 0 0 1

After Instruction:

R3 = 000000841h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-166
Rotate Left Through Carry ROLC

Example 2 ROLC R3

Before Instruction:

R3 = 80004281h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R3 = 00008502h
LUF LV UF N Z V C = 0 0 0 0 0 0 1

Assembly Language Instructions 10-167


ROR Rotate Right

Syntax ROR dst

Operation dst right-rotated one bit through carry bit → dst

Operands dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 0 1 0 1 1 1 dst 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Description The contents of the dst operand are right-rotated one bit and loaded into the
dst register. The LSB is rotated into the carry bit and also transferred into the
MSB.

Rotate right:

dst C

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the bit rotated out of the high-order bit. Unaffected
if dst is not R7–R0.
Mode Bit OVM Operation is not affected by OVM bit value.

Example ROR R7

Before Instruction:

R7 = 00000421h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R7 = 80000210h
LUF LV UF N Z V C = 0 0 0 1 0 0 1

10-168
Rotate Right Through Carry RORC

Syntax RORC dst

Operation dst right-rotated one bit through carry bit → dst

Operands dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 0 1 1 0 1 1 dst 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Description The contents of the dst operand are right-rotated one bit through the status
register’s carry bit. This could be viewed as a 33-bit shift. The carry bit value
is rotated into the MSB of the dst, while at the same time the dst LSB is rotated
into the carry bit.

Rotate right through carry bit:

C dst

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the bit rotated out of the high-order bit. If dst is not
R7 – R0, then C is shifted in but not changed.
Mode Bit OVM Operation is not affected by OVM bit value.

Example RORC R4

Before Instruction:

R4 = 80000081h
LUF LV UF N Z V C = 0 0 0 1 0 0 0

After Instruction:

R4 = 40000040h
LUF LV UF N Z V C = 0 0 0 0 0 0 1

Assembly Language Instructions 10-169


RPTB Repeat Block

Syntax RPTB src


Operation src → RE
1 → ST (RM)
Next PC → RS
Operands src long-immediate addressing mode
Encoding
31 24 23 16 15 87 0

0 1 1 0 0 1 0 0 src

Description RPTB allows a block of instructions to be repeated a number of times without


any penalty for looping. This instruction activates the block repeat mode of up-
dating the PC. The src operand is a 24-bit unsigned immediate value that is
loaded into the repeat end address (RE) register. A 1 is written into the repeat
mode bit of status register ST (RM) to indicate that the PC is being updated
in the repeat mode. The address of the next instruction is loaded into the repeat
start address (RS) register.
Cycles 4
Status Bits LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example RPTB 127h
Before Instruction:
PC = 123h
ST = 0h
RE = 0h
RS = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 124h
ST = 100h
RE = 127h
RS = 124h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-170
Repeat Single RPTS

Syntax RPTS src

Operation src → RC
1 → ST (RM)
1→S
Next PC → RS
Next PC → RE

Operands src general addressing modes (G):


00 register
01 direct
10 indirect
11 immediate

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 0 1 1 1 G 1 1 0 1 1 src

Description The RPTS instruction allows you to repeat a single instruction a number of
times without any penalty for looping. Fetches can also be made from the in-
struction register (IR), thus avoiding repeated memory access.

The src operand is loaded into the repeat counter (RC). A 1 is written into the
repeat mode bit of the status register ST (RM). A 1 is also written into the re-
peat single bit (S). This indicates that the program fetches are to be performed
only from the instruction register. The next PC is loaded into the repeat end
address (RE) register and the repeat start address (RS) register.

For the immediate mode, the src operand is assumed to be an unsigned inte-
ger and is not sign-extended.

Cycles 4

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-171


RPTS Repeat Single

Example RPTS AR5

Before Instruction:

PC = 123h
ST = 0h
RS = 0h
RE = 0h
RC = 0h
AR5 = 0FFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 124h
ST = 100h
RS = 124h
RE = 124h
RC = 0FFh
AR5 = 0FFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-172
Signal, Interlocked SIGI

Syntax SIGI

Operation Signal interlocked operation.


Wait for interlock acknowledge.
Clear interlock.

Operands None

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Description An interlocked operation is signaled over XF0 and XF1. After the interlocked
operation is acknowledged, the interlocked operation ends. SIGI ignores the
external ready signals. Refer to Section 6.4 on page 6-12 for detailed informa-
tion.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example SIGI ; The processor sets XF0 to 0, idles


; until XF1 is set to 0, and then
; sets XF0 to 1.

Assembly Language Instructions 10-173


STF Store Floating Point

Syntax STF src, dst

Operation src → dst

Operands src register (Rn, 0 ≤ n ≤ 7)

dst general addressing modes (G):


01 direct
10 indirect

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 1 0 0 0 G src dst

Description The src register is loaded into the dst memory location. The src and dst oper-
ands are assumed to be floating-point numbers.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example STF R2,@98A1h

Before Instruction:

DP = 80h
R2 = 052C501900h = 4.30782204e + 01
Data at 8098A1h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

DP = 80h
R2 = 052C501900h = 4.30782204e + 01
Data at 8098A1h = 52C5019h = 4.30782204e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-174
Store Floating Point, Interlocked STFI

Syntax STFI src, dst

Operation src → dst


Signal end of interlocked operation.

Operands src register (Rn, 0 ≤ n ≤ 7)

dst general addressing modes (G):


01 direct
10 indirect

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 1 0 0 1 G src dst

Description The src register is loaded into the dst memory location. An interlocked opera-
tion is signaled over pins XF0 and XF1. The src and dst operands are assumed
to be floating-point numbers. Refer to Section 6.4 on page 6-12 for detailed
information.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example STFI R3,*–AR4

Before Instruction:

R3 = 0733C00000h = 1.79750e + 02
AR4 = 80993Ch
Data at 80993Bh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R3 = 0733C00000h = 1.79750e + 02
AR4 = 80993Ch
Data at 80993Bh = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-175


STF||STF Parallel Store Floating Point

Syntax STF src2, dst2


|| STF src1, dst1

Operation src2 → dst2


|| src1 → dst1

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


dst1 indirect (disp = 0, 1, IR0, IR1)
src2 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 0 0 0 0 src2 0 0 0 src1 dst1 dst2

Description Two STF instructions are executed in parallel. Both src1 and src2 are assumed
to be floating-point numbers.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected

Mode Bit OVM Operation is not affected by OVM bit value.

Example STF R4,*AR3– –


|| STF R3,*++AR5

Before Instruction:

R4 = 070C800000h = 1.4050e + 02
AR3 = 809835h
R3 = 0733C00000h = 1.79750e + 02
AR5 = 8099D2h
Data at 809835h = 0h
Data at 8099D3h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-176
Parallel Store Floating Point STF||STF

After Instruction:
R4 = 070C800000h = 1.4050e + 02
AR3 = 809834h
R3 = 0733C00000h = 1.79750e + 02
AR5 = 8099D3h
Data at 809835h = 070C8000h = 1.4050e + 02
Data at 8099D3h = 0733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-177


STI Store Integer

Syntax STI src, dst

Operation src → dst

Operands src register (Rn, 0 ≤ n ≤ 27)

dst general addressing modes (G):


01 direct
10 indirect

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 1 0 1 0 G src dst

Description The src register is loaded into the dst memory location. The src and dst oper-
ands are assumed to be signed integers.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example STI R4,@982Bh

Before Instruction:

DP = 80h
R4 = 42BD7h = 273,367
Data at 80982Bh = 0E5FCh = 58,876
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

DP = 80h
R4 = 42BD7h = 273,367
Data at 80982Bh = 42BD7h = 273,367
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-178
Store Integer, Interlocked STII

Syntax STII src, dst

Operation src → dst


Signal end of interlocked operation

Operands src register (Rn, 0 ≤ n ≤ 27)

dst general addressing modes (G):


01 direct
10 indirect

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 1 0 1 1 G src dst

Description The src register is loaded into the dst memory location. An interlocked opera-
tion is signaled over pins XF0 and XF1. The src and dst operands are assumed
to be signed integers. Refer to Section 6.4 on page 6-12 for detailed informa-
tion.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example STII R1,@98AEh

Before Instruction:

DP = 80h
R1 = 78Dh
Data at 8098AEh = 25Ch

After Instruction:

DP = 80h
R1 = 78Dh
Data at 8098AEh = 78Dh

Assembly Language Instructions 10-179


STI||STI Parallel STI and STI

Syntax STI src2, dst2


|| STI src1, dst1

Operation src2 → dst2


|| src1 → dst1

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


dst1 indirect (disp = 0, 1, IR0, IR1)
src2 register (Rn2, 0 ≤ n2 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 0 0 0 0 1 src2 0 0 0 src1 dst1 dst2

Description Two integer stores are performed in parallel. If both stores are executed to the
same address, the value written is that of STI src2, dst2.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example STI R0,*++AR2(IR0)


|| STI R5,*AR0

Before Instruction:

R0 = 0DCh = 220
AR2 = 809830h
IR0 = 8h
R5 = 35h = 53
AR0 = 8098D3h
Data at 809838h = 0h
Data at 8098D3h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-180
Parallel STI and STI STI||STI

After Instruction:

R0 = 0DCh = 220
AR2 = 809838h
IR0 = 8h
R5 = 35h = 53
AR0 = 8098D3h
Data at 809838h = 0DCh = 220
Data at 8098D3h = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-181


SUBB Subtract Integer With Borrow

Syntax SUBB src, dst

Operation dst – src – C → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 1 1 0 1 G dst src

Description The difference of the dst, src, and C operands is loaded into the dst register.
The dst and src operands are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Example SUBB *AR5++(4),R5

Before Instruction:

AR5 = 809800h
R5 = 0FAh = 250
Data at 809800h = 0C7h = 199
LUF LV UF N Z V C = 0 0 0 0 0 0 1

After Instruction:

AR5 = 809804h
R5 = 032h = 50
Data at 809800h = 0C7h = 199
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-182
Subtract Integer With Borrow, 3-Operand SUBB3

Syntax SUBB3 src2, src1, dst

Operation src1 – src2 – C → dst

Operands src1 three-operand addressing modes (T):


00 register (Rn1, 0 ≤ n1 ≤ 27)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 27)
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 register (Rn2, 0 ≤ n2 ≤ 27)
01 register (Rn2, 0 ≤ n2 ≤ 27)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 1 1 0 0 T dst src1 src2

Description The difference of the src1 and src2 operands and the C flag is loaded into the
dst register. The src1, src2, and dst operands are assumed to be signed inte-
gers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Assembly Language Instructions 10-183


SUBB3 Subtract Integer With Borrow, 3-Operand

Example SUBB3 R5,*AR5++(IR0),R0

Before Instruction:

AR5 = 809800h
IR0 = 4h
R5 = 0C7h = 199
R0 = 0h
Data at 809800h = 0FAh = 250
LUF LV UF N Z V C = 0 0 0 0 0 0 1

After Instruction:

AR5 = 809804h
IR0 = 4h
R5 = 0C7h = 199
R0 = 32h = 50
Data at 809800h = 0FAh = 250
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-184
Subtract Integer Conditionally SUBC

Syntax SUBC src, dst

Operation If (dst – src ≥ 0):


(dst – src << 1) OR 1 → dst
Else:
dst << 1 → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 0 1 1 1 0 G dst src

Description The src operand is subtracted from the dst operand. The dst operand is loaded
with a value dependent on the result of the subtraction. If (dst – src) is greater
than or equal to 0, then (dst – src) is left-shifted one bit, the least significant
bit is set to 1, and the result is loaded into the dst register. If (dst – src) is less
than 0, dst is left-shifted one bit and loaded into the dst register. The dst and
src operands are assumed to be unsigned integers.

You can use SUBC to perform a single step of a multibit integer division. See
subsection 11.3.4 on page 11-26 for a detailed description.

Cycles 1

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-185


SUBC Subtract Integer Conditionally

Example 1 SUBC @98C5h,R1

Before Instruction:

DP = 80h
R1 = 04F6h = 1270
Data at 8098C5h = 492h = 1170
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

DP = 80h
R1 = 0C9h = 201
Data at 8098C5h = 492h = 1170
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Example 2 SUBC 3000,R0 (3000 = 0BB8h)

Before Instruction:

R0 = 07D0h = 2000
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R0 = 0FA0h = 4000
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-186
Subtract Floating Point SUBF

Syntax SUBF src, dst


Operation dst – src → dst
Operands src general addressing modes (G):
00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 7)


Encoding
31 24 23 16 15 87 0

0 0 0 1 0 1 1 1 1 G dst src

Description The difference of the dst operand minus the src operand is loaded into the
dst register. The dst and src operands are assumed to be floating-point num-
bers.

Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example SUBF *AR0– – (IR0),R5


Before Instruction:
AR0 = 809888h
IR0 = 80h
R5 = 0733C00000h = 1.79750000e + 02
Data at 809888h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:
AR0 = 809808h
IR0 = 80h
R5 = 051D000000h = 3.9250e + 01
Data at 809888h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-187


SUBF3 Subtract Floating Point, 3-Operand

Syntax SUBF3 src2, src1, dst

Operation src1 – src2 → dst

Operands src1 three-operand addressing modes (T):


00 register (Rn1, ≤ n1 ≤ 7)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, ≤ n1 ≤ 7)
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 register (Rn2, ≤ n2 ≤ 7)
01 register (Rn2, ≤ n2 ≤ 7)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

dst register (Rn, 0 ≤ n ≤ 7)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 1 1 0 1 T dst src1 src2

Description The difference of the src1 and src2 operands is loaded into the dst register.
The src1, src2, and dst operands are assumed to be floating-point numbers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-188
Subtract Floating Point, 3-Operand SUBF3

Example 1 SUBF3 *AR0– – (IR0),*AR1,R4

Before Instruction:

AR0 = 809888h
IR0 = 80h
AR1 = 809851h
R4 = 0h
Data at 809888h = 70C8000h = 1.4050e + 02
Data at 809851h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR0 = 809808h
IR0 = 80h
AR1 = 809851h
R4 = 51D000000h = 3.9250e + 01
Data at 809888h = 70C8000h = 1.4050e + 02
Data at 809851h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Example 2 SUBF3 R7,R0,R6

Before Instruction:

R7 = 57B400000h = 6.281250e + 01
R0 = 34C200000h = 1.27578125e + 01
R6 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R7 = 57B400000h = 6.281250e + 01
R0 = 34C200000h = 1.27578125e + 01
R6 = 5B7C80000h = – 5.00546875e + 01
LUF LV UF N Z V C = 0 0 0 0 1 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-189


SUBF3||STF Parallel SUBF3 and STF

Syntax SUBF3 src1, src2, dst1


|| STF src3, dst2

Operation src2 – src1 → dst1


|| src3 → dst2

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn2, 0 ≤ n2 ≤ 7)
src3 register (Rn3, 0 ≤ n3 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 1 0 1 0 1 dst1 src1 src3 dst2 src2

Description A floating-point subtraction and a floating-point store are performed in parallel.


All registers are read at the beginning and loaded at the end of the execute
cycle. This means that if one of the parallel operations (STF) reads from a reg-
ister and the operation being performed in parallel (SUBF3) writes to the same
register, STF accepts as input the contents of the register before it is modified
by the SUBF3.

If src3 and dst1 point to the same location, src3 is read before the write to dst1.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-190
Parallel SUBF3 and STF SUBF3||STF

Example SUBF3 R1,*–AR4(IR1),R0


|| STF R7,*+AR5(IR0)

Before Instruction:

R1 = 057B400000h = 6.28125e + 01
AR4 = 8098B8h
IR1 = 8h
R0 = 0h
R7 = 0733C00000h = 1.79750e + 02
AR5 = 809850h
IR0 = 10h
Data at 8098B0h = 70C8000h = 1.4050e + 02
Data at 809860h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R1 = 057B400000h = 6.28125e + 01
AR4 = 8098B8h
IR1 = 8h
R0 = 061B600000h = 7.768750e + 01
R7 = 0733C00000h = 1.79750e + 02
AR5 = 809850h
IR0 = 10h
Data at 8098B0h = 70C8000h = 1.4050e + 02
Data at 809860h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-191


SUBI Subtract Integer

Syntax SUBI src, dst

Operation dst – src → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 1 0 0 0 0 G dst src

Description The difference of the dst operand minus the src operand is loaded into the dst
register. The dst and src operands are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Example SUBI 220,R7

Before Instruction:

R7 = 226h = 550
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R7 = 14Ah = 330
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-192
Subtract Integer, 3-Operand SUBI3

Syntax SUBI3 src2, src1, dst

Operation src1 – src2 → dst

Operands src1 three-operand addressing modes (T):


00 register (Rn1, 0 ≤ n1 ≤ 27)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 27)
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 register (Rn2, 0 ≤ n2 ≤ 27)
01 register (Rn2, 0 ≤ n2 ≤ 27)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 1 1 1 0 T dst src1 src2

Description The difference of the src1 operand minus the src2 operand is loaded into the
dst register. The src1, src2, and dst operands are assumed to be signed inte-
gers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Assembly Language Instructions 10-193


SUBI3 Subtract Integer, 3-Operand

Example 1 SUBI3 R7,R2,R0

Before Instruction:

R2 = 0866h = 2150
R7 = 0834h = 2100
R0 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R2 = 0866h = 2150
R7 = 0834h = 2100
R0 = 032h = 50
LUF LV UF N Z V C = 0 0 0 1 0 0 0

Example 2 SUBI3 *–AR2(1),R4,R3

Before Instruction:

AR2 = 80985Eh
R4 = 0226h = 550
R3 = 0h
Data at 80985Dh = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR2 = 80985Eh
R4 = 0226h = 550
R3 = 014Ah = 330
Data at 80985Dh = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-194
Parallel SUBI3 and STI SUBI3||STI

Syntax SUBI3 src1, src2, dst1


|| STI src3, dst2

Operation src2 – src1 → dst1


|| src3 → dst2

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn2, 0 ≤ n2 ≤ 7)
src3 register (Rn3, 0 ≤ n3 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 1 0 1 1 0 dst1 src1 src3 dst2 src2

Description An integer subtraction and an integer store are performed in parallel. All regis-
ters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (SUBI3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the SUBI3.

If src3 and dst1 point to the same location, src3 is read before the write to dst1.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Assembly Language Instructions 10-195


SUBI3||STI Parallel SUBI3 and STI

Example SUBI3 R7,*+AR2(IR0),R1


|| STI R3,*++AR7

Before Instruction:

R7 = 14h = 20
AR2 = 80982Fh
IR0 = 10h
R1 = 0h
R3 = 35h = 53
AR7 = 80983Bh
Data at 80983Fh = 0DCh = 220
Data at 80983Ch = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R7 = 14h = 20
AR2 = 80982Fh
IR0 = 10h
R1 = 0C8h = 200
R3 = 35h = 53
AR7 = 80983Ch
Data at 80983Fh = 0DCh = 220
Data at 80983Ch = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-196
Subtract Reverse Integer With Borrow SUBRB

Syntax SUBRB src, dst

Operation src – dst – C → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 1 0 0 0 1 G dst src

Description The difference of the src, dst, and C operands is loaded into the dst register.
The dst and src operands are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Example SUBRB R4,R6

Before Instruction:

R4 = 03CBh = 971
R6 = 0258h = 600
LUF LV UF N Z V C = 0 0 0 0 0 0 1

After Instruction:

R4 = 03CBh = 971
R6 = 0172h = 370
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-197


SUBRF Subtract Reverse Floating Point

Syntax SUBRF src, dst

Operation src – dst → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 7)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 7)

Encoding
31 24 23 16 15 87 0

0 0 0 1 1 0 0 1 0 G dst src

Description The difference of the src operand minus the dst operand is loaded into the dst
register. The dst and src operands are assumed to be floating-point numbers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example SUBRF @9905h,R5

Before Instruction:

DP = 80h
R5 = 057B400000h = 6.281250e + 01
Data at 809905h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

DP = 80h
R5 = 0669E00000h = 1.16937500e + 02
Data at 809905h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-198
Subtract Reverse Integer SUBRI

Syntax SUBRI src, dst

Operation src – dst → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 1 0 0 1 1 G dst src

Description The difference of the src operand minus the dst operand is loaded into the dst
register. The dst and src operands are assumed to be signed integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.

Example SUBRI *AR5++(IR0),R3

Before Instruction:

AR5 = 809900h
IR0 = 8h
R3 = 0DCh = 220
Data at 809900h = 226h = 550
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR5 = 809908h
IR0 = 8h
R3 = 014Ah = 330
Data at 809900h = 226h = 550
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Assembly Language Instructions 10-199


SWI Software Interrupt

Syntax SWI

Operation Performs an emulation interrupt

Operands None

Encoding
31 24 23 16 15 87 0

0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Description The SWI instruction performs an emulator interrupt. This is a reserved instruc-
tion and should not be used in normal programming.

Cycles 4

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-200
Trap Conditionally TRAPcond

Syntax TRAPcond N

Operation 0 → ST(GIE)
If cond is true:
Next PC → *++SP,
Trap vector N → PC.

Else:

Set ST(GIE) to original state.


Continue.

Operands N (0 ≤ N ≤ 31)

Encoding
31 24 23 16 15 87 0

0 1 1 1 0 1 0 0 0 0 0 cond 0 0 0 0 0 0 0 0 0 0 1 N

Description Interrupts are disabled globally when 0 is written to ST(GIE). If the condition
is true, the contents of the PC are pushed onto the system stack, and the PC
is loaded with the contents of the specified trap vector (N). If the condition is
not true, ST(GIE) is set to its value before the TRAPcond instruction changes
it.

The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Condition flags are set on a previous instruction
only when the destination register is one of the extended-precision registers
(R7–R0) or when one of the compare instructions (CMPF, CMPF3, CMPI,
CMPI3, TSTB, or TSTB3) is executed.

Cycles 5

Status Bits LUF Unaffected


LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-201


TRAPcond Trap Conditionally

Example TRAPZ 16

Before Instruction:

PC = 123h
SP = 809870h
ST = 0h
Trap Vector 16 = 10h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

PC = 10h
SP = 809871h
Data at 809871h = 124h
ST = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-202
Test Bit Fields TSTB

Syntax TSTB src, dst

Operation dst AND src


Operands src general addressing modes (G):
00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate
dst register (Rn, 0 ≤ n ≤ 27)
Encoding
31 24 23 16 15 87 0

0 0 0 1 1 0 1 0 0 G dst src

Description The bitwise logical-AND of the dst and src operands is formed, but the result
is not loaded in any register. This allows for nondestructive compares. The dst
and src operands are assumed to be unsigned integers.

Cycles 1

Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example TSTB *–AR4(1),R5

Before Instruction:

AR4 = 8099C5h
R5 = 898h = 2200
Data at 8099C4h = 767h = 1895
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:

AR4 = 8099C5h
R5 = 898h = 2200
Data at 8099C4h = 767h = 1895
LUF LV UF N Z V C = 0 0 0 0 1 0 0

Assembly Language Instructions 10-203


TSTB3 Test Bit Fields, 3-Operand

Syntax TSTB3 src2, src1

Operation src1 AND src2

Operands src1 three-operand addressing modes (T):


00 register (Rn1, 0 ≤ n1 ≤ 27)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 27)
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 register (Rn2, 0 ≤ n2 ≤ 27)
01 register (Rn2, 0 ≤ n2 ≤ 127)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

0 0 1 0 0 1 1 1 1 T 0 0 0 0 0 src1 src2

Description The bitwise logical-AND between the src1 and src2 operands is formed but is
not loaded into any register. This allows for nondestructive compares. The
src1 and src2 operands are assumed to be unsigned integers. Although this
instruction has only two operands, it is designated as a three-operand instruc-
tion because operands are specified in the three-operand format.

Cycles 1

Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

10-204
Test Bit Fields, 3-Operands TSTB3

Example 1 TSTB3 *AR5– – (IR0),*+AR0(1)

Before Instruction:

AR5 = 809885h
IR0 = 80h
AR0 = 80992Ch
Data at 809885h = 898h = 2200
Data at 80992Dh = 767h = 1895
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR5 = 809805h
IR0 = 80h
AR0 = 80992Ch
Data at 809885h = 898h = 2200
Data at 80992Dh = 767h = 1895
LUF LV UF N Z V C = 0 0 0 0 1 0 0

Example 2 TSTB3 R4,*AR6– – (IR0)

Before Instruction:

R4 = 0FBC4h
AR6 = 8099F8h
IR0 = 8h
Data at 8099F8h = 1568h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R4 = 0FBC4h
AR6 = 8099F0h
IR0 = 8h
Data at 8099F8h = 1568h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

Assembly Language Instructions 10-205


XOR Bitwise Exclusive-OR

Syntax XOR src, dst

Operation dst XOR src → dst

Operands src general addressing modes (G):


00 register (Rn, 0 ≤ n ≤ 27)
01 direct
10 indirect
11 immediate

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 0 1 1 0 1 0 1 G dst src

Description The bitwise exclusive-OR of the src and dst operands is loaded into the dst
register. The dst and src operands are assumed to be unsigned integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Example XOR R1,R2

Before Instruction:

R1 = 0FFA32h
R2 = 0FF5C1h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R1 = 0FF412h
R2 = 000FF3h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

10-206
Bitwise Exclusive-OR, 3-Operand XOR3

Syntax XOR3 src2, src1, dst

Operation src1 XOR src2 → dst

Operands src1 three-operand addressing modes (T):


00 register (Rn1, 0 ≤ n1 ≤ 27)
01 indirect (disp = 0, 1, IR0, IR1)
10 register (Rn1, 0 ≤ n1 ≤ 27)
11 indirect (disp = 0, 1, IR0, IR1)

src2 three-operand addressing modes (T):


00 register (Rn2, 0 ≤ n2 ≤ 27)
01 register (Rn2, 0 ≤ n2 ≤ 27)
10 indirect (disp = 0, 1, IR0, IR1)
11 indirect (disp = 0, 1, IR0, IR1)

dst register (Rn, 0 ≤ n ≤ 27)

Encoding
31 24 23 16 15 87 0

0 0 1 0 1 0 0 0 0 T dst src1 src2

Description The bitwise exclusive-OR between the src1 and src2 operands is loaded into
the dst register. The src1, src2, and dst operands are assumed to be unsigned
integers.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-207


XOR3 Bitwise Exclusive-OR, 3-Operand

Example 1 XOR3 *AR3++(IR0),R7,R4

Before Instruction:

AR3 = 809800h
IR0 = 10h
R7 = 0FFFFh
R4 = 0h
Data at 809800h = 5AC3h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR3 = 809810h
IR0 = 10h
R7 = 0FFFFh
R4 = 0A53Ch
Data at 809800h = 5AC3h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Example 2 XOR3 R5,*–AR1(1),R1

Before Instruction:

R5 = 0FFA32h
AR1 = 809826h
R1 = 0h
Data at 809825h = 0FF5C1h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

R5 = 0FFA32h
AR1 = 809826h
R1 = 000F33h
Data at 809825h = 0FF5C1h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-208
Parallel XOR3 and STI XOR3||STI

Syntax XOR3 src2, src1, dst1


|| STI src3, dst2

Operation src1 XOR src2 → dst1


|| src3 → dst2

Operands src1 register (Rn1, 0 ≤ n1 ≤ 7)


src2 indirect (disp = 0, 1, IR0, IR1)
dst1 register (Rn2, 0 ≤ n2 ≤ 7)
src3 register (Rn3, 0 ≤ n3 ≤ 7)
dst2 indirect (disp = 0, 1, IR0, IR1)

Encoding
31 24 23 16 15 87 0

1 1 1 0 1 1 1 dst src1 src3 dst2 src2

Description A bitwise exclusive-XOR and an integer store are performed in parallel. All reg-
isters are read at the beginning and loaded at the end of the execute cycle. This
means that, if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (XOR3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the XOR3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Cycles 1

Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.

Assembly Language Instructions 10-209


XOR3||STI Parallel XOR3 and STI

Example XOR3 *AR1++,R3,R3


|| STI R6,*–AR2(IR0)

Before Instruction:

AR1 = 80987Eh
R3 = 85h
R6 = 0DCh = 220
AR2 = 8098B4h
IR0 = 8h
Data at 80987Eh = 85h
Data at 8098ACh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0

After Instruction:

AR1 = 80987Fh
R3 = 0h
R6 = 0DCh = 220
AR2 = 8098B4h
IR0 = 8h
Data at 80987Eh = 85h
Data at 8098ACh = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0

Note: Cycle Count


See subsection 9.5.2 on page 9-24 for operand ordering effects on cycle
count.

10-210
Chapter 11

Software Applications

The TMS320C3x is a powerful digital signal processor with an architecture and


instruction set designed to find simple solutions to DSP problems. There are
instructions specifically designed for efficient implementation of DSP algo-
rithms as well as general-purpose instructions that make the device suitable
for more general tasks, like any microprocessor. The floating-point and integer
arithmetic supported by the device let you concentrate on the algorithm and
pay less attention to scaling, dynamic range, and overflows.

The purpose of this chapter is to explain how to use the instruction set, the ar-
chitecture, and the interface of the TMS320C3x processor. It presents coding
examples for frequently used applications and discusses more involved exam-
ples and applications. This chapter defines the principles involved in the ap-
plications and provides the corresponding assembly-language code for in-
structional purposes and for immediate use. Whenever the detailed explana-
tion of the underlying theory is too extensive to be included in this manual, ap-
propriate references are given for further information.

Major topics discussed in this chapter are listed below.

Topic Page

11.1 Processor Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2


11.2 Program Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6
11.3 Logical and Arithmetic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-23
11.4 Application-Oriented Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-53
11.5 Programming Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-131

11-1
Processor Initialization

11.1 Processor Initialization


Before you execute a digital signal processing algorithm, you must initialize
the processor. Initialization usually occurs any time the processor is reset.

You can reset the processor by applying a low level to the RESET input for sev-
eral cycles. At this time, the TMS320C3x terminates execution and puts the
reset vector (that is, the contents of memory location 0) in the program counter.
The reset vector normally contains the address of the system-initialization rou-
tine. The hardware reset also initializes various registers and status bits.

After reset, you can further initialize the processor by executing instructions
that set up operational modes, memory pointers, interrupts, and the remaining
functions needed to meet system requirements.

To configure the processor at reset, you should initialize the following internal
functions:
- Memory-mapped registers
- Interrupt structure

In addition to the initialization performed during the hardware reset (for condi-
tions after hardware reset, see Chapter 12), Example 11–1 shows coding for
initializing the TMS320C3x to the following machine state:
- All interrupts are enabled.
- The overflow mode is disabled.
- The data memory page pointer is set to 0.
- The internal memory is filled with 0s.

Note that all constants larger than 16 bits should be placed in memory and ac-
cessed through direct or indirect addressing.

11-2
Processor Initialization

Example 11–1. TMS320C3x Processor Initialization


*
* TITLE PROCESSOR INITIALIZATION
*
.global RESET,INIT,BEGIN
.global INT0,INT1,INT2,INT3
.global ISR0,ISR1,ISR2,ISR3
.global DINT,DMA
.global TINT0,TINT1,XINT0,RINT0,XINT1,RINT1
.global TIME0,TIME1,XMT0,RCV0,XMT1,RCV1
.global TRAP0,TRAP1,TRAP2,TRP0,TRP1,TRP2
*
* PROCESSOR INITIALIZATION FOR THE TMS320C3x
*
* RESET AND INTERRUPT VECTOR SPECIFICATION. THIS
* ARRANGEMENT ASSUMES THAT DURING LINKING, THE FOLLOWING
* TEXT SEGMENT WILL BE PLACED TO START AT MEMORY
* LOCATION 0.
*
.sect “init” ; Named section
RESET .word INIT ; RS± load address INIT to PC
INT0 .word ISR0 ; INT0± loads address ISR0 to PC
INT1 .word ISR1 ; INT1± loads address ISR1 to PC
INT2 .word ISR2 ; INT2± loads address ISR2 to PC
INT3 .word ISR3 ; INT3± loads address ISR3 to PC
*
* XINT0 .word XMT0 ; Serial port 0 transmit interrupt processing
* RINT0 .word RCV0 ; Serial port 0 receive interrupt processing
* XINT1 .word XMT1 ; Serial port 1 transmit interrupt processing
* RINT1 .word RCV1 ; Serial port 1 receive interrupt processing
TINT0 .word TIME0 ; Timer 0 interrupt processing
TINT1 .word TIME1 ; Timer 1 interrupt processing
DINT .word DMA ; DMA interrupt processing
.space 20 ; Reserved space
TRAP0 .word TRP0 ; Trap 0 vector processing begins
TRAP1 .word TRP1 ; Trap 1 vector processing begins
TRAP2 .word TRP2 ; Trap 2 vector processing begins
.space 29 ; Leave space for the other 29 traps
*
* IN THE FOLLOWING SECTION, CONSTANTS THAT CANNOT BE REPRESENTED
* IN THE SHORT FORMAT ARE INITIALIZED. THE NUMBERS IN PARENTHESIS
* AT THE END OF THE COMMENTS REPRESENT THE OFFSET OF A
* PARTICULAR CONTROL REGISTER FROM
* CTRL (808000H)

Software Applications 11-3


Processor Initialization

.data
MASK .word 0FFFFFFFFH
BLK0 .word 0809800H ; Beginning address of RAM block 0
BLK1 .word 0809C00H ; Beginning address of RAM block 1
STCK .word 0809F00H ; Beginning of stack
CTRL .word 0808000H ; Pointer for peripheral±bus memory map
DMACTL .word 0000000H ; Init for DMA control (0)
TIM0CTL .word 0000000H ; Init of timer 0 control (32)
TIM1CTL .word 0000000H ; Init of timer 1 control (48)
SERGLOB0 .word 0000000H ; Init of serial 0 glbl control (64)
SERPRTX0 .word 0000000H ; Init of serial 0 xmt port control (66)
SERPRTR0 .word 0000000H ; Init of serial 0 rcv port control (67)
SERTIM0 .word 0000000H ; Init of serial 0 timer control (68)
SERGLOB1 .word 0000000H ; Init of serial 1 glbl control (80)
SERPRTX1 .word 0000000H ; Init of serial 1 xmt port control (82)
SERPRTR1 .word 0000000H ; Init of serial 1 rcv port control (83)
SERTIM1 .word 0000000H ; Init of serial 1 timer control (84)
PARINT .word 0000000H ; Init of parallel interface control (100)
IOINT .word 0000000H ; Init of I/O interface control (96)
*
.text

*
* THE ADDRESS AT MEMORY LOCATION 0 DIRECTS EXECUTION TO BEGIN HERE
* FOR RESET PROCESSING THAT INITIALIZES THE PROCESSOR. WHEN RESET
* IS APPLIED, THE FOLLOWING REGISTERS ARE INITIALIZED TO 0:
*

* ST – – CPU STATUS REGISTER


* IE – – CPU/DMA INTERRUPT ENABLE FLAGS
* IF – – CPU INTERRUPT FLAGS
* IOF – – I/O FLAGS
*
* THE STATUS REGISTER HAS THE FOLLOWING ARRANGEMENT:

* BITS: 31–14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

* FUNCTION: RESRV GIE CC CE CF RESRV RM OVM LUF LV UF N Z V C


*

INIT LDP 0,DP ; Point the DP register to page 0


LDI 1800H,ST ; Clear and enable cache, and disable OVM
LDI @MASK,IE ; Unmask all interrupts

*
INTERNAL DATA MEMORY INITIALIZATION TO FLOATING POINT 0
*

LDI @BLK0,AR0 ; AR0 points to block 0


LDI @BLK1,AR1 ; AR1 points to block 1
LDF 0.0,R0 ; 0 register R0
RPTS 1023 ; Repeat 1024 times ...
STF R0,*AR0++(1) ; Zero out location in RAM block 0 and ...
|| STF R0,*AR1++(1) ; Zero out location in RAM block 1

11-4
Processor Initialization

*
* THE PROCESSOR IS INITIALIZED. THE REMAINING APPLICATION–
* DEPENDENT PART OF THE SYSTEM (BOTH ON– AND OFF–CHIP) SHOULD
* NOW BE INITIALIZED.
*
* FIRST, INITIALIZE THE CONTROL REGISTERS. IN THIS EXAMPLE,
* EVERYTHING IS INITIALIZED TO 0, SINCE THE ACTUAL INITIALIZATION IS
* APPLICATION-DEPENDENT.
*
LDI @CTRL,AR0 ; Load in AR0 the pointer to control
* ; registers
LDI @DMACTL,R0
STI R0,*+AR0(0) ; Init DMA control

LDI @TIM0CTL,R0
STI R0,*+AR0(32) ; Init timer 0 control
LDI @TIM1CTL,R0
STI R0,*+AR0(48) ; Init timer 1 control
LDI @SERGLOB0,R0
STI R0,*+AR0(64) ; Init serial 0 global control
LDI @SERPRTX0,R0
STI R0,*+AR0(66) ; Init serial 0 xmt control
LDI @SERPRTR0,R0
STI R0,*+AR0(67) ; Init serial 0 rcv control
LDI @SERTIM0,R0
STI R0,*+AR0(68) ; Init serial 0 timer control
LDI @SERGLOB1,R0
STI R0,*+AR0(80) ; Init serial 1 global control
LDI @SERPRTX1,R0
STI R0,*+AR0(82) ; Init serial 1 xmt control
LDI @SERPRTR1,R0
STI R0,*+AR0(83) ; Init serial 1 rcv control
LDI @SERTIM1,R0
STI R0,*+AR0(84) ; Init serial 1 timer control
LDI @PARINT,R0
STI R0,*+AR0(100) ; Init parallel interface control (C30 only)
LDI @IOINT,R0
STI R0,*+AR0(96) ; Init I/O interface control
*
LDI @STCK,SP ; Init the stack pointer
OR 2000H,ST ; Global interrupt enable
*
BR BEGIN ; Branch to the beginning of application

.end

Software Applications 11-5


Program Control

11.2 Program Control


One group of TMS320C3x instructions provides program control and facili-
tates all types of high-speed processing. These instructions directly handle:
- subroutine calls
- software stack
- interrupts
- zero-overhead branches
- single- and multiple-instruction loops without any overhead

11.2.1 Subroutines
The TMS320C3x has a 24-bit program counter (PC) and a practically unlimited
software stack. The CALL and CALLcond subroutine calls cause the stack
pointer to increment and store the contents of the next value of the PC counter
on the stack. At the end of the subroutine, RETScond performs a conditional
return.

Example 11–2 illustrates the use of a subroutine to determine the dot product
between two vectors. Given two vectors of length N, represented by the arrays
a [0], a [1],..., a [N –1] and b [0], b [1],..., b [N –1], the dot product is computed
from the expression

d = a [0] b [0] + a [1] b [1] + ... + a [N –1] b [N –1]

Processing proceeds in the main routine to the point where the dot product is
to be computed. It is assumed that the arguments of the subroutine have been
appropriately initialized. At this point, a CALL is made to the subroutine,
transferring control to that section of the program memory for execution, then
returning to the calling routine via the RETS instruction when execution has
completed. Note that for this particular example, it would suffice to save the
register R2. However, a larger number of registers are saved for demonstra-
tion purposes. The saved registers are stored on the system stack. This stack
should be large enough to accommodate the maximum anticipated storage re-
quirements. You could use other methods of saving registers equally well.

11-6
Program Control

Example 11–2. Subroutine Call (Dot Product)


*
* TITLE SUBROUTINE CALL (DOT PRODUCT)
*
*
* MAIN ROUTINE THAT CALLS THE SUBROUTINE ‘DOT’ TO COMPUTE THE
* DOT PRODUCT OF TWO VECTORS
* .
* .
* .
* LDI @blk0,AR0 ; AR0 points to vector a
* LDI @blk1,AR1 ; AR1 points to vector b
* LDI N,RC ; RC contains the number of elements

* CALL DOT
* .
* .
* .
*
* SUBROUTINE DOT
*
*
* EQUATION: d = a(0) * b(0) + a(1) * b(1) + ... + a(N±1) * b(N±1)
*
* THE DOT PRODUCT OF a AND b IS PLACED IN REGISTER R0. N MUST
* BE GREATER THAN OR EQUAL TO 2.
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* AR0 | ADDRESS OF a(0)
* AR1 | ADDRESS OF b(0)
* RC | LENGTH OF VECTORS (N)
*
* REGISTERS USED AS INPUT: AR0, AR1, RC
* REGISTER MODIFIED: R0
* REGISTER CONTAINING RESULT: R0
*
*
*
.global DOT
*
DOT PUSH ST ; Save status register
PUSH R2 ; Use the stack to save R2’s
PUSHF R2 ; Lower 32 and upper 32 bits
PUSH AR0 ; Save AR0
PUSH AR1 ; Save AR1
PUSH RC ; Save RC

Software Applications 11-7


Program Control

* ; Initialize R0:
MPYF3 *AR0,*AR1,R0 ; a(0) * b(0) ±> R0
LDF 0.0,R2 ; Initialize R2
SUBI 2,RC ; Set RC = N±2
*
* DOT PRODUCT (1 <= i < N)
*
RPTS RC ; Setup the repeat single
MPYF3 *++AR0(1),*++AR1(1),R0 ; a(i) * b(i) ±> R0
|| ADDF3 R0,R2,R2 ; a(i±1)*b(i±1) + R2 ±> R2
*
ADDF3 R0,R2,R0 ; a(N±1)*b(N±1) + R2 ±> R0
*
* RETURN SEQUENCE
*
POP RC ; Restore RC
POP AR1 ; Restore AR1
POP AR0 ; Restore AR0
POPF R2 ; Restore top 32 bits of R2
POP R2 ; Restore bottom 32 bits of R2
POP ST ; Restore ST
RETS ; Return

*
* end
*
.end

11.2.2 Software Stack


The TMS320C3x has a software stack whose location is determined by the
contents of the stack pointer register (SP). The stack pointer increments from
low to high values, and provisions should be made to accommodate the antici-
pated storage requirements. The stack can be used not only during the sub-
routines CALL and RETS, but also inside the subroutine as a place of tempo-
rary storage of the registers, as shown in Example 11–2. SP always points to
the last value pushed on the stack.

11-8
Program Control

The CALL and CALLcond instructions and the interrupt routines push the
value of the PC onto the stack. RETScond and RETIcond then pop the stack
and place the value in the program counter. You can also use the PUSH and
POP instructions to maneuver the integer value of any register onto and off the
stack, respectively. There are two additional instructions, PUSHF and POPF,
for floating point numbers. You can push and pop floating point numbers to reg-
isters R7–R0. This feature makes it easy to save all 40 bits of the extended
precision registers (see Example 11–2). Using PUSH and PUSHF on the
same register saves the lower 32 and upper 32 bits. PUSH saves the lower
32; PUSHF, the upper 32. POPF, followed by POP, will recover this extended
precision number. It is important to perform the integer and floating-point
PUSH and POP in the order given above. POPF forces the least significant
eight bits of the extended-precision registers to 0 and therefore must be per-
formed first.

You can easily read and write to the SP to create multiple stacks for different
program segments. SP is not initialized by the hardware during reset. It is
therefore important to remember to initialize its value so that SP points to a pre-
determined memory location. This avoids the problem of SP attempting to
write into ROM or otherwise write over useful data.

11.2.3 Interrupt Service Routines


Interrupts on the TMS320C3x are prioritized and vectored. When an interrupt
occurs, the corresponding flag is set in the interrupt flag register IF. If the corre-
sponding bit in the interrupt enable register (IE) is set, and interrupts are en-
abled by having the GIE bit in the status register set to 1, interrupt processing
begins. You can also write to the interrupt flag register, allowing you to force
an interrupt by software or to clear interrupts without processing them.

Even when the interrupt is disabled, you can read the interrupt flag register (IF)
and take appropriate action, depending on whether the interrupt has occurred.
This is true even when the interrupt is disabled. This can be useful when an
interrupt-driven interface is not implemented. Example 11–3 shows the case
in which a subroutine is called when interrupt 1 has not occurred.

Example 11–3. Use of Interrupts for Software Polling


* TITLE INTERRUPT POLLING
.
.
.
TSTB 2,IF ; Test if interrupt 1 has occurred
CALLZ SUBROUTINE ; If not, call subroutine
.
.
.

Software Applications 11-9


Program Control

When interrupt processing begins, the PC is pushed onto the stack, and the
interrupt vector is loaded in the PC. Interrupts are then disabled by setting the
GIE = 0, and the program continues from the address loaded in the PC. Since
all interrupts are disabled, interrupt processing can proceed without further in-
terruption, unless the interrupt service routine re-enables interrupts.

Except for very simple interrupt service routines, it is important to ensure that
the processor context is saved during execution of this routine. You must save
the context before you execute the routine itself and restore it after the routine
is finished. The procedure is called context switching. Context switching is also
useful for subroutine calls, especially during extensive use of the auxiliary and
the extended precision registers. This section contains code examples of con-
text switching and an interrupt service routine.

11-10
Program Control

11.2.3.1 Context Switching

Context switching is commonly required during the processing of subroutine


calls or interrupts. It might be quite extensive or it might be simple, depending
on system requirements. On the TMS320C3x, the program counter is auto-
matically pushed onto the stack. Important information in other TMS320C3x
registers, such as the status, auxiliary, or extended-precision registers, must
be saved by special commands. In order to preserve the state of the status reg-
ister, you should push it first and pop it last. This keeps the restoration of the
extended precision registers from affecting the status register.

Example 11–4 and Example 11–5 show saving and restoring of the
TMS320C3x state. In both examples, the stack is used for saving the registers,
and it expands towards higher addresses. If you don’t want to use the stack
pointed at by SP, you can create a separate stack by using an auxiliary register
as the stack pointer. Registers saved in these examples are:
- Extended-precision registers R7 through R0
- Auxiliary registers AR7 through AR0
- Data-page pointer DP
- Index registers IR0 and IR1
- Block-size register BK
- Status register ST
- Interrupt-related registers IE and IF
- I/O flag IOF
- Repeat-related registers RS, RE, and RC

Software Applications 11-11


Program Control

Example 11–4. Context Save for the TMS320C3x


* TITLE CONTEXT SAVE FOR THE TMS320C3x
*
*
.global SAVE
*
* CONTEXT SAVE ON SUBROUTINE CALL OR INTERRUPT
*
SAVE:
PUSH ST ; Save status register
*
* SAVE THE EXTENDED PRECISION REGISTERS
*
PUSH R0 ; Save the lower 32 bits
PUSHF R0 ; and the upper 32 bits of R0
PUSH R1 ; Save the lower 32 bits
PUSHF R1 ; and the upper 32 bits of R1
PUSH R2 ; Save the lower 32 bits
PUSHF R2 ; and the upper 32 bits of R2
PUSH R3 ; Save the lower 32 bits
PUSHF R3 ; and the upper 32 bits of R3
PUSH R4 ; Save the lower 32 bits
PUSHF R4 ; and the upper 32 bits of R4
PUSH R5 ; Save the lower 32 bits
PUSHF R5 ; and the upper 32 bits of R5
PUSH R6 ; Save the lower 32 bits
PUSHF R6 ; and the upper 32 bits of R6
PUSH R7 ; Save the lower 32 bits
PUSHF R7 ; and the upper 32 bits of R7
*
* SAVE THE AUXILIARY REGISTERS
*
PUSH AR0 ; Save AR0
PUSH AR1 ; Save AR1
PUSH AR2 ; Save AR2
PUSH AR3 ; Save AR3
PUSH AR4 ; Save AR4
PUSH AR5 ; Save AR5
PUSH AR6 ; Save AR6
PUSH AR7 ; Save AR7
*

11-12
Program Control

* SAVE THE REST REGISTERS FROM THE REGISTER FILE


*
PUSH DP ; Save data page pointer
PUSH IR0 ; Save index register IR0
PUSH IR1 ; Save index register IR1
PUSH BK ; Save block±size register
PUSH IE ; Save interrupt enable register
PUSH IF ; Save interrupt flag register
PUSH IOF ; Save I/O flag register
PUSH RS ; Save repeat start address
PUSH RE ; Save repeat end address
PUSH RC ; Save repeat counter
*
* SAVE IS COMPLETE
*

Software Applications 11-13


Program Control

Example 11–5. Context Restore for the TMS320C3x


*
* TITLE CONTEXT RESTORE FOR THE TMS320C3x
*
.global RESTR
*
* CONTEXT RESTORE AT THE END OF A SUBROUTINE CALL OR INTERRUPT
*
RESTR:
*
* RESTORE THE REST REGISTERS FROM THE REGISTER FILE
*
POP RC ; Restore repeat counter
POP RE ; Restore repeat end address
POP RS ; Restore repeat start address
POP IOF ; Restore I/O flag register
POP IF ; Restore interrupt flag register
POP IE ; Restore interrupt enable register
POP BK ; Restore block±size register
POP IR1 ; Restore index register IR1
POP IR0 ; Restore index register IR0
POP DP ; Restore data page pointer
*
* RESTORE THE AUXILIARY REGISTERS
*
POP AR7 ; Restore AR7
POP AR6 ; Restore AR6
POP AR5 ; Restore AR5
POP AR4 ; Restore AR4
POP AR3 ; Restore AR3
POP AR2 ; Restore AR2
POP AR1 ; Restore AR1
POP AR0 ; Restore AR0
*
* RESTORE THE EXTENDED PRECISION REGISTERS
*

11-14
Program Control

POPF R7 ; Restore the upper 32 bits and


POP R7 ; the lower 32 bits of R7
POPF R6 ; Restore the upper 32 bits and
POP R6 ; the lower 32 bits of R6
POPF R5 ; Restore the upper 32 bits and
POP R5 ; the lower 32 bits of R5
POPF R4 ; Restore the upper 32 bits and
POP R4 ; the lower 32 bits of R4
POPF R3 ; Restore the upper 32 bits and
POP R3 ; the lower 32 bits of R3
POPF R2 ; Restore the upper 32 bits and
POP R2 ; the lower 32 bits of R2
POPF R1 ; Restore the upper 32 bits and
POP R1 ; the lower 32 bits of R1
POPF R0 ; Restore the upper 32 bits and
POP R0 ; the lower 32 bits of R0
POP ST ; Restore status register
*
* RESTORE IS COMPLETE
*

Software Applications 11-15


Program Control

11.2.3.2 Interrupt Priority

Interrupts on the TMS320C3x are automatically prioritized. This allows inter-


rupts that occur simultaneously to be serviced in a predefined order. Infrequent
but lengthy interrupt service routines might need to be interrupted by more fre-
quently occurring interrupts. In Example 11–6, the interrupt service routine for
INT2 temporarily modifies the IE to permit interrupt processing when an inter-
rupt to INT0 (but no other interrupt) occurs. When the routine has finished pro-
cessing, the IE register is restored to its original state. Notice that the
RETIcond instruction not only pops the next program counter address from the
stack, but also sets the GIE bit of the status register. This enables all interrupts
that have their interrupt enable bit set.

Example 11–6. Interrupt Service Routine


* TITLE INTERRUPT SERVICE ROUTINE
* .global ISR2
ENABLE .set 2000h
MASK .set 1
*
* INTERRUPT PROCESSING FOR EXTERNAL INTERRUPT INT2±
*
ISR2:
PUSH ST ; Save status register
PUSH DP ; Save data page pointer
PUSH IE ; Save interrupt enable register
PUSH R0 ; Save lower 32 bits and
PUSHF R0 ; upper 32 bits of R0
PUSH R1 ; Save lower 32 bits and
PUSHF R1 ; upper 32 bits of R1
LDI MASK,IE ; Unmask only INT0
OR ENABLE,ST ; Enable all interrupts
*
* MAIN PROCESSING SECTION FOR ISR2
.
.
.
XOR ENABLE,ST ; Disable all interrupts
POPF R1 ; Restore upper 32 bits and
POP R1 ; lower 32 bits of R1
POPF R0 ; Restore upper 32 bits and
POP R0 ; lower 32 bits of R0
POP IE ; Restore interrupt enable register
POP DP ; Restore data page register
POP ST ; Restore status register
*
RETI ; Return and enable interrupts

11-16
Program Control

11.2.4 Delayed Branches


The TMS320C3x uses delayed branches to create single-cycle branching.
The delayed branches operate like regular branches but do not flush the pipe-
line. Instead, the three instructions following a delayed branch are also ex-
ecuted. As discussed in Chapter 6, the only limitations are that none of the
three instructions following a delayed branch can be a:
- Branch (standard or delayed)
- Call to a subroutine
- Return from a subroutine
- Return from an interrupt
- Repeat instruction
- TRAP instruction
- IDLE instruction

Conditional delayed branches use the conditions that exist at the end of the
instruction immediately preceding the delayed branch. Sometimes a branch
is necessary in the flow of a program, but fewer than three instructions can be
placed after a delayed branch. For faster execution, it is still advantageous to
use a delayed branch. This is shown in Example 11–7, with NOPs taking the
place of the unused instructions. The trade-off is more instruction words for
less execution time.

Example 11–7. Delayed Branch Execution


* TITLE DELAYED BRANCH EXECUTION
.
.
.
.
LDF *+AR1(5),R2 ; Load contents of memory to R2
BGED SKIP ; If loaded number >=0, branch (delayed)
LDFN R2,R1 ; If loaded number <0, load it to R1
SUBF 3.0,R1 ; Subtract 3 from R1
NOP ; Dummy operation to complete delayed
* ; branch
MPYF 1.5,R1 ; Continue here if loaded number <0
.
.
.
SKIP LDF R1,R3 ; Continue here if loaded number >=0

Software Applications 11-17


Program Control

11.2.5 Repeat Modes


The TMS320C3x supports looping without any overhead. For that purpose,
there are two instructions: RPTB repeats a block of code, and RPTS repeats
a single instruction. There are three control registers: repeat start address
(RS), (repeat end address (RE), and repeat counter (RC). These contain the
parameters that specify loop execution (refer to Section 6.1 on page 6-2 for
a complete description of RPTB and RPTS). RS and RE are automatically set
from the code, while you must set RC, as shown in the examples below.

11.2.5.1 Block Repeat

Example 11–8 shows an application of the block repeat construct. In this ex-
ample, an array of 64 elements is flipped over by exchanging the elements that
are equidistant from the end of the array. In other words, if the original array is

a(1), a(2),..., a(31), a(32),..., a(64);

the final array after the rearrangement will be

a(64), a(63),..., a(32), a(31),..., a(1).

Because the exchange operation is done on two elements at the same time,
it requires 32 operations. The repeat counter RC is initialized to 31. In general,
if RC contains the number N, the loop will be executed N + 1 times. The loop
is defined by the RPTB instruction and the EXCH label.

11-18
Program Control

Example 11–8. Loop Using Block Repeat


* TITLE LOOP USING BLOCK REPEAT
*
* THIS CODE SEGMENT EXCHANGES THE VALUES OF ARRAY ELEMENTS THAT ARE
* SYMMETRIC AROUND THE MIDDLE OF THE ARRAY.
*
.
.
.
LDI @ADDR,AR0 ; AR0 points to the beginning of the array
LDI AR0,AR1
ADDI 63,AR1 ; AR1 points to the end of the
* ; 64 ± element array
LDI 31,RC ; Initialize repeat counter
*
RPTB EXCH ; Repeat RC+1 times between here and
* ; EXCH
LDI *AR0,R0 ; Load one memory element in R0,
|| LDI *AR1,R1 ; and the other in R1
EXCH STI R1, *AR0++(1) ; Then, exchange their locations
|| STI R0, *AR1– –(1)
.
.
.

Subsection 6.1.2 on page 6-3 specifies restrictions in the block-repeat con-


struct. Because the program counter is modified at the end of the loop accord-
ing to the contents of the registers RS, RE, and RC, no operation should at-
tempt to modify the repeat counter or the program counter at the end of the
loop in a different way.

In principle, it is possible to nest repeat blocks. However, there is only one set
of control registers: RS, RE, and RC. It is therefore necessary to save these
registers before entering an inside loop. It might be more practical to imple-
ment a nested loop by the more traditional method of using a register as a
counter and then using a delayed branch rather than using the nested repeat
block approach.

Example 11–9 shows another example of using the block repeat to find a maxi-
mum of 147 numbers.

Software Applications 11-19


Program Control

Example 11–9. Use of Block Repeat to Find a Maximum


*
*
* TITLE USE OF BLOCK REPEAT TO FIND A MAXIMUM
*
* THIS ROUTINE FINDS THE MAXIMUM OF N = 147 NUMBERS.
*
.
.
.
LDI 146,RC ; Initialize repeat counter to 147±1
LDI @ADDR,AR0 ; AR0 points to beginning of array
LD *AR0++(1),R0 ; Initialize MAX to the first value
*
RPTB LOOP
CMPF *AR0++(1),R0 ; Compare number to the maximum
LOOP LDFLT *± AR0(1),R0 ; If greater, this is a new maximum
.
.
.

11.2.5.2 Single-Instruction Repeat

The single-instruction repeat uses the control registers RS, RE, and RC in the
same way as the block repeat. The advantage over the block repeat is that the
instruction is fetched only once, and then the buses are available for moving
operands. Note that the single-instruction repeat construct is not interruptible,
while block repeat is interruptible.

Example 11–10 shows an application of the single-repeat construct. In this ex-


ample, the sum of the products of two arrays is computed. The arrays are not
necessarily different. If the arrays are a(i) and b(i), each of length N = 512,
register R0 will contain, after computation, this quantity:

a (1) b (1) + a (2) b (2) +...+ a (N) b (N).

The value of the RC is specified to be 511 in the instruction. If RC contains the


number N, the loop will be executed N + 1 times.

11-20
Program Control

Example 11–10. Loop Using Single Repeat


* TITLE LOOP USING SINGLE REPEAT
*
* THIS CODE SEGMENT COMPUTES SUM[a(i)b(i)] FOR i = 1 to N.
*
*
.
.
.
LDI @ADDR1,AR0 ; AR0 points to array a(i)
LDI @ADDR2,AR1 ; AR1 points to array b(i)
*
LDF 0.0,R0 ; Initialize R0
*
MPYF3 *AR0++(1),*AR1++(1),R1
* ; Compute first product
RPTS 511 ; Repeat 512 times
*
MPYF3 *AR0++(1),*AR1++(1),R1,R0 ; Compute next product
|| ADDF3 R1,R0,R0 ; and accumulate the
; previous one
*
ADDF R1,R0 ; One final addition
.
.
.

Software Applications 11-21


Program Control

11.2.6 Computed GOTOs


It is occasionally convenient to select during run time (and not during assem-
bly) the subroutine to be executed. The TMS320C3x’s computed GOTO sup-
ports this selection. The computed GOTO is implemented using the CALLcond
instruction in the register-addressing mode. This instruction uses the contents
of the register as the address of the call. Example 11–11 shows a computed
GOTO for a task controller.

Example 11–11. Computed GOTO


* TITLE COMPUTED GOTO
*
* TASK CONTROLLER
*
* THIS MAIN ROUTINE CONTROLS THE ORDER OF TASK EXECUTION (6 TASKS
* IN THE PRESENT EXAMPLE). TASK0 THROUGH TASK5 ARE THE NAMES OF
* SUBROUTINES TO BE CALLED. THEY ARE EXECUTED IN ORDER, TASK0,
* TASK1, . . .TASK5. WHEN AN INTERRUPT OCCURS, THE INTERRUPT
* SERVICE ROUTINE IS EXECUTED, AND THE PROCESSOR CONTINUES
* WITH THE INSTRUCTION FOLLOWING THE IDLE INSTRUCTION. THIS
* ROUTINE SELECTS THE TASK APPROPRIATE FOR THE CURRENT CYCLE,
* CALLS THE TASK AS A SUBROUTINE, AND BRANCHES BACK TO THE IDLE
* TO WAIT FOR THE NEXT SAMPLE INTERRUPT WHEN THE SCHEDULED TASK
* HAS COMPLETED EXECUTION. R0 HOLDS THE OFFSET FROM THE BASE
* ADDRESS OF THE TASK TO BE EXECUTED.
*
*
LDI 5,R0 ; Initialize R0
LDI @ADDR,AR1 ; AR1 holds base address of the table
WAIT IDLE ; Wait for the next interrupt
ADDI3 *AR1,R0,AR2 ; Add the base address to the table
* ; Entry number
SUBI 1,R0 ; Decrement R0
LDILT 5,R0 ; If R0<0, reinitialize it to 5
LDI *AR2,R1 : Load the task address
CALLU R1 ; Execute appropriate task
BR WAIT
*
TSKSEQ .word TASK5 ; Address of TASK5
.word TASK4 ; Address of TASK4
.word TASK3 ; Address of TASK3
.word TASK2 ; Address of TASK2
.word TASK1 ; Address of TASK1
.word TASK0 ; Address of TASK0
ADDR .word TSKSEQ

11-22
Logical and Arithmetic Operations

11.3 Logical and Arithmetic Operations


The TMS320C3x instruction set supports both integer and floating-point arith-
metic and logical operations. The basic functions of such instructions can be
combined to form more complex operations. This section examines examples
of these operations:
- Bit manipulation
- Block moves
- Bit-reversed addressing
- Integer and floating-point division
- Square root
- Extended-precision arithmetic
- Floating-point format conversion between IEEE and TMS320C3x formats

11.3.1 Bit Manipulation


Instructions for logical operations, such as AND, OR, NOT, ANDN, and XOR
can be used together with the shift instructions for bit manipulation. A special
instruction, TSTB, tests bits. TSTB performs the same operation as AND, but
the result of the logical AND is only used to set the condition flags and is not
written anywhere. Example 11–12 and Example 11–13 demonstrate the use
of the several instructions for bit manipulation and testing.

Example 11–12. Use of TSTB for Software-Controlled Interrupt


* TITLE USE OF TSTB FOR SOFTWARE±CONTROLLED INTERRUPT
*
* IN THIS EXAMPLE, ALL INTERRUPTS HAVE BEEN DISABLED BY
* RESETTING THE GIE BIT OF THE STATUS REGISTER. WHEN AN
* INTERRUPT ARRIVES, IT IS STORED IN THE IF REGISTER. THE
* PRESENT EXAMPLE ACTIVATES THE INTERRUPT SERVICE ROUTINE INTR
* WHEN IT DETECTS THAT INT2± HAS OCCURRED.
.
.
.
TSTB 0100b,IF ; Check if bit 2 of IF is set,
CALLNZ INTR ; and, if so, call subroutine INTR
.
.
.

Software Applications 11-23


Logical and Arithmetic Operations

Example 11–13. Copy a Bit From One Location to Another


* TITLE COPY A BIT FROM ONE LOCATION TO ANOTHER
*
* BIT I OF R1 NEEDS TO BE COPIED TO BIT J OF R2.
* AR0 POINTS TO A LOCATION HOLDING I, AND IT IS ASSUMED THAT THE
* NEXT MEMORY LOCATION HOLDS THE VALUE J.
*
* I
* ↓
*
* R1
*
* J
* ↓
*
R2
*
*
*
*
* I *AR0
*
*
*
J *(AR0+1)
*
*
*
.
.
.
LDI 1,R0
LSH *AR0,R0 ; Shift 1 to align it with bit I
TSTB R1,R0 ; Test the Ith bit of R1
BZD CONT ; If bit = 0, branch delayed
LDI 1,R0
LSH *+AR0(1),R0 ; Align 1 with Jth location
ANDN R0,R2 ; If bit = 0, reset Jth bit of R2
OR R0,R2 ; If bit = 1, set Jth bit of R2
CONT .
.
.
.

11-24
Logical and Arithmetic Operations

11.3.2 Block Moves

Since the TMS320C3x directly addresses a large amount of memory, blocks


of data or program code can be stored off-chip in slow memories and then
loaded on-chip for faster execution. Data can also be moved from on-chip to
off-chip memory for storage or for multiprocessor data transfers.

You can use direct memory access (DMA) in parallel with CPU operations to
accomplish such data transfers. The DMA operation is explained in detail in
subsection 8.3 on page 8-43. An alternative to DMA is to perform data trans-
fers under program control using load and store instructions in a repeat mode.
Example 11–14 shows the transfer of a block of 512 floating-point numbers
from external memory to block 1 of the on-chip RAM.

Example 11–14. Block Move Under Program Control


* TITLE BLOCK MOVE UNDER PROGRAM CONTROL
*
extern .word 01000H
block1 .word 0809C00H
.
.
.
LDI @extern,AR0 ; Source address
LDI @block1,AR1 ; Destination address
LDF *AR0++,R0 ; Load the first number
RPTS 510 ; Repeat following instruction 511 times
LDF *AR0++,R0 ; Load the next number, and...
|| STF R0,*AR1++ ; store the previous one
STF R0,*AR1 ; Store the last number
.
.
.

11.3.3 Bit-Reversed Addressing

The TMS320C3x can implement fast Fourier transforms (FFT) with bit-rev-
ersed addressing. If the data to be transformed is in the correct order, the final
result of the FFT is scrambled in bit-reversed order. To recover the frequency-
domain data in the correct order, you must swap certain memory locations.
The bit-reversed addressing mode makes swapping unnecessary. The next
time data needs to be accessed, the access is performed in a bit-reversed
manner rather than sequentially. The base address of bit-reversed addressing
must be located on a boundary of the size of the table. For example, if IR0 =
2n–1, the n LSBs of the base address must be 0.

Software Applications 11-25


Logical and Arithmetic Operations

In bit-reversed addressing, IR0 holds a value equal to one-half the size of the
FFT, if real and imaginary data are stored in separate arrays. During access-
ing, the auxiliary register is indexed by IR0, but with reverse carry propagation.
Example 11–15 illustrates a 512-point complex FFT being moved from the
place of computation (pointed at by AR0) to a location pointed at by AR1. In
this example, real and imaginary parts XR(i) and XI(i) of the data are not stored
in separate arrays, but they are interleaved XR(0), XI(0), XR(1), XI(1), ...,
XR(N-1), XI(N-1). Because of this arrangement, the length of the array is 2N
instead of N, and IR0 is set to 512 instead of 256.

Example 11–15. Bit-Reversed Addressing


*
* TITLE BIT±REVERSED ADDRESSING
*
* THIS EXAMPLE MOVES THE RESULT OF THE 512±POINT FFT
* COMPUTATION POINTED AT BY AR0 TO A LOCATION POINTED AT
* BY AR1. REAL AND IMAGINARY POINTS ARE ALTERNATING.
.
.
.
LDI 512,IR0
LDI 2,IR1
LDI 511,RC ; Repeat 511+1 times
LDF *+AR0(1),R1 ; Load first imaginary point
RPTB LOOP
*
LDF *AR0++(IR0)B,R0 ; Load real value (and point
|| STF R1,*+AR1(1) : to next location) and store
* ; the imaginary value
LOOP LDF *+AR0(1),R1 ; Load next imaginary point and store
|| STF R0,*AR1++(IR1) ; previous real value
.
.
.

11.3.4 Integer and Floating-Point Division


Although division is not implemented as a single instruction in the
TMS320C3x, the instruction set has the capacity to perform an efficient divi-
sion routine. Integer and floating-point division are examined separately be-
cause different algorithms are used.

11-26
Logical and Arithmetic Operations

11.3.4.1 Integer Division

Division is implemented on the TMS320C3x by repeated subtractions using


SUBC, a special conditional subtract instruction. Consider the case of a 32-bit
positive dividend with i significant bits (and 32 – i sign bits) and a 32-bit positive
divisor with j significant bits (and 32 – j sign bits). The repetition of the SUBC
command i – j + 1 times produces a 32-bit result in which the lower i – j +
1 bits are the quotient and the upper 31 – i + j bits are the remainder of the
division.

SUBC implements binary division in the same manner that long division imple-
ments it. The divisor which is assumed to be smaller than the dividend) is
shifted left i – j times to be aligned with the dividend. Then, using SUBC, the
shifted divisor is subtracted from the dividend. For each subtraction that does
not produce a negative answer, the dividend is replaced by the difference. It
is then shifted to the left, and a 1 is put in the LSB. If the difference is negative,
the dividend is simply shifted left by 1. This operation is repeated
i – j + 1 times.

Software Applications 11-27


Logical and Arithmetic Operations

As an example, consider the division of 33 by 5, using both long division and


the SUBC method. In this case, i = 6, j = 3, and the SUBC operation is repeated
6 – 3 + 1 = 4 times.

Long Division:

00000000000000000000000000000110
Quotient
00000000000000000000000000000101 00000000000000000000000000100001
–101
1101
–101
11 Remainder

SUBC Method:
00000000000000000000000000100001 Dividend
00000000000000000000000000101000 Divisor (Aligned)
(First SUBC Command)
Negative Difference

00000000000000000000000000100010 New Dividend + Quotient
00000000000000000000000000101000 Divisor
Difference (> 0) (Second SUBC Command)
00000000000000000000000000011010

00000000000000000000000000110101 New Dividend + Quotient
00000000000000000000000000101000 Divisor
Difference (> 0) (Third SUBC Command)
00000000000000000000000000001101

00000000000000000000000000011011 New Dividend + Quotient
00000000000000000000000000101000 Divisor
(Fourth SUBC Command)
Negative Difference

00000000000000000000000000110110
Final Result
↓ ↓
Remainder Quot.

When the SUBC command is used, both the dividend and the divisor must be
positive. Example 11–16 shows an example of a realization of the integer divi-
sion in which the sign of the quotient is properly handled. The last instruction
before returning modifies the condition flag in case subsequent operations de-
pend on the sign of the result.

11-28
Logical and Arithmetic Operations

Example 11–16. Integer Division


*
* TITLE INTEGER DIVISION
*
SUBROUTINE DIVI
*
*
* INPUTS: SIGNED INTEGER DIVIDEND IN R0,
* SIGNED INTEGER DIVISOR IN R1
*
* OUTPUT: R0/R1 into R0
*
* REGISTERS USED: R0±R3, IR0, IR1
*
* OPERATION: 1. NORMALIZE DIVISOR WITH DIVIDEND
* 2. REPEAT SUBC
* 3. QUOTIENT IS IN LSBs OF RESULT
*
* CYCLES: 31±62 (DEPENDS ON AMOUNT OF NORMALIZATION)
*

.globl DIVI

SIGN .set R2
TEMPF .set R3
TEMP .set IR0
COUNT .set IR1

* DIVI ± SIGNED DIVISION

DIVI:
*
* DETERMINE SIGN OF RESULT. GET ABSOLUTE VALUE OF OPERANDS.
*

XOR R0,R1,SIGN ; Get the sign


ABSI R0
ABSI R1

CMPI R0,R1 ; Divisor > dividend ?


BGTD ZERO ; If so, return 0
*

* NORMALIZE OPERANDS. USE DIFFERENCE IN EXPONENTS AS SHIFT COUNT


* FOR DIVISOR AND AS REPEAT COUNT FOR ’SUBC’.
*

FLOAT R0,TEMPF ; Normalize dividend


PUSHF TEMPF ; PUSH as float
POP COUNT ; POP as int
LSH ±24,COUNT ; Get dividend exponent

Software Applications 11-29


Logical and Arithmetic Operations

FLOAT R1,TEMPF ; Normalize divisor


PUSHF TEMPF ; PUSH as float
POP TEMP ; POP as int
LSH ±24,TEMP ; Get divisor exponent
SUBI TEMP,COUNT ; Get difference in exponents
LSH COUNT,R1 ; Align divisor with dividend
*
* DO COUNT+1 SUBTRACT & SHIFTS.
RPTS COUNT
SUBC R1,R0
*
* MASK OFF THE LOWER COUNT+1 BITS OF R0.
*
SUBRI 31,COUNT ; Shift count is (32 ± (COUNT+1))
LSH COUNT,R0 ; Shift left
NEGI COUNT
LSH COUNT,R0 ; Shift right to get result
*
* CHECK SIGN AND NEGATE RESULT IF NECESSARY.
*
NEGI R0,R1 ; Negate result
ASH ±31,SIGN ; Check sign
LDINZ R1,R0 ; If set, use negative result
CMPI 0,R0 ; Set status from result
RETS
*
* RETURN 0.
*
0:
LDI 0,R0
RETS
.end

If the dividend is less than the divisor and you want fractional division, you can
perform a division after you determine the desired accuracy of the quotient in
bits. If the desired accuracy is k bits, start by shifting the dividend left by k posi-
tions. Then apply the algorithm described above, with i replaced by i + k. It is
assumed that i + k is less than 32.

11-30
Logical and Arithmetic Operations

11.3.4.2 Computation of Floating-Point Inverse and Division

This section presents a method of implementing floating-point division on the


TMS320C3x. Since the algorithm outlined here computes the inverse of a
number v, to perform y / v, multiply y by the inverse of v.

The computation of 1 / v is based on the following iterative algorithm. At the


ith iteration, the estimate x [i] of 1 / v is computed from v and the previous esti-
mate x [i–1] according to the following formula:

x [i] = x [i – 1] * (2.0 – v * x [i – 1])

To start the operation, an initial estimate x [0] is needed. If v = a * 2e, a good


initial estimate is

x [0] = 1.0 * 2 – e – 1

Example 11–17 shows the implementation of this algorithm on the


TMS320C3x, where the iteration has been applied five times. Both accuracy
and speed are affected by the number of iterations. The accuracy offered by
the single-precision floating-point format is 2 – 23 = 1.192E – 7. If you want
more accuracy, use more iterations. If you want less accuracy, reduce the
number of iterations to increase the execution speed.

This algorithm properly treats the boundary conditions when the input number
either is 0 or has a very large value. When the input is 0, the exponent
e = – 128. Then the calculation of x [0] yields an exponent equal to
– (– 128) –1 = 127, and the algorithm will overflow and saturate. On the other
hand, in the case of a very large number, e = 127, the exponent of x [0] will be
– 127 – 1 = – 128. This will cause the algorithm to yield 0, which is a reasonable
handling of that boundary condition.

Software Applications 11-31


Logical and Arithmetic Operations

Example 11–17. Inverse of a Floating-Point Number


*
* TITLE INVERSE OF A FLOATING±POINT NUMBER
*
*
* SUBROUTINE INVF
*
*
* THE FLOATING-POINT NUMBER v IS STORED IN R0. AFTER THE
* COMPUTATION IS COMPLETED, 1/v IS ALSO STORED IN R0.
*
* TYPICAL CALLING SEQUENCE:
* LDF v, R0
* CALL INVF
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | v = NUMBER TO FIND THE RECIPROCAL OF (UPON THE CALL)
* R0 | 1/v (UPON THE RETURN)
*
* REGISTER USED AS INPUT: R0
* REGISTERS MODIFIED: R0, R1, R2, R3
* REGISTER CONTAINING RESULT: R0
*
* CYCLES: 35 WORDS: 32
*
*
.global INVF
*
INVF: LDF R0,R3 ; v is saved for later
ABSF R0 ; The algorithm uses v = |v|
*
* EXTRACT THE EXPONENT OF v.
*
PUSHF R0
POP R1
ASH ±24,R1 ; The 8 LSBs of R1 contain the exponent
* ; of v
*
* x[0] FORMATION IS GIVEN THE EXPONENT OF v.
*

11-32
Logical and Arithmetic Operations

NEGI R1
SUBI 1,R1 ; Now we have ±e±1, the exponent of x[0]
ASH 24,R1
PUSH R1
POPF R1 ; Now R1 = x[0] = 1.0 * 2**(±e±1)
*
* NOW THE ITERATIONS BEGIN.
*
MPYF R1,R0,R2 ; R2 = v * x[0]
SUBRF 2.0,R2 ; R2 = 2.0 ± v * x[0]
MPYF R2,R1 ; R1 = x[1] = x[0] * (2.0 ± v * x[0])
*
MPYF R1,R0,R2 ; R2 = v * x[1]
SUBRF 2.0,R2 ; R2 = 2.0 – v * x[1]
MPYF R2,R1 ; R1 = x[2] = x[1] * (2.0 ± v * x[1])
*
MPYF R1,R0,R2 ; R2 = v * x[2]
SUBRF 2.0,R2 ; R2 = 2.0 ± v * x[2]
MPYF R2,R1 ; R1 = x[3] = x[2] * (2.0 ± v * x[2])
*
MPYF R1,R0,R2 ; R2 = v * x[3]
SUBRF 2.0,R2 ; R2 = 2.0 ± v * x[3]
MPYF R2,R1 ; R1 = x[4] = x[3] * (2.0 ± v * x[3])
*
RND R1 ; This minimizes error in the LSBs
*
* FOR THE LAST ITERATION WE USE THE FORMULATION:
* x[5] = (x[4] * (1.0 ± (v * x[4]))) + x[4]
*
MPYF R1,R0,R2 ; R2 = v * x[4] = 1.0..01.. => 1
SUBRF 1.0,R2 ; R2 = 1.0 ± v * x[4] = 0.0..01... => 0
MPYF R1,R2 ; R2 = x[4] * (1.0 ± v * x[4])
ADDF R2,R1 ; R2 = x[5] = (x[4]*(1.0±(v*x[4])))+x[4]
*
RND R1,R0 ; Round since this is followed by a MPYF
*
* NOW THE CASE OF v < 0 IS HANDLED.
*
NEGF R0,R2
LDF R3,R3 ; This sets condition flags
LDFN R2,R0 ; If v < 0, then R0 = ±R0
*
RETS
*
* END
*
.end

Software Applications 11-33


Logical and Arithmetic Operations

11.3.5 Square Root


An iterative algorithm computes square root on the TMS320C3x and is similar
to the one used for the computation of the inverse. This algorithm computes
the inverse of the square root of a number v, 1 / SQRT(v). To derive SQRT(v),
multiply this result by v. Since in many applications, division by the square root
of a number is desirable, the output of the algorithm saves the effort to compute
the inverse of the square root.

At the ith iteration, the estimate x[i] of 1 / SQRT(v) is computed from v and the
previous estimate x[i-1] according to this formula:

x [i] = x [i – 1] * (1.5 – (v / 2) * x [i – 1] * x [i – 1])

To start the operation, an initial estimate x[0] is needed. If v = a * 2e, a good


initial estimate is

x [0] = 1.0 * 2 – e/2

Example 11–18 shows the implementation of this algorithm on the


TMS320C3x, where the iteration has been applied five times. Both accuracy
and speed are affected by the number of iterations. If you want more accuracy,
use more iterations. If you want less accuracy, reduce the number of iterations
to increase the execution speed.

11-34
Logical and Arithmetic Operations

Example 11–18. Square Root of a Floating-Point Number


*
* TITLE SQUARE ROOT OF A FLOATING±POINT NUMBER
*
*
* SUBROUTINE SQRT
*
* THE FLOATING POINT NUMBER v IS STORED IN R0. AFTER THE
* COMPUTATION IS COMPLETED, SQRT(v) IS ALSO STORED IN R0. NOTE
* THAT THE ALGORITHM ACTUALLY COMPUTES 1/SQRT(v).
*
*
* TYPICAL CALLING SEQUENCE:
*
* LDF v, R0
* CALL SQRT
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | v = NUMBER TO FIND THE SQUARE ROOT OF
* | (UPON THE CALL)
* R0 | SQRT(v) (UPON THE RETURN)
*
* REGISTER USED AS INPUT: R0
* REGISTERS MODIFIED: R0, R1, R2, R3
* REGISTER CONTAINING RESULT: R0
*
* CYCLES: 50 WORDS: 39
*
.global SQRT
*
* EXTRACT THE EXPONENT OF V.
*

Software Applications 11-35


Logical and Arithmetic Operations

SQRT: LDF R0,R3 ; Save v


RETSLE ; Return if number is non±positive
PUSHF R0
POP R1
ASH ±24,R1 ; The 8 LSBs of R1 contain exponent of v
ADDI 1,R1 ; Add a rounding bit in the exponent
ASH –1,R1 ; e/2
*
* X[0] FORMATION GIVEN THE EXPONENT OF V.
*
NEGI R1
ASH 24,R1
PUSH R1
POPF R1 ; Now R1 = x[0] = 1.0 * 2**(±e/2)
*
* GENERATE V/2.
*
MPYF 0.5,R0 ; V/2 and take rounding bit out
*
* NOW THE ITERATIONS BEGIN.
*
MPYF R1,R1,R2 ; R2 = x[0] * x[0]
MPYF R0,R2 ; R2 = (v/2) * x[0] * x[0]
SUBRF 1.5,R2 ; R2 = 1.5 ± (v/2) * x[0] * x[0]
MPYF R2,R1 ; R1 = x[1] = x[0] *
* ; (1.5 ± (v/2)*x[0]*x[0])
RND R1
MPYF R1,R1,R2 ; R2 = x[1] * x[1]
MPYF R0,R2 ; R2 = (v/2) * x[1] * x[1]
SUBRF 1.5,R2 ; R2 = 1.5 ± (v/2) * x[1] * x[1]
MPYF R2,R1 ; R1 = x[2] = x[1] *
* ; (1.5 ± (v/2)*x[1]*x[1])
RND R1
MPYF R1,R1,R2 ; R2 = x[2] * x[2]
MPYF R0,R2 ; R2 = (v/2) * x[2] * x[2]
SUBRF 1.5,R2 ; R2 = 1.5 ± (v/2) * x[2] * x[2]
MPYF R2,R1 ; R1 = x[3] = x[2]
* ; *(1.5 ± (v/2)*x[2]*x[2])
RND R1
*

11-36
Logical and Arithmetic Operations

MPYF R1,R1,R2 ; R2 = x[3] * x[3]


MPYF R0,R2 ; R2 = (v/2) * x[3] * x[3]
SUBRF 1.5,R2 ; R2 = 1.5 ± (v/2) * x[3] * x[3]
MPYF R2,R1 ; R1 = x[4] = x[3]
* ; * (1.5 ± (v/2) * x[3] * x[3])
RND R1
*
MPYF R1,R1,R2 ; R2 = x[4] * x[4]
MPYF R0,R2 ; R2 = (v/2) * x[4] * x[4]
SUBRF 1.5,R2 ; R2 = 1.5 ± (v/2) * x[4] * x[4]
MPYF R2,R1 ; R1 = x[5] = x[4]
* ; * (1.5 ± (v/2) * x[4] * x[4])
*
*
RND R1,R0 ; Round
*
MPYF R3,R0 ; Sqrt(v) from sqrt(v**(±1))
*
RETS
*
* end
*
.end

Software Applications 11-37


Logical and Arithmetic Operations

11.3.6 Extended-Precision Arithmetic


The TMS320C3x offers 32 bits of precision for integer arithmetic and 24 bits
of precision in the mantissa for floating-point arithmetic. For higher precision
in floating-point operations, the eight extended-precision registers R7 to R0
contain eight additional bits of accuracy. Since no comparable extension is
available for fixed-point arithmetic, this section shows how you can achieve
fixed-point double precision by using the capabilities of the processor. The
technique consists of performing the arithmetic by parts (which is similar to
performing longhand arithmetic).

In the instruction set, operations ADDC (add with carry) and SUBB (subtract
with borrow) use the status carry bit for extended-precision arithmetic. The
carry bit is affected by the arithmetic operations of the ALU and by the rotate
and shift instructions. It can also be manipulated directly by setting the status
register to certain values. For proper operation, the overflow mode bit should
be reset (OVM = 0) so that the accumulator results are not loaded with the sat-
uration values. Example 11–19 and Example 11–20 show 64-bit addition and
64-bit subtraction. The first operand is stored in the registers R0 (low word) and
R1 (high word). The second operand is stored in R2 and R3. The result is
stored in R0 and R1.

11-38
Logical and Arithmetic Operations

Example 11–19. 64-Bit Addition


* TITLE 64±BIT ADDITION
*
* TWO 64±BIT NUMBERS ARE ADDED TO EACH OTHER, PRODUCING
* A 64±BIT RESULT. THE NUMBERS X (R1,R0) AND Y (R3,R2) ARE
* ADDED, RESULTING IN W (R1,R0).
*
* R1 R0
* + R3 R2
* –––––––––
* R1 R0
*
ADDI R2,R0
ADDC R3,R1

Example 11–20. 64-Bit Subtraction


* TITLE 64±BIT SUBTRACTION
*
* TWO 64±BIT NUMBERS ARE SUBTRACTED FROM EACH OTHER
* PRODUCING A 64±BIT RESULT. THE NUMBERS X (R1,R0) AND
* Y (R3,R2) ARE SUBTRACTED, RESULTING IN W (R1,R0).
*
* R1 R0
* – R3 R2
* –––––––––
* R1 R0
*
SUBI R2,R0
SUBB R3,R1

When two 32-bit numbers are multiplied, a 64-bit product results. The proce-
dure for multiplication is to split the 32-bit magnitude values of the multiplicand
X and the multiplier Y into two parts (X1,X0) and (X3,X2), respectively, with 16
bits each. The operation is done on unsigned numbers, and the product is ad-
justed for the sign bit. Example 11–21 shows the implementation of a 32-bit by
32-bit multiplication.

Software Applications 11-39


Logical and Arithmetic Operations

Example 11–21. 32-Bit-by-32-Bit Multiplication


*
* TITLE 32 BIT X 32 BIT MULTIPLICATION
*
*
* SUBROUTINE EXTMPY
*
* FUNCTION: TWO 32±BIT NUMBERS ARE MULTIPLIED, PRODUCING A 64±BIT
* RESULT. THE TWO NUMBERS (X and Y) ARE EACH SEPARATED INTO TWO
* PARTS (X1 X0) AND (Y1 Y0), WHERE X0, X1, Y0, AND Y1 ARE 16 BITS.
* THE TOP BIT IN X1 AND Y1 IS THE SIGN BIT. THE PRODUCT IS
* IN TWO WORDS (W0 AND W1). THE MULTIPLICATION IS PERFORMED ON
* POSITIVE NUMBERS, AND THE SIGN IS DETERMINED AT THE END.
*
*
* X1 X0 BITS OF PRODUCTS
* X Y1 Y0 (NOT COUNTING SIGN) PRODUCT
* –––––––––––

* X0*Y0 16+16 P1
* X0*Y1 16+16 P2
* X1*Y0 16+16 P3
* X1*Y1 16+16 P4
* ––––––––––––––
* W1 W0
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | MULTIPLIER AND LOW WORD OF THE PRODUCT
* R1 | MULTIPLICAND AND UPPER WORD OF THE PRODUCT
*
*
* REGISTERS USED AS INPUT: R0, R1
* REGISTERS MODIFIED: R0, R1, R2, R3, R4, AR0, AR1
* REGISTER CONTAINING RESULT: R0,R1
*
*

11-40
Logical and Arithmetic Operations

* CYCLES: 28 (WORST CASE) WORDS: 25


*
.global EXTMPY
*
EXTMPY XOR3 R0,R1,AR0 ; Store sign
ABSI R0 ; Absolute values of X
ABSI R1 ; and Y
*
* SEPARATE MULTIPLIER AND MULTIPLICAND INTO TWO PARTS
*
LDI ±16,AR1
LSH3 AR1,R0,R2 ; R2 = X1 = upper 16 bits of X
AND 0FFFFH,R0 ; R0 = X0 = lower 16 bits of X
LSH3 AR1,R1,R3 ; R3 = Y1 = upper 16 bits of Y
AND 0FFFFH,R1 ; R1 = Y0 = lower 16 bits of Y
*
* CARRY OUT THE MULTIPLICATION
*
MPYI3 R0,R1,R4 ; X0*Y0 = P1
MPYI R3,R0 ; X0*Y1 = P2
MPYI R2,R1 ; X1*Y0 = P3
ADDI R0,R1 ; P2+P3
MPYI R2,R3 ; X1*Y1 = P4
*
LDI R1,R2
LSH 16,R2 ; Lower 16 bits of P2+P3
CMPI 0,AR0 ; Check the sign of the product
BGED DONE ; If >0, multiplication complete
; (delayed)
LSH –16,R1 ; Upper 16 bits of P2+P3
ADDI3 R4,R2,R0 ; W0 = R0 = lower word of the product
ADDC3 R1,R3,R1 ; W1 = R1 = upper word of the product
*
* NEGATE THE PRODUCT IF THE NUMBERS ARE OF OPPOSITE SIGNS
*
NOT R0
ADDI 1,R0
NOT R1
ADDC 0,R1
*
DONE RETS
.end

Software Applications 11-41


Logical and Arithmetic Operations

11.3.7 IEEE/TMS320C3x Floating-Point Format Conversion


The fast version of the IEEE-to-TMS320C3x conversion routine was originally
developed by Keith Henry of Apollo Computer, Inc. The other routines were
based on this initial input.

In fixed-point arithmetic, the binary point that separates the integer from the
fractional part of the number is fixed at a certain location. For example, if a
32-bit number has the binary point after the most significant bit (which is also
the sign bit), only fractional numbers (numbers with absolute values less than
1), can be represented. In other words, there is a number called a Q31 number,
which is a number with 31 fractional bits. All operations assume that the binary
point is fixed at this location. The fixed-point system, although simple to imple-
ment in hardware, imposes limitations in the dynamic range of the represented
number, which causes scaling problems in many applications. You can avoid
this difficulty by using floating-point numbers.

A floating-point number consists of a mantissa m multiplied by base b raised


to an exponent e:

m * be

In current hardware implementations, the mantissa is typically a normalized


number with an absolute value between 1 and 2, and the base is b = 2. Al-
though the mantissa is represented as a fixed-point number, the actual value
of the overall number floats the binary point because of the multiplication by
b e. The exponent e is an integer whose value determines the position of the
binary point in the number. IEEE has established a standard format for the re-
presentation of floating-point numbers.

To achieve higher efficiency in hardware implementation, the TMS320C3x


uses a floating-point format that differs from the IEEE standard. This section
briefly describes the two formats and presents software routines to convert be-
tween them.

TMS320C3x floating-point format:

8 1 23
e s f

11-42
Logical and Arithmetic Operations

In a 32-bit word representing a floating-point number, the first eight bits corre-
spond to the exponent expressed in two’s-complement format. There is one
bit for sign and 23 bits for the mantissa. The mantissa is expressed in two’s-
complement form, with the binary point after the most significant nonsign bit.
Since this bit is the complement of the sign bit s, it is suppressed. In other
words, the mantissa actually has 24 bits. A special case occurs when
e = –128. In this case, the number is interpreted as 0, independently of the
values of s and f (which are set to 0 by default). To summarize, the values of
the represented numbers in the TMS320C3x floating-point format are as fol-
lows:
2e * (01.f) if s = 0
2e * (10.f) if s = 1
0 if e = –128
IEEE floating-point format:
1 8 23
s e f

The IEEE floating-point format uses sign-magnitude notation for the mantissa,
and the exponent is biased by 127. In a 32-bit word representing a
floating-point number, the first bit is the sign bit. The next eight bits correspond
to the exponent, which is expressed in an offset-by-127 format (the actual ex-
ponent is e –127). The following 23 bits represent the absolute value of the
mantissa with the most significant 1 implied. The binary point is after this most
significant 1. In other words, the mantissa actually has 24 bits. There are sev-
eral special cases, summarized below.
These are the values of the represented numbers in the IEEE floating-point
format:
(–1) s * 2 e –127 * (01.f) if 0 < e < 255
Special cases:
(–1) s * 0.0 if e = 0 and f = 0 (zero)
(–1) s * 2 –126 * (0.f) if e = 0 and f < > 0 (denormalized)
(–1) s * infinity if e = 255 and f = 0 (infinity)
NaN (not a number) if e = 255 and f < > 0
Based on these definitions of the formats, two versions of the conversion rou-
tines were developed. One version handles the complete definition of the for-
mats. The other ignores some of the special cases (typically the ones that are
rarely used), but it has the benefit of executing faster than the complete con-
version. For this discussion, the two versions are referred to as the complete
version and the fast version, respectively.

Software Applications 11-43


Logical and Arithmetic Operations

11.3.7.1 IEEE-to-TMS320C3x Floating-Point Format Conversion

Example 11–22 shows the fast conversion from IEEE to TMS320C3x floating-
point format. It properly handles the general case when 0 < e < 255, and also
handles 0s (that is, e = 0 and f = 0). The other special cases (denormalized,
infinity, and NaN) are not treated and, if present, will give erroneous results.

Example 11–22. IEEE-to-TMS320C3x Conversion (Fast Version)


* TITLE IEEE TO TMS320C3x CONVERSION (FAST VERSION)
*
*
* SUBROUTINE FMIEEE
*
* FUNCTION: CONVERSION BETWEEN THE IEEE FORMAT AND THE
* TMS320C3x FLOATING-POINT FORMAT. THE NUMBER TO
* BE CONVERTED IS IN THE LOWER 32 BITS OF R0.
* THE RESULT IS STORED IN THE UPPER 32 BITS OF R0.
* UPON ENTERING THE ROUTINE, AR1 POINTS TO THE
* FOLLOWING TABLE:
*
* (0) 0xFF800000 <– – AR1
* (1) 0xFF000000
* (2) 0x7F000000
* (3) 0x80000000
* (4) 0x81000000
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* –––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | NUMBER TO BE CONVERTED
* AR1 | POINTER TO TABLE WITH CONSTANTS
*
* REGISTERS USED AS INPUT: R0, AR1
* REGISTERS MODIFIED: R0, R1
* REGISTER CONTAINING RESULT: R0
*
* NOTE: SINCE THE STACK POINTER SP IS USED, MAKE SURE TO
* INITIALIZE IT IN THE CALLING PROGRAM.
*
*
* CYCLES: 12 (WORST CASE) WORDS: 12
*
.global FMIEEE
*

11-44
Logical and Arithmetic Operations

FMIEEE AND3 R0,*AR1,R1 ; Replace fraction with 0


BND NEG ; Test sign
ADDI R0,R1 ; Shift sign
; and exponent inserting 0
LDIZ *+AR1(1),R1 ; If all 0, generate C30 0
SUBI *+AR1(2),R1 ; Unbias exponent
PUSH R1
POPF R0 ; Load this as a flt. pt. number
RETS
*
NEG PUSH R1
POPF R0 ; Load this as a flt. pt. number
NEGF R0,R0 ; Negate if orig. sign is negative
RETS

Software Applications 11-45


Logical and Arithmetic Operations

Example 11–23 shows the complete conversion between the IEEE and
TMS320C3x formats. In addition to the general case and the 0s, it handles the
special cases as follows:

- If NaN (e = 255, f< >0), the number is returned intact.

- If infinity (e = 255, f = 0), the output is saturated to the most positive or


negative number, respectively.

- If denormalized (e = 0, f< >0), two cases are considered. If the MSB of


f is 1, the number is converted to TMS320C3x format. Otherwise, an un-
derflow occurs, and the number is set to 0.

Example 11–23. IEEE-to-TMS320C3x Conversion (Complete Version)


* TITLE IEEE TO TMS320C3x CONVERSION (COMPLETE VERSION)
*
*
* SUBROUTINE FMIEEE1
*
* FUNCTION: CONVERSION BETWEEN THE IEEE FORMAT AND THE TMS320C3x
* FLOATING-POINT FORMAT. THE NUMBER TO BE CONVERTED
* IS IN THE LOWER 32 BITS OF R0. THE RESULT IS STORED
* IN THE UPPER 32 BITS OF R0.
*
*
* UPON ENTERING THE ROUTINE, AR1 POINTS TO THE FOLLOWING TABLE:
*
* (0) 0xFF800000 <– – AR1
* (1) 0xFF000000
* (2) 0x7F000000
* (3) 0x80000000
* (4) 0x81000000
* (5) 0x7F800000
* (6) 0x00400000
* (7) 0x007FFFFF
* (8) 0x7F7FFFFF
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* –––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | NUMBER TO BE CONVERTED
* AR1 | POINTER TO TABLE WITH CONSTANTS
*
* REGISTERS USED AS INPUT: R0, AR1
* REGISTERS MODIFIED: R0, R1
* REGISTER CONTAINING RESULT: R0
*

11-46
Logical and Arithmetic Operations

* NOTE: SINCE THE STACK POINTER SP IS USED, MAKE SURE TO


* INITIALIZE IT IN THE CALLING PROGRAM.
*
*
* CYCLES: 23 (WORST CASE) WORDS: 34
*
.global FMIEEE1
*
FMIEEE1 LDI R0,R1
AND *+AR1(5),R1
BZ UNNORM ; If e = 0, number is either 0 or
* ; denormalized
XOR *+AR1(5),R1
BNZ NORMAL ; If e < 255, use regular routine

* HANDLE NaN AND INFINITY

TSTB *+AR1(7),R0
RETSNZ ; Return if NaN
LDI R0,R0
LDFGT *+AR1(8),R0 ; If positive, infinity =
; most positive number
LDFN *+AR1(5),R0 ; If negative, infinity =
RETS ; most negative number RETS

* HANDLE 0s AND UNNORMALIZED NUMBERS

UNNORM TSTB *+AR1(6),R0 ; Is the MSB of f equal to 1?


LDFZ *+AR1(3),R0 ; If not, force the number to 0
RETSZ ; and return
XOR *+AR1(6),R0 ; If MSB of f = 1, make it 0
BND NEG1
LSH 1,R0 ; Eliminate sign bit
; & line up mantissa
SUBI *+AR1(2),R0 ; Make e = ±127
PUSH R0
POPF R0 ; Put number in floating point format
RETS
NEG1 POPF R0
NEGF R0,R0 ; If negative, negate R0
RETS

Software Applications 11-47


Logical and Arithmetic Operations

* HANDLE THE REGULAR CASES


*
NORMAL AND3 R0,*AR1,R1 ; Replace fraction with 0
BND NEG ; Test sign
ADDI R0,R1 ; Shift sign and exponent inserting 0
SUBI *+AR1(2),R1 ; Unbias exponent
PUSH R1
POPF R0 ; Load this as a flt. pt. number
RETS

NEG POPF R0 ; Load this as a flt. pt. number


NEGF R0,R0 ; Negate if original sign negative
RETS

11-48
Logical and Arithmetic Operations

11.3.7.2 TMS320C3x-to-IEEE Floating-Point Format Conversion

The vast majority of the numbers represented by the TMS320C3x


floating-point format are covered by the general IEEE format and the repre-
sentation of 0s. The only special case is e = –127 in the TMS320C3x format;
this corresponds to a denormalized number in IEEE format. It is ignored in the
fast version, while it is treated properly in the complete version.
Example 11–24 shows the fast version, and Example 11–25 shows the com-
plete version of the TMS320C3x-to-IEEE conversion.

Example 11–24. TMS320C3x-to-IEEE Conversion (Fast Version)

*
* TITLE TMS320C3x TO IEEE CONVERSION (FAST VERSION)
*
*
* SUBROUTINE TOIEEE
*
* FUNCTION: CONVERSION BETWEEN THE TMS320C3x FORMAT AND THE IEEE
* FLOATING-POINT FORMAT. THE NUMBER TO BE CONVERTED
* IS IN THE UPPER 32 BITS OF R0. THE RESULT WILL BE IN
* THE LOWER 32 BITS OF R0.
*

* UPON ENTERING THE ROUTINE, AR1 POINTS TO THE FOLLOWING TABLE:


*
* (0) 0xFF800000 <– – AR1
* (1) 0xFF000000
* (2) 0x7F000000
* (3) 0x80000000
* (4) 0x81000000
*

* ARGUMENT ASSIGNMENTS:

* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | NUMBER TO BE CONVERTED
* AR1 | POINTER TO TABLE WITH CONSTANTS
*

* REGISTERS USED AS INPUT: R0, AR1


* REGISTERS MODIFIED: R0
* REGISTER CONTAINING RESULT: R0
*

* NOTE: SINCE THE STACK POINTER ‘SP’ IS USED, MAKE SURE TO


* INITIALIZE IT IN THE CALLING PROGRAM.
*
*

Software Applications 11-49


Logical and Arithmetic Operations

* CYCLES: 14 (WORST CASE) WORDS: 15


*
.global TOIEEE
*
TOIEEE LDF R0,R0 ; Determine the sign of the number
LDFZ *+AR1(4),R0 ; If 0, load appropriate number
BND NEG ; Branch to NEG if negative (delayed)
ABSF R0 ; Take the absolute value of the number
LSH 1,R0 ; Eliminate the sign bit in R0
PUSHF R0
POP R0 ; Place number in lower 32 bits of R0
ADDI *+AR1(2),R0 ; Add exponent bias (127)
LSH ±1,R0 ; Add the positive sign
RETS

NEG POP R0 ; Place number in lower 32 bits


; of R0
ADDI *+AR1(2),R0 ; Add exponent bias (127)
LSH ±1,R0 ; Make space for the sign
ADDI *+AR1(3),R0 ; Add the negative sign
RETS

11-50
Logical and Arithmetic Operations

Example 11–25. TMS320C3x-to-IEEE Conversion (Complete Version)


*
* TITLE TMS320C3x TO IEEE CONVERSION (COMPLETE VERSION)
*
*
* SUBROUTINE TOIEEE1
*
*
* FUNCTION: CONVERSION BETWEEN THE TMS320C3x FORMAT AND THE IEEE
* FLOATING-POINT FORMAT. THE NUMBER TO BE CONVERTED
* IS IN THE UPPER 32 BITS OF R0. THE RESULT WILL BE
* IN THE LOWER 32 BITS OF R0.
*
*
* UPON ENTERING THE ROUTINE, AR1 POINTS TO THE FOLLOWING TABLE:
*
* (0) 0xFF800000 <– – AR1
* (1) 0xFF000000
* (2) 0x7F000000
* (3) 0x80000000
* (4) 0x81000000
* (5) 0x7F800000
* (6) 0x00400000
* (7) 0x007FFFFF
* (8) 0x7F7FFFFF
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | NUMBER TO BE CONVERTED
* AR1 | POINTER TO TABLE WITH CONSTANTS
*
* REGISTERS USED AS INPUT: R0, AR1
* REGISTERS MODIFIED: R0
* REGISTER CONTAINING RESULT: R0
*
* NOTE: SINCE THE STACK POINTER ’SP’ IS USED, MAKE SURE TO
* INITIALIZE IT IN THE CALLING PROGRAM.
*
*
* CYCLES: 31 (WORST CASE) WORDS: 25
*
.global TOIEEE1

Software Applications 11-51


Logical and Arithmetic Operations

*
TOIEEE1 LDF R0,R0 ; Determine the sign of the number
LDFZ *+AR1(4),R0 ; If 0, load appropriate number
BND NEG ; Branch to NEG if negative (delayed)
ABSF R0 ; Take the absolute value
; of the number
LSH 1,R0 ; Eliminate the sign bit in R0
PUSHF R0
POP R0 ; Place number in lower 32 bits of R0
ADDI *+AR1(2),R0 ; Add exponent bias (127)
LSH ±1,R0 ; Add the positive sign
CONT TSTB *+AR1(5),R0
RETSNZ ; If e > 0, return
TSTB *+AR1(7),R0
RETSZ ; If e = 0 & f = 0, return
PUSH R0
POPF R0
LSH ±1,R0 ; Shift f right by one bit
PUSHF R0
POP R0
ADDI *+AR1(6),R0 ; Add 1 to the MSB of f
RETS
NEG POP R0 ; Place number in lower 32 bits of R0
BRD CONT
ADDI *+ARI(2),R0 ; Add exponent bias (127)
LSH ±1,R0 ; Make space for the sign
ADDI *+AR1(3),R0 ; Add the negative sign
RETS

11-52
Application-Oriented Operations

11.4 Application-Oriented Operations


Certain features of the TMS320C3x architecture and instruction set facilitate
the solution of numerically intensive problems. This section presents exam-
ples of applications using these features, such as companding, filtering, FFTs,
and matrix arithmetic.

11.4.1 Companding
In telecommunications, conserving channel bandwidth while preserving
speech quality is a primary concern. This is achieved this by quantizing the
speech samples logarithmically. An 8-bit logarithmic quantizer produces
speech quality equivalent to a 13-bit uniform quantizer. The logarithmic quanti-
zation is achieved by companding (COMpress/exPANDing). Two international
standards have been established for companding: the µ-law standard (used
in the United States and Japan), and the A-law standard (used in Europe). De-
tailed descriptions of µ law and A law companding are presented in an applica-
tion report on companding routines included in the book Digital Signal Pro-
cessing Applications with the TMS320 Family (literature number SPRA012A).

During transmission, logarithmically compressed data in sign-magnitude form


is transmitted along the communications channel. If any processing is neces-
sary, you should expand this data to a 14-bit (for µ law) or 13-bit (for A law)
linear format. This operation is performed when the data is received at the digi-
tal signal processor. After processing, the result is compressed back to 8-bit
format and transmitted through the channel to continue transmission.

Example 11–26 and Example 11–27 show µ-law compression and expansion
(that is, linear to µ-law and µ-law to linear conversion), while Example 11–28
and Example 11–29 show A-law compression and expansion. For expansion,
using a look-up table is an alternative approach. A look-up table trades
memory space for speed of execution. Since the compressed data is eight bits
long, you can construct a table with 256 entries containing the expanded data.
If the compressed data is stored in the register AR0, the following two instruc-
tions will put the expanded data in register R0:
ADDI @TABL,AR0 ; @TABL = BASE ADDRESS OF TABLE
LDI *AR0,R0 ; PUT EXPANDED NUMBER IN R0

You could use the same look-up table approach for compression, but the re-
quired table length would then be 16,384 words for µ-law or 8,192 words for
A-law. If this memory size is not acceptable, use the subroutines presented in
Example 11–26 or Example 11–28.

Software Applications 11-53


Application-Oriented Operations

Example 11–26. µ-Law Compression


*
* TITLE U±LAW COMPRESSION
*
*
* SUBROUTINE MUCMPR
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | NUMBER TO BE CONVERTED
*
* REGISTERS USED AS INPUT: R0
* REGISTERS MODIFIED: R0, R1, R2, SP
* REGISTER CONTAINING RESULT: R0
*
* NOTE: SINCE THE STACK POINTER ’SP’ IS USED IN THE COMPRESSION
* ROUTINE ‘MUCMPR’, MAKE SURE TO INITIALIZE IT IN THE
* CALLING PROGRAM.
*
*
* CYCLES: 20 WORDS: 17
*
*
.global MUCMPR
*
MUCMPR LDI R0,R1 ; Save sign of number
ABSI R0,R0
CMPI 1FDEH,R0 ; If R0>0x1FDE,
LDIGT 1FDEH,R0 ; saturate the result
ADDI 33,R0 ; Add bias

FLOAT R0 ; Normalize: (seg+5)0WXYZx...x


MPYF 0.03125,R0 ; Adjust segment number by 2**(±5)
LSH 1,R0 ; (seg)WXYZx...x
PUSHF R0
POP R0 ; Treat number as integer
LSH ±20,R0 ; Right-justify

LDI 0,R2
LDI R1,R1 ; If number is negative,
LDILT 80H,R2 ; set sign bit
ADDI R2,R0 ; R0 = compressed number
NOT R0 ; Reverse all bits for transmission
RETS

11-54
Application-Oriented Operations

Example 11–27. µ-Law Expansion


*
* TITLE U-LAW EXPANSION
*
*
* SUBROUTINE MUXPND
*
*
* ARGUMENT ASSIGNMENTS:
*
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | NUMBER TO BE CONVERTED
*
* REGISTERS USED AS INPUT: R0
* REGISTERS MODIFIED: R0, R1, R2, SP
* REGISTER CONTAINING RESULT: R0
*
*
* CYCLES: 20 (WORST CASE) WORDS: 14
*
*
.global MUXPND
*
MUXPND NOT R0,R0 ; Complement bits
LDI R0,R1
AND 0FH,R1 ; Isolate quantization bin
LSH 1,R1
ADDI 33,R1 ; Add bias to introduce 1xxxx1
LDI R0,R2 ; Store for sign bit
LSH ±4,R0
AND 7,R0 ; Isolate segment code
LSH3 R0,R1,R0 ; Shift and put result in R0
SUBI 33,R0 ; Subtract bias
TSTB 80H,R2 ; Test sign bit
RETSZ
NEGI R0 ; Negate if a negative number
RETS

Software Applications 11-55


Application-Oriented Operations

Example 11–28. A-Law Compression


*
* TITLE A±LAW COMPRESSION
*
*
* SUBROUTINE ACMPR
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | NUMBER TO BE CONVERTED
*
* REGISTERS USED AS INPUT: R0
* REGISTERS MODIFIED: R0, R1, R2, SP
* REGISTER CONTAINING RESULT: R0
*
* NOTE: SINCE THE STACK POINTER ‘SP’ IS USED IN THE COMPRESSION
* ROUTINE ‘ACMPR’, MAKE SURE TO INITIALIZE IT IN THE
* CALLING PROGRAM.
*
*
* CYCLES:22 WORDS: 19
*
.global ACMPR
*
ACMPR LDI R0,R1 ; Save sign of number
ABSI R0,R0
CMPI 1FH,R0 ; If R0<0x20,
BLED END ; do linear coding
CMPI 0FFFH,R0 ; If R0>0xFFF,
LDIGT 0FFFH,R0 ; saturate the result
LSH ±1,R0 ; Eliminate rightmost bit

FLOAT R0 ; Normalize: (seg+3)0WXYZx...x


MPYF 0.125,R0 ; Adjust segment number by 2**(±3)
LSH 1,R0 ; (seg)WXYZx...x
PUSHF R0
POP R0 ; Treat number as integer
LSH ±20,R0 ; Right±justify

END LDI 0,R2


LDI R1,R1 ; If number is negative,
LDILT 80H,R2 ; set sign bit
ADDI R2,R0 ; R0 = compressed number
XOR 0D5H,R0 ; Invert even bits
; for transmission
RETS
*

11-56
Application-Oriented Operations

Example 11–29. A-Law Expansion


*
* TITLE A-LAW EXPANSION
*
*
*
* SUBROUTINE AXPND
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | NUMBER TO BE CONVERTED
*
* REGISTERS USED AS INPUT: R0
* REGISTERS MODIFIED: R0, R1, R2, SP
* REGISTER CONTAINING RESULT: R0
*
*
* CYCLES: 25 (WORST CASE) WORDS: 16
*
*
.global AXPND
*
AXPND XOR D5H,R0 ; Invert even bits
LDI R0,R1
AND 0FH,R1 ; Isolate quantization bin
LSH 1,R1
LDI R0,R2 ; Store for bit sign
LSH ±4,R0
AND 7,R0 ; Isolate segment code
BZ SKIP1
SUBI 1,R0
ADDI 32,R1 ; Create 1xxxx1
SKIP1 ADDI 1,R1 ; OR 0xxxx1
LSH3 R0,R1,R0 ; Shift and put result in R0
TSTB 80H,R2 ; Test sign bit
RETSZ
NEGI R0 ; Negate if a negative number
RETS

Software Applications 11-57


Application-Oriented Operations

11.4.2 FIR, IIR, and Adaptive Filters

Digital filters are a common requirement for digital signal processing systems.
There are two types of digital filters: finite impulse response (FIR) and infinite
impulse response (IIR). Each of these types can have either fixed or adaptable
coefficients. This section presents the fixed-coefficient filters first, followed by
the adaptive filters.

11.4.2.1 FIR Filters

If the FIR filter has an impulse response h [0], h [1],..., h [N –1], and x[n] repre-
sents the input of the filter at time n, the output y [n] at time n is given by this
equation:

y [n] = h [0] x [n] + h [1] x [n –1] + ...+ h [N –1] x [n – (N –1)]

Two features of the TMS320C3x that facilitate the implementation of the FIR
filters are parallel multiply/add operations and circular addressing. The former
permits the performance of a multiplication and an addition in a single machine
cycle, while the latter makes a finite buffer of length N sufficient for the data x.

Figure 11–1 shows the arrangement of the memory locations necessary to im-
plement circular addressing, while Example 11–30 presents the TMS320C3x
assembly code for an FIR filter.

Figure 11–1. Data Memory Organization for an FIR Filter

Impulse Initial Final


Response Input Samples Input Samples
Low
Address h(N – 1) Oldest Input x[n – (N – 1)] x(n)
h(N – 2) x[n – (N – 2)] x[n – (N – 1)]
• • •
• • • Circular
• • •
Queue

h(1) x(n – 1) x(n – 2)


High h(0) Newest Input x(n) x(n – 1)
Address

To set up circular addressing, initialize the block-size register BK to block


length N. Also, the locations for signal x should start from a memory location
whose address is a multiple of the smallest power of 2 that is greater than N.
For instance, if N = 24, the first address for x should be a multiple of 32 (the
lowest five bits of the beginning address should be 0). See Section 5.3 on page
5-24 for more information.

11-58
Application-Oriented Operations

In Example 11–30, the pointer to the input sequence x is incremented and is


assumed to be moving from an older input to a newer input. At the end of the
subroutine, AR1 will be pointing to the position for the next input sample.

Example 11–30. FIR Filter


*
* TITLE FIR FILTER
*
*
* SUBROUTINE FIR
*
* EQUATION: y(n) = h(0) * x(n) + h(1) * x(n±1) +
* ... + h(N±1) * x(n±(N±1))
*
* TYPICAL CALLING SEQUENCE:
*
* LOAD AR0
* LOAD AR1
* LOAD RC
* LOAD BK
* CALL FIR
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* AR0 | ADDRESS OF h(N±1)
* AR1 | ADDRESS OF x(n–(N±1))
* RC | LENGTH OF FILTER ± 2 (N±2)
* BK | LENGTH OF FILTER (N)
*
* REGISTERS USED AS INPUT: AR0, AR1, RC, BK
* REGISTERS MODIFIED: R0, R2, AR0, AR1, RC
* REGISTER CONTAINING RESULT: R0
*
*
* CYCLES: 11 + (N±1) WORDS: 6
*
*
.global FIR
* ; Initialize R0:
FIR MPYF3 *AR0++(1),*AR1++(1)%,R0
* ; h(N±1) * x(n±(N±1)) ±> R0
LDF 0.0,R2 ; Initialize R2
*
* FILTER (1 <= i < N)
*
RPTS RC ; Set up the repeat cycle
MPYF3 *AR0++(1),*AR1++(1)%,R0 ; h(N±1±i)*x(n±(N±1±i))±>R0
|| ADDF3 R0,R2,R2 ; Multiply and add operation

Software Applications 11-59


Application-Oriented Operations

*
ADDF R0,R2,R0 ; Add last product
*
* RETURN SEQUENCE
*
RETS ; Return
*
* end
*
.end

11.4.2.2 IIR Filters


The transfer function of the IIR filters has both poles and 0s. Its output depends
on both the input and the past output. As a rule, the filters need less computa-
tion than an FIR with similar frequency response, but the filters have the draw-
back of being sensitive to coefficient quantization. Most often, the IIR filters are
implemented as a cascade of second-order sections, called biquads.
Example 11–31 and Example 11–32 show the implementation for one biquad
and for any number of biquads, respectively.
This is the equation for a single biquad:
y [n] = a1 y [n – 1] + a2 y [n – 2] + b0 x [n ] + b1 x [n –1] + b2 x [n – 2]
However, the following two equations are more convenient and have smaller
storage requirements:
d [n] = a2 d [n – 2] + a1 d [n –1] + x [n]
y [n] = b2 d [n – 2] + b1 d [n – 1] + b0 d [n]
Figure 11–2 shows the memory organization for this two-equation approach,
and Example 11–31 is an implementation of a single biquad on the
TMS320C3x.

Figure 11–2. Data Memory Organization for a Single Biquad


Filter Newest Delay Newest Delay
Coefficients Node Values Node Values
Low
Address a2 Newest Delay d(n) d(n –1)
b2 d(n –1) d(n – 2) Circular Queue
a1 Oldest Delay d(n – 2) d(n)
b1
High b0
Address

As in the case of FIR filters, the address for the start of the values d must be
a multiple of 4; that is, the last two bits of the beginning address must be 0. The
block-size register BK must be initialized to 3.

11-60
Application-Oriented Operations

Example 11–31. IIR Filter (One Biquad)


*
* TITLE IIR FILTER
*
*
* SUBROUTINE IIR 1
*
* IIR1 == IIR FILTER (ONE BIQUAD)
*
*
* EQUATIONS: d(n) = a2 * d(n±2) + a1 * d(n±1) + x(n)
* y(n) = b2 * d(n±2) + b1 * d(n±1) + b0 * d(n)
*
* OR y(n) = a1*y(n±1) + a2*y(n±2) + b0*x(n)
* + b1*x(n±1) + b2*x(n±2)
*
* TYPICAL CALLING SEQUENCE:
*
* load R2
* load AR0
* load AR1
* load BK
* CALL IIR1
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R2 | INPUT SAMPLE X(N)
* AR0 | ADDRESS OF FILTER COEFFICIENTS (A2)
* AR1 | ADDRESS OF DELAY MODE VALUES (D(N±2))
* BK | BK = 3
*
* REGISTERS USED AS INPUT: R2, AR0, AR1, BK
* REGISTERS MODIFIED: R0, R1, R2, AR0, AR1
* REGISTER CONTAINING RESULT: R0
*
* CYCLES: 11 WORDS: 8
*
*
* FILTER

Software Applications 11-61


Application-Oriented Operations

*
.global IIR1
*
IIR1 MPYF3 *AR0,*AR1,R0
* ; a2 * d(n±2) ±> R0
MPYF3 *++AR0(1),*AR1– –(1) % ,R1
* ; b2 * d(n±2) ±> R1
*
MPYF3 *++AR0(1),*AR1,R0 ; a1 * d(n±1) ±> R0
|| ADDF3 R0,R2,R2 ; a2*d(n±2)+x(n) ±> R2
*
MPYF3 *++AR0(1),*AR1– –(1)%,R0 ; b1 * d(n±1) ±> R0
|| ADDF3 R0,R2,R2 ; a1*d(n±1)+a2*d(n±2)+x(n) ±> R2
*
MPYF3 *++AR0(1),R2,R2 ; b0 * d(n) ±> R2
|| STF R2,*AR1++(1)%
*
* ; Store d(n)and point to d(n±1)
*
ADDF R0,R2 ; b1*d(n±1)+b0*d(n) ±> R2
ADDF R1,R2,R0 ; b2*d(n±2)+b1*d(n±1)
; +b0*d(n) ±> R0
*
* RETURN SEQUENCE
*
RETS ; Return
*
* end
*
.end

In the more general case, the IIR filter contains N >1 biquads. The equations
for its implementation are given by the following pseudo-C language code:

y [0,n] = x [n]
for (i = 0; i < N; i ++){
d [i,n] = a2 [i] d [i, n – 2] + a1 [i] d [i,n –1] + y [i – 1,n]
y [i,n] = b2 [i] d [i – 2] + b1 [i] d [i,n – 1] + b0 [i] d [i,n]
}
y [n] = y [N – 1,n]

Figure 11–3 shows the corresponding memory organization, while


Example 11–32 shows the TMS320C3x assembly-language code.

11-62
Application-Oriented Operations

Figure 11–3. Data Memory Organization for N Biquads


Filter Initial Delay Final Delay
Coefficients Node Values Node Values
Low
Address a2(0) Newest Delay d(0, n) d(0, n –1)
b2(0) d(0, n –1) d(0, n – 2) Circular Queue
a1(0) Oldest Delay d(0, n – 2) d(0, n)
b1(0) Empty Empty
b0(0) • •
• • •
• • •
• d(N –1, n) d(N –1, n –1)
d(N –1, n –1) d(N –1, n – 2) Circular Queue
a2(N –1) d(N –1, n – 2) d(N –1, n)
b2(N –1)
Empty Empty
a1(N –1)
b1(N –1)
High b0(N –1)
Address

You should initialize the block register BK to 3; the beginning of each set of d
values (that is, d [i,n ], i = 0...N – 1) should be at an address that is a multiple
of 4 (where the last two bits are 0).

Software Applications 11-63


Application-Oriented Operations

Example 11–32. IIR Filters (N > 1 Biquads)


*
* TITLE IIR FILTERS (N > 1 BIQUADS)

*
*
* SUBROUTINE IIR2
*
*
*
* EQUATIONS: y(0,n) = x(n)
*
* FOR (i = 0; i < N; i++)

* {

* d(i,n) = a2(i) * d(i,n±2) + a1(i) * d(i,n±1) * y(i±1,n)


* y(i,n) = b2(i) * d(i,n±2) + b1(i) * d(i,n±1) * b0(i) * d(i,n)

* TYPICAL CALLING SEQUENCE:


* }

* y(n) = y(N±1,n)
*
* TYPICAL CALLING SEQUENCE:
*

* load R2
* load AR0
* load AR1
* load IR0
* load IR1
* load BK
* load RC
* CALL IIR2
*
*

* ARGUMENT ASSIGNMENT:

* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R2 | INPUT SAMPLE x(n)
* ARO | ADDRESS OF FILTER COEFFICIENTS (a2(0))
* AR1 | ADDRESS OF DELAY NODE VALUES (d(0,n±2))
* BK | BK = 3
* IR0 | IR0 = 4
* IR1 | IR1 = 4*N±4
* RC | NUMBER OF BIQUADS (N) ±2
*

* REGISTERS USED AS INPUT; R2, AR0, AR1, IR0, IR1, BK, RC


* REGISTERS MODIFIED; R0, R1, R2, AR0, AR1, RC
* REGISTERS CONTAINING RESULT: R0
*

11-64
Application-Oriented Operations

* CYCLES: 17 + 6N WORDS: 17
*
*
*
*
.global IIR2
*
IIR2 MPYF3 *AR0, *AR1, R0
* ; a2(0) * d(0,n±2) ±> R0
MPYF3 *AR0++(1), *AR1– –(1)%, R1
* ; b2(0) * d(0,n±2) ±> R1

*
MPYF3 *++AR0(1),*AR1,R0 ; a1(0) * D(0,n±1) ±> R0
|| ADDF R0, R2, R2 ; First sum term of d(0,n)
*
MPYF3 *++AR0(1),*AR1– –(1)%,R0 ; b1(0) * d(0,n±1) ±> R0
|| ADDF3 R0, R2, R2 ; Second sum term of d(0,n)
MPYF3 *++AR0(1),R2,R2 ; b0(0) * d(0,n) ±> R2
|| STF R2, *AR1– –(1)%
*
* ; Store d(0,n);
; point to
; d(0,n±2)
RPTB LOOP ; Loop for 1 <= i < n
*
MPYF3 *++AR0(1),*++AR1(IR0),R0 ; a2(i) * d(i,n±2) ±> R0
|| ADDF3 R0,R2,R2 ; First sum term of y(i±1,n)
*
MPYF3 *++AR0(1),*AR1– – (1)%R1 ; b2(i) * D(i,n±2) ±> R1
|| ADDF3 R1,R2,R2 ; Second sum term
; of y(i±1,n)
*
MPYF3 *++AR0(1),*AR1,R0 ; a1(i) * d(i,n±1) ±> R0
|| ADDF3 R0,R2,R2 ; First sum of d(i,n)
*
MPYF3 *++AR0(1),*AR1– –(1)%,R0 ; b1(i) * d(i,n±1) ±> R0
|| ADDF3 R0,R2,R2 ; Second sum term of d(i,n)
*
STF R2, *AR1– –(1)%
* ; Store d(i,n);
; point to d(i,n±2)
LOOP MPYF3 *++AR0(1), R2,R2
* ; b0(i) * d(i,n) ±> R2
*

Software Applications 11-65


Application-Oriented Operations

*
* FINAL SUMMATION
*
ADDF R0,R2 ; First sum term of y(n±1,n)
ADDF3 R1,R2,R0 ; Second sum term
; of y(n±1,n)
*
NOP *AR1– –(IR1) ; Return to first biquad
NOP *AR1– –(1)% ; Point to d(0,n±1)
*
* RETURN SEQUENCE
*
RETS ; Return
* end
*
.end

11-66
Application-Oriented Operations

11.4.2.3 Adaptive Filters (LMS Algorithm)

In some applications in digital signal processing, you must adapt a filter over
time to keep track of changing conditions. The book Theory and Design of
Adaptive Filters by Treichler, Johnson, and Larimore (Wiley-Interscience,
1987) presents the theory of adaptive filters. Although in theory, both FIR and
IIR structures can be used as adaptive filters, the stability problems and the
local optimum points that the IIR filters exhibit make them less attractive for
such an application. Hence, until further research makes IIR filters a better
choice, only the FIR filters are used in adaptive algorithms of practical applica-
tions.

In an adaptive FIR filter, the filtering equation takes this form:

y [n] = h [n,0] x [n] + h [n,1] x [n – 1] + ... + h [n,N – 1] x [n – (N – 1)]

The filter coefficients are time-dependent. In a least-mean-squares (LMS) al-


gorithm, the coefficients are updated by an equation in this form:

h [n + 1,i] = h [n,i] + βx [n – i], i = 0,1,...,N – 1

β is a constant for the computation. You can interleave the updating of the filter
coefficients with the computation of the filter output so that it takes three cycles
per filter tap to do both. The updated coefficients are written over the old filter
coefficients. Example 11–33 shows the implementation of an adaptive FIR fil-
ter on the TMS320C3x. The memory organization and the positioning of the
data in memory should follow the same rules that apply to the FIR filter de-
scribed in subsection 11.4.2.1 on page 11-58.

Software Applications 11-67


Application-Oriented Operations

Example 11–33. Adaptive FIR Filter (LMS Algorithm)


* TITLE ADAPTIVE FIR FILTER (LMS ALGORITHM)

*
* SUBROUTINE LMS

* LMS == LMS ADAPTIVE FILTER


*
*
*
* EQUATIONS: y(n) = h(n,0)*x(n) + h(n,1)*x(n±1) + ...

* + h(n,N±1)*x(n±(N±1))

* FOR (i = 0; i < N; i++)

* h(n+1,i) = h(n,i) + tmuerr * x(n±i)

*
* TYPICAL CALLING SEQUENCE:
*

* load R4
* load AR0
* load AR1
* load RC
* load BK
* CALL LMS
*
*

* ARGUMENT ASSIGNMENTS:

* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R4 | SCALE FACTOR (2 * mu * err)
* AR0 | ADDRESS OF h(n,N±1)
* AR1 | ADDRESS OF x(n±(N±1))
* RC | LENGTH OF FILTER ± 2 (N±2)
* BK | LENGTH OF FILTER (N)
*

11-68
Application-Oriented Operations

* REGISTERS USED AS INPUT: R4, AR0, AR1, RC, BK


* REGISTERS MODIFIED: R0, R1, R2, AR0, AR1, RC
* REGISTER CONTAINING RESULT: R0
*
* PROGRAM SIZE: 10 words
*
* EXECUTION CYCLES: 14 + 3(N±1)
*
*
* SETUP (i = 0)
*

.global LMS

* ; Initialize R0:
LMS MPYF3 *AR0, *AR1, R0
* ; h(n,N±1) * x(n±(N±1)) ±> R0
LDF 0.0,R2 ; Initialize R2
*
* ; Initialize R1:
MPYF3 *AR1++(1)%, R4, R1
* ; x(n±(N±1)) * tmuerr ±> R1
ADDF3 *AR0++(1), R1, R1
* ; h(n,N±1) + x(n±(N±1)) *
* ; tmuerr ±> R1
*

* FILTER AND UPDATE (1 <= I < N)


*

RPTB LOOP ; Set up the repeat block


*
* ; Filter:
MPYF3 *AR0– –(1),*AR1,R0 ; h(n,N±1±i)
; * x(n±(N±1±i)) ±> R0
|| ADDF3 R0,R2,R2 ; Multiply and add operation
*
* ; UPDATE:
MPYF3 *AR1++(1)%,R4,R1 ; x(n,N±(N±1±i)) * tmuerr ±> R1
|| STF R1,*AR0++(1) ; R1 ±> h(n+1,N±1±(i±1))
*
LOOP ADDF3 *AR0++(1), R1, R1
* ; h(n,N±1±i) + x(n±(N±1±i))
; *tmuerr ±> R1
*
ADDF3 R0,R2,R0 ; Add last product
STF R1,*±AR0(1) ; h(n,0) + x(n)
; * tmuerr ±> h(n+1,0)
*

* RETURN SEQUENCE

Software Applications 11-69


Application-Oriented Operations

*
RETS ; Return
*

* end
*
.end

11.4.3 Matrix-Vector Multiplication


In matrix-vector multiplication, a K x N matrix of elements m(i,j) having K rows
and N columns is multiplied by an N x 1 vector to produce a K x 1 result. The
multiplier vector has elements v(j), and the product vector has elements p(i).
Each one of the product–vector elements is computed by the following expres-
sion:

p (i ) = m (i,0) v (0) + m (i,1) v (1) + ... + m (i,N – 1) v (N – 1) i = 0,1,...,K – 1

This is essentially a dot product, and the matrix-vector multiplication contains,


as a special case, the dot product presented in Example 11–2 on page 11-7.
In pseudo-C format, the computation of the matrix multiplication is expressed
by

for (i = 0; i < K; i + +) {
p (i) = 0
for (j = 0; j < N; j + +)
p (i) = p (i) + m (i,j) * v (j)
}

Figure 11–4 shows the data memory organization for matrix-vector multiplica-
tion, and Example 11–34 shows the TMS320C3x assembly code that imple-
ments it. Note that in Example 11–34, K (number of rows) should be greater
than 0, and N (number of columns) should be greater than 1.

11-70
Application-Oriented Operations

Figure 11–4. Data Memory Organization for Matrix-Vector Multiplication

Input Result
Matrix Storage Vector Storage Vector Storage
Low
Address m(0, 0) v(0) p(0)
m(0, 1) v(1) p(1)

• • •
• • •
• • •
m(0, N – 1) v(N – 1) p(K – 1)
m(1, 0)
High m(1, 1)
Address


Software Applications 11-71


Application-Oriented Operations

Example 11–34. Matrix Times a Vector Multiplication


*
* TITLE MATRIX TIMES A VECTOR MULTIPLICATION
*
*
* SUBROUTINE MAT
*
* MAT == MATRIX TIMES A VECTOR OPERATION
*
*
* TYPICAL CALLING SEQUENCE:*
* load AR0
* load AR1
* load AR2
* load AR3
* load R1
* CALL MAT
*
*

* ARGUMENT ASSIGNMENTS:

* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* AR0 | ADDRESS OF M(0,0)
* AR1 | ADDRESS OF V(0)
* AR2 | ADDRESS OF P(0)
* AR3 | NUMBER OF ROWS ± 1 (K±1)
* R1 | NUMBER OF COLUMNS ± 2 (N±2)
*

* REGISTERS USED AS INPUT: AR0, AR1, AR2, AR3, R1


* REGISTERS MODIFIED: R0, R2, AR0, AR1, AR2, AR3, IR0,
* RC, RSA, REA
*
*
* PROGRAM SIZE: 11
*
* EXECUTION CYCLES: 6 + 10 * K + K * (N ± 1)
*
*
*

.global MAT

*
* SETUP

*
MAT LDI R1,IR0 ; Number of columns±2 ±> IR0
ADDI 2,IR0 ; IR0 = N

11-72
Application-Oriented Operations

*
* FOR (i = 0; i < K; i++) LOOP OVER THE ROWS
*

ROWS LDF 0.0,R2 ; Initialize R2


MPYF3 *AR0++(1),*AR1++(1),R0
* ; m(i,0) * v(0) ±> R0
*

* FOR (j = 1; j < N; j++) DO DOT PRODUCT OVER COLUMNS


*

RPTS R1 ; Multiply a row by a column


*
MPYF3 *AR0++(1),*AR1++(1),R0 ; m(i,j) * v(j) ±> R0
|| ADDF3 R0,R2,R2 ; m(i,j±1) * v(j±1) + R2 ±> R2
*
DBD AR3,ROWS ; Counts the no. of rows left
*
*
ADDF R0,R2 ; Last accumulate
STF R2,*AR2++(1) ; Result ±> p(i)

NOP *– –AR1(IR0) ; Set AR1 to point to v(0)

* !!! DELAYED BRANCH HAPPENS HERE !!!


*
* RETURN SEQUENCE
*

RETS ; Return

* end
*
.end

11.4.4 Fast Fourier Transforms (FFT)

Fourier transforms are an important tool often used in digital signal processing
systems. The purpose of the transform is to convert information from the time
domain to the frequency domain. The inverse Fourier transform converts infor-
mation back to the time domain from the frequency domain. Implementation
of Fourier transforms that are computationally efficient are known as fast Four-
ier transforms (FFTs). The theory of FFTs can be found in books such as DFT/
FFT and Convolution Algorithms by C.S. Burrus and T.W. Parks (John Wiley,
1985) and Digital Signal Processing Applications with the TMS320 Family by
Texas Instruments (literature number SPRA012A).

Software Applications 11-73


Application-Oriented Operations

Fast Fourier transform is a label for a collection of algorithms that implement


efficient conversion from time to frequency domain. There are several types
of FFTs:
- Radix-2 or radix-4 algorithms (depending on the size of the FFT butterfly)
- Decimation in time or frequency (DIT or DIF)
- Complex or real FFTs
- FFTs of different lengths, etc.

Certain TMS320C3x features that increase efficient implementation of numer-


ically intensive algorithms are particularly well-suited for FFTs. The high speed
of the device (33-ns cycle time) makes implementation of real-time algorithms
easier, while floating-point capability eliminates the problems associated with
dynamic range. The powerful indirect-addressing indexing scheme facilitates
the access of FFT butterfly legs with different spans. The repeat block implem-
ented by the RPTB instruction reduces the looping overhead in algorithms
heavily dependent on loops (such as the FFTs). This construct provides the
efficiency of in-line coding in loop form. The FFT will reverse the bit order of
the output; therefore, the output must be reordered. This reordering does not
require extra cycles, because the device has a special mode of indirect ad-
dressing (bit-reversed addressing) for accessing the FFT output in the original
order.

The examples in this subsection were based on programs contained in the


Burrus and Parks book and in the paper Real-Valued Fast Fourier Transform
Algorithms by H.V. Sorensen, et al (IEEE Transform on ASSP, June 1987).

Example 11–35 and Example 11–36 show the implementation of a complex


radix-2, DIF FFT on the TMS320C3x. Example 11–35 contains the generic
code of the FFT, which can be used with a number of any length. However, for
the complete implementation of an FFT, you need a table of twiddle factors
(sines/cosines); the length of the table depends on the size of the transform.
To retain the generic form of Example 11–35, the table with the twiddle factors
(containing 1-1/4 complete cycles of a sine) is presented separately in
Example 11–36 for the case of a 64-point FFT. A full cycle of a sine should have
a number of points equal to the FFT size. Example 11–36 uses two variables:
N, which is the FFT length, and M, which is the logorithm of N to a base equal
to the radix. In other words, M is the number of stages of the FFT. For example,
in a 64-point FFT, M = 6 when using a radix-2 algorithm, and M = 3 when using
a radix-4 algorithm. If the table with the twiddle factors and the FFT code are
kept in separate files, they should be connected at link time.

11-74
Application-Oriented Operations

Example 11–35. Complex, Radix-2, DIF FFT


*
* TITLE COMPLEX, RADIX–2, DIF FFT
*
* GENERIC PROGRAM FOR LOOPED±CODE RADIX±2 FFT COMPUTATION IN TMS320C3x
*
* THE PROGRAM IS TAKEN FROM THE BURRUS AND PARKS BOOK, P. 111.
* THE (COMPLEX) DATA RESIDE IN INTERNAL MEMORY. THE COMPUTATION
* IS DONE IN PLACE, BUT THE RESULT IS MOVED TO ANOTHER MEMORY
* SECTION TO DEMONSTRATE THE BIT±REVERSED ADDRESSING.
*
* THE TWIDDLE FACTORS ARE SUPPLIED IN A TABLE THAT IS PUT IN A .DATA
* SECTION. THIS DATA IS INCLUDED IN A SEPARATE FILE TO PRESERVE THE
* GENERIC NATURE OF THE PROGRAM. FOR THE SAME PURPOSE, THE SIZE OF
* THE FFTN AND LOG2(N) ARE DEFINED IN A .GLOBL DIRECTIVE AND SPECIFIED
* DURING LINKING.
*
*

.globl FFT ; Entry point for execution


.globl N ; FFT size
.globl M ; LOG2(N)
.globl SINE ; Address of sine table

INP .usect “IN”,1024 ; Memory with input data


.BSS OUTP,1024 ; Memory with output data

.text

* INITIALIZE

FFTSIZ .word N
LOGFFT .word M
SINTAB .word SINE
INPUT .word INP
OUTPUT .word OUTP

FFT: LDP FFTSIZ ; Command to load data page pointer

LDI @FFTSIZ,IR1
LSH ±2,IR1 ; IR1 = N/4, pointer for SIN/COS table
LDI 0,AR6 ; AR6 holds the current stage number
LDI @FFTSIZ,IR0
LSH 1,IR0 ; IR0 = 2*N1 (because of real/imag)
LDI @FFTSIZ,R7 ; R7 = N2
LDI 1,AR7 ; Initialize repeat counter
; of first loop
LDI 1,AR5 ; Initialize IE index (AR5 = IE)

Software Applications 11-75


Application-Oriented Operations

* OUTER LOOP

LOOP: NOP *++AR6(1) ; Current FFT stage


LDI @INPUT,AR0 ; AR0 points to X(I)
ADDI R7,AR0,AR2 ; AR2 points to X(L)
LDI AR7,RC
SUBI 1,RC ; RC should be one less than desired #

* FIRST LOOP

RPTB BLK1
ADDF *AR0,*AR2,R0 ; R0 = X(I)+X(L)
SUBF *AR2++,*AR0++,R1 ; R1 = X(I)±X(L)
ADDF *AR2,*AR0,R2 ; R2 = Y(I)+Y(L)
SUBF *AR2,*AR0,R3 ; R3 = Y(I)±Y(L)
STF R2,*AR0– – ; Y(I) = R2 and...
|| STF R3,*AR2– – ; Y(L) = R3
BLK1 STF R0,*AR0++(IR0) ; X(I) = R0 and...
|| STF R1,*AR2++(IR0) ; X(L) = R1 and AR0,2 = AR0,2 + 2*n

* IF THIS IS THE LAST STAGE, YOU ARE DONE

CMPI @LOGFFT,AR6
BZD END

* MAIN INNER LOOP

LDI 2,AR1 ; Init loop counter for


; inner loop
LDI @SINTAB,AR4 ; Initialize IA index (AR4 = IA)
INLOP: ADDI AR5,AR4 ; IA = IA+IE; AR4 points to
; cosine
LDI AR1,AR0
ADDI 2,AR1 ; Increment inner loop counter
ADDI @INPUT,AR0 ; (X(I),Y(I)) pointer
ADDI R7,AR0,AR2 ; (X(L),Y(L)) pointer
LDI AR7,RC
SUBI 1,RC ; RC should be 1 less than
; desired #
LDF *AR4,R6 ; R6 = SIN

* SECOND LOOP

RPTB BLK2
SUBF *AR2,*AR0,R2 ; R2 = X(I)±X(L)
SUBF *+AR2,*+AR0,R1
* ; R1 = Y(I)±Y(L)
MPYF R2,R6,R0 ; R0 = R2*SIN and...
|| ADDF *+AR2,*+AR0,R3
* ; R3 = Y(I)+Y(L)
MPYF R1,*+AR4(IR1),R3 ; R3 = R1 * COS and ...
|| STF R3,*+AR0 ; Y(I) = Y(I)+Y(L)

11-76
Application-Oriented Operations

SUBF R0,R3,R4 ; R4 = R1 * COS±R2 * SIN


MPYF R1,R6,R0 ; R0 = R1 * SIN and...
|| ADDF *AR2,*AR0,R3 ; R3 = X(I) + X(L)
MPYF R2,*+AR4(IR1),R3 ; R3 = R2 * COS and...
|| STF R3,*AR0++(IR0)
*
* ; X(I) = X(I)+X(L) and AR0 = AR0+2*N1
ADDF R0,R3,R5 ; R5 = R2*COS+R1*SIN
BLK2 STF R5,*AR2++(IR0) ; X(L) = R2 * COS+R1 * SIN,
; incr AR2 and...
|| STF R4,*+AR2 ; Y(L) = R1*COS±R2*SIN

CMPI R7,AR1
BNE INLOP ; Loop back to the inner loop

LSH 1,AR7 ; Increment loop counter for next time


BRD LOOP ; Next FFT stage (delayed)
LSH 1,AR5 ; IE = 2*IE
LDI R7,IR0 ; N1 = N2
LSH ±1,R7 ; N2 = N2/2

* STORE RESULT OUT USING BIT-REVERSED ADDRESSING

END: LDI @FFTSIZ,RC ; RC = N


SUBI 1,RC ; RC should be one less than desired #
LDI @FFTSIZ,IR0 ; IR0 = size of FFT = N
LDI 2,IR1
LDI @INPUT,AR0
LDI @OUTPUT,AR1

RPTB BITRV
LDF *+AR0(1),R0
|| LDF *AR0++(IR0)B,R1
BITRV STF R0,*+AR1(1)
|| STF R1,*AR1++(IR1)

SELF BR SELF ; Branch to itself at the end


.end

Software Applications 11-77


Application-Oriented Operations

Example 11–36. Table With Twiddle Factors for a 64-Point FFT


*
*TITLE TABLE WITH TWIDDLE FACTORS FOR A 64±POINT FFT
*
* FILE TO BE LINKED WITH THE SOURCE CODE FOR A 64–POINT, RADIX±2 FFT

.globl SINE
.globl N
.globl M

N .set 64
M .set 6

.data

SINE
.float 0.000000
.float 0.098017
.float 0.195090
.float 0.290285
.float 0.382683
.float 0.471397
.float 0.555570
.float 0.634393
.float 0.707107
.float 0.773010
.float 0.831470
.float 0.881921
.float 0.923880
.float 0.956940
.float 0.980785
.float 0.995185
COSINE
.float 1.000000
.float 0.995185
.float 0.980785
.float 0.956940
.float 0.923880
.float 0.881921
.float 0.831470
.float 0.773010
.float 0.707107
.float 0.634393
.float 0.555570
.float 0.471397
.float 0.382683
.float 0.290285
.float 0.195090

11-78
Application-Oriented Operations

.float 0.098017
.float 0.000000
.float ± 0.098017
.float ± 0.195090
.float ± 0.290285
.float ± 0.382683
.float – 0.471397
.float –0.555570
.float – 0.634393
.float – 0.707107
.float – 0.773010
.float – 0.831470
.float – 0.881921
.float – 0.923880
.float – 0.956940
.float – 0.980785
.float – 0.995185
.float –1.000000
.float – 0.995185
.float – 0.980785
.float – 0.956940
.float – 0.923880
.float – 0.881921
.float – 0.831470
.float – 0.773010
.float – 0.707107
.float – 0.634393
.float – 0.555570
.float – 0.471397
.float – 0.382683
.float – 0.290285
.float – 0.195090
.float – 0.098017

Software Applications 11-79


Application-Oriented Operations

.float 0.000000
.float 0.098017
.float 0.195090
.float 0.290285
.float 0.382683
.float 0.471397
.float 0.555570
.float 0.634393
.float 0.707107
.float 0.773010
.float 0.831470
.float 0.881921
.float 0.923880
.float 0.956940
.float 0.980785
.float 0.995185

The radix-2 algorithm has tutorial value, because the functioning of the FFT
algorithm is relatively easy to understand. However, radix-4 implementation
can increase execution speed by reducing the amount of arithmetic required.
Example 11–37 shows the generic implementation of a complex, DIF FFT in
radix-4. A companion table, such as the one in Example 11–36, should have
a value of M equal to the logN, where the base of the logarithm is 4.

11-80
Application-Oriented Operations

Example 11–37. Complex, Radix-4, DIF FFT


*
* TITLE COMPLEX, RADIX-4, DIF FFT
*
* GENERIC PROGRAM TO PERFORM A LOOPED±CODE RADIX±4 FFT COMPUTATION
* IN THE TMS320C3x
*
* THE PROGRAM IS TAKEN FROM THE BURRUS AND PARKS BOOK, P. 117.
* THE (COMPLEX) DATA RESIDE IN INTERNAL MEMORY, AND THE COMPUTATION
* IS DONE IN PLACE.
*
* THE TWIDDLE FACTORS ARE SUPPLIED IN A TABLE THAT IS PUT IN A .DATA
* SECTION. THIS DATA IS INCLUDED IN A SEPARATE FILE TO PRESERVE THE
* GENERIC NATURE OF THE PROGRAM. FOR THE SAME PURPOSE, THE SIZE OF
* THE FFT N AND LOG4(N) ARE DEFINED IN A .GLOBL DIRECTIVE AND
* SPECIFIED DURING LINKING.
*
* IN ORDER TO HAVE THE FINAL RESULT IN BIT±REVERSED ORDER, THE TWO
* MIDDLE BRANCHES OF THE RADIX±4 BUTTERFLY ARE INTERCHANGED DURING
* STORAGE. NOTE THIS DIFFERENCE WHEN COMPARING WITH THE PROGRAM IN
* P. 117 OF THE BURRUS AND PARKS BOOK.
*

*
.globl FFT ; Entry point for execution
.globl N ; FFT size
.globl M ; LOG4(N)
.globl SINE ; Address of sine table

.usect “IN”,1024 ; Memory with input data

.text

* INITIALIZE

TEMP .word $+2


STORE .word FFTSIZ ; Beginning of temp storage area
.word N
.word M
.word SINE
.word INP

.BSS FFTSIZ,1 ; FFT size


.BSS LOGFFT,1 ; LOG4(FFTSIZ)
.BSS SINTAB,1 ; Sine/cosine table base
.BSS INPUT,1 ; Area with input data to process
.BSS STAGE,1 ; FFT stage #
.BSS RPTCNT,1 ; Repeat counter
.BSS IEINDX,1 ; IE index for sine/cosine

Software Applications 11-81


Application-Oriented Operations

.BSS LPCNT,1 ; Second±loop count


.BSS JT,1 ; JT counter in program, P. 117
.BSS IA1,1 ; IA1 index in program, P. 117

FFT:

* INITIALIZE DATA LOCATIONS

LDP TEMP ; Command to load data page counter


LDI @TEMP,AR0
LDI @STORE,AR1
LDI *AR0++,R0 ; Xfer data from one memory to the other
STI R0,*AR1++
LDI *AR0++,R0
STI R0,*AR1++
LDI *AR0++,R0
STI R0,*AR1++
LDI *AR0,R0
STI R0,*AR1

LDP FFTSIZ ; Command to load data page pointer


LDI @FFTSIZ,R0
LDI @FFTSIZ,IR0
LDI @FFTSIZ,IR1
LDI 0,AR7
STI AR7,@STAGE ; @STAGE holds the current stage number
LSH 1,IR0 ; IR0 = 2*N1 (because of real/imag)
LSH ±2,IR1 ; IR1 = N/4, pointer for SIN/COS table
LDI 1,AR7
STI AR7,@RPTCNT ; Init repeat counter of first loop
STI AR7,@IEINDX ; Init. IE index
LSH ±2,R0 ; JT = R0/2+2
ADDI 2,R0
STI R0,@JT
SUBI 2,R0
LSH 1,R0 ; R0 = N2

* OUTER LOOP

LOOP:
LDI @INPUT,AR0 ; AR0 points to X(I)
ADDI R0,AR0,AR1 ; AR1 points to X(I1)
ADDI R0,AR1,AR2 ; AR2 points to X(I2)
ADDI R0,AR2,AR3 ; AR3 points to X(I3)
LDI @RPTCNT,RC
SUBI 1,RC ; RC should be one less than desired #

* FIRST LOOP

RPTB BLK1
ADDF *+AR0,*+AR2,R1

11-82
Application-Oriented Operations

* ; R1 = Y(I)+Y(I2)
ADDF *+AR3,*+AR1,R3
* ; R3 = Y(I1)+Y(I3)
ADDF R3,R1,R6 ; R6 = R1+R3
SUBF *+AR2,*+AR0,R4
* ; R4 = Y(I)±Y(I2)
STF R6,*+AR0 ; Y(I) = R1+R3
SUBF R3,R1 ; R1 = R1±R3
LDF *AR2,R5 ; R5 = X(I2)
|| LDF *+AR1,R7 ; R7 = Y(I1)
ADDF *AR3,*AR1,R3 ; R3 = X(I1)+X(I3)
ADDF R5,*AR0,R1 ; R1 = X(I)+X(I2)
|| STF R1,*+AR1 ; Y(I1) = R1±R3
ADDF R3,R1,R6 ; R6 = R1+R3
SUBF R5,*AR0,R2 ; R2 = X(I)±X(I2)
|| STF R6,*AR0++(IR0) ; X(I) = R1+R3
SUBF R3,R1 ; R1 = R1±R3
SUBF *AR3,*AR1,R6 ; R6 = X(I1)±X(I3)
SUBF R7,*+AR3,R3 ; ±R3 = Y(I1)±Y(I3)
|| STF R1,*AR1++(IR0) ; X(I1) = R1±R3
SUBF R6,R4,R5 ; R5 = R4±R6
ADDF R6,R4 ; R4 = R4+R6
STF R5,*+AR2 ; Y(I2) = R4±R6
|| STF R4,*+AR3 ; Y(I3) = R4+R6
SUBF R3,R2,R5 ; R5 = R2±R3
ADDF R3,R2 ; R2 = R2+R3
BLK1 STF R5,*AR2++(IR0) ; X(I2) = R2±R3
|| STF R2,*AR3++(IR0) ; X(I3) = R2+R3

* IF THIS IS THE LAST STAGE, YOU ARE DONE

LDI @STAGE,AR7
ADDI 1,AR7
CMPI @LOGFFT,AR7
BZD END
STI AR7,@STAGE ; Current FFT stage

* MAIN INNER LOOP

LDI 1,AR7
STI AR7,@IA1 ; Init IA1 index
LDI 2,AR7
STI AR7,@LPCNT ; Init loop counter for inner loop
; INLOP:
LDI 2,AR6 ; Increment inner loop counter
ADDI @LPCNT,AR6
LDI @LPCNT,AR0
LDI @IA1,AR7
ADDI @IEINDX,AR7 ; IA1 = IA1+IE
ADDI @INPUT,AR0 ; (X(I),Y(I)) pointer
STI AR7,@IA1

Software Applications 11-83


Application-Oriented Operations

ADDI R0,AR0,AR1 ; (X(I1),Y(I1)) pointer


STI AR6,@LPCNT
ADDI R0,AR1,AR2 ; (X(I2),Y(I2)) pointer
ADDI R0,AR2,AR3 ; (X(I3),Y(I3)) pointer
LDI @RPTCNT,RC
SUBI 1,RC ; RC should be one less than desired #
CMPI @JT,AR6 ; If LPCNT = JT, go to
BZD SPCL ; special butterfly
LDI @IA1,AR7
LDI @IA1,AR4
ADDI @SINTAB,AR4 ; Create cosine index AR4
SUBI 1,AR4 ; Adjust sine table pointer
ADDI AR4,AR7,AR5
SUBI 1,AR5 ; IA2 = IA1+IA1±1
ADDI AR7,AR5,AR6
SUBI 1,AR6 ; IA3 = IA2+IA1±1

* SECOND LOOP

RPTB BLK2
ADDF *+AR2,*+AR0,R3
* ; R3 = Y(I)+Y(I2)
ADDF *+AR3,*+AR1,R5
* ; R5 = Y(I1)+Y(I3)
ADDF R5,R3,R6 ; R6 = R3+R5
SUBF *+AR2,*+AR0,R4
* ; R4 = Y(I)±Y(I2)
SUBF R5,R3 ; R3 = R3±R5
ADDF *AR2,*AR0,R1 ; R1 = X(I)+X(I2)
ADDF *AR3,*AR1,R5 ; R5 = X(I1)+X(I3)
MPYF R3,*+AR5(IR1),R6 R6 = R3*CO2
|| STF R6,*+AR0 ; Y(I) = R3+R5
ADDF R5,R1,R7 ; R7 = R1+R5
SUBF *AR2,*AR0,R2 ; R2 = X(I)±X(I2)
SUBF R5,R1 ; R1 = R1±R5
MPYF R1,*AR5,R7 ; R7 = R1*SI2
|| STF R7,*AR0++(IR0) ; X(I) = R1+R5
SUBF R7,R6 ; R6 = R3*CO2±R1*SI2
SUBF *+AR3,*+AR1,R5
* ; R5 = Y(I1)±Y(I3)
MPYF R1,*+AR5(IR1),R7 ; R7 = R1*C02
|| STF R6,*+AR1 ; Y(I1) = R3*CO2±R1*SI2
MPYF R3,*AR5,R6 ; R6 = R3*SI2
ADDF R7,R6 ; R6 = R1*CO2+R3*SI2
ADDF R5,R2,R1 ; R1 = R2+R5
SUBF R5,R2 ; R2 = R2±R5
SUBF *AR3,*AR1,R5 ; R5 = X(I1)±X(I3)
SUBF R5,R4,R3 ; R3 = R4±R5
ADDF R5,R4 ; R4 = R4+R5
MPYF R3,*+AR4(IR1),R6 ; R6 = R3*CO1
|| STF R6,*AR1++(IR0) ; X(I1) = R1*CO2+R3*SI2

11-84
Application-Oriented Operations

MPYF R1,*AR4,R7 ; R7 = R1*SI1


SUBF R7,R6 ; R6 = R3*CO1±R1*SI1
MPYF R1,*+AR4(IR1),R6 ; R6 = R1*CO1
|| STF R6,*+AR2 ; Y(I2) = R3*CO1±R1*SI1
MPYF R3,*AR4,R7 ; R7 = R3*SI1
ADDF R7,R6 ; R6 = R1*C O1+R3*SI1
MPYF R4,*+AR6(IR1),R6 ; R6 = R4*CO3
|| STF R6,*AR2++(IR0) ; X(I2) = R1*CO1+R3*SI1
MPYF R2,*AR6,R7 ; R7 = R2*SI3
SUBF R7,R6 ; R6 = R4*CO3±R2*SI3
MPYF R2,*+AR6(IR1),R6 ; R6 = R2*CO3
|| STF R6,*+AR3 ; Y(I3) = R4*CO3±R2*SI3
MPYF R4,*AR6,R7 ; R7 = R4*SI3
ADDF R7,R6 ; R6 = R2*CO3+R4*SI3

BLK2 STF R6,*AR3++(IR0)


* ; x(i3) = R2*CO3+R4*SI3

CMPI @LPCNT,R0
BP INLOP ; Loop back to the inner loop
BR CONT

* SPECIAL BUTTERFLY FOR W = J

SPCL LDI IR1,AR4


LSH ±1,AR4 ; Point to SIN(45)
ADDI @SINTAB,AR4 ; Create cosine index AR4 = CO21

RPTB BLK3
ADDF *AR2,*AR0,R1 ; R1 = X(I)+X(I2)
SUBF *AR2,*AR0,R2 ; R2 = X(I)±X(I2)
ADDF *+AR2,*+AR0,R3
* ; R3 = Y(I)+Y(I2)
SUBF *+AR2,*+AR0,R4
* ; R4 = Y(I)±Y(I2)
ADDF *AR3,*AR1,R5 ; R5 = X(I1)+X(I3)
SUBF R1,R5,R6 ; R6 = R5±R1
ADDF R5,R1 ; R1 = R1+R5
ADDF *+AR3,*+AR1,R5
* ; R5 = Y(I1)+Y(I3)
SUBF R5,R3,R7 ; R7 = R3±R5
ADDF R5,R3 ; R3 = R3+R5
STF R3,*+AR0 ; Y(I) = R3+R5
|| STF R1,*AR0++(IR0) ; X(I) = R1+R5
SUBF *AR3,*AR1,R1 ; R1 = X(I1)±X(I3)
SUBF *+AR3,*+AR1,R3
* ; R3 = Y(I1)±Y(I3)
STF R6,*+AR1 ; Y(I1) = R5±R1

Software Applications 11-85


Application-Oriented Operations

|| STF R7,*AR1++(IR0) ; X(I1) = R3±R5


ADDF R3,R2,R5 ; R5 = R2+R3
SUBF R2,R3,R2 ; R2 = ±R2+R3
SUBF R1,R4,R3 ; R3 = R4±R1
ADDF R1,R4 ; R4 = R4+R1
SUBF R5,R3,R1 ; R1 = R3±R5
MPYF *AR4,R1 ; R1 = R1*CO21
ADDF R5,R3 ; R3 = R3+R5
MPYF *AR4,R3 ; R3 = R3*CO21
|| STF R1,*+AR2 ; Y(I2) = (R3±R5)*CO21
SUBF R4,R2,R1 ; R1 = R2±R4
MPYF *AR4,R1 ; R1 = R1*CO21
|| STF R3,*AR2++(IR0) ; X(I2) = (R3+R5)*CO21
ADDF R4,R2 ; R2 = R2+R4
MPYF *AR4,R2 ; R2 = R2*CO21
BLK3 STF R1,*+AR3 ; Y(I3) = ±(R4±R2)*CO21
|| STF R2,*AR3++(IR0) ; X(I3) = (R4+R2)*CO21

CMPI @LPCNT,R0
BPD INLOP ; Loop back to the inner loop

CONT LDI @RPTCNT,AR7


LDI @IEINDX,AR6
LSH 2,AR7 ; Increment repeat counter for
* ; next time
STI AR7,@RPTCNT
LSH 2,AR6 ; IE = 4*IE
STI AR6,@IEINDX
LDI R0,IR0 ; N1 = N2
LSH –3,R0
ADDI 2,R0
STI R0,@JT ; JT = N2/2+2
SUBI 2,R0
LSH 1,R0 ; N2 = N2/4
BR LOOP ; Next FFT stage
* STORE RESULT USING BIT±REVERSED ADDRESSING

11-86
Application-Oriented Operations

END: LDI @FFTSIZ,RC ; RC = N


SUBI 1,RC ; RC should be one less than desired #
LDI @FFTSIZ,IR0 ; IR0 = size of FFT = N
LDI 2,IR1
LDI @INPUT,AR0
LDP STORE
LDI @STORE,AR1

RPTB BITRV
LDF *+AR0(1),R0
|| LDF *AR0++(IR0)B,R1
BITRV STF R0,*+AR1(1)
|| STF R1,*AR1++(IR1)

SELF BR SELF ; Branch to itself at the end


.end

The data to be transformed is usually a sequence of real numbers. In this case,


the FFT demonstrates certain symmetries that permit the reduction of the
computational load even further. Example 11–38 shows the generic imple-
mentation of a real-valued, radix-2 FFT. For such an FFT, the total storage re-
quired for a length-N transform is only N locations; in a complex FFT, 2N are
necessary. Recovery of the rest of the points is based on the symmetry condi-
tions.

Example 11–39 shows the implementation of a radix-2 real inverse FFT. The
inverse transformation assumes that the input data is given in the order pres-
ented at the output of the forward transformation and produces a time signal
in the proper order (that is, bit reversing takes place at the end of the program).

Software Applications 11-87


Application-Oriented Operations

Example 11–38. Real, Radix-2 FFT


*****************************************************************************
* FILENAME : ffft_rl.asm
*
* WRITTEN BY : Alex Tessarolo
* Texas Instruments, Australia
*
* DATE : 23rd July 1991
*
* VERSION : 2.0
*
*****************************************************************************

*
* VER DATE COMMENTS
* ––– –––––––––––– –––––––––––––––––––––––––––––––––––––––––––––––––––
* 1.0 18th July 91 Original release.
* 2.0 23rd July 91 Most stages modified.
* Minimum FFT size increased from 32 to 64.
* Faster in place bit reversing algorithm.
* Program size increased by about 100 words.
* One extra data word required.

*****************************************************************************

* SYNOPSIS: int ffft_rl( FFT_SIZE, LOG_SIZE, SOURCE_ADDR, DEST_ADDR,


* SINE_TABLE, BIT_REVERSE );
*
* int FFT_SIZE ; 64, 128, 256, 512, 1024, ...
* int LOG_SIZE ; 6, 7, 8, 9, 10, ...
* float *SOURCE_ADDR ; Points to location of source data.
* float *DEST_ADDR ; Points to where data will be
* ; operated on and stored.
* float *SINE_TABLE ; Points to the SIN/COS table.
* int BIT_REVERSE ; = 0, bit reversing is disabled.
* ; <> 0, bit reversing is enabled.
*
* NOTE: 1) If SOURCE_ADDR = DEST_ADDR, then in-place bit
* reversing is performed, if enabled (more
* processor intensive).
* 2) FFT_SIZE must be >= 64 (this is not checked).
*

11-88
Application-Oriented Operations

* DESCRIPTION: Generic function to do a radix–2 FFT computation on the C30.


* The data array is FFT_SIZE–long with only real data. The out-
* put is stored in the same locations with real and imaginary
* points R and I as follows:
*
* DEST_ADDR[0] R(0)
* R(1)
* R(2)
* R(3)
* .
* .
* R(FFT_SIZE/2)
* I(FFT_SIZE/2 – 1)
* .
* .
* I(2)
* DEST_ADDR[FFT_SIZE – 1] I(1)
*
* The program is based on the FORTRAN program in the
paper by Sorensen et al., June 1987 issue of Trans.
on ASSP.
*
* Bit reversal is optionally implemented at the begin-
ning of the function.
*
* The sine/cosine table for the twiddle factors is ex-
pected to be supplied in the following format:
*
* SINE_TABLE[0] sin(0*2*pi/FFT_SIZE)
* sin(1*2*pi/FFT_SIZE)
* .
.
* sin((FFT_SIZE/2–2)*2*pi/FFT_SIZE)
* SINE_TABLE[FFT_SIZE/2 – 1] sin((FFT_SIZE/2–1)*2*pi/FFT_SIZE)
*
* NOTE: The table is the first half period of a sine wave.
*
* Stack structure upon call:
*
*
BIT_REVERSE
* –FP(7)
SINE_TABLE
* –FP(6)
DEST_ADDR
* –FP(5)
SOURCE_ADDR
* –FP(4)
LOG_SIZE
* –FP(3)
FFT_SIZE
* –FP(2)
returne
* –FP(1)
addr
* –FP(0)
old FP
*
*
*****************************************************************************

Software Applications 11-89


Application-Oriented Operations

* NOTE: Calling C program can be compiled using either large


* or small model.
*
* WARNING: DP initialized only once in the program. Be wary
* with interrupt service routines. Make sure interrupt
* service routines save the DP pointer.
*
* WARNING: The DEST_ADDR must be aligned such that the first
* LOG_SIZE bits are zero (this is not checked by the
* program).
*
*****************************************************************************
*

* REGISTERS USED: R0, R1, R2, R3, R4, R5, R6, R7


* AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7
* IR0, IR1
* RC, RS, RE
* DP
*
* MEMORY REQUIREMENTS: Program = 405 Words (approximately)
* Data = 7 Words
* Stack = 12 Words
*
*****************************************************************************
*

* BENCHMARKS: Assumptions – Program in RAM0


* – Reserved data in RAM0
* – Stack on primary/expansion bus RAM
* – Sine/cosine tables in RAM0
* – Processing and data destination in RAM1.
* – Primary/expansion bus RAM, 0 wait state.
*
* FFT Size Bit Reversing Data Source Cycles(C30)
* –––––––– ––––––––––––– ––––––––––– –––––––––––
* 1024 OFF RAM1 19816 approx.
* Note: This number does not include the C callable overheads.
* Add 57 cycles for these overheads.
*
*****************************************************************************

FP .set AR3

.global _ffft_rl ; Entry execution point.

FFT_SIZE: .usect ”.fftdata”,1 ; Reserve memory for arguments.


LOG_SIZE: .usect ”.fftdata”,1
SOURCE_ADDR: .usect ”.fftdata”,1
DEST_ADDR: .usect ”.fftdata”,1
SINE_TABLE: .usect ”.fftdata”,1
BIT_REVERSE: .usect ”.fftdata”,1
SEPARATION: .usect ”.fftdata”,1

11-90
Application-Oriented Operations

;
; Initialize C function.
;
.sect ”.ffttext”
_ffft_rl: PUSH FP ; Preserve C environment.
LDI SP,FP
PUSH R4
PUSH R5
PUSH R6
PUSHF R6
PUSH R7
PUSHF R7
PUSH AR4
PUSH AR5
PUSH AR6
PUSH AR7
PUSH DP
LDP FFT_SIZE ; Init. DP pointer.
LDI *–FP(2),R0 ; Move arguments from stack.
STI R0,@FFT_SIZE
LDI *–FP(3),R0
STI R0,@LOG_SIZE
LDI *–FP(4),R0
STI R0,@SOURCE_ADDR
LDI *–FP(5),R0
STI R0,@DEST_ADDR
LDI *–FP(6),R0
STI R0,@SINE_TABLE
LDI *–FP(7),R0
STI R0,@BIT_REVERSE
;
; Check bit reversing mode (on or off).
;
; BIT_REVERSING = 0, then OFF
; (no bit reversing).
; BIT_REVERSING <> 0, Then ON.
;
LDI @BIT_REVERSE,R0
CMPI 0,R0
BZ MOVE_DATA
;
; Check bit reversing type.
;
; If SourceAddr = DestAddr, then in place
; bit reversing.
; If SourceAddr <> DestAddr, then
; standard bit reversing.
;

Software Applications 11-91


Application-Oriented Operations

LDI @SOURCE_ADDR,R0
CMPI @DEST_ADDR,R0
BEQ IN_PLACE

;
; Bit reversing Type 1 (from source to
; destination).
;
; NOTE: abs(SOURCE_ADDR – DEST_ADDR)
; must be > FFT_SIZE, this is not
; checked.
;

LDI @FFT_SIZE,R0
SUBI 2,R0
LDI @FFT_SIZE,IR0
LSH –1,IR0 ; IRO = half FFT size.
LDI @SOURCE_ADDR,AR0
LDI @DEST_ADDR,AR1

LDF *AR0++,R1

RPTS R0
LDF *AR0++,R1
|| STF R1,*AR1++(IR0)B

STF R1,*AR1++(IR0)B

BR START

;
; In-place bit reversing.
;

; Bit reversing on even locations,


; 1st half only.

IN_PLACE: LDI @FFT_SIZE,IR0


LSH –2,IR0 ; IRO = quarter FFT size.
LDI 2,IR1

LDI @FFT_SIZE,RC
LSH –2,RC
SUBI 3,RC
LDI @DEST_ADDR,AR0
LDI AR0,AR1
LDI AR0,AR2

NOP *AR1++(IR0)B
NOP *AR2++(IR0)B
LDF *++AR0(IR1),R0
LDF *AR1,R1
CMPI AR1,AR0 ; Xchange locs only if AR0<AR1.
LDFGT R0,R1
LDFGT *AR1++(IR0)B,R1

11-92
Application-Oriented Operations

RPTB BITRV1
LDF *++AR0(IR1),R0
|| STF R0,*AR0
LDF *AR1,R1
|| STF R1,*AR2++(IR0)B
CMPI AR1,AR0
LDFGT R0,R1
BITRV1: LDFGT *AR1++(IR0)B,R0

STF R0,*AR0
STF R1,*AR2

; Perform bit reversing on odd


; locations, 2nd half only.

LDI @FFT_SIZE,RC
LSH –1,RC
LDI @DEST_ADDR,AR0
ADDI RC,AR0
ADDI 1,AR0
LDI AR0,AR1
LDI AR0,AR2
LSH –1,RC
SUBI 3,RC

NOP *AR1++(IR0)B
NOP *AR2++(IR0)B
LDF *++AR0(IR1),R0
LDF *AR1,R1
CMPI AR1,AR0 ; Xchange locs only if AR0<AR1.
LDFGT R0,R1
LDFGT *AR1++(IR0)B,R1

RPTB BITRV2
LDF *++AR0(IR1),R0
|| STF R0,*AR0
LDF *AR1,R1
|| STF R1,*AR2++(IR0)B
CMPI AR1,AR0
LDFGT R0,R1
BITRV2: LDFGT *AR1++(IR0)B,R0

STF R0,*AR0
STF R1,*AR2

; Perform bit reversing on odd


; locations, 1st half only.

LDI @FFT_SIZE,RC
LSH –1,RC
LDI RC,IR0
LDI @DEST_ADDR,AR0
LDI AR0,AR1
ADDI 1,AR0

Software Applications 11-93


Application-Oriented Operations

ADDI IR0,AR1
LSH –1,RC
LDI RC,IR0
SUBI 2,RC

LDF *AR0,R0
LDF *AR1,R1

RPTB BITRV3
LDF *++AR0(IR1),R0
|| STF R0,*AR1++(IR0)B
BITRV3: LDF *AR1,R1
|| STF R1,*–AR0(IR1)

STF R0,*AR1
STF R1,*AR0

BR START

;
; Check data source locations.
;
; If SourceAddr = DestAddr, then
; do nothing.
; If SourceAddr <> DestAddr, then move
data.
;

MOVE_DATA: LDI @SOURCE_ADDR,R0


CMPI @DEST_ADDR,R0
BEQ START

LDI @FFT_SIZE,R0
SUBI 2,R0
LDI @SOURCE_ADDR,AR0
LDI @DEST_ADDR,AR1

LDF *AR0++,R1

RPTS R0
LDF *AR0++,R1
|| STF R1,*AR1++

STF R1,*AR1

11-94
Application-Oriented Operations

;
; Perform first and second FFT loops.
;
; AR1 I1 0 [X(I1) + X(I2)] + [X(I3) + X(I4)]
;
AR2 I2 1 [X(I1) – X(I2)]
;
; AR3 I3 2 [X(I1) + X(I2)] – [X(I3) + X(I4)]
;
AR4 I4 3 –[X(I3) – X(I4)]
;
; AR1 4
;
;

START: LDI @DEST_ADDR,AR1


LDI AR1,AR2
LDI AR1,AR3
LDI AR1,AR4
ADDI 1,AR2
ADDI 2,AR3
ADDI 3,AR4
LDI 4,IR0
LDI @FFT_SIZE,RC
LSH –2,RC
SUBI 2,RC
LDF *AR2,R0 ; R0 = X(I2)
|| LDF *AR3,R1 ; R1 = X(I3)
ADDF3 R1,*AR4,R4 ; R4 = X(I3) + X(I4)
SUBF3 R1,*AR4++(IR0),R5 ; R5 = –[X(I3) – X(I4)]
SUBF3 R0,*AR1,R6 ; R6 = X(I1) – X(I2)
ADDF3 R0,*AR1++(IR0),R7 ; R7 = X(I1) + X(I2)
ADDF3 R7,R4,R2 ; R2 = R7 + R4
SUBF3 R4,R7,R3 ; R3 = R7 – R4
;
RPTB LOOP1_2 ;
LDF *+AR2(IR0),R0 ;
|| LDF *+AR3(IR0),R1 ;
ADDF3 R1,*AR4,R4 ;
|| STF R3,*AR3++(IR0) ; X(I3)
SUBF3 R1,*AR4++(IR0),R5 ;
|| STF R5,*–AR4(IR0) ; X(I4)
SUBF3 R0,*AR1,R6 ;
|| STF R6,*AR2++(IR0) ; X(I2)
ADDF3 R0,*AR1++(IR0),R7 ;
|| STF R2,*–AR1(IR0) ; X(I1)
ADDF3 R7,R4,R2
LOOP1_2: SUBF3 R4,R7,R3
STF R3,*AR3
|| STF R5,*–AR4(IR0)
STF R6,*AR2
|| STF R2,*–AR1(IR0)

Software Applications 11-95


Application-Oriented Operations

;
; Perform third FFT loop.
;
; Part A:
;
; AR1 I1 0 X(I1) + X(I3)
; 1
;
; I2 2
; 3
;
; AR2 I3 4 X(I1) – X(I3)
; 5
;
; AR3 I4 6 –X(I4)
;
7
;
; AR1 8
;
9
;
;
;

LDI @DEST_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
ADDI 4,AR2
ADDI 6,AR3
LDI 8,IR0
LDI @FFT_SIZE,RC
LSH –3,RC
SUBI 2,RC

SUBF3 *AR2,*AR1,R1
ADDF3 *AR2,*AR1,R2
NEGF *AR3,R3

RPTB LOOP3_A
LDF *+AR2(IR0),R0 ; R0 = X(I3)
|| STF R2,*AR1++(IR0)
SUBF3 R0,*AR1,R1 ; R1 = X(I1) – X(I3)
|| STF R1,*AR2++(IR0) ;
ADDF3 R0,*AR1,R2 ; R2 = X(I1) + X(I3)
|| STF R3,*AR3++(IR0) ;
LOOP3_A: NEGF *AR3,R3 ; R3 = –X(I4)
;
STF R2,*AR1 ; X(I1)
STF R1,*AR2 ; X(I3)
STF R3,*AR3 ; X(I4)

11-96
Application-Oriented Operations

;
; Part B:
;
;
;
; 0
; AR0 I1 1 X[I1] + [X(I3)*COS+ X(I4)*COS]
; 2
; AR1 I2 3 X[I1] – [X(I3)*COS+ X(I4)*COS]
; 4
; AR2 I3 5 –X[I2] – [X(I3)*COS– X(I4)*COS]
; 6
; AR3 I4 7 X[I2] – [X(I3)*COS– X(I4)*COS]
; 8
; AR0 9 NOTE: COS(2*pi/8) = SIN(2*pi/8)

LDI @FFT_SIZE,RC
LSH –3,RC
LDI RC,IR1
SUBI 3,RC
LDI 8,IR0
LDI @DEST_ADDR,AR0
LDI AR0,AR1
LDI AR0,AR2
LDI AR0,AR3
ADDI 1,AR0
ADDI 3,AR1
ADDI 5,AR2
ADDI 7,AR3
LDI @SINE_TABLE,AR7 ; Initialize table pointers.
LDF *++AR7(IR1),R7 ; R7 = COS(2*pi/8)
; *AR7 = COS(2*pi/8)
MPYF3 *AR7,*AR2,R0 ; R0 = X(I3)*COS
MPYF3 *AR3,R7,R1 ; R5 = X(I4)*COS
ADDF3 R0,R1,R2 ; R2 = [X(I3)*COS + X(I4)*COS]
MPYF3 *AR7,*+AR2(IR0),R0
|| SUBF3 R0,R1,R3 ; R3 = –[X(I3)*COS – X(I4)*COS]
SUBF3 *AR1,R3,R4 ; R4 = –X(I2) + R3
ADDF3 *AR1,R3,R4 ; R4 = X(I2) + R3
|| STF R4,*AR2++(IR0) ; X(I3)
SUBF3 R2,*AR0,R4 ; R4 = X(I1) – R2
|| STF R4,*AR3++(IR0) ; X(I4)
ADDF3 *AR0,R2,R4 ; R4 = X(I1) + R2
|| STF R4,*AR1++(IR0) ; X(I2)
;
RPTB LOOP3_B ;
MPYF3 *AR3,R7,R1 ;
|| STF R4,*AR0++(IR0) ; X(I1)
ADDF3 R0,R1,R2
MPYF3 *AR7,*+AR2(IR0),R0

Software Applications 11-97


Application-Oriented Operations

|| SUBF3 R0,R1,R3
SUBF3 *AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2++(IR0)
SUBF3 R2,*AR0,R4
|| STF R4,*AR3++(IR0)
LOOP3_B: ADDF3 *AR0,R2,R4
|| STF R4,*AR1++(IR0)
MPYF3 *AR3,R7,R1
|| STF R4,*AR0++(IR0)
ADDF3 R0,R1,R2
SUBF3 R0,R1,R3
SUBF3 *AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2
SUBF3 R2,*AR0,R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
|| STF R4,*AR1
STF R4,*AR0

11-98
Application-Oriented Operations

;
; Perform fourth FFT loop.
;
; Part A:
;
; AR1 I1 0 X(I1) + X(I3)
; 1
; 2
; 3
; I2 4
; 5
; 6
; 7
; AR2 I3 8 X(I1) – X(I3)
; 9
; 10
; 11
; AR3 I4 12 –X(I4)
; 13
; 14
; 15
; AR1 I5 16
; 17
;
;
;
;
LDI @DEST_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
ADDI 8,AR2
ADDI 12,AR3
LDI 16,IR0
LDI @FFT_SIZE,RC
LSH –4,RC
SUBI 2,RC
SUBF3 *AR2,*AR1,R1
ADDF3 *AR2,*AR1,R2
NEGF *AR3,R3
RPTB LOOP4_A
LDF *+AR2(IR0),R0 ; R0 = X(I3)
|| STF R2,*AR1++(IR0)
SUBF3 R0,*AR1,R1 ; R1 = X(I1) – X(I3)
|| STF R1,*AR2++(IR0) ;
ADDF3 R0,*AR1,R2 ; R2 = X(I1) + X(I3)
|| STF R3,*AR3++(IR0) ;
LOOP4_A: NEGF *AR3,R3 ; R3 = –X(I4)
;
STF R2,*AR1 ; X(I1)
|| STF R1,*AR2 ; X(I3)
STF R3,*AR3 ; X(I4)

Software Applications 11-99


Application-Oriented Operations

;
; Part B:
;
; 0
; AR0 I1 (3rd) 1 X[I1] + [X(I3)*COS+ X(I4)*SIN]
; I1 (2nd) 2 .
; I1 (1st) 3 .
; 4
; I2 (1st) 5 .
; I2 (2nd) 6 .
; AR1 I2 (3rd) 7 X[I1] – [X(I3)*COS+ X(I4)*SIN]
; 8
; AR2 I3 (3rd) 9 –X[I2] – [X(I3)*COS– X(I4)*COS]
; I3 (2nd) 10 .
; AR4 I3 (1st) 11 .
; 12
; I4 (1st) 13 .
; I4 (2nd) 14 .
; AR3 I4 (3rd) 15 X[I2] – [X(I3)*SIN– X(I4)*COS]
; 16
; AR0 17
;
;

LDI @FFT_SIZE,RC
LSH –4,RC
LDI RC,IR1
LDI 2,IR0
SUBI 3,RC
LDI @DEST_ADDR,AR0
LDI AR0,AR1
LDI AR0,AR2
LDI AR0,AR3
LDI AR0,AR4
ADDI 1,AR0
ADDI 7,AR1
ADDI 9,AR2
ADDI 15,AR3
ADDI 11,AR4

LDI @SINE_TABLE,AR7
LDF *++AR7(IR1),R7 ; R7 = SIN(1*[2*pi/16])
; *AR7 = COS(3*[2*pi/16])
LDI AR7,AR6
LDF *++AR6(IR1),R6 ; R6 = SIN(2*[2*pi/16])
; *AR6 = COS(2*[2*pi/16])
LDI AR6,AR5
LDF *++AR5(IR1),R5 ; R5 = SIN(3*[2*pi/16])
; *AR5 = COS(1*[2*pi/16])

LDI 16,IR1

11-100
Application-Oriented Operations

MPYF3 *AR7,*AR4,R0 ; R0 = X(I3)*COS(3)


MPYF3 *++AR2(IR0),R5,R4 ; R4 = X(I3)*SIN(3)
MPYF3 *– –AR3(IR0),R5,R1 ; R1 = X(I4)*SIN(3)
MPYF3 *AR7,*AR3,R0 ; R0 = X(I4)*COS(3)
|| ADDF3 R0,R1,R2 ; R2 = [X(I3)*COS + X(I4)*SIN]
MPYF3 *AR6,*–AR4,R0
|| SUBF3 R4,R0,R3 ; R3 = –[X(I3)*SIN – X(I4)*COS]
SUBF3 *– –AR1(IR0),R3,R4 ; R4 = –X(I2) + R3
ADDF3 *AR1,R3,R4 ; R4 = X(I2) + R3
STF R4,*AR2– – ; X(I3)
SUBF3 R2,*++AR0(IR0),R4 ; R4 = X(I1) – R2
STF R4,*AR3 ; X(I4)
ADDF3 *AR0,R2,R4 ; R4 = X(I1) + R2
STF R4,*AR1 ; X(I2)
;
MPYF3 *++AR3,R6,R1 ;
|| STF R4,*AR0 ; X(I1)
ADDF3 R0,R1,R2
MPYF3 *AR5,*–AR4(IR0),R0
|| SUBF3 R0,R1,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2
SUBF3 R2,*– –AR0,R4
|| STF
STF R4,*AR1
MPYF3 *– –AR2,R7,R4
|| STF R4,*AR0
MPYF3 *++AR3,R7,R1
|| MPYF3 *AR5,*AR3,R0
ADDF3 R0,R1,R2
MPYF3 *AR7,*++AR4(IR1),R0
|| SUBF3 R4,R0,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
STF R4,*AR2++(IR1)
SUBF3 R2,*– –AR0,R4
|| STF R4,*AR3++(IR1)
ADDF3 *AR0,R2,R4
|| STF R4,*AR1++(IR1)
RPTB LOOP4_B
MPYF3 *++AR2(IR0),R5,R4
|| STF R4,*AR0++(IR1)
MPYF3 *– –AR3(IR0),R5,R1
MPYF3 *AR7,*AR3,R0
|| ADDF3 R0,R1,R2
MPYF3 *AR6,*–AR4,R0
|| SUBF3 R4,R0,R3
SUBF3 *– –AR1(IR0),R3,R4
ADDF3 *AR1,R3,R4

Software Applications 11-101


Application-Oriented Operations

|| STF R4,*AR2––
SUBF3 R2,*++AR0(IR0),R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
|| STF R4,*AR1

MPYF3 *++AR3,R6,R1
|| STF R4,*AR0
ADDF3 R0,R1,R2
MPYF3 *AR5,*–AR4(IR0),R0
|| SUBF3 R0,R1,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2
SUBF3 R2,*– –AR0,R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
|| STF R4,*AR1

MPYF3 *– –AR2,R7,R4
|| STF R4,*AR0
MPYF3 *++AR3,R7,R1
MPYF3 *AR5,*AR3,R0
|| ADDF3 R0,R1,R2
MPYF3 *AR7,*++AR4(IR1),R0
|| SUBF3 R4,R0,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2++(IR1)
SUBF3 R2,*– –AR0,R4
|| STF R4,*AR3++(IR1)
LOOP4_B: ADDF3 *AR0,R2,R4
|| STF R4,*AR1++(IR1)

MPYF3 *++AR2(IR0),R5,R4
|| STF R4,*AR0++(IR1)
MPYF3 *– –AR3(IR0),R5,R1
MPYF3 *AR7,*AR3,R0
|| ADDF3 R0,R1,R2
MPYF3 *AR6,*–AR4,R0
|| SUBF3 R4,R0,R3
SUBF3 *– –AR1(IR0),R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2– –
SUBF3 R2,*++AR0(IR0),R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
|| STF R4,*AR1

11-102
Application-Oriented Operations

MPYF3 *++AR3,R6,R1
|| STF R4,*AR0
ADDF3 R0,R1,R2
MPYF3 *AR5,*–AR4(IR0),R0
|| SUBF3 R0,R1,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2
SUBF3 R2,*– –AR0,R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
STF R4,*AR1

MPYF3 *– –AR2,R7,R4
|| STF R4,*AR0
MPYF3 *++AR3,R7,R1
MPYF3 *AR5,*AR3,R0
|| ADDF3 R0,R1,R2
SUBF3 R4,R0,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2
SUBF3 R2,*– –AR0,R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
STF R4,*AR1

STF R4,*AR0

Software Applications 11-103


Application-Oriented Operations

;
; Perform remaining FFT loops (loop 4 onwards).
;
; LOOP
; 1st 2nd
;
; X’(I1) 0 0 X’(I1)+ X’(I3)
; AR1 X(I1) (1st) 1 1 X(I1) + [X(I3)*COS + X(I4)*SIN]
; X(I1) (2nd) 2 2 .
; X(I1) (3rd) 3 3 .
; .
; .
; A
; X’(I2) 8 16
; B .
; .
;
; X(I2) (3rd) 13 29 .
; X(I2) (2nd) 14 30 .
; AR2 X(I2) (1st) 15 31 X[I1] – [X(I3)*COS + X(I4)*SIN]
; X’(I3) 16 32 X’(I1)– X’(I3)
; AR3 X(I3) (1st) 17 33 –X[I2]– [X(I3)*SIN – X(I4)*COS]
; X(I3) (2nd) 18 34 .
; X(I3) (3rd) 19 35 .
; .
; .
; C
; X’(I4) 24 48 –X’(I4)
; D .
; .
;
; X(I4) (3rd) 29 61 .
; X(I4) (2nd) 30 62 .
; AR4 X(I4) (1st) 31 63 X[I2] – [X(I3)*SIN – X(I4)*COS]
; 32 64
; AR1 33 65
;

LDI @FFT_SIZE,IR0
LSH –2,IR0
STI IR0,@SEPARATION
LSH –2,IR0
LDI 5,R5
LDI 3,R7
LDI 16,R6
LDI @DEST_ADDR,AR5
LDI @DEST_ADDR,AR1
LSH –1,IR0
LSH 1,R7
LOOP: ADDI 1,R7
LSH 1,R6
LDI AR1,AR4

11-104
Application-Oriented Operations

ADDI R7,AR1 ; AR1 points at A.


LDI AR1,AR2
ADDI 2,AR2 ; AR2 points at B.
ADDI R6,AR4
SUBI R7,AR4 ; AR4 points at D.
LDI AR4,AR3
SUBI 2,AR3 ; AR3 points at C.
LDI @SINE_TABLE,AR0 ; AR0 points at SIN/COS table.
LDI R7,IR1
LDI R7,RC
INLOP: ADDF3 *– –AR1(IR1),*++AR2(IR1),R0; R0 = X’(I1) + X’(I3)
SUBF3 *– –AR3(IR1),*AR1++,R1 ; R1 = X’(I1) – X’(I3)
NEGF *– –AR4,R2 ; R2 = –X’(I4)
|| STF R0,*–AR1 ; X’(I1)
STF R1,*AR2– – ; X’(I3)
|| STF R2,*AR4++(IR1) ; X’(I4)
LDI @SEPARATION,IR1 ; IR1=SEPARATION
BETWEEN SIN/COS TBLS
SUBI 3,RC
MPYF3 *++AR0(IR0),*AR4,R4 ; R4 = X(I4)*SIN
MPYF3 *AR0,*++AR3,R1 ; R1 = X(I3)*SIN
MPYF3 *++AR0(IR1),*AR4,R0 ; R0 = X(I4)*COS
MPYF3 *AR0,*AR3,R0 ; R0 = X(I3)*COS
|| SUBF3 R1,R0,R3 ; R3 = –[X(I3)*SIN – X(I4)*COS]
MPYF3 *++AR0(IR0),*–AR4,R0
|| ADDF3 R0,R4,R2 ; R2 = X(I3)*COS + X(I4)*SIN
SUBF3 *AR2,R3,R4 ; R4 = R3 – X(I2)
ADDF3 *AR2,R3,R4 ; R4 = R3 + X(I2)
|| STF R4,*AR3++ ; X(I3)
SUBF3 R2,*AR1,R4 ; R4 = X(I1) – R2
|| STF R4,*AR4– – ; X(I4)
ADDF3 *AR1,R2,R4 ; R4 = X(I1) + R2
|| STF R4,*AR2– – ; X(I2)
;
RPTB IN_BLK ;
LDF *–AR0(IR1),R3 ;
MPYF3 *AR4,R3,R4 ;
|| STF R4,*AR1++ ; X(I1)
MPYF3 *AR3,R3,R1
MPYF3 *AR0,*AR3,R0
|| SUBF3 R1,R0,R3
MPYF3 *++AR0(IR0),*–AR4,R0
|| ADDF3 R0,R4,R2
SUBF3 *AR2,R3,R4
ADDF3 *AR2,R3,R4
|| STF R4,*AR3++
SUBF3 R2,*AR1,R4
|| STF R4,*AR4– –
IN_BLK: ADDF3 *AR1,R2,R4
|| STF R4,*AR2– –

Software Applications 11-105


Application-Oriented Operations

LDF *–AR0(IR1),R3
MPYF3 *AR4,R3,R4
|| STF R4,*AR1++
MPYF3 *AR3,R3,R1
MPYF3 *AR0,*AR3,R0
|| SUBF3 R1,R0,R3
LDI R6,IR1
ADDF3 R0,R4,R2
SUBF3 *AR2,R3,R4
ADDF3 *AR2,R3,R4
|| STF R4,*AR3++(IR1)
SUBF3 R2,*AR1,R4
|| STF R4,*AR4++(IR1)
ADDF3 *AR1,R2,R4
|| STF R4,*AR2++(IR1)

STF R4,*AR1++(IR1)

SUBI3 AR5,AR1,R0
CMPI @FFT_SIZE,R0
BLTD INLOP ; LOOP BACK TO THE
INNER LOOP
LDI @SINE_TABLE,AR0 ; AR0 POINTS TO
SIN/COS TABLE
LDI R7,IR1
LDI R7,RC

ADDI 1,R5
CMPI @LOG_SIZE,R5
BLED LOOP
LDI @DEST_ADDR,AR1
LSH –1,IR0
LSH 1,R7

11-106
Application-Oriented Operations

;
; Return to C environment.
;

POP DP ; Restore C environment


; variables.
POP AR7
POP AR6
POP AR5
POP AR4
POPF R7
POP R7
POPF R6
POP R6
POP R5
POP R4
POP FP
RETS

.end

*
* No more.
*
*****************************************************************************

Software Applications 11-107


Application-Oriented Operations

Example 11–39. Real Inverse, Radix-2 FFT


* Real Inverse FFT
*****************************************************************************
*
* FILENAME : ifft_rl.asm
*
* WRITTEN BY : Daniel Mazzocco
* Texas Instruments, Houston
*
* DATE : 18th Feb 1992
*
* VERSION : 1.0
*

*****************************************************************************
* VER DATE COMMENTS
* ––– –––––––––––– –––––––––––––––––––––––––––––––––––––––––––––––––––
* 1.0 18th Feb 92 Original release. Started from forward real FFT
* routine written by Alex Tessarolo, rev 2.0 .
*

*****************************************************************************
*
* SYNOPSIS: int ifft_rl( FFT_SIZE, LOG_SIZE, SOURCE_ADDR,
DEST_ADDR, SINE_TABLE, BIT_REVERSE );
*
* int FFT_SIZE ; 64, 128, 256, 512, 1024, ...
* int LOG_SIZE ; 6, 7, 8, 9, 10, ...
* float *SOURCE_ADDR ; Points to where data is originated
* ; and operated on.
* float *DEST_ADDR ; Points to where data will be stored.
* float *SINE_TABLE ; Points to the SIN/COS table.
* int BIT_REVERSE ; = 0, bit reversing is disabled.
* ; <> 0, bit reversing is enabled.
*
* NOTE: 1) If SOURCE_ADDR = DEST_ADDR, then in place bit
* reversing is performed, if enabled (more
* processor intensive).
* 2) FFT_SIZE must be >= 64 (this is not checked).
*

11-108
Application-Oriented Operations

* DESCRIPTION: Generic function to do an inverse radix–2 FFT computation


* on the C30.
* The data array is FFT_SIZE long with real and imaginary
* points R and I as follows:
*
* SOURCE_ADDR[0] R(0)
* R(1)
* R(2)
* R(3)
* .
* .
* R(FFT_SIZE/2)
* I(FFT_SIZE/2 – 1)
* .
* .
* I(2)
* SOURCE_ADDR[FFT_SIZE–1] I(1)
*
* The output data array will contain only real values.
* Bit reversal is optionally implemented at the end
* of the function.
*
* The sine/cosine table for the twiddle factors is expected
* to be supplied in the following format:
*
* SINE_TABLE[0] sin(0*2*pi/FFT_SIZE)
* sin(1*2*pi/FFT_SIZE)
* .
* .
* sin((FFT_SIZE/2–2)*2*pi/FFT_SIZE)
* SINE_TABLE[FFT_SIZE/2–1] sin((FFT_SIZE/2–1)*2*pi/FFT_SIZE)
*
* NOTE: The table is the first half period of a sine wave.
*
* Stack structure upon call:
*
*
* –FP(7) BIT_REVERSE
* –FP(6) SINE_TABLE
* –FP(5) DEST_ADDR
* –FP(4) SOURCE_ADDR
* –FP(3) LOG_SIZE
* –FP(2) FFT_SIZE
* –FP(1) returne
* –FP(0) addr
* old FP
*
*****************************************************************************

Software Applications 11-109


Application-Oriented Operations

* NOTE: Calling C program can be compiled using either large


* or small model.
*
* WARNING: DP initialized only once in the program. Be wary
* with interrupt service routines. Make sure interrupt
* service routines save the DP pointer.
*
* WARNING: The SOURCE_ADDR must be aligned such that the first
* LOG_SIZE bits are zero (this is not checked by the
* program).
*

*****************************************************************************
*
* REGISTERS USED: R0, R1, R2, R3, R4, R5, R6, R7
* AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7
* IR0, IR1
* RC, RS, RE
* DP
*
* MEMORY REQUIREMENTS: Program = 322 words (approximately)
* Data = 7 words
* Stack = 12 words
*
*****************************************************************************

*
* BENCHMARKS: Assumptions – Program in RAM0
* – Reserved data in RAM0
* – Stack on primary/expansion bus RAM
* – Sine/cosine tables in RAM0
* – Processing and data destination in RAM1
* – Primary/expansion bus RAM, 0 wait state
*
* FFT Size Bit Reversing Data Source Cycles(C30)
* –––––––– ––––––––––––– ––––––––––– –––––––––––
* 1024 OFF RAM1 25892 approx.
* Note: This number does not include the C callable overheads.
* Add 57 cycles for these overheads.
*****************************************************************************

FP .set AR3

.global _ifft_rl ; Entry execution point.

FFT_SIZE: .usect ”.ifftdata”,1 ; Reserve memory for arguments.


LOG_SIZE: .usect ”.ifftdata”,1
SOURCE_ADDR: .usect ”.ifftdata”,1
DEST_ADDR: .usect ”.ifftdata”,1
SINE_TABLE: .usect ”.ifftdata”,1
BIT_REVERSE: .usect ”.ifftdata”,1
SEPARATION: .usect ”.ifftdata”,1

11-110
Application-Oriented Operations

;
; Initialize C Function.
;

.sect ”.iffttext”

_ifft_rl: PUSH FP ; Preserve C environment.


LDI SP,FP
PUSH R4
PUSH R5
PUSH R6
PUSHF R6
PUSH R7
PUSHF R7
PUSH AR4
PUSH AR5
PUSH AR6
PUSH AR7
PUSH DP

LDP FFT_SIZE ; Initialize DP pointer.

LDI *–FP(2),R0 ; Move arguments from stack.


STI R0,@FFT_SIZE
LDI *–FP(3),R0
STI R0,@LOG_SIZE
LDI *–FP(4),R0
STI R0,@SOURCE_ADDR
LDI *–FP(5),R0
STI R0,@DEST_ADDR
LDI *–FP(6),R0
STI R0,@SINE_TABLE
LDI *–FP(7),R0
STI R0,@BIT_REVERSE

Software Applications 11-111


Application-Oriented Operations

;
; Perform last FFT loops first (loop 2 onwards).
;
; LOOP
; 1st 2nd
;
; X’(I1) 0 0 X’(I1)+ X’(I3)
; AR1 X(I1) (1st) 1 1 X(I1) + [X(I2)
; X(I1) (2nd) 2 2 .
; X(I1) (3rd) 3 3 .
; .
; .
; A
; X’(I2) 8 16 X’(12)* 2
; B .
; .
;
; X(I2) (3rd) 13 29 .
; X(I2) (2nd) 14 30 .
; AR2 X(I2) (1st) 15 31 X[I4] – [X(I3)
; X’(I3) 16 32 X’(I1)– X’(I3)
; AR3 X(I3) (1st) 17 33 [X(I1)–X(I2)]*COS–[X(I3)+X(I4)]*SIN
; X(I3) (2nd) 18 34 .
; X(I3) (3rd) 19 35 .
; .
; .
; C
; X’(I4) 24 48 –X’(I4)* 2
; D .
; .
;
; X(I4) (3rd) 29 61 .
; X(I4) (2nd) 30 62 .
; AR4 X(I4) (1st) 31 63 [X(I2)–X(I2)]*SIN+[X(I3)+X(I4)]*COS
; 32 64
; AR1 33 65
;
;

LDI 1,IR0 ; Step between two consecutive sines


LDI 4,R5 ; Stage number from 4 to M.
LDI @FFT_SIZE,R7
LSH –2,R7 ; R7 is FFT_SIZE/4–1 (ie 15 for 64 pts)
SUBI 1,R7 ; and will be used to point at A & D.
LDI @FFT_SIZE,R6 ; R6 will be used to point at D.
LSH 1,R6
LDI @SOURCE_ADDR,AR5
LDI @SOURCE_ADDR,AR1

LOOP: LSH –1,R6 ; R6 is FFT_SIZE at the 1st loop.


LDI AR1,AR4
ADDI R7,AR1 ; AR1 points at A.

11-112
Application-Oriented Operations

LDI AR1,AR2
ADDI 2,AR2 ; AR2 points at B.
ADDI R6,AR4
SUBI R7,AR4 ; AR4 points at D.
LDI AR4,AR3
SUBI 2,AR3 ; AR3 points at C.

LDI R7,IR1
LDI R7,RC

INLOP: ADDF3 *– –AR1(IR1),*


– –AR3(IR1),R0 ; R0 = X’(I1) + X’(I3)
SUBF3 *AR3,*AR1,R1 ; R1 = X’(I1) – X’(I3)
LDF *– –AR4,R2
|| STF R0,*AR1++ ; X’(I1)
MPYF –2.0,R2 ; R2 = –2*X’(I4)
LDF *– –AR2,R3
|| STF R1,*AR3++ ; X’(I3)
MPYF 2.0,R3 ; R3 = 2*X’(I2)
STF R3,*AR2++(IR1) ; X’(I2)
|| STF R2,*AR4++(IR1) ; X’(I4)

LDI @FFT_SIZE,IR1 ; IR1=separation between SIN/


; COS tbls
LDI @SINE_TABLE,AR0; AR0 points at SIN/COS table.
LSH –2,IR1
SUBI 3,RC

SUBF3 *AR2,*AR1,R3 ; R3 = X(I1)–X(I2)


ADDF3 *AR1,*AR2,R2 ; R2 = X(I1)+X(I2)
MPYF3 R3,*++AR0(IR0),R1; R1 = R3*SIN
LDF *AR4,R4 ; R4 = X(I4)
MPYF3 R3,*++AR0(IR1),R0; R0 = R3*COS
|| SUBF3 *AR3,R4,R3 ; R3 = X(I4)–X(I3)
ADDF3 R4,*AR3,R2 ; R2 = X(I3)+X(I4)
|| STF R2,*AR1++ ; X(I1)
MPYF3 R2,*AR0– –(IR1),R4; R4 = R2*COS
|| STF R3,*AR2–– ; X(I2)
ADDF3 R4,R1,R3 ; R3 = R3*SIN + R2*COS
MPYF3 R2,*AR0,R1 ; R1 = R2*SIN
|| STF R3,*AR4– – ; X(I4)
SUBF3 R1,R0,R4 ; R4 = R3*COS – R2*SIN

RPTB IN_BLK

Software Applications 11-113


Application-Oriented Operations

SUBF3 *AR2,*AR1,R3 ; R3 = X(I1)–X(I2)


ADDF3 *AR1,*AR2,R2 ; R2 = X(I1)+X(I2)
MPYF3 R3,*++AR0(IR0),R1; R1 = R3*SIN
|| STF R4,*AR3++ ; X(I3)
LDF *AR4,R4 ; R4 = X(I4)
MPYF3 R3,*++AR0(IR1),R0; R0 = R3*COS
|| SUBF3 *AR3,R4,R3 ; R3 = X(I4)–X(I3)
ADDF3 R4,*AR3,R2 ; R2 = X(I3)+X(I4)
|| STF R2,*AR1++ ; X(I1)
MPYF3 R2,*AR0– –(IR1),R4; R4 = R2*COS
|| STF R3,*AR2– – ; X(I2)
ADDF3 R4,R1,R3 ; R3 = R3*SIN + R2*COS
MPYF3 R2,*AR0,R1 ; R1 = R2*SIN
|| STF R3,*AR4– – ; X(I4)
IN_BLK: SUBF3 R1,R0,R4 ; R4 = R3*COS – R2*SIN

SUBF3 *AR2,*AR1,R3 ; R3 = X(I1)–X(I2)


ADDF3 *AR1,*AR2,R2 ; R2 = X(I1)+X(I2)
MPYF3 R3,*++AR0(IR0),R1; R1 = R3*SIN
|| STF R4,*AR3++ ; X(I3)
LDF *AR4,R4 ; R4 = X(I4)
MPYF3 R3,*++AR0(IR1),R0; R0 = R3*COS
|| SUBF3 *AR3,R4,R3 ; R3 = X(I4)–X(I3)
ADDF3 R4,*AR3,R2 ; R2 = X(I3)+X(I4)
|| STF R2,*AR1 ; X(I1)
MPYF3 R2,*AR0– –(IR1),R4; R4 = R2*COS
|| STF R3,*AR2 ; X(I2)
LDI R6,IR1 ; Get prepared for the next
ADDF3 R4,R1,R3 ; R3 = R3*SIN + R2*COS
MPYF3 R2,*AR0,R1 ; R1 = R2*SIN
|| STF R3,*AR4++(IR1) ; X(I4)
SUBF3 R1,R0,R4 ; R4 = R3*COS – R2*SIN
NEGF *AR1++(IR1),R2 ; Dummy
|| STF R4,*AR3++(IR1) ; X(I3)

SUBI3 AR5,AR1,R0
CMPI @FFT_SIZE,R0
BLTD INLOP ; Loop back to the inner loop
NOP *AR2++(IR1) ; Dummy
LDI R7,IR1
LDI R7,RC

ADDI 1,R5
CMPI @LOG_SIZE,R5 ; Next stage if any left
BLED LOOP
LDI @SOURCE_ADDR,AR1
LSH 1,IR0 ; Double step in sinus table
LSH –1,R7

11-114
Application-Oriented Operations

;
; Perform third FFT loop.

;Part A:
; AR1 I1 0 X (I1) + X(I3)
;
; 1
; AR2 I2 2 2 * X(I2)
;
; 3
; AR3 I3 4 X (I1) – X(I3)
;
; 5
; AR3 I4 6 –2 * X(I4)
;
; 7
; AR1 8
;
; 9
;
;
;
;
;

LDI @SOURCE_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
LDI AR1,AR4
ADDI 2,AR2
ADDI 4,AR3
ADDI 6,AR4
LDI 8,IR0
LDI @FFT_SIZE,RC
LSH –3,RC
SUBI 1,RC
LDI @SINE_TABLE,AR0 ; AR0 points at SIN/COS table.

RPTB LOOP3_A
LDF *AR3,R3
ADDF3 R3,*AR1,R0 ; R0 = X’(I1) + X’(I3)
SUBF3 R3,*AR1,R1 ; R1 = X’(I1) – X’(I3)
LDF *AR4,R2 ;
|| STF R0,*AR1++(IR0) ; X’(I1)
MPYF –2.0,R2 ; R2 = –2*X’(I4)
LDF *AR2,R3 ;
|| STF R1,*AR3++(IR0) ; X’(I3)
MPYF 2.0,R3 ; R3 = 2*X’(I2)
LOOP3_A: STF R3,*AR2++(IR0) ; X’(I2)
|| STF R2,*AR4++(IR0) ; X’(I4)

Software Applications 11-115


Application-Oriented Operations

;
; Part B:
;
; 0
; AR1 I1 1 X(I1) + X(I2)
; 2
; AR2 I2 3 X(I1) – X(I3)
; 4
; AR3 I3 5 [X(I1)– X(I2)]*COS– [X(I3)+ X(I4)]*SIN
; 6
; AR4 I4 7 [X(I1)– X(I2)]*SIN+ [X(I3)+ X(I4)]*COS]
; 8
; AR1 9 NOTE: COS(2*pi/8) = SIN(2*pi/8)
;
;
;

LDI @SOURCE_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
LDI AR1,AR4
ADDI 1,AR1
ADDI 3,AR2
ADDI 5,AR3
ADDI 7,AR4
LDI @SINE_TABLE,AR7 ; AR7 points at SIN/COS table.
LDI @FFT_SIZE,RC
LSH –3,RC
LDI RC,IR1
SUBI 2,RC

11-116
Application-Oriented Operations

LDF *AR2,R6 ; R6 = X(I2)


LDF *AR3,R0 ; R0 = X(I3)
ADDF3 R6,*AR1,R5 ; R5 = X(I1)+X(I2)
SUBF3 R6,*AR1,R4 ; R4 = X(I1)–X(I2)
SUBF3 R0,R4,R3 ; R3 = X(I1)–X(I2)–X(I3)
ADDF3 R0,R4,R2 ; R2 = X(I1)–X(I2)+X(I3)
SUBF3 R0,*AR4,R1 ; R1 = X(I4)–X(I3)
|| STF R5,*AR1++(IR0) ; X(I1)
ADDF3 R2,*AR4,R5 ; R5 = X(I1)–X(I2)+X(I3)+X(I4)
|| STF R1,*AR2++(IR0) ; X(I2)
MPYF3 R5,*++AR7(IR1),R1 ; R1 = R5*SIN
|| SUBF3 *AR4,R3,R2 ; R2 = X(I1)–X(I2)–X(I3)–X(I4)
MPYF3 R2,*AR7,R0 ; R0 = R2*SIN
|| STF R1,*AR4++(IR0) ; X(I4)
;
RPTB LOOP3_B ;
;
LDF *AR2,R6 ; R6 = X(I2)
|| STF R0,*AR3++(IR0) ; X(I3)
ADDF3 R6,*AR1,R5 ; R5 = X(I1)+X(I2)
LDF *AR3,R0 ; R0 = X(I3)
SUBF3 R6,*AR1,R4 ; R4 = X(I1)–X(I2)
SUBF3 R0,R4,R3 ; R3 = X(I1)–X(I2)–X(I3)
ADDF3 R0,R4,R2 ; R2 = X(I1)–X(I2)+X(I3)
SUBF3 R0,*AR4,R1 ; R1 = X(I4)–X(I3)
|| STF R5,*AR1++(IR0) ; X(I1)
ADDF3 R2,*AR4,R5 ; R5 = X(I1)–X(I2)+X(I3)+X(I4)
|| STF R1,*AR2++(IR0) ; X(I2)
MPYF3 R5,*AR7,R1 ; R1 = R5*SIN
|| SUBF3 *AR4,R3,R2 ; R2 = X(I1)–X(I2)–X(I3)–X(I4)
LOOP3_B: MPYF3 R2,*AR7,R0 ; R0 = R2*SIN
|| STF R1,*AR4++(IR0) ; X(I4)

STF R0,*AR3 ; X(I3)

Software Applications 11-117


Application-Oriented Operations

;
; Perform first and second FFT loops.
;
; AR1 I1 0 X(I1) + X(I3) + 2*X(I2)
; AR2 I2 1 X(I1) + X(I3) – 2*X(I2)
; AR3 I3 2 X(I1) – X(I3) – 2*X(I4)
; AR4 I4 3 X(I1) – X(I3) + 2*X(I4)
; AR1 4
;
;
;

LDI @SOURCE_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
LDI AR1,AR4
ADDI 1,AR2
ADDI 2,AR3
ADDI 3,AR4
LDI 4,IR0
LDI @FFT_SIZE,RC
LSH –2,RC
SUBI 2,RC

11-118
Application-Oriented Operations

LDF *AR4,R6 ; R6 = X(I4)


LDF *AR2,R7 ; R7 = X(I2)
|| LDF *AR1,R1 ; R1 = X(I1)
MPYF 2.0,R6 ; R6 = 2 * X(I4)
MPYF 2.0,R7 ; R7 = 2 * X(I2)
SUBF3 R6,*AR3,R5 ; R5 = X(I3) – 2*X(I4)
SUBF3 R5,R1,R4 ; R4 = X(I1)–X(I3)+2X(I4)
SUBF3 R7,*AR3,R5 ; R5 = X(I3) – 2*X(I2)
|| STF R4,*AR4++(IR0) ; X(I4)
ADDF3 R5,R1,R3 ; R3 = X(I1)+X(I3)–2X(I2)
ADDF3 R6,*AR3,R4 ; R4 = X(I3) + 2*X(I4)
|| STF R3,*AR2++(IR0) ; X(I2)
SUBF3 R4,R1,R4 ; R4 = X(I1)–X(I3)–2X(I4)
ADDF3 R7,*AR3,R0 ; R0 = X(I3) + 2*X(I2)
|| STF R4,*AR3++(IR0) ; X(I3)
ADDF3 R0,R1,R0 ; R0 = X(I1)+X(I3)+2X(I2)
;
RPTB LOOP1_2 ;
LDF *AR4,R6 ; R6 = X(I4)
|| STF R0,*AR1++(IR0) ; X(I1)
MPYF 2.0,R6 ; R6 = 2 * X(I4)
LDF *AR2,R7 ; R7 = X(I2)
|| LDF *AR1,R1 ; R1 = X(I1)
MPYF 2.0,R7 ; R7 = 2 * X(I2)
SUBF3 R6,*AR3,R5 ; R5 = X(I3) – 2*X(I4)
SUBF3 R5,R1,R4 ; R4 = X(I1)–X(I3)+2X(I4)
SUBF3 R7,*AR3,R5 ; R5 = X(I3) – 2*X(I2)
|| STF R4,*AR4++(IR0) ; X(I4)
ADDF3 R5,R1,R3 ; R3 = X(I1)+X(I3)–2X(I2)
ADDF3 R6,*AR3,R4 ; R4 = X(I3) + 2*X(I4)
|| STF R3,*AR2++(IR0) ; X(I2)
SUBF3 R4,R1,R4 ; R4 = X(I1)–X(I3)–2X(I4)
ADDF3 R7,*AR3,R0 ; R0 = X(I3) + 2*X(I2)
|| STF R4,*AR3++(IR0) ; X(I3)
LOOP1_2: ADDF3 R0,R1,R0 ; R0 = X(I1)+X(I3)+2X(I2)
;
STF R0,*AR1 ; LAST X(I1)

Software Applications 11-119


Application-Oriented Operations

;
; Check bit reversing mode (on or off).
;
; BIT_REVERSING = 0, then OFF (no bit reversing).
; BIT_REVERSING <> 0, then ON.
;
LDI @BIT_REVERSE,R0
CMPI 0,R0
BZ MOVE_DATA

;
; Check bit reversing type.
;
; If SourceAddr = DestAddr, then in place bit reversing.
; If SourceAddr <> DestAddr, then standard bit reversing.
;

LDI @SOURCE_ADDR,R0
CMPI @DEST_ADDR,R0
BEQ IN_PLACE

;
; Bit reversing type 1 (from source to destination).
;
; NOTE: abs(SOURCE_ADDR – DEST_ADDR) must be > FFT_SIZE, this is not checked.
;

LDI @FFT_SIZE,R0
SUBI 2,R0
LDI @FFT_SIZE,IR0
LSH –1,IR0 ; IRO = half FFT size.
LDI @SOURCE_ADDR,AR0
LDI @DEST_ADDR,AR1

LDF *AR0++,R1

RPTS R0
LDF *AR0++,R1
|| STF R1,*AR1++(IR0)B

STF R1,*AR1++(IR0)B

BR DIVISION

11-120
Application-Oriented Operations

;
; In-place bit reversing.
;

; Bit reversing on even locations, 1st half


; only.

IN_PLACE: LDI @FFT_SIZE,IR0


LSH –2,IR0 ; IRO = quarter FFT size.
LDI 2,IR1

LDI @FFT_SIZE,RC
LSH –2,RC
SUBI 3,RC
LDI @DEST_ADDR,AR0
LDI AR0,AR1
LDI AR0,AR2

NOP *AR1++(IR0)B
NOP *AR2++(IR0)B
LDF *++AR0(IR1),R0
LDF *AR1,R1
CMPI AR1,AR0 ; Xchange locations only if AR0<AR1.
LDFGT R0,R1
LDFGT *AR1++(IR0)B,R1

RPTB BITRV1
LDF *++AR0(IR1),R0
|| STF R0,*AR0
LDF *AR1,R1
|| STF R1,*AR2++(IR0)B
CMPI AR1,AR0
LDFGT R0,R1
BITRV1: LDFGT *AR1++(IR0)B,R0

STF R0,*AR0
STF R1,*AR2

; Perform bit reversing on odd locations,


; 2nd half only.

LDI @FFT_SIZE,RC
LSH –1,RC
LDI @DEST_ADDR,AR0
ADDI RC,AR0
ADDI 1,AR0
LDI AR0,AR1
LDI AR0,AR2
LSH –1,RC
SUBI 3,RC

NOP *AR1++(IR0)B
NOP *AR2++(IR0)B
LDF *++AR0(IR1),R0

Software Applications 11-121


Application-Oriented Operations

LDF *AR1,R1
CMPI AR1,AR0 ; Xchange locations only if AR0<AR1.
LDFGT R0,R1
LDFGT *AR1++(IR0)B,R1

RPTB BITRV2
LDF *++AR0(IR1),R0
|| STF R0,*AR0
LDF *AR1,R1
|| STF R1,*AR2++(IR0)B
CMPI AR1,AR0
LDFGT R0,R1
BITRV2: LDFGT *AR1++(IR0)B,R0

STF R0,*AR0
STF R1,*AR2

; Perform bit reversing on odd


; locations, 1st half only.

LDI @FFT_SIZE,RC
LSH –1,RC
LDI RC,IR0
LDI @DEST_ADDR,AR0
LDI AR0,AR1
ADDI 1,AR0
ADDI IR0,AR1
LSH –1,RC
LDI RC,IR0
SUBI 2,RC

LDF *AR0,R0
LDF *AR1,R1

RPTB BITRV3
LDF *++AR0(IR1),R0
|| STF R0,*AR1++(IR0)B
BITRV3: LDF *AR1,R1
|| STF R1,*–AR0(IR1)

STF R0,*AR1
STF R1,*AR0

BR DIVISION

11-122
Application-Oriented Operations

;
; Check data source locations.
;
; If SourceAddr =
; DestAddr, then do nothing.
; If SourceAddr <>
; DestAddr, then move data.
;

MOVE_DATA: LDI @SOURCE_ADDR,R0


CMPI @DEST_ADDR,R0
BEQ DIVISION

LDI @FFT_SIZE,R0
SUBI 2,R0
LDI @SOURCE_ADDR,AR0
LDI @DEST_ADDR,AR1

LDF *AR0++,R1

RPTS R0
LDF *AR0++,R1
|| STF R1,*AR1++

STF R1,*AR1

DIVISION: LDI 2,IR0


LDI @FFT_SIZE,R0
FLOAT R0 ; exp = LOG_SIZE
PUSHF R0 ; 32 MSB’S saved
POP R0
NEGI R0 ; Neg exponent
PUSH R0
POPF R0 ; R0 = 1/FFT_SIZE
LDI @DEST_ADDR,AR1
LDI @DEST_ADDR,AR2
NOP *AR2++
LDI @FFT_SIZE,RC
LSH –1,RC
SUBI 2,RC
MPYF3 R0,*AR1,R1 ; 1st location
RPTB LAST_LOOP
MPYF3 R0,*AR2,R2 ; 2nd,4th,6th,... location
|| STF R1,*AR1++(IR0)
LAST_LOOP: MPYF3 R0,*AR1,R1 ; 3rd,5th,7th,... location
|| STF R2,*AR2++(IR0)

MPYF3 R0,*AR2,R2 ; Last location


|| STF R1,*AR1
STF R2,*AR2

Software Applications 11-123


Application-Oriented Operations

; Return to C environment.
;

POP DP ; Restore C environment variables.


POP AR7
POP AR6
POP AR5
POP AR4
POPF R7
POP R7
POPF R6
POP R6
POP R5
POP R4
POP FP
RETS

.end

*
* No more.
*
*****************************************************************************
*

The TMS320C3x quickly executes FFT lengths up to 1024 points (complex)


or 2048 (real), covering most applications, because it can do so almost entirely
in on-chip memory. Table 11–1 and Table 11–2 summarize the number of CPU
clock cycles and the execution time required for FFT lengths between 64 and
1024 points for the four algorithms.

11-124
Application-Oriented Operations

Table 11–1. TMS320C3x FFT Timing Benchmarks (Cycles)


FFT Timing in Cycles
Number
N b off RADIX-2 RADIX-4 RADIX-2 RADIX-2
Points (Complex) (Complex) (Real) (Real Inverse)
64 2 770 2 050 810 1 070

128 6 170 — 1 760 2 370

256 13 600 10 400 3 940 5 290

512 29 740 — 8 860 11 740

1 024 64 570 50 670 19 820 25 900

1 024† 39 500
† This benchmark is based on the Meyer and Schwarz program found in Digital Signal Processing Applications With the TMS320
Family, Volume 3.

Table 11–2. TMS320C3x FFT Timing Benchmarks (Milliseconds)


FFT Timing in Milliseconds
N b off
Number RADIX-2 RADIX-4 RADIX-2 RADIX-2
Points (Complex) (Complex) (Real) (Real Inverse)
64 0.139 0.103 0.041 0.054

128 0.309 — 0.088 0.119

256 0.680 0.520 0.197 0.265

512 1.487 — 0.443 0.587

1 024 3.229 2.534 0.991 1.295

1 024† 1.975
† This benchmark is based on the Meyer and Schwarz program found in Digital Signal Processing Applications With the TMS320
Family, Volume 3.

11.4.5 Lattice Filters


The lattice form is an alternative way of implementing digital filters; it has found
applications in speech processing, spectral estimation, and other areas. In this
discussion, the notation and terminology from speech processing applications
are used.

If H(z) is the transfer function of a digital filter that has only poles, A(z) = 1/H(z)
will be a filter having only 0s, and it will be called the inverse filter. The inverse
lattice filter is shown in Figure 11–5. These equations describe the filter in
mathematical terms:

Software Applications 11-125


Application-Oriented Operations

f (i,n) = f (i – 1,n) + k (i ) b (i – 1,n – 1)


b (i,n) = b (i – 1,n – 1) + k (i ) f (i – 1,n)

Initial conditions:
f (0,n) = b (0,n) = x (n)

Final conditions:
y (n) = f ( p,n)

In the above equation, f (i,n) is the forward error, b (i,n) is the backward error,
k (i ) is the i-th reflection coefficient, x (n) is the input, and y (n) is the output
signal. The order of the filter (that is, the number of stages) is p. In the linear
predictive coding (LPC) method of speech processing, the inverse lattice filter
is used during analysis, and the (forward) lattice filter during speech synthesis.

Figure 11–5. Structure of the Inverse Lattice Filter

x(n) f(0, n) f(1, n) f(p –1, n) f(p, n) = y(n)

K1 K2 Kp

K1 K2 Kp
z –1 z –1 z –1
b(0, n) b(1, n) b(p–1, n)

Figure 11–6 shows the data memory organization of the inverse lattice-filter
on the TMS320C3x.

Figure 11–6. Data Memory Organization for Lattice Filters


Reflection Backward
Coefficients Propagation Terms
Low k(1) b(0, n –1)
Address
k(2) b(1, n –1)

• •
• •
• •
High k(p) b(p –1, n –1)
Address

Example 11–40 shows the implementation of an inverse lattice filter.

11-126
Application-Oriented Operations

Example 11–40. Inverse Lattice Filter

* TITLE INVERSE LATTICE FILTER


*
*
* SUBROUTINE LATINV
*
* LATINV == LATTICE FILTER (LPC INVERSE FILTER ± ANALYSIS)
*
*
* TYPICAL CALLING SEQUENCE:
*

* load R2
* load AR0
* load AR1
* load RC
* CALL LATINV
*
*

* ARGUMENT ASSIGNMENTS:

* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R2 | f(0,n) = x(n)
* AR0 | ADDRESS OF FILTER COEFFICIENTS (k(1))
* AR1 | ADDRESS OF BACKWARD PROPAGATION
* | VALUES (b(0,n±1))
* RC | RC = p ± 2
*

* REGISTERS USED AS INPUT: R2, AR0, AR1, RC


* REGISTERS MODIFIED: R0, R1, R2, R3, RS, RE, RC, AR0, AR1
* REGISTER CONTAINING RESULT: R2 (f(p,n))
*
*
* PROGRAM SIZE: 10 WORDS
*
* EXECUTION CYCLES: 13 + 3 * (p±1)
*
*

.global LATINV
*
* i = 1
*
LATINV MPYF3 *AR0, *AR1, R0

Software Applications 11-127


Application-Oriented Operations

* ; k(1) * b(0,n±1) ±> R0


* ; Assume f(0,n) ±> R2.
LDF R2,R3 ; Put b(0,n) = f(0,n) ±> R3.
MPYF3 *AR0++(1),R2,R1
* ; k(1) * f(0,n) ±> R1
*
* 2 <= i <= p
*
RPTB LOOP
MPYF3 *AR0,*++AR1(1),R0 ; k(i) * b(i±1,n±1) ±> R0
|| ADDF3 R2,R0,R2 ; f(i±1±1,n)+k(i±1)
* ; *b(i±1±1,n±1)
* ; = f(i±1,n) ±> R2
*
* ; b(i±1±1,b±1)+k(i±1)*f(i±1±1,n)
ADDF3 *±AR1(1), R1, R3 ; = b(i±1,n) ±> R3
|| STF R3, *±AR1(1) ; b(i±1±1,n) ±> b(i±1±1,n±1)
*
LOOP MPYF3 *AR0++(1),R2,R1
* ; k(i) * f(i±1,n) ±> R1
*
* I = P+1 (CLEANUP)

ADDF3 R2,R0,R2 ; f(p±1,n)+k(p)*b(p±1,n±1)


* ; = f(p,n) ±> R2
*
* ; b(p±1,n±1)+k(p)*f(p±1,n)
ADDF3 *AR1, R1, R3 ; = b(p,n) ±> R3
|| STF R3, *AR1 ; b(p±1,n) ±> b(p±1,n±1)
*
* RETURN SEQUENCE
*
RETS ; RETURN
*
* end
*
.end
The forward lattice filter is similar in structure to the inverse filter, as shown in
Figure 11–7.

Figure 11–7. Structure of the (Forward) Lattice Filter


x(n) f(p–1, n) f(2, n) f(1, n) y(n)

– Kp – K2 – K1
Kp K2 K1
z –1 z –1 z –1
b(p–1, n) b(2, n) b(1, n)

11-128
Application-Oriented Operations

These corresponding equations describe the lattice filter:

f (i – 1,n) = f (i,n) – k (i ) b (i – 1,n – 1)


b (i,n) = b (i – 1,n – 1) + k (i ) f (i – 1,n)

Initial conditions:
f (p,n) = x (n), b (i,n – 1) = 0 for i = 1,...,p

Final conditions:
y (n) = f (0,n)

The data memory organization is identical to that of the inverse filter, as shown
in Figure 11–6 on page 11-126. Example 11–41 shows the implementation of
the lattice filter on the TMS320C3x.

Example 11–41. Lattice Filter

* TITLE LATTICE FILTER


*
*
* SUBROUTINE LATICE
*

* LOAD AR0
* LOAD AR1
* LOAD RC
* CALL LATICE
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R2 | F(P,N) = E(N) = EXCITATION
* AR0 | ADDRESS OF FILTER COEFFICIENTS (K(P))
* AR1 | ADDRESS OF BACKWARD PROPAGATION VALUES (B(P±1,N±1))
IR0 | 3
* RC | RC = P ± 3
*
* REGISTERS USED AS INPUT: R2, AR0, AR1, RC
* REGISTERS MODIFIED: R0, R1, R2, R3, RS, RE, RC, AR0, AR1
* REGISTER CONTAINING RESULT: R2 (f(0,n))
*
* STACK USAGE: NONE
*
* PROGRAM SIZE: 12 WORDS
*
* EXECUTION CYCLES: 15 + 3 * (P±2)
*

Software Applications 11-129


Application-Oriented Operations

.global LATICE
*
*
LATICE MPYF3 *AR0,*AR1,R0
* ; K(P) * B(P±1,N±1) ±> R0
; Assume F(P,N) ±> R2
SUBF3 R0,R2,R2 ; F(P,N)±K(P)*B(P±1,N±1)
; = F(P±1,N) ±> R2
|| MPYF3 *– –AR0(1),*– –AR1(1),R0
; K(P–1) * B(P±2,N±1) ±> R0
SUBF3 R0,R2,R2 ; F(P–1,N)±K(P–1)*B(P±2,N±1)
; = F(P±2,N) ±> R2
|| MPYF3 *– –AR0(1),*– –AR1(1),R0
; K(P–2) * B(P–3,N–1) ±> R0
MPYF3 R2,*+AR0(1),R1 ; F(P–2,N) * K(P–1) ±> R1
ADDF3 R1,*+AR1(1),R3 ; F(P±2,N) * K(P–1) + B(P±2,N–1)
; = B(P–1,N) ±> R3

; 1 <= I <= P–2


*
RPTB LOOP
SUBF3 R0,R2,R2 ; F(I,N) – K(I) * B(I–1,N–1)
; = F(I–1,N) ±> R2
|| MPYF3 *– –AR0(1),*– –AR1(1),R0
; K(I–1) * B(I±2,N±1) ±> R0
STF R3,*+AR1(IR0) ; B(I+1,N) ±> B(I+1,N–1)
|| MPYF3 R2,*+AR0(1),R1 ; F(I–1,N) * K(I) ±> R1
LOOP ADDF3 R1,*+AR1(1),R3 ; F(I–1,N) * K(I) + B(I–1,N–1)
; = B(I,N) ±> R3
STF R3,*+AR1(2) ; B(1,N) ±> B(1,N±1)
STF R2,*+AR1(1) ; F(0,N) ±> B(0,N±1)
* RETURN SEQUENCE
*
RETS
*
* END
*
.end

11-130
Programming Tips

11.5 Programming Tips


Programming style reflects personal preference. The purpose of this section
is not to impose any particular style; rather, it is to highlight features of the
TMS320C3x that can help to produce faster and/or shorter programs. The tips
cover the C compiler, assembly language programming, and low-power-mode
wakeup.

11.5.1 C-Callable Routines


The TMS320C3x was designed with a large register file, software stack, and
large memory space to implement a high-level language (HLL) compiler easi-
ly. The first such implementation supplied is a C compiler. Use of the C compil-
er increases the transportability of applications that have been tested on large,
general-purpose computers, and it decreases their porting time.

For best use of the compiler, complete the following steps:

1) Write the application in the high-level language.

2) Debug the program.

3) Determine whether it runs in real-time.

4) If it doesn’t, identify the places where most of the execution time is spent.

5) Optimize these areas by writing assembly language routines that implement


the functions.

6) Call the routines from the C program as C functions.

When writing a C program, you can increase the execution speed by maximiz-
ing the use of register variables. For more information, refer to the
TMS320C3x C Compiler Reference Guide.
You must observe certain conventions when writing a C-callable routine.
These conventions are outlined in the Runtime Environment chapter of the
TMS320C3x C Compiler Reference Guide. Certain registers are saved by the
calling function, and others need to be saved by the called function. The C
compiler manual helps achieve a clean interface. The end result is the read-
ability and natural flow of a high-level language combined with the efficiency
and special-feature use of assembly language.

11.5.2 Hints for Assembly Coding


Each program has particular requirements. Not all possible optimizations will
make sense in every case. You can use the suggestions presented in this sec-
tion as a checklist of available software tools.

Software Applications 11-131


Programming Tips

- Use delayed branches. Delayed branches execute in a single cycle; reg-


ular branches execute in four cycles. The following three instructions are
also executed whether the branch is taken or not. If fewer than three in-
structions can be used, use the delayed branch and append NOPs. Ma-
chine cycles (time) are still being saved.
- Apply the repeat single/block construct. In this way, loops are achieved
with no overhead. Nesting such constructs will not normally increase effi-
ciency, so try to use the feature on the most often performed loop. Note
that RPTS is not interruptible, and the executed instruction is not refetched
for execution. This frees the buses for operands.
- Use parallel instructions. It is possible to have a multiply in parallel with
an add (or subtract) and to have stores in parallel with any multiply or ALU
operation. This increases the number of operations executed in a single
cycle. For maximum efficiency, observe the addressing modes used in
parallel instructions and arrange the data appropriately.
- Maximize the use of registers. The registers are an efficient way to ac-
cess scratch-pad memory. Extensive use of the register file facilitates the
use of parallel instructions and helps avoid pipeline conflicts when you use
the registers in addressing modes.
- Use the cache. This is especially important in conjunction with external
slow memory. The cache is transparent to the user, so make sure that it
is enabled.
- Use internal memory instead of external memory. The internal
memory (2K x 32 bits RAM and 4K x 32 bits ROM) is considerably faster
to access. In a single cycle, two operands can be brought from internal
memory. You can maximize performance if you use the DMA in parallel
with the CPU to transfer data to internal memory before you operate on it.
- Avoid pipeline conflicts. If there is no problem with program speed,
ignore this suggestion. For time-critical operations, make sure you do not
miss any cycles because of conflicts. To identify conflicts, run the trace
function on the development tools (simulator, emulators) with the program
tracing option enabled. The tracing immediately identifies the pipeline
conflicts. Consult the appropriate section of this user’s guide for an expla-
nation of the reason for the conflict. You can then take steps to correct the
problem.
The above checklist is not exhaustive, and it does not address the more de-
tailed features outlined in other sections of this manual. To learn how to exploit
the full power of the TMS320C3x, study the architecture, hardware configura-
tion, and instruction set of the device. These subjects are described in earlier
chapters.

11-132
Programming Tips

11.5.3 Low-Power-Mode Wakeup Example


There are two instructions by which the TMS320C31 is placed in the low power
consumption mode:
- IDLE2
- LOPOWER

The LOPOWER instruction will slow down the H1/H3 clock by a factor of 16
during the read phase of the instruction. The MAXSPEED instruction will wake
the device from the low-power mode and return it to full frequency during
MAXSPEED’s read cycle. However, the H1/H3 clock may resume with the
phase opposite from before the clocks were shut down.

The IDLE2 instruction has the same functions that the IDLE instruction has,
except that the clock is stopped during the execute phase of the IDLE2 instruc-
tion. The clock pin will stop with H1 high and H3 low. The status of all of the
signals will remain the same as in the execute phase of the IDLE2 instruction.
In emulation mode, however, the clocks will continue to run, and IDLE2 will op-
erate identically to IDLE. The external interrupts INT(0–3) are the only signals
that start the processor up from the mode the device was in. Therefore, you
must enable the external interrupt before going to IDLE2 power-down mode.
(See Example 11–42.) If the proper external interrupt is not set up before
executing IDLE2 to power down, the only way to wake up the processor is with
a device RESET.

Example 11–42. Setup of IDLE2 Power-Down-Mode Wakeup


*
* TITLE IDLE2 POWER-DOWN MODE WAKEUP ROUTINE SETUP
*
* THIS EXAMPLE SETS UP THE EXTERNAL INTERRUPT 0, INT0, BEFORE
* EXECUTING THE IDLE2 INSTRUCTION. WHEN THE INT0 SIGNAL IS RECEIVED
* LATER, THE PROCESSOR WILL RESUME FROM ITS PREVIOUS
* STATE. NOTE: THE “INTRPT” SECTION IS MAPPED FROM THE
* ADDRESS 0 FROM THE RESET AND INTERRUPT VECTORS.
*

. sect “INTRPT”
RESET .word START ; Reset vector
INT0 .word INT0_ISR ; INT0 interrupt vector
INT1 .word INT1_ISR ; INT1 interrupt vector
INT2 .word INT2_ISR ; INT2 interrupt vector
INT3 .word INT3_ISR ; INT3 interrupt vector
: :
: :
.text
: :

Software Applications 11-133


Programming Tips

: :
LDP @SP_ADR
LDI @SP_ADR,SP ; Set up stack pointer
OR 01h, IE ; Enable INT0
IDLE2 ; Set GIE = 1 and stop clock
: :
: :
: :
: :
INT0_ISR RETI ; Return to instruction after IDLE2

There will be one cycle of delay while waking up the processor from the IDLE2
power-down mode before the clocks start up. This adds one extra cycle from
the time the interrupt pad goes low until the interrupt is taken. The interrupt pad
needs to be low for at least two cycles. The clocks may start up in the phase
opposite from before the clocks were stopped.

11-134
Chapter 12

Hardware Applications

The TMS320C3x’s advanced interface design can implement many system


configurations. Its two external buses and DMA capability provide a parallel
32-bit interface to external devices, while the interrupt interface, dual serial
ports, and general-purpose digital I/O provide communication with many
peripherals.

This chapter describes how to use the TMS320C3x’s interfaces to connect to


various external devices. Specific discussions include implementation of par-
allel interface to devices with and without wait states, use of general-purpose
I/O, and system control functions. All interfaces shown in this chapter have
been built and tested to verify proper operation and apply to the TMS320C30.
Comparable designs for the other TMS320C3x devices can be implemented
with appropriate logic.

Major topics discussed in this chapter are as follows:

Topic Page

12.1 System Configuration Options Overview . . . . . . . . . . . . . . . . . . . . . . . 12-2


12.2 Primary Bus Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-4
12.3 Expansion Bus Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-19
12.4 System Control Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-27
12.5 Serial-Port Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-32
12.6 Low-Power-Mode Interrupt Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 12-36
12.7 XDS Target Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-39

12-1
System Configuration Options Overview

12.1 System Configuration Options Overview


The various TMS320C3x interfaces connect to many different device types.
Each of these interfaces is tailored to a particular family of devices.

12.1.1 Categories of Interfaces on the TMS320C3x


The TMS320C3x interface types fall into several categories, depending on the
devices to which they are intended to be connected. Each interface comprises
one or more signal lines that transfer information and control its operation.
Figure 12–1 shows the signal line groupings for each of these various inter-
faces.

Figure 12–1. External Interfaces on the TMS320C3x

32
Data D31–D0 HOLD
24 External DMA Interface
Address A23–A0 HOLDA
Primary 4
Bus R/W INT3–0
Control STRB IACK External Interrupt Interface
RDY External Flags
XF1–0
System Reset RESET TCLK0 Timer Interface
TCLK1
X1
Master Clock
X2/CLKIN CLKX0
H1 DX0
System Clock Outputs FSX0
Control H3 Serial Port 0
CLKR0
ROM Enable MC/MP DR0
(TMS320C30 only)
FSR0
Boot Load Enable MCBL/MP
(TMS320C31 only) CLKX1
32 DX1
Data XD31–XD0 FSX1 Serial Port 1
13 (TMS320C30 only)
Address XA12–XA0 CLKR1
Expansion Bus XR/W DR1
(TMS320C30 only) XRDY
Control FSR1
IOSTRB
MSTRB
TMS320C3x

All of the interfaces are independent of one another, and you can perform dif-
ferent operations simultaneously on each interface.
The primary and expansion buses implement the memory-mapped interface
to the device. The external direct memory access (DMA) interface allows ex-
ternal devices to cause the processor to relinquish the primary bus and allow
direct memory access.

12-2
System Configuration Options Overview

12.1.2 Typical System Block Diagram


The devices that can be interfaced to the TMS320C3x include memory, DMA
devices, and numerous parallel and serial peripherals and I/O devices.
Figure 12–2 illustrates a typical configuration of a TMS320C3x system with
different types of external devices and the interfaces to which they are con-
nected.

Figure 12–2. Possible System Configurations

Memory DMA Devices Memory

TMS320C3x
Peripherals Peripherals
External DMA Interface

Primary Bus Expansion Bus

Interrupt
Peripherals Interface Timer Interface I/O Devices

External Flags

System Serial Serial


Control Ports Ports

TCM29C13
Bit I/O
CODEC
Clock and
TLC3204x
Reset
AIC
Generators,
Analog I/O
etc.

This block diagram constitutes essentially a fully expanded system. In an actual


design, you can use any subset of the illustrated configuration as appropriate.

Hardware Applications 12-3


Primary Bus Interface

12.2 Primary Bus Interface


The TMS320C3x uses the primary bus to access the majority of its
memory-mapped locations. Therefore, typically, when a large amount of exter-
nal memory is required in a system, it is interfaced to the primary bus. The ex-
pansion bus (discussed in Section 12.3 on page 12-19) actually comprises two
mutually exclusive interfaces, controlled by the MSTRB and IOSTRB signals,
respectively. Cycles on the expansion bus controlled by the MSTRB signal are
essentially equivalent to cycles on the primary bus, except that bank switching
is not implemented on the expansion bus. Accordingly, the discussion of pri-
mary bus cycles in this section applies equally to MSTRB cycles on the expan-
sion bus.
Although you can use both the primary bus and the expansion bus to interface
to a wide variety of devices, the devices most commonly interfaced to these
buses are memories. Therefore, this section presents detailed examples of
memory interface.

12.2.1 Zero-Wait-State Interface to Static RAMs


Zero-wait-state read access time for the TMS320C3x is determined by the dif-
ference between the cycle time (specification 10 in Table 13–12 on page
13-30) and the sum of the times for H1 low to address valid (specification 14.1
in Table 13–13 on page 13-33) and data setup before next H1 low (specifica-

ƪ ƫ
tion 15.1 in Table 13–13 on page 13-33):

t
c(H)
– t
d(H1L – A)
) tsu(D)R
For example, for full-speed, zero-wait-state interface to any device, the 60-ns
TMS320C3x requires a read access time of 30 ns from address stable to data
valid. Because for most memories access time from chip select is the same
as access time from address, it is theoretically possible to use 30-ns memories
at full speed with the TMS320C3x-33. This requires that there be no delays
between the processor and the memories. However, because of
interconnection delays and because some gating is normally required for chip-
select generation, this is usually not the case. Therefore, slightly faster memo-
ries are required in most systems.
Among currently available RAMs, there are two distinct categories of devices
with different interface characteristics:
- RAMs without output enable control lines (OE), which include the one-bit-
wide organized RAMs and most of the four-bit wide RAMs
- RAMs with OE controls, which include the byte-wide RAMs and a few of
the four-bit wide RAMs

12-4
Primary Bus Interface

Many of the fastest RAMs do not provide OE control; they use chip-select (CS)
controlled write cycles to ensure that data outputs do not turn on for write oper-
ations. In CS-controlled write cycles, the write control line (WE) goes low be-
fore CS goes low, and internal logic holds the outputs disabled until the cycle
is completed. Using CS-controlled write cycles is an efficient way to interface
fast RAMs without OE controls to the TMS320C30 at full speed.

In the case of RAMs with OE controls, using this signal can add flexibility to
many systems. Additionally, many of these devices can be interfaced by using
CS-controlled write cycles with OE tied low in the same manner as with RAMs
without OE controls. There are, however, two requirements for interfacing to
OE RAMs in this manner. First, the RAM’s OE input must be gated with chip
select and WE internally so that the device’s outputs do not turn on unless a
read is being performed. Second, the RAM must allow its address inputs to
change while WE is low; some RAMs specifically prohibit this.

Figure 12–3 shows the TMS320C3x interfaced to Cypress Semiconductor’s


CY7C186 25-ns 8K x 8-bit CMOS static RAM with the OE control input tied low
and using a CS-controlled write cycle.

Hardware Applications 12-5


Primary Bus Interface

Figure 12–3. TMS320C3x Interface to Cypress Semiconductor CY7C186 CMOS SRAM

4 × CY7C186-25

Primary
Address
Bus
A23–A0
D31
A12 I/O7
A12 D30
A11 I/O6
A11 D29
A10 I/O5
A10 D28
A9 I/O4
A9 D27
A8 I/O3
A8 D26
A7 I/O2
A7 D25
A6 I/O1
A6 D24
A5 I/O0
A5
A4
A4
A3
A3
A2
A2
A1
A1
A23 A0 8 D23–D16
A0 I/O
(7–0)

STRB CS1 8 D15–D8


I/O
CS2 (7–0)
R/W WE
74AS04 8 D7–D0
OE I/O
(7–0)
Primary Data Bus D31–D0

In this circuit, the two chip selects on the RAM are driven by STRB and A23,
which are ANDed together internally. A23 locates the RAM at addresses
00000h through 03FFFh in external memory, and STRB establishes the CS-
controlled write cycle. The WE control input is then driven by the TMS320C3x
R/W signal, and the OE input is not used and is therefore connected to ground.

The timing of read operations, shown in Figure 12–4, is very straightforward


because the two chip-select inputs are driven directly. The read access time
of the circuit is therefore the inverter propagation delay added to the RAM’s
chip-select access time, or t1 + t2 = 5 + 25 = 30 ns. This access time therefore
meets the TMS320C3x-33’s specified 30-ns read access time requirement.

12-6
Primary Bus Interface

Figure 12–4. Read Operations Timing

H1

A23–A0 Valid

CS1 = STRB

CS2

D31–D0 Valid
t1
t2

During write operations, as shown in Figure 12–5, the RAM’s outputs do not
turn on at all, because of the use of the chip-select controlled write cycles. The
chip-select controlled write cycles are generated because R/W goes active
(low) before the STRB term of the chip-select input. Because the RAM’s output
drivers are disabled whenever the WE input is low (regardless of the state of
the OE input), bus conflicts with the TMS320C3x are automatically avoided
with this interface. The circuit’s data setup and hold times (t1 and t2 in the timing
diagram) of approximately 50 and 20 ns, respectively, also easily meet the
RAM’s timing requirements of 10 and 0 ns.

Hardware Applications 12-7


Primary Bus Interface

Figure 12–5. Write Operations Timing

H1

A23–A0

CS1 = STRB

WE = R/W

D31–D0
t1
t2

If you require more complex chip-select decode than can be accomplished in


time to meet zero-wait-state timing, you should use wait states (see subsec-
tion 12.2.2) or bank-switching techniques (see subsection 12.2.3).

Note that the CY7C186’s OE control is gated internally with CS; therefore, the
RAM’s outputs are not enabled unless the device is selected. This is critical
if there are any other devices connected to the same bus; if there are no other
devices connected to the bus, OE need not be gated internally with chip select.

You can easily interface RAMs without OE controls to the TMS320C3x by us-
ing an approach similar to that used with RAMs with OE controls. If only one
bank of memory is implemented and no other devices are present on the bus,
the memories’ CS input can usually be connected to STRB directly. If several
devices must be selected, however, a gate is generally required to AND the
device select and STRB to drive the CS input to generate the chip-select con-
trolled write cycles. In either case, the WE input is driven by the TMS320C3x
R/W signal. Provided sufficiently fast gating is used, 25-ns RAMs can still be
used.

As with the case of RAMs with OE control lines, this approach works well if only
a few banks of memory are implemented where the chip-select decode can
be accomplished with only one level of gating. If many banks are required to
implement very large memory spaces, bank switching can be used to provide
for multiple bank select generation while still maintaining full-speed accesses
within each bank. Bank switching is discussed in detail in subsection 12.2.3.

12-8
Primary Bus Interface

12.2.2 Ready Generation


The use of wait states can greatly increase system flexibility and reduce hard-
ware requirements over systems without wait-state capability. The
TMS320C3x has the capability of generating wait states on either the primary
bus or the expansion bus; both buses have independent sets of ready control
logic.This subsection discusses ready generation from the perspective of the
primary bus interface; however, wait-state operation on the expansion bus is
similar to that on the primary bus. Therefore, these discussions also pertain
to expansion bus operation. Accordingly, ready generation is not included in
the specific discussions of the expansion bus interface.

Wait states are generated on the basis of:

- the internal wait-state generator,


- the external ready input (RDY), or
- the logical AND or OR of the two.

When enabled, internally generated wait states affect all external cycles, re-
gardless of the address accessed. If different numbers of wait states are re-
quired for various external devices, the external RDY input may be used to tai-
lor wait-state generation to specific system requirements.

If the logical AND (electrical OR) of the wait count and external ready signals
is selected, the later of the two signals will control the internal ready signal, and
both signals must occur. Accordingly, external ready control must be imple-
mented for each wait-state device, and the wait count ready signal must be en-
abled.

If the logical OR (or electrical AND, since the signals are low true) of the exter-
nal and internal wait-count ready signals is selected, the earlier of the two sig-
nals will generate a ready condition and allow the cycle to be completed. Both
signals need not be present.

ORing of the Ready Signals


The OR of the two ready signals can implement wait states for devices that
require a greater number of wait states than are implemented with external
logic (up to seven). This feature is useful, for example, if a system contains
some fast and some slow devices. In this case, fast devices can generate a
ready signal externally with a minimum of logic, and slow devices can use the
internal wait counter for larger numbers of wait states. Thus, when fast devices
are accessed, the external hardware responds promptly with a ready signal
that terminates the cycle. When slow devices are accessed, the external hard-
ware does not respond, and the cycle is appropriately terminated after the in-
ternal wait count.

Hardware Applications 12-9


Primary Bus Interface

You can use the OR of the two ready signals if conditions occur that require
termination of bus cycles prior to the number of wait states implemented with
external logic. In this case, a shorter wait count is specified internally than the
number of wait states implemented with the external ready logic, and the bus
cycle is terminated after the wait count. This feature can also be a safeguard
against inadvertent accesses to nonexistent memory that would never re-
spond with ready and would therefore lock up the TMS320C3x.

If the OR of the two ready signals is used, however, and the internal wait-state
count is less than the number of wait states implemented externally, the exter-
nal ready generation logic must have the ability to reset its sequencing to allow
a new cycle to begin immediately following the end of the internal wait count.
This requires that, under these conditions, consecutive cycles be from inde-
pendently decoded areas of memory and that the external ready generation
logic be capable of restarting its sequence as soon as a new cycle begins.
Otherwise, the external ready generation logic might lose synchronization with
bus cycles and therefore generate improperly timed wait states.

ANDing of the Ready Signals


The AND of the two ready signals can be used to implement wait states for de-
vices that are equipped to provide a ready signal but cannot respond quickly
enough to meet the TMS320C3x’s timing requirements. In particular, if these
devices normally indicate a ready condition and, when accessed, respond with
a wait until they become ready, the logical AND of the two ready signals can
be used to save hardware in the system. In this case, the internal wait counter
can provide wait states initially and become ready after the external device has
had time to send a not ready indication. The internal wait counter then remains
ready until the external device also becomes ready, which terminates the
cycle.

Additionally, the AND of the two ready signals can extend the number of wait
states for devices that already have external ready logic implemented but re-
quire additional wait states under certain unique circumstances.

External Ready Generation


In the implementation of external ready generation hardware, the particular
technique employed depends heavily on the specific characteristics of the sys-
tem. The optimum approach to ready generation varies, depending on the rel-
ative number of wait-state and non-wait-state devices in the system and on the
maximum number of wait states required for any one device. The approaches
discussed here are intended to be general enough for most applications and
are easily modifiable to comprehend many different system configurations.

12-10
Primary Bus Interface

In general, ready generation involves the following three functions:

- Segmentating the address space in some fashion to distinguish fast and


slow devices

- Generating properly timed ready indications

- Logically ORing all of the separate ready timing signals together to con-
nect to the physical ready input

Segmentation of the address space is required to obtain a unique indication


of each particular area within the address space that requires wait states. This
segmentation is commonly implemented in a system in the form of chip-select
generation. In many cases, you can use chip-select signals to initiate wait
states; however chip-select decoding considerations might occasionally pro-
vide signals that will not allow ready input timing requirements to be met. In this
case, you could make coarse address space segmentation on the basis of a
small number of address lines, where simpler gating allows signals to be gen-
erated more quickly. In either case, the signal indicating that a particular area
of memory is being addressed is normally used to initiate a ready or wait-state
indication.

Once the region of address space being accessed has been established, a
timing circuit of some sort is normally used to provide a ready indication to the
processor at the appropriate point in the cycle to satisfy each device’s unique
requirements.

Finally, since indications of ready status from multiple devices are typically
present, the signals are logically ORed by using a single gate to drive the RDY
input.

Ready Control Logic


You can take one of two basic approaches in the implementation of ready con-
trol logic, depending on the state of the ready input between accesses. If RDY
is low between accesses, the processor is always ready unless a wait state is
required; if RDY is high between accesses, the processor will always enter a
wait state unless a ready indication is generated.

If RDY is low between accesses, control of full-speed devices is straightfor-


ward; no action is necessary because ready is always active unless otherwise
required. Devices requiring wait states, however, must drive ready high fast
enough to meet the input timing requirements. Then, after an appropriate
delay, a ready indication must be generated. This can be quite difficult in many
circumstances because wait-state devices are inherently slow and often re-
quire complex select decoding.

Hardware Applications 12-11


Primary Bus Interface

If RDY is high between accesses, zero-wait-state devices, which tend to be


inherently fast, can usually respond immediately with a ready indication. Wait-
state devices might delay their select signals appropriately to generate a
ready. Typically, this approach results in the most efficient implementation of
ready control logic. Figure 12–6 shows a circuit of this type, which can be used
to generate zero, one, or two wait states for multiple devices in a system.

Figure 12–6. Circuit for Generation of Zero, One, or Two Wait States for Multiple Devices
74ALS138
TMS320C30 A Y0
Address B Y1
Bus
C Y2
STRB G2A Y3 Device
Y4 Selects
G1
Other 2- Y5
G2B
Wait-State Y6
Devices Y7
74AS32
Other 1-
Wait-State STRB
Devices
A23

Other 0-
Wait-State
74AS20 Devices
+5 V
PRE A23
J
Q 74AS21
74AS20 4.7 kΩ
74ACT112
K RDY
CLR
PRE
J Q

H1
74ACT112
K Q
CLR

RESET

12-12
Primary Bus Interface

Example Circuit

In this circuit, full-speed devices drive ready directly through the ’74AS21, and
the two flip-flops delay wait-state devices’ select signals one or two H1 cycles
to provide one or two wait states.

Considering the TMS320C3x-33’s ready delay time of eight ns following ad-


dress, zero-wait-state devices must use ungated address lines directly to drive
the input of the ’74AS21, since this gate contributes a maximum propagation
delay of six ns to the RDY signal. Thus, zero-wait-state devices should be
grouped together within a coarse segmentation of address space if other de-
vices in the system require wait states.

With this circuit, devices requiring wait states might take up to 36 ns from a val-
id address on the TMS320C3x to provide inputs to the ’74AS20’s inputs. This
usually allows sufficient time for any decoding required in generating select
signals for slower devices in the system. For example, the 74ALS138, driven
by address and STRB, can generate select decodes in 22 ns, which easily
meets the TMS320C3x-33’s timing requirements.

With this circuit, unused inputs to either the 74AS20s or the 74AS21 should
be tied to a logic high level to prevent noise from generating spurious wait
states.

If more than two wait states are required by devices within a system, other ap-
proaches can be employed for ready generation. If between three and seven
wait states are required, additional flip-flops can be included in the same man-
ner shown in Figure 12–6, or internally generated wait states can be used in
conjunction with external hardware. If more than seven wait states are re-
quired, an external circuit using a counter may be used to supplement the ca-
pabilities of the internal wait-state generators.

12.2.3 Bank Switching Techniques

The TMS320C3x’s programmable bank switching feature can greatly ease


system design when large amounts of memory are required. Because, in gen-
eral, devices take longer to release the bus than they take to drive the bus,
bank switching is used to provide a period of time for disabling all device se-
lects that would not be present otherwise (refer to Section 7.4 on page 7-30
for further information regarding bank switching). During this interval, slow de-
vices are allowed time to turn off before other devices have the opportunity to
drive the data bus, thus avoiding bus contention.

Hardware Applications 12-13


Primary Bus Interface

When bank switching is enabled, any time a portion of the high order address
lines changes, as defined by the contents of the BNKCMPR register, STRB
goes high for one full H1 cycle. Provided STRB is included in chip-select de-
codes, this causes all devices to be disabled during this period. The next bank
of devices is not enabled until STRB goes low again.

In general, bank switching is not required during writes, because these cycles
always exhibit an inherent one-half H1 cycle setup of address information be-
fore STRB goes low. Thus, when you use bank switching for read/write de-
vices, a minimum of half of one H1 cycle of address setup is provided for all
accesses. Therefore, large amounts of memory can be implemented without
wait states or extra hardware required for isolation between banks. Also, note
that access time for cycles during bank switching is the same as that for cycles
without bank switching, and, accordingly, full-speed accesses can still be ac-
complished within each bank.

When you use bank switching to implement large multiple-bank memory sys-
tems, an important consideration is address line fanout. Besides parametric
specifications for which account must be made, AC characteristics are also
crucial in memory system design. With large memory arrays, which commonly
require large numbers of address line inputs to be driven in parallel, capacitive
loading of address outputs is often quite large. Because all TMS320C3x timing
specifications are guaranteed up to a capacitive load of 80 pF, driving greater
loads will invalidate guaranteed AC characteristics. Therefore, it is often nec-
essary to provide buffering for address lines when driving large memory ar-
rays. AC timings for buffer performance can then be derated according to man-
ufacturer specifications to accommodate a wide variety of memory array sizes.

The circuit shown in Figure 12–7 illustrates the use of bank switching with Cy-
press Semiconductor’s CY7C185 25-ns 8K × 8 CMOS static RAM. This circuit
implements 32K 32-bit words of memory with one-wait-state accesses within
each bank.

A wait state is required with this implementation of bank memory because of


the added propagation delay presented by the address bus buffers used in the
circuit. The wait state is not a function of the memory organization of multiple
banks or the use of bank switching. When bank switching is used, memory ac-
cess speeds are the same as without bank switching, once bank boundaries
are crossed. Therefore, no speed penalty is paid when bank switching is used,
except for the occasional extra cycle inserted when bank boundaries are
crossed. Note, however, that if the extra cycle inserted when bank boundaries
are crossed does impact software performance significantly, you can often re-
structure code to minimize bank boundary crossings, thereby reducing the ef-
fect of these boundary crossings on software performance.

12-14
Primary Bus Interface

The wait state for this bank memory is generated by using the wait-state gener-
ator circuit presented in the previous section. Because A23 is the signal that
enables the entire bank memory system, the inverted version of this signal is
ANDed with STRB to derive a one-wait-state device select. This signal is then
connected in the circuit along with the other one-wait-state device selects.
Thus, any time a bank memory access is made, one wait state is generated.

Each of the four banks in this circuit is selected by using a decode of A15–A13
generated by the 74AS138 (see Figure 12–8). With the BNKCMPR register
set to 0Bh, the banks will be selected on even 8K-word boundaries starting at
location 080A000h in external memory space.

Figure 12–7. Bank Switching for Cypress Semiconductor’s CY7C185

BA0–12
+ 15 V + 15 V + 15 V + 15 V

BA12 BA12 BA12 BA12


A12 VCC A12 VCC A12 VCC A12 VCC
BA11 BA11 BA11 BA11
A11 A11 A11 A11
BA10 BA10 BA10 BA10
A10 A10 A10 A10
BA9 D0 BA9 D0 BA9 D0 BA9 D0
A9 A9 A9 A9
BA8 D1 BA8 D1 BA8 D1 BA8 D1
A8 A8 A8 A8
BA7 D2 BA7 D2 BA7 D2 BA7 D2
A7 A7 A7 A7
BA6 D3 BA6 D3 BA6 D3 BA6 D3
A6 A6 A6 A6
BA5 D4 BA5 D4 BA5 D4 BA5 D4
A5 A5 A5 A5
BA4 D5 BA4 D5 BA4 D5 BA4 D5
A4 A4 A4 A4
BA3 D6 BA3 D6 BA3 D6 BA3 D6
A3 A3 A3 A3
BA2 D7 BA2 D7 BA2 D7 BA2 D7
A2 A2 A2 A2
BA1 BA1 BA1 BA1
A1 A1 A1 A1
BA0 BA0 BA0 BA0
A0 8 A0 8 A0 8 A0 8

BANKSEL BANKSEL BANKSEL BANKSEL


CS1 CS1 CS1 CS1
BSTRB BSTRB BSTRB BSTRB
CS2 CS2 CS2 CS2
WE WE WE WE
OE OE OE OE
GND GND GND GND

BANKSEL0
BSTRB
BR/W
Bank 0

32 Data Bus D31–D0


BANKSEL1 Bank 1

32
BANKSEL2 Bank 2

32
BANKSEL3 Bank 3

D31–D0

Hardware Applications 12-15


Primary Bus Interface

Figure 12–8. Bank Memory Control Logic


74ALS2541

A0 A1 Y1 BA0
A1 A2 Y2 BA1
A2 A3 Y3 BA2
A3 A4 Y4 BA3
A4 A5 Y5 BA4
A5 A6 Y6 BA5
A6 A7 Y7 BA6
A7 A8 Y8 BA7
G1 G2

74ALS2541

A8 A1 Y1 BA8
A9 A2 Y2 BA9
A10 A3 Y3 BA10
A11 A4 Y4 BA11
A12 A5 Y5 BA12
R/W A6 Y6 BR/W
A7 Y7
A8 Y8
G1 G2

74AS138

A15 C Y1 BANKSEL0
A14 B Y2 BANKSEL1
A13 A Y3 BANKSEL2 74AS04
Y4 BANKSEL3
STRB BSTRB
Y5
Y6
A23 G1 Y7
G2A Y8
G2B G2

12-16
Primary Bus Interface

The 74ALS2541 buffers used on the address lines are necessary in this design
because the total capacitive load presented to each address line is a maximum
of 16 × 10 pF or 160 pF (bank memory plus zero-wait-state static RAM), which
exceeds the TMS320C3x rated capacitive loading of 80 pF. Using the
manufacturer’s derating curves for these devices at a load of 80 pF (the load
presented by the bank memory) predicts propagation delays at the output of
the buffers of a maximum of 16 ns. The access time of a read cycle within a
bank of the memory is therefore the sum of the memory access time and the
maximum buffer propagation delay, or 25 + 16 = 41 ns, which, since it falls be-
tween 30 and 90 ns, requires one wait state on the TMS320C3x-33.

The 74ALS2541 buffers offer one additional system-performance enhance-


ment in that they include 25-ohm resistors in series with each individual buffer
output. These resistors greatly improve the transient response characteristics
of the buffers, especially when driving CMOS loads such as the memories
used here. The effect of these resistors is to reduce overshoot and ringing,
which is common when driving predominantly capacitive loads such as
CMOS. The result is reduced noise and increased immunity to latch-up in the
circuit, which in turn results in a more reliable memory system. Having these
resistors included in the buffers eliminates the need to put discrete resistors
in the system, which is often required in high-speed memory systems.

This circuit cannot be implemented without bank switching because data out-
put’s turn-on and turn-off delays cause bus conflicts. Here, the propagation
delay of the 74AS138 is involved only during bank switches, when there is suf-
ficient time between cycles to allow new chip selects to be decoded.

The timing of this circuit for read operations using bank switching is shown in
Figure 12–9. With the BNKCMPR register set to 0Bh, when a bank switch oc-
curs, the bank address on address lines A23–A13 is updated during the extra
H1 cycle while STRB is high. Then, after chip-select decodes have stabilized
and the previously selected bank has disabled its outputs, STRB goes low for
the next read cycle. Further accesses occur at normal bus timings with one
wait state, as long as another bank switch is not necessary. Write cycles do
not require bank switching due to the inherent address setup provided in their
timings.

Hardware Applications 12-17


Primary Bus Interface

Figure 12–9. Timing for Read Operations Using Bank Switching

t1 t4
H1

A23–A13 Valid

A12–A0 Valid

STRB
t2

BANKSEL0 t5

BANKSEL1
t3
t6
D31–D0 Bank 0 on Bus Bank 1 on Bus

This timing is summarized in Table 12–1.

Table 12–1. Bank Switching Interface Timing


Timer Interval Event Time Period
t1 H1 falling to address valid/STRB rising 14 ns
t2 Address valid to select delay 10 ns
t3 Memory disable from STRB 10 ns
t4 H1 falling to STRB 10 ns
t5 STRB to select delay 4.5 ns
t6 Memory output enable delay 3 ns
† Timing for the TMS320C3x-33

12-18
Expansion Bus Interface

12.3 Expansion Bus Interface


The TMS320C30’s expansion bus interface provides a second complete par-
allel bus, which can be used to implement data transfers concurrently with (and
independently of) operations on the primary bus. The expansion bus com-
prises two mutually exclusive interfaces controlled by the MSTRB and
IOSTRB signals, respectively. This subsection discusses interface to the ex-
pansion bus using IOSTRB cycles; MSTRB cycles are essentially equivalent
in timing to primary bus cycles and are discussed in Section 12.2, beginning
on page 12-4. This section applies to TMS320C30 devices.

Unlike the primary bus, both read and write cycles on the I/O portion of the ex-
pansion bus are two H1 cycles in duration and exhibit the same timing. The
XR/W signal is high for reads and low for writes. Since I/O accesses take two
cycles, many peripherals that require wait states if interfaced either to the pri-
mary bus or by using MSTRB can be used in a system without the need for wait
states. Specifically, in cases where there is only one device on the expansion
bus, devices with address access times greater than the 30 ns required by the
primary bus, but less than 59 ns, can be interfaced to the I/O bus of the
TMS320C30-33 without wait states.

12.3.1 A/D Converter Interface


A/D and D/A converters are commonly required in DSP systems and interface
efficiently to the I/O expansion bus. These devices are available in many
speed ranges and with a variety of features. While some might require one or
more wait states on the I/O bus, others can be used at full speed.

Figure 12–10 illustrates a TMS320C30 interface to an Analog Devices


AD1678 analog-to-digital converter. The AD1678 is a 12-bit, 5-µs converter
that allows sample rates up to 200 kHz and has an input voltage range of 10
volts, bipolar or unipolar. The converter is connected according to manufactur-
er’s specifications to provide 0- to +10-volt operation. This interface illustrates
a common approach to connecting devices such as this to the TMS320C30.
Note that the interface requires only a minimum amount of control logic.

Hardware Applications 12-19


Expansion Bus Interface

Figure 12–10. Interface to AD1678 A/D Converter

XA12 +12 V +5 V
IOSTRB IOW
XR/W
74AS32
74AS04 VCC VDD
IOR OE REFOUT
XA12 SC
50 Ω
CS
12/8 REFIN
ONE SYNC
74AS32 74LS244 EOCEN

XD0 18 1Y1 1A1 2 D0


XD1 16 4 D1 BIPOFF
XD2 14 6 D2
XD3 12 8 D3 200 Ω
XD4 9 2Y1 2A1 11 D4
XD5 7 13 AD1678
D5
XD6 5 15 D6
XD7 3 17 D7 AIN Analog
1G 2G D8 Input
D9 +5 V
D10
D11 20K Ω
74LS244

XD8 18 2 EOC INT0


1Y1 1A1
XD9 16 4
XD10 14 6 PGND VEE AGND
XD11 12 8
19
ONE
1G -12 V

XD Bus

The AD1678 is a very flexible converter and is configurable in a number of dif-


ferent operating modes. These operating modes include byte or word data for-
mat, continuous or noncontinuous conversions, enabled or disabled chip-se-
lect function, and programmable end-of-conversion indication. This interface
utilizes 12-bit word data format, rather than byte format, to be compatible with
the TMS320C3x. Noncontinuous conversions are selected so that variable
sample rates can be used; continuous conversions occur only at a rate of 200
kHz. With noncontinuous conversions, the host processor determines the con-
version rate by initiating conversions through write operations to the converter.

12-20
Expansion Bus Interface

The chip-select function is enabled, so the chip-select input is required to be


active when accessing the device. Enabling the chip select function is neces-
sary to allow a mechanism for the AD1678 to be isolated from other peripheral
devices connected to the expansion bus. To establish the desired operating
modes, the SYNC and 12/8 inputs to the converter are pulled high and EOCEN
is grounded, as specified in the AD1678 data sheet.

In this application, the converter’s chip select is driven by XA12, which maps
this device at 804000h in I/O address space. Conversions are initiated by writ-
ing any data value to the device, and the conversion results are obtained by
reading from the device after the conversion is completed. To generate the de-
vice’s start conversion (SC) and output enable (OE) inputs, IOSTRB is ANDed
with XR/W. Therefore, the converter is selected whenever XA12 is low; OE is
driven when reads are performed, while SC is driven when writes are per-
formed.

As with many A/D converters, at the end of a read cycle the AD1678 data out-
put lines enter a high-impedance state. This occurs after the output enable
(OE) or read control line goes inactive. Also common with these types of de-
vices is that the data output buffers often require a substantial amount of time
to actually attain a full high-impedance state. When used with the
TMS320C30-33, devices must have their outputs fully disabled no later than
65 ns following the rising edge of IOSTRB because the TMS320C30 will begin
driving the data bus at this point if the next cycle is a write. If this timing is not
met, bus conflicts between the TMS320C30 and the AD1678 might occur, po-
tentially causing degraded system performance and even failure due to dam-
aged data bus drivers. The actual disable time for the AD1678 can be as long
as 80 ns; therefore, buffers are required to isolate the converter outputs from
the TMS320C30. The buffers used here are 74LS244s that are enabled when
the AD1678 is read and turned off 30.8 ns following IOSTRB going high.
Therefore, the TMS320C30-33 requirement of 65 ns is met.

When data is read following a conversion, the AD1678 takes 100 ns after its
OE control line is asserted to provide valid data at its outputs. Thus, including
the propagation delay of the 74LS244 buffers, the total access time for reading
the converter is 118 ns. This requires two wait states on the TMS320C30-33
expansion I/O bus.

The two wait states required in this case are implemented using software wait
states; however, depending on the overall system configuration, it might be
necessary to implement a separate wait-state generator for the expansion bus
(refer to subsection 12.2.2 on page 12-9). This would be the case if multiple
devices that required different numbers of wait states were connected to the
expansion bus.

Hardware Applications 12-21


Expansion Bus Interface

Figure 12–11 shows the timing for read operations between the
TMS320C30-33 and the AD1678. At the beginning of the cycle, the address
and XR/W lines become valid t1 = 10 ns following the falling edge of H1. Then,
after t2 = 10 ns from the next rising edge of H1, IOSTRB goes low, beginning
the active portion of the read cycle. After t3 = 5.8 ns (the control logic propaga-
tion delay), the IOR signal goes low, asserting the OE input to the AD1678. The
’74LS244 buffers take t4 = 30 ns to enable their outputs, and then, following
the converters access delay and the buffer propagation delay (t5 = 100 + 18
= 118 ns), data is provided to the TMS320C30. This provides approximately
46 ns of data setup before the rising edge of IOSTRB. Therefore, this design
easily satisfies the TMS320C30-33’s requirement of 15 ns of data setup time
for reads.

Figure 12–11.Read Operations Timing Between the TMS320C30 and AD1678

H1

XA12–XA0
t2
t1
IOSTRB
t3

IOR

READO
DATA
t4
t5

Unlike the primary bus, read and write cycles on the I/O expansion bus are
timed the same with the exception that XR/W is high for reads and low for
writes and that the data bus is driven by the TMS320C30 during writes. When
writing to the AD1678, the ’74LS244 buffers do not turn on and no data is trans-
ferred. The purpose of writing to the converter is only to generate a pulse on
the converter’s SC input, which initiates a conversion cycle. When a conver-
sion cycle is completed, the AD1678’s EOC output is used to generate an inter-
rupt on the TMS320C30 to indicate that the converted data can be read.

It should be noted that for different applications, use of TLC1225 or TLC1550


A/D converters from Texas Instruments can be beneficial. The TLC1225 is a
self-calibrating 12-bit-plus-sign bipolar or unipolar converter, which features
10-µs conversion times. The TLC1550 is a 10-bit, 6-µs converter with a high-
speed DSP interface. Both converters are parallel-interface devices.

12-22
Expansion Bus Interface

12.3.2 D/A Converter Interface


In many DSP systems, the requirement for generating an analog output signal
is a natural consequence of sampling an analog waveform with an A/D conver-
ter and then processing the signal digitally internally. Interfacing D/A conver-
ters to the TMS320C30 on the expansion I/O bus is also quite straightforward.

As with A/D converters, D/A converters are also available in a number of vari-
eties. One of the major distinctions between various types of D/A converters
is whether or not the converter includes both latches to store the digital value
to be converted to an analog quantity, and the interface to control those
latches. With latches and control logic included with the converter, interface
design is often simplified; however, internal latches are often included only in
slower D/A converters.

Because slower converters limit signal bandwidths, the converter used in this
design was selected to allow a reasonably wide range of signal frequencies
to be processed, and to illustrate the technique of interfacing to a converter that
uses external data latches.

Figure 12–12 shows an interface to an Analog Devices AD565A digital-to-


analog converter. This device is a 12-bit, 250-ns current output DAC with an
on-chip 10-volt reference. Using an offchip current-to-voltage conversion cir-
cuit connected according to manufacturers specifications, the converter ex-
hibits output signal ranges of 0 to +10 volts, which is compatible with the con-
version range of the A/D converter discussed in the previous section.

Hardware Applications 12-23


Expansion Bus Interface

Figure 12–12. Interface Between the TMS320C30 and the AD565A


+12 V

VCC
REF. OUT
VEE -12 V
50 Ω
20 V SPAN
REF. IN
REF. GND
74LS377 10 V
AGND SPAN
XD0 3 2 10 pF
1D 1Q Bit 12 (LSB)
XD1 4 5 11 +12 V
XD2 7 6 10
XD3 8 9 DACOUT
9 LM318 Analog
XD4 13 U25 12 Out
8
XD5 14 15 AD565A
7
XD6 17 16 -12 V
6
XD7 18 19 5 2.4 K
CLK EN 4
3
2
Bit 1 (MSB)
74LS377
Power
XD8 3 2 GND
XD9 4 5
XD10 7 6
U26 AGND
XD11 8 9
CLK EN

XA12
XD Bus IOW

Because this DAC essentially performs continuous conversions based on the


digital value provided at its inputs, periodic sampling is maintained by periodi-
cally updating the value stored in the external latches. Therefore, between
sample updates, the digital value is stored and maintained at the latch outputs
that provide the input to the DAC. This results in the analog output remaining
stable until the next sample update is performed.

12-24
Expansion Bus Interface

The external data latches used in this interface are ’74LS377 devices that have
both clock and enable inputs. These latches serve as a convenient interface
with the TMS320C30; the enable inputs provide a device select function, and
the clock inputs latch the data. Therefore, with the enable input driven by in-
verted XA12 and the clock input by IOW, which is the AND of IOSTRB and
XR/W, data will be stored in the latches when a write is performed to I/O ad-
dress 805000h. Reading this address has no effect on the circuit.

Figure 12–13 shows a timing diagram of a write operation to the D/A converter
latches.

Figure 12–13. Write Operation to the D/A Converter Timing Diagram

H1

XA12–XA0
t1
t3
XA12
t2
t4
IOSTRB

IOW

XD32–XD0

t5
t6

Because the write is actually being performed to the latches, the key timings
for this operation are the timing requirements for these devices. For proper op-
eration, these latches require simply a minimal setup and hold time of data and
control signals with respect to the rising edge of the clock input. Specifically,
the latches require a data setup time of 20 ns, enable setup of 25 ns, disable
setup of 10 ns, and data and enable hold times of 5 ns. This design provides
approximately 60 ns of enable setup, 30 ns of data setup, and 7.2 ns of data
hold time. Therefore, the setup and hold times provided by this design are well
in excess of those required by the latches. The key timing parameters for this
interface are summarized in Table 12–2.

Hardware Applications 12-25


Expansion Bus Interface

Table 12–2. Key Timing Parameter for D/A Converter Write Operation
Time Time
Interval Event Period†
t1 H1 falling to address valid 10 ns

t2 XA12 to XA12 delay 5 ns

t3 H1 rising to IOSTRB falling 10 ns

t4 IOSTRB to IOW delay 5.8 ns

t5 Data setup to IOW 30 ns

t6 Data hold from IOW 7.2 ns


† Timing for the TMS320C30-33

12-26
System Control Functions

12.4 System Control Functions


Several aspects of TMS320C3x system hardware design are critical to overall
system operation. These include such functions as clock and reset signal gen-
eration and interrupt control.

12.4.1 Clock Oscillator Circuitry


You can provide an input clock to the TMS320C3x either from an external clock
input or by using the onboard oscillator. Unless special clock requirements ex-
ist, the onboard oscillator is generally a convenient method for clock genera-
tion. This method requires few external components and can provide stable,
reliable clock generation for the device.
Figure 12–14 shows the external clock generator circuit designed to operate
the TMS320C3x at 33.33 MHz. Since crystals with fundamental oscillation fre-
quencies of 30 MHz and above are not readily available, a parallel-resonant
third-overtone crystal is used with crystal frequency of 13 MHz.

Figure 12–14. Crystal Oscillator Circuit

TMS320C3x

X2/CLKIN X1
13 MHz

15 pF 15 pF

10 µH

In a third-overtone oscillator, the crystal fundamental frequency must be


attenuated so that oscillation is at the third harmonic. This is achieved with an
LC circuit that filters out the fundamental, thus allowing oscillation at the third
harmonic. The impedance of the LC circuit must be inductive at the crystal fun-
damental and capacitive at the third harmonic. The impedance of the LC circuit
is represented by

z(w) + jwL ) jw1C


(3)

Therefore, the LC circuit has a 0 at

ωP + ǸLC
1 (4)

Hardware Applications 12-27


System Control Functions

At frequencies significantly lower than ωP, the 1/(ωC) term in (3) becomes the
dominating term, while ωL can be neglected. This is expressed as

z(w) + jw1C for w tw P


(3)

In (5), the LC circuit appears conductive at frequencies lower than ωP. On the
other hand, at frequencies much higher than ωP, the ωL term is the dominant
term in (3), and 1/(ωC) can be neglected. This is expressed as

z(w) + jwL for w tw P


(3)

The LC circuit in (6) appears increasingly inductive as the frequency increases


above ω P. This is shown in Figure 12–15, which is a plot of the magnitude of
the impedance of the LC circuit of Figure 12–14 versus frequency.

Figure 12–15. Magnitude of the Impedance of the Oscillator LC Network

| z (ω) |

ωP + ǸLC
1 ω
(rad/s)

12-28
System Control Functions

Based on the discussion above, the design of the LC circuit proceeds as fol-
lows:

1) Choose the pole frequency ωP slightly above the crystal fundamental.


2) The circuit now appears inductive at the fundamental frequency and ca-
pacitive at the third harmonic.

In the oscillator of Figure 12–14 on page 12-27, choose fP = 13 MHz, which


is slightly above the fundamental frequency of the crystal. Choose C = 15 pF.
Then, using equation (4), L = 10 µH.

12.4.2 Reset Signal Generation

The reset input controls initialization of internal TMS320C3x logic and also
causes execution of the system initialization software. For proper system ini-
tialization, the reset signal must be applied for at least ten H1 cycles, i.e., 600
ns for a TMS320C3x operating at 33.33 MHz. Upon power-up, however, it can
take 20 ms or more before the system oscillator reaches a stable operating
state. Therefore, the power-up reset circuit should generate a low pulse on the
reset line for 100 to 200 ms. Once a proper reset pulse has been applied, the
processor fetches the reset vector from location 0, which contains the address
of the system initialization routine. Figure 12–16 shows a circuit that will gener-
ate an appropriate power-up reset circuit.

Figure 12–16. Reset Circuit

TMS320C3x

RS

+5 V

R1 = 100 KΩ 74LS14 74LS14

C1 = 4.7 µF

DGND

Hardware Applications 12-29


System Control Functions

The voltage on the reset pin (RESET) is controlled by the R1C1 network. After
a reset, this voltage rises exponentially according to the time constant R1C1,
as shown in Figure 12–17.

Figure 12–17. Voltage on the TMS320C30 Reset Pin

Voltage

V = VCC (1 – e – t / τ )
VCC

V1

t0 = 0 t1 Time

The duration of the low pulse on the reset pin is approximately t1, which is the
time it takes for the capacitor C1 to be charged to 1.5 V. This is approximately
the voltage at which the reset input switches from a logic 0 to a logic 1. The

ƪ ƫ
capacitor voltage is expressed as

V + VCC t
1–e –t
(7)

ƪ ƫ
where τ = R1C1 is the reset circuit time constant. Solving equation (7) for t re-
sults in

t + –R1C1ln 1 –
V
V
(8)
CC

Setting the following:

R1 = 100 KΩ

C1 = 4.7 µF

VCC = 5 V

V = V1 = 1.5 V

results in t = 167 ms. Therefore, the reset circuit of Figure 12–16 provides a
low pulse of long enough duration to ensure the stabilization of the system os-
cillator.

12-30
System Control Functions

Note that if synchronization of multiple TMS320C3xs is required, all proces-


sors should be provided with the same input clock and the same reset signal.
After power-up, when the clock has stabilized, all processors can be synchro-
nized by generating a falling edge on the common reset signal. Because it is
the falling edge of reset that establishes synchronization, reset must be high
for at least ten H1 cycles initially. Following the falling edge, reset should re-
main low for at least ten H1 cycles and then be driven high. This sequencing
of reset can be accomplished using additional circuitry based on either RC
time delays or counters.

Hardware Applications 12-31


Serial-Port Interface

12.5 Serial-Port Interface


For applications such as modems, speech, control, instrumentation, and ana-
log interface for DSPs, a complete analog-to-digital (A/D) and digital-to-analog
(D/A) input/output system on a single chip might be appropriate. The
TLC32044 analog interface circuit (AIC) integrates a bandpass, switched-ca-
pacitor, antialiasing input filter, 14-bit resolution A/D and D/A converters, and
a low-pass, switched-capacitor, output-reconstruction filter, all on a single
monolithic CMOS chip. The TLC32044 offers numerous combinations of mas-
ter clock input frequencies and conversion/sampling rates, which can be
changed via digital signal processor control.

Four serial port modes on the TLC32044 allow direct interface to TMS320C3x
processors. When the transmit and receive sections of the AIC are operating
synchronously, it can interface to two SN54299 or SN74299 serial-to-parallel
shift registers. These shift registers can then interface in parallel to the
TMS320C30, to other TMS320 digital processors, or to external FIFO circuitry.
Output data pulses inform the processor that data transmission is complete or
allow the DSP to differentiate between two transmitted bytes. A flexible control
scheme is provided so that the functions of the AIC can be selected and ad-
justed coincidentally with signal processing via software control. Refer to the
TLC32044 data sheet for detailed information.

When you interface the AIC to the TMS320C3x via one of the serial ports, no
additional logic is required. This interface is shown in Figure 12–18. The serial
data, control, and clock signals connect directly between the two devices, and
the AIC’s master clock input is driven from TCLK0, one of the TMS320C3x’s
internal timer outputs. The AIC’s WORD/BYTE input is pulled high, selecting
16-bit serial port transfers to optimize serial port data transfer rate. The
TMS320C3x’s XF0 pin, configured as an output, is connected to the AIC’s re-
set (RST) input to allow the AIC to be reset by the TMS320C3x under program
control. This allows the TMS320C3x timer and serial port to be initialized be-
fore beginning conversions on the AIC.

12-32
Serial-Port Interface

Figure 12–18. AIC to TMS320C30 Interface

TMS320C30 TLC32044

IN+ ADV
FSX0 FSX
IN– AGND
DX0 DX
FSR0 FSR OUT+ AOUT
DR0 DR OUT–
CLKX0 SHIFT CLK
CLKR0 VDD +5 V
TCLK0 MSTR CLK VCC+ +5 V
VCC– +5V
XF0
AGND
G2
AGND AGND
WORO1 BYTE +5 V
RST
DGND

DGND

To provide the master clock input for the AIC, the TCLK0 timer is configured
to generate a clock signal with a 50% duty cycle at a frequency of f(H1)/4 or
4.167 MHz. To accomplish this, the global control register for timer 0 is set to
the value 3C1h, which establishes the desired operating modes. The period
register for timer 0 is set to 1, which sets the required division ratio for the H1
clock.

To properly communicate with the AIC, the TMS320C30 serial port must be
configured appropriately by initializing several TMS320C30 registers and
memory locations. First, reset the serial port by setting the serial port global
control register to 2170300h. (The AIC should also be reset at this time. See
description below of resetting the AIC via XF0.) This resets the serial port logic,
configures the serial port operating modes, including data transfer lengths,
and enables the serial port interrupts. This also configures another important
aspect of serial port operation: polarity of serial port signals. Because active
polarity of all serial port signals is programmable, it is critical to set appropriate-
ly the bits in the serial port global control register that control the polarity. In this
application, all polarities are set to positive except FSX and FSR, which are
driven by the AIC and are true low.

The serial port transmit and receive control registers must also be initialized
for proper serial port operation. In this application, both of these registers are
set to 111h, which configures all of the serial port pins in the serial port mode,
rather than the general-purpose digital I/O mode.

Hardware Applications 12-33


Serial-Port Interface

When the operations described above are completed, interrupts are enabled,
and, provided that the serial port interrupt vector(s) are properly loaded, serial
port transfers can begin after the serial port is taken out of reset. You can do
this by loading E170300h into the serial port global control register.
To begin conversion operations on the AIC and subsequent transfers of data
on the serial port, first reset the AIC by setting XF0 to 0 at the beginning of the
TMS320C3x initialization routine. Set XF0 to 0 by setting the TMS320C3x IOF
register to 2. This sets the AIC to a default configuration and halts serial port
transfers and conversion operations until reset is set high. Once the
TMS320C3x serial port and timer have been initialized as described above,
set XF0 high by setting the IOF register to 6. This allows the AIC to begin oper-
ating in its default configuration, which in this application is the desired mode.
In this mode, all internal filtering is enabled, sample rate is set at approximately
6.4 kHz, and the transmit and receive sections of the device are configured to
operate synchronously. This mode of operation is appropriate for a variety of
applications; if a 5.184-MHz master clock input is used, the default configura-
tion results in an 8-kHz sample rate, which makes this device ideal for speech
and telecommunications applications.
In addition to the benefit of a convenient default operating configuration, the
AIC can also be programmed for a wide variety of other operating configura-
tions. Sample rates and filter characteristics can be varied, and numerous con-
nections in the device can be configured to establish different internal architec-
tures by enabling or disabling various functional blocks.
To configure the AIC in a fashion different from the default state, you must first
send the device a serial data word with the two LSBs set to 1. The two LSBs
of a transmitted data word are not part of the transferred data information and
are not set to 1 during normal operation. This condition indicates that the next
serial transmission will contain secondary control information, not data. This
information is then used to load various internal registers and specify internal
configuration options. Four different types of secondary control words are dis-
tinguished by the state of the two LSBs of the transferred control information.
Note that each transferred secondary control word must be preceded by a data
word with the two LSBs set to 1.
The TMS320C3x can communicate with the AIC either synchronously or
asynchronously, depending on the information in the control register. The op-
erating sequence for synchronous communication with the TMS320C30
shown in Figure 12–19 is as follows:
1) The FSX or FSR pin is brought low.
2) One 16-bit word is transmitted, or one 16-bit word is received.
3) The FSX or FSR pin is brought high.
4) The E0DX or E0DR pin emits a low-going pulse.

12-34
Serial-Port Interface

Figure 12–19. Synchronous Timing of TLC32044 to TMS320C3x

SHIFT CLK

FSR, FSX

DR D15 D14 D13 D12 D2 D1 D0

DX D15 D14 D13 D12 D2 D1 D0

E0DR, E0DX

For asynchronous communication, the operating sequence is similar, but FSX


and FSR do not occur at the same time (see Figure 12–20). After each receive
and transmit operation, the TMS320C30 asserts an internal receive (RINT)
and transmit (XINT) interrupt, which can be used to control program execution.

Figure 12–20. Asynchronous Timing of TLC32044 to TMS320C30

FSX

FSR

Hardware Applications 12-35


Low-Power-Mode Interrupt Interface

12.6 Low-Power-Mode Interrupt Interface


This section explains how to generate interrupts when the IDLE2 power-down
mode is used.

The execution of the IDLE2 instruction causes the H1 and H3 processor clocks
to be held at a constant level until the occurrence of an external interrupt. To
use the TMS320C31 IDEL2 power management feature effectively, interrupts
must be generated with or without the presence of the H1 clock. For normal
(non-IDLE2) operation, however, the interrupt inputs must be synchronized
with the falling edge of the H1 clock. An interrupt must satisfy the following
conditions:

- It must meet the setup time on the falling edge of H1, and
- It must be at least one cycle and less than two cycles in duration.

For an interrupt to be recognized during IDLE2 operation and turn the clocks
back on, it must first be held low for one H1 cycle. The logic in Figure 12–21
can be used to generate an interrupt signal to the TMS320C31 with the correct
timing during non-IDLE2 and IDLE2 operation. Figure 12–21 shows the inter-
rupt circuit, which uses a 16R4 PLD to generate the appropriate interrupt sig-
nal.

Figure 12–21. Interrupt Generation Circuit for Use With IDLE2 Operation

TMS320C31 TIBPAL16R4

Interrupt
INTx Source 2 12

H1 CLK

Example 12–1 shows the PLD equations for the 16R4 using the ABEL lan-
guage. This implementation makes the following assumptions regarding the
interrupt source:

- The interrupt source is at least one H1 cycle in duration. One H1 cycle is


required to turn the H1 clock on again.

- The interrupt source is a low-going pulse or a falling edge. If the interrupt


source stays active for more than one H1 cycle, it is regarded as the same
interrupt request and not a new one.

12-36
Low-Power-Mode Interrupt Interface

Notice that the interrupt is driven active as soon as the interrupt source goes
active. It goes inactive again on detection of two H3 rising edges. These two
rising edges ensure that the interrupt is recognized during normal operation
and after the end of IDLE2 operation (when the clocks turn on again). The inter-
rupt goes inactive after the two H3 clocks are counted and does not go inactive
again until after the interrupt source again goes inactive and returns to active.

Example 12–1. State Machine and Equations for the Interrupt Generation 16R4 PLD
MODULE INTERRUPT_GENERATION
TITLE’ INTERRUPT_GENERATION FOR IDLE2 AND NON-IDLE2 TMS320C31A
TMS320C31’
c3xu5 device ’P16R4’;
”inputs
h3 Pin 1;
intsrc_ Pin 2; ”Interrupt source
”output
intx_ Pin 12; ”Interrupt input signal to the TMS320C31
sync_src_Pin 14; ”Internal signal used to synchronize the
”input to the H1 clock
same_ Pin 15; ”Keeps track if the new interrupt source
”has occurred. If active, no new interrupt
”has occurred.
”This logic makes the following assumptions:
”The duration of the interrupt source is at least one H1
”cycle in duration. It takes one H1 cycle to turn the H1
”clock on again.
”The interrupt source is pulse- or level-triggered. If the
”source stays active after being asserted, it is regarded
”as the same interrupt request and not a new one.

”Name Substitutions for Test Vectors and Equations

c,H,L,X = .C.,,1,0,.X.;
source = !intsrc_;
sync = !sync_src_;
samesrc = !same_;
c3xint = !intx_;
”state bits
outstate = [samesrc,sync];
idle = ^b00;
sync_st = ^b01; ”synchronize state
wait = ^b10; ”wait for interrupt source to go inactive

Hardware Applications 12-37


Low-Power-Mode Interrupt Interface

state_diagram outstate
state idle:
if (source) then sync_st
else idle;

state sync_st:
if (source) then wait
else idle;

state wait:
if (source) then wait
else idle;

equations
!intx_ = (source # sync) & !samesrc;
@page
”Test interrupt generation logic
test_vectors
([he, source] –> [outstate,c3xint])
[ c, L ] –> [idle, L ]; ”check start from idle
[ L, H ] –> [idle, H ]; ”test normal interrupt operation
[ c, H ] –> [sync_st, H ];
[ c, L ] –> [idle, L ];
[ c, L ] –> [idle, L ];
[ L, H ] –> [idle, H ]; ”test coming out of idle2 operation
[ L, H ] –> [idle, H ];
[ c, H ] –> [sync_st, H ];
[ c, L ] –> [idle, L ];
[ c, H ] –> [sync_st, H ]; ”test same source
[ c, H ] –> [wait, L ];
[ c, H ] –> [wait, L ];
[ c, L ] –> [idle, L ];
[ L, H ] –> [idle, H ]; ”test idle2 operation
[ L, H ] –> [idle, H ];
[ L, H ] –> [idle, H ];
end interrupt_generation

12-38
XDS Target Design Considerations

12.7 XDS Target Design Considerations

12.7.1 Designing Your MPSD Emulator Connector (12-Pin Header)


The ’C3x uses a modular port scan device (MPSD) technology to allow com-
plete emulation via a serial scan path of the ’C3x. To communicate with the
emulator, your target system must have a 12-pin header (2 rows of 6 pins) with
the connections that are shown in Figure 12–22.To use the target cable, sup-
ply the signals shown in Table 12–3 to a 12-pin header with pin 8 cut out to pro-
vide keying. For the latest information, refer to the JTAG/MPSD Emulation
Technical Reference (literature number SPDU079).

Figure 12–22. 12-Pin Header Signals and Header Dimensions


EMU1† 1 2 GND
Header Dimensions:
EMU0† 3 4 GND
Pin-to-pin spacing, 0.100 in. (X,Y)
EMU2† 5 6 GND Pin width: 0.025-in. square post
Pin length: 0.235-in. nominal
PD(VCC) 7 8 no pin (key)‡
EMU3 9 10 GND
H3 11 12 GND
† These signals should always be pulled up with separate 20-kΩ resistors to VCC.
‡ While the corresponding female position on the cable connector is plugged to prevent improper
connection, the cable lead for pin 8 is present in the cable and is grounded as shown in the
schematics and wiring diagrams in this document.

Table 12–3.12-Pin Header Signal Descriptions and Pin Numbers


XDS510 ’C30 ’C31
Signal Description Pin Number Pin Number
EMU0 Emulation pin 0 F14 124
EMU1 Emulation pin 1 E15 125
EMU2 Emulation pin 2 F13 126
EMU3 Emulation pin 3 E14 123
H3 ’C3x H3 A1 82
PD Presence detect. Indicates that the emulation cable is con-
nected and that the target is powered up. PD should be tied to
VCC in the target system.

Although you can use other headers, recommended parts include:

straight header, unshrouded DuPont Connector Systems


part numbers: 65610–112
65611–112
37996–112
67997–112

Hardware Applications 12-39


XDS Target Design Considerations

Figure 12–23 shows a portion of logic in the emulator pod. Note that 33-Ω re-
sistors have been added to the EMU0, EMU1, and EMU2 lines; this minimizes
cable reflections.

Figure 12–23. Emulator Cable Pod Interface

74LVT240
33 Ω
EMU1 (Pin 1)

33 Ω
EMU0 (Pin 2)

33 Ω
EMU2 (Pin 3)

+5 V
180 Ω 270 Ω 74F175

JP1
EMU3 (Pin 9) D
+5 V
180 Ω 270 Ω

JP2 74AS1004
H3 (Pin 11)

PD (VCC Pin 7)
100 Ω
RESIN
TL7705A
GND (Pins 2, 4, 6, 8, 10, 12)

12.7.2 MPSD Emulator Cable Signal Timing


Figure 12–24 shows the signal timings for the emulator pod. Table 12–4 de-
fines the timing parameters. The timing parameters are calculated from values
specified in the standard data sheets for the emulator and cable pod and are
for reference only. Texas Instruments does not test or guarantee these timings.

12-40
XDS Target Design Considerations

Figure 12–24. Emulator Cable Pod Timings


1

H3

2
3
EMU0
EMU1
EMU2
4
5
6

EMU3

Table 12–4.Emulator Cable Pod Timing Parameters


No. Reference Description Min Max Unit
1 tH3 min ns
H3 period 35 200
tH3 max
2 tH3 high min H3 high pulse duration 15 ns
3 tH3 low min H3 low pulse duration 15 ns
4 td (EMU0, 1, 2) EMU0, 1, 2 valid from H3 low 7 23 ns
5 tsu (EMU3) EMU3 setup time to H3 high 3 ns
6 thd (EMU3) EMU3 hold time from H3 high 11 ns

12.7.3 Connections Between the Emulator and the Target System


It is extremely important to provide high-quality signals between the emulator
and the ’C3x on the target system. In many cases, the signal must be buffered
to produce high quality. The need for signal buffering can be divided into three
categories, depending on the placement of the emulation header:

- No signals buffered. In this situation, the distance between the emulation


header and the ’C3x should be no more than two inches. (See
Figure 12–25.)

Hardware Applications 12-41


XDS Target Design Considerations

Figure 12–25. Signals Between the Emulator and the ’C3x With No Signals Buffered
2 inches or less
VCC

TMS320C3x Emulator Header


7
PD
3
EMU0 EMU0
1 2
EMU1 EMU1 GND
4
5 GND
EMU2 EMU2 6
GND
8
GND
9 10
EMU3 EMU3 GND
11 12
H3 H3 GND

GND

- Transmission signals buffered. In this situation, the distance between


the emulation header and the ’C3x is greater than two inches but less than
six inches. The transmission signals, H3 and EMU3, are buffered through
the same package. (See Figure 12–26.)

Figure 12–26. Signals Between the Emulator and the ’C3x With Transmission Signals
Buffered
2 to 6 inches
VCC

TMS320C3x Emulator Header


7
PD
3
EMU0 EMU0
1 2
EMU1 EMU1 GND
4
5 GND
EMU2 EMU2 6
GND
8
GND
9 10
EMU3 EMU3 GND
11 12
H3 H3 GND

GND

12-42
XDS Target Design Considerations

- All signals buffered. The distance between the emulation header and the
’C3x is greater than 6 inches but less than 12 inches. All ’C3x emulation
signals, EMU0, EMU1, EMU2, EMU3, and H3, are buffered through the
same package. (See Figure 12–27.)

Figure 12–27. All Signals Buffered

6 to 12 inches
VCC

TMS320C3x Emulator Header


7
PD
3
EMU0 EMU0
1 2
EMU1 EMU1 GND
4
5 GND
EMU2 EMU2 6
GND
8
GND
9 10
EMU3 EMU3 GND
11 12
H3 H3 GND

GND

H3 Buffer Restrictions

Don’t connect any devices be-


tween the buffered H3 output
and the header! Otherwise,
you will degrade the quality
of the signal.

12.7.4 Mechanical Dimensions for the 12-Pin Emulator Connector

The ’C3x emulator target cable consists of a three-foot section of jacketed


cable, an active cable pod, and a short section of jacketed cable that connects
to the target system. The overall cable length is approximately three feet, ten
inches. Figure 12–28 and Figure 12–29 show the mechanical dimensions for
the target cable pod and short cable. Note that the pin-to-pin spacing on the
connector is 0.100 inches in both the X and Y planes. The cable pod box is
nonconductive plastic with four recessed metal screws.

Hardware Applications 12-43


XDS Target Design Considerations

Figure 12–28. Pod/Connector Dimensions

2.70

4.50

9.50

0.90

Emulator cable pod


Connector

Short, jacketed cable

Refer to Figure 12–29.


Note: All dimensions are in inches and are nominal unless otherwise specified.

12-44
XDS Target Design Considerations

Figure 12–29. 12-Pin Connector Dimensions

0.20

Cable

0.38

Connector, Side View

0.100
Key, Pin 8

0.70
Cable

0.100

Connector, Front View


Pin 1, 3, 5, 7, 9, 11 Pin 2, 4, 6, 8, 10, 12

Note: All dimensions are in inches and are nominal unless otherwise specified.

12.7.5 Diagnostic Applications

For system diagnostics applications, or to embed emulation compatibility on


your target system, you can connect a ’C3x device directly to a TI ACT8990
test bus controller (TBC) as shown in Figure 12–30. The TBC is described in
the Texas Instruments Advanced Logic and Bus Interface Logic Data Book (lit-
erature number SCYD001). A TBC can connect to only one ’C3x device.

Hardware Applications 12-45


XDS Target Design Considerations

Figure 12–30. TBC Emulation Connections for ’C3x Scan Paths


VCC

22 kΩ
TBC 22 kΩ 22 kΩ C3x

TMS0 EMU0
TMS1 EMU1
TD0 EMU2
TCKO EMU4
TCKI H1 (Clock)
TDI0 EMU3
TDI1 EMU5
TMS2/EVNT0 EMU6
TMS3/EVNT1
TMS4/EVNT2
TMS5/EVNT3

Notes: 1) In a ’C3x design, the TBC can connect to only one ’C3x device.
2) The ’C3x device’s H1 clock drives TCKI on the TBC. This is different from the
emulation header connections where H3 is used.

12-46
Chapter 13

TMS320C3x Signal Descriptions


and Electrical Characteristics

This chapter covers the TMS320C3x pinouts, signal descriptions, and


electrical characteristics.

Major topics discussed in this chapter are as follows:

Topic Page

13.1 Pinout and Pin Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2


13.2 Signal Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-16
13.3 Electrical Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-25
13.4 Signal Transition Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-29
13.5 Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-30

13-1
Pinout and Pin Assignments

13.1 Pinout and Pin Assignments

13.1.1 TMS320C30 Pinouts and Pin Assignments


The TMS320C30 digital signal processor is available in a 181-pin grid array
(PGA) package. Figure 13–1 and Figure 13–2 show the pinout for this pack-
age. Figure 13–3 shows the mechanical layout. Table 13–1 shows the
associated pin assignments alphabetically; Table 13–2 shows the pin assign-
ments numerically.

13-2
Pinout and Pin Assignments

Figure 13–1. TMS320C30 Pinout (Top View)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

H3 D2 D3 D7 D10 D13 D16 D17 D19 D22 D25 D28 XA0 XA1 XA5

A
X2/CLKIN CVSS H1 D4 D8 D11 D15 D18 D20 D24 D27 D31 XA4 IVSS XA6

B
EMU5 X1 DVSS D0 D5 D9 D14 VSS D21 D26 D30 XA3 DVSS XA7 XA10

C
XR/W XRDY VBBP DDVDD D1 D6 D12 VDD D23 D29 XA2 ADVDD XA9 XA11 MC/MP

D
RDY HOLDA MSTRB VSUBS LOCATOR DDVDD XA8 XA12 EMU3 EMU1

E
RESET STRB HOLD IOSTRB EMU4/SHZ EMU2 EMU0 A0

F
IACK XF0 XF1 R/W A1 A2 A3 A4

G
INT1 INT0 VSS VDD MDVDD TMS320C30 ADVDD VDD VSS A6 A5
Top View
H
INT2 INT3 RSV0 RSV1 A11 A9 A8 A7

J
RSV2 RSV3 RSV5 RSV7 A17 A14 A12 A10

K
RSV4 RSV6 RSV9 CLKR1 IODVDD A22 A18 A15 A13

L
RSV8 RSV10 FSR1 PDVDD CLKX0 EMU6 XD5 VDD XD16 XD22 XD27 IODVDD A21 A19 A16

M
DR1 CLKX1 DVSS CLKR0 TCLK1 XD2 XD7 VSS XD14 XD19 XD23 XD28 DVSS A23 A20

N
FSX1 DX1 FSR0 TCLK0 XD1 XD4 XD8 XD10 XD13 XD17 XD20 XD24 XD29 CVSS XD31

P
DR0 FSX0 DX0 XD0 XD3 XD6 XD9 XD11 XD12 XD15 XD18 XD21 XD25 XD26 XD30

TMS320C3x Signal Descriptions and Electrical Characteristics 13-3


Pinout and Pin Assignments

Figure 13–2. TMS320C30 Pinout (Bottom View)


15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

XA5 XA1 XA0 D28 D25 D22 D19 D17 D16 D13 D10 D7 D3 D2 H3

A
XA6 IVSS XA4 D31 D27 D24 D20 D18 D15 D11 D8 D4 H1 CVSS X2/CLKIN

B
XA10 XA7 DVSS XA3 D30 D26 D21 VSS D14 D9 D5 D0 DVSS X1 EMU5

C
MC/MP XA11 XA9 ADVDD XA2 D29 D23 VDD D12 D6 D1 DDVDD VBBP XRDY XR/W

D
EMU1 EMU3 XA12 XA8 DDVDD LOCATOR VSUBS MSTRB HOLDA RDY

E
A0 EMU0 EMU2 EMU4/SHZ IOSTRB HOLD STRB RESET

F
A4 A3 A2 A1 R/W XF1 XF0 IACK

G
A5 A6 VSS VDD ADVDD TMS320C30 MDVDD VDD VSS INT0 INT1
Bottom View H
A7 A8 A9 A11 RSV1 RSV0 INT3 INT2

J
A10 A12 A14 A17 RSV7 RSV5 RSV3 RSV2

K
A13 A15 A18 A22 IODVDD CLKR1 RSV9 RSV6 RSV4

L
A16 A19 A21 IODVDD XD27 XD22 XD16 VDD XD5 EMU6 CLKX0 PDVDD FSR1 RSV10 RSV8

M
A20 A23 DVSS XD28 XD23 XD19 XD14 VSS XD7 XD2 TCLK1 CLKR0 DVSS CLKX1 DR1

N
XD31 CVSS XD29 XD24 XD20 XD17 XD13 XD10 XD8 XD4 XD1 TCLK0 FSR0 DX1 FSX1

P
XD30 XD26 XD25 XD21 XD18 XD15 XD12 XD11 XD9 XD6 XD3 XD0 DX0 FSX0 DR0

13-4
Pinout and Pin Assignments

Figure 13–3. TMS320C30 181-Pin PGA Dimensions—GEL Package

Thermal Resistance Characteristics


Air Flow
Parameter °C/W LFPM
RΘJC 2.0 N/A
40.38 (1.590)
RΘJA 21.8 0
39.62 (1.560)
RΘJA N/A 200
RΘJA N/A 400
RΘJA N/A 600
RΘJA N/A 800
RΘJA N/A 1000

40.38 (1.590)
39.62 (1.560)
5.02 (0.198)
3.88 (0.152) 1.52 (0.060)
1.02 (0.040)

1,27 (0.050) Nom


.510 (.020) Dia (4 Places)
3.68 (.145) .410 (.016)
2.92 (.115)
(181 Places)

2,54 (0.100) T.P.


R
P
N
M
L
K Bottom
J View
35.86 (1.412) H
35.26 (1.388) G Locator
F
E
D
C 2,54 (0.100) TYP
B
A
1 2 3 4 5 6 7 8 9 1011 121314 15

All linear dimensions are in millimeters and parenthetically in inches.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-5


Pinout and Pin Assignments

Table 13–1.TMS320C30–PGA Pin Assignments (Alphabetical)†


Signal Pin Signal Pin Signal Pin Signal Pin Signal Pin
A0 F15 D8 B5 EMU8 F14 VBBP D3 XD15 R10
A1 G12 D9 C6 FSR0 P3 VDD D8 XD16 M9
A2 G13 D10 A5 FSR1 M3 VDD H4 XD17 P10
A3 G14 D11 B6 FSX0 R2 VDD H12 XD18 R11
A4 G15 D12 D7 FSX1 P1 VDD M8 XD19 N10
A5 H15 D13 A6 H1 B3 VSS C8 XD20 P11
A6 H14 D14 C7 H3 A1 VSS H3 XD21 R12
A7 J15 D15 B7 HOLD F3 VSS H13 XD22 M10
A8 J14 D16 A7 HOLDA E2 VSS N8 XD23 N11
A9 J13 D17 A8 IACK G1 VSUBS E4 XD24 P12
A10 K15 D18 B8 INT0 H2 X1 C2 XD25 R13
A11 J12 D19 A9 INT1 H1 X2/CLKIN B1 XD26 R14
A12 K14 D20 B9 INT2 J1 XA0 A13 XD27 M11
A13 L15 D21 C9 INT3 J2 XA1 A14 XD28 N12
A14 K13 D22 A10 IODVDD L8 XA2 D11 XD29 P13
A15 L14 D23 D9 IODVDD M12 XA3 C12 XD30 R15
A16 M15 D24 B10 IOSTRB F4 XA4 B13 XD31 P15
A17 K12 D25 A11 IVSS B14 XA5 A15 XF0 G2
A18 L13 D26 C10 LOCATOR E5 XA6 B15 XF1 G3
A19 M14 D27 B11 MC/MP D15 XA7 C14 XRDY D2
A20 M13 D28 A12 MDVDD H5 XA8 E12 XR/W D1
A21 N15 D29 D10 MSTRB E3 XA9 D13
A22 L12 D30 C11 PDVDD M4 XA10 C15
A23 N14 D31 B12 RDY E1 XA11 D14
ADVDD D12 DDVDD D4 RESET F1 XA12 E13
ADVDD H11 DDVDD E8 RSV0 J3 XD0 R4
CLKR0 N4 DR0 R1 RSV1 J4 XD1 P5
CLKR1 L4 DR1 N1 RSV2 K1 XD2 N6
CLKX0 M5 DVSS C3 RSV3 K2 XD3 R5
CLKX1 N2 DVSS C13 RSV4 L1 XD4 P6
CVSS B2 DVSS N3 RSV5 K3 XD5 M7
CVSS P14 DVSS N13 RSV6 L2 XD6 R6
D0 C4 DX0 R3 RSV7 K4 XD7 N7
D1 D5 DX1 P2 RSV8 M1 XD8 P7
D2 A2 EMU1 E15 RSV9 L3 XD9 R7
D3 A3 EMU2 F13 RSV10 M2 XD10 P8
D4 B4 EMU3 E14 R/W G4 XD11 R8
D5 C5 EMU4/SHZ F12 STRB F2 XD12 R9
D6 D6 EMU5 C1 TCLK0 P4 XD13 P9
D7 A4 EMU6 M6 TCLK1 N5 XD14 N9
† ADVDD, CVSS, DDVDD, DVSS, IODVDD, IVSS, MDVDD, PDVDD, VDD, and VSS pins are on a common plane internal to the
device.

13-6
Pinout and Pin Assignments

Table 13–2.TMS320C30–PGA Pin Assignments (Numerical)†


Signal Pin Signal Pin Signal Pin Signal Pin Signal Pin
H3 A1 D30 C11 XF1 G3 A13 L15 XD17 P10
D2 A2 XA3 C12 R/W G4 RSV8 M1 XD20 P11
D3 A3 DVSS C13 A1 G12 RSV10 M2 XD24 P12
D7 A4 XA7 C14 A2 G13 FSR1 M3 XD29 P13
D10 A5 XA10 C15 A3 G14 PDVDD M4 CVSS P14
D13 A6 XR/W D1 A4 G15 CLKX0 M5 XD31 P15
D16 A7 XRDY D2 INT1 H1 EMU6 M6 DR0 R1
D17 A8 VBBP D3 INT0 H2 XD5 M7 FSX0 R2
D19 A9 DDVDD D4 VSS H3 VDD M8 DX0 R3
D22 A10 D1 D5 VDD H4 XD16 M9 XD0 R4
D25 A11 D6 D6 MDVDD H5 XD22 M10 XD3 R5
D28 A12 D12 D7 ADVDD H11 XD27 M11 XD6 R6
XA0 A13 VDD D8 VDD H12 IODVDD M12 XD9 R7
XA1 A14 D23 D9 VSS H13 A20 M13 XD11 R8
XA5 A15 D29 D10 A6 H14 A19 M14 XD12 R9
X2/CLKIN B1 XA2 D11 A5 H15 A16 M15 XD15 R10
CVSS B2 ADVDD D12 INT2 J1 DR1 N1 XD18 R11
H1 B3 XA9 D13 INT3 J2 CLKX1 N2 XD21 R12
D4 B4 XA11 D14 RSV0 J3 DVSS N3 XD25 R13
D8 B5 MC/MP D15 RSV1 J4 CLKR0 N4 XD26 R14
D11 B6 RDY E1 A11 J12 TCLK1 N5 XD30 R15
D15 B7 HOLDA E2 A9 J13 XD2 N6
D18 B8 MSTRB E3 A8 J14 XD7 N7
D20 B9 VSUBS E4 A7 J15 VSS N8
D24 B10 LOCATOR E5 RSV2 K1 XD14 N9
D27 B11 DDVDD E8 RSV3 K2 XD19 N10
D31 B12 XA8 E12 RSV5 K3 XD23 N11
XA4 B13 XA12 E13 RSV7 K4 XD28 N12
IVSS B14 EMU3 E14 A17 K12 DVSS N13
XA6 B15 EMU1 E15 A14 K13 A23 N14
EMU5 C1 RESET F1 A12 K14 A21 N15
X1 C2 STRB F2 A10 K15 FSX1 P1
DVSS C3 HOLD F3 RSV4 L1 DX1 P2
D0 C4 IOSTRB F4 RSV6 L2 FSR0 P3
D5 C5 EMU4/SHZ F12 RSV9 L3 TCLK0 P4
D9 C6 EMU2 F13 CLKR1 L4 XD1 P5
D14 C7 EMU8 F14 IODVDD L8 XD4 P6
VSS C8 A0 F15 A22 L12 XD8 P7
D21 C9 IACK G1 A18 L13 XD10 P8
D26 C10 XF0 G2 A15 L14 XD13 P9
† ADVDD, CVSS, DDVDD, DVSS, IODVDD, IVSS, MDVDD, PDVDD, VDD, and VSS pins are on a common plane internal to the
device.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-7


Pinout and Pin Assignments

13.1.2 TMS320C30 PPM Pinouts and Pin Assignments


The TMS320C30 PPM device is packaged in a 208-pin plastic quad flat pack
(PQFP) JDEC standard package. Figure 13–4 shows the pinouts for this pack-
age, and Figure 13–5 shows the mechanical layout. Table 13–3 shows the as-
sociated pin assignments alphabetically; Table 13–4 shows the assignments
numerically.

Figure 13–4. TMS320C30 PPM Pinout (Top View)


IODV DD
IODV DD

IODV DD
IODV DD

PDV DD
PDV DD
CLKR0
CLKX0
TCLK1
TCLK0
EMU6

FSR0
XD30
XD29
XD28
XD27
XD26
XD25
XD24
XD23
XD22
XD21
XD20
XD19
XD18
XD17
XD16
XD15
XD14
XD13
XD12

XD10

FSX0
XD11

V DD
V DD
V SS
V SS

DR0
XD9
XD8
XD7
XD6
XD5
XD4
XD3

XD2
XD1
XD0

DX0
NC

NC
104 53

105 52
DVSS DVSS
DVSS DVSS
CVSS DX1
CVSS FSX1
XD31 CLKX1
A23 CLKR1
A22 FSR1
A21 DR1
A20 RSV10
A19 RSV9
A18 RSV8
A17 RSV7
A16 RSV6
A15 RSV5
A14 RSV4
ADVDD RSV3
ADVDD RSV2
A13 RSV1
A12 RSV0
A11 INT3
A10 INT2
A9 INT1
A8 VSS
A7 VSS
A6 NC
VDD VDD
VDD VDD
VSS INT0
VSS IACK
A5 XF0
A4 XF1
A3 RESET
A2 R/W
A1 STRB
A0 RDY
EMU0 MDVDD
EMU1 MDVDD
EMU2 HOLD
EMU3 HOLDA
EMU4 XR/W
MC/MP XSTRB
XA12 MSTRB
XA11 XRDY
XA10 EMU5
XA9 VBBP
XA8 VSUBS
XA7 X1
XA6 X2
IVSS CVSS
IVSS CVSS
DVSS DVSS
DVSS DVSS
156 1

157 208
D11
D20
D19
D18

V SS
V SS
D17
D16
D15
D14
D13
D12

D10
D9
D8
D7
D6
D5
D4
D3
D2
D1
D0
H1
H3
ADV DD
ADV DD
XA5
XA4
XA3
XA2
XA1
XA0
D31
D30
D29
D28
D27
D26

D25
D24
D23
D22
D21
DDV DD
DDV DD

V DD
V DD
NC

DDV DD
DDV DD
NC

13-8
Pinout and Pin Assignments

Figure 13–5. TMS320C30 PPM 208-Pin Plastic Quad Flat Pack—PQL Package

30,7 (1.209)
30,5 (1.201) SQ
156 105

157 104

0,28 (0.01102)
0,18 (1.00709)

0,50 (0.01968) TYP

0,20 (0.008)
0,12 (0.005)
208 53

1 52 3,6 (0.142)
3,4 (0.134)
28,1 (1.106)
SQ
27,9 (1.098)

0,25 (0.001) MIN


Seating Plane 0°– 5°

0,60 (0.024)
0,40 (0.016)

4,20 (0.165) MAX


4040016/A–10/93

Notes: 1) All linear dimensions are in millimeters and parenthetically in inches.


2) This drawing is subject to change without notice.
3) Contact a field sales office to determine if a tighter coplanarity requirement is available for this package.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-9


Pinout and Pin Assignments

Table 13–3.TMS320C30–PPM Pin Assignments (Alphabetical)†

Signal Pin Signal Pin Signal Pin Signal Pin Signal Pin
A0 139 D6 197 EMU0 140 RSV3 37 XA10 148
A1 138 D7 196 EMU1 141 RSV4 38 XA11 147
A2 137 D8 195 EMU2 142 RSV5 39 XA12 146
A3 136 D9 194 EMU3 143 RSV6 40 XD0 64
A4 135 D10 193 EMU4/SHZ 144 RSV7 41 XD1 65
A5 134 D11 192 EMU5 9 RSV8 42 XD2 66
A6 129 D12 191 EMU6 63 RSV9 43 XD3 69
A7 128 D13 190 FSR0 56 RSV10 44 XD4 70
A8 127 D14 189 FSR1 46 R/W 20 XD5 71
A9 126 D15 188 FSX0 59 STRB 19 XD6 72
A10 125 D16 187 FSX1 49 TCLK0 61 XD7 73
A11 124 D17 186 H1 204 TCLK1 62 XD8 74
A12 123 D18 180 H3 205 VBBP 8 XD9 75
A13 122 D19 179 HOLD 15 VDD 26 XD10 76
A14 119 D20 178 HOLDA 14 VDD 27 XD11 82
A15 118 D21 177 IACK 24 VDD 77 XD12 83
A16 117 D22 176 INT0 25 VDD 78 XD13 84
A17 116 D23 175 INT1 31 VDD 130 XD14 85
A18 115 D24 174 INT2 32 VDD 131 XD15 86
A19 114 D25 173 INT3 33 VDD 181 XD16 87
A20 113 D26 170 IODVDD 67 VDD 182 XD17 88
A21 112 D27 169 IODVDD 68 VSS 29 XD18 89
A22 111 D28 168 IODVDD 102 VSS 30 XD19 90
A23 110 D29 167 IODVDD 103 VSS 80 XD20 91
ADVDD 120 D30 166 IVSS 153 VSS 81 XD21 92
ADVDD 121 D31 165 IVSS 154 VSS 132 XD22 93
ADVDD 157 DDVDD 171 MC/MP 145 VSS 133 XD23 94
ADVDD 158 DDVDD 172 MDVDD 16 VSS 184 XD24 95
CLKR0 57 DDVDD 206 MDVDD 17 VSS 185 XD25 96
CLKR1 47 DDVDD 207 MSTRB 11 VSUBS 7 XD26 97
CLKX0 58 DR0 55 NC 28 X1 6 XD27 98
CLKX1 48 DR1 45 NC 79 X2/CLKIN 5 XD28 99
CVSS 3 DVSS 1 NC 104 XA0 164 XD29 100
CVSS 4 DVSS 2 NC 183 XA1 163 XD30 101
CVSS 107 DVSS 51 NC 208 XA2 162 XD31 109
CVSS 108 DVSS 52 PDVDD 53 XA3 161 XF0 23
D0 203 DVSS 105 PDVDD 54 XA4 160 XF1 22
D1 202 DVSS 106 RDY 18 XA5 159 XRDY 10
D2 201 DVSS 155 RESET 21 XA6 152 XR/W 13
D3 200 DVSS 156 RSV0 34 XA7 151 XSTRB 12
D4 199 DX0 60 RSV1 35 XA8 150
D5 198 DX1 50 RSV2 36 XA9 149
† ADVDD, CVSS, DDVDD, DVSS, IODVDD, IVSS, MDVDD, PDVDD, VDD, and VSS pins are on a common plane internal to the
device.

13-10
Pinout and Pin Assignments

Table 13–4.TMS320C30–PPM Pin Assignments (Numerical)†


Pin Signal Pin Signal Pin Signal Pin Signal Pin Signal
1 DVSS 43 RSV9 85 XD14 127 A8 169 D27
2 DVSS 44 RSV10 86 XD15 128 A7 170 D26
3 CVSS 45 DR1 87 XD16 129 A6 171 DDVDD
4 CVSS 46 FSR1 88 XD17 130 VDD 172 DDVDD
5 X2 47 CLKR1 89 XD18 131 VDD 173 D25
6 X1 48 CLKX1 90 XD19 132 VSS 174 D24
7 VSUBS 49 FSX1 91 XD20 133 VSS 175 D23
8 VBBP 50 DX1 92 XD21 134 A5 176 D22
9 EMU5 51 DVSS 93 XD22 135 A4 177 D21
10 XRDY 52 DVSS 94 XD23 136 A3 178 D20
11 MSTRB 53 PDVDD 95 XD24 137 A2 179 D19
12 XSTRB 54 PDVDD 96 XD25 138 A1 180 D18
13 XR/W 55 DR0 97 XD26 139 A0 181 VDD
14 HOLDA 56 FSR0 98 XD27 140 EMU0 182 VDD
15 HOLD 57 CLKR0 99 XD28 141 EMU1 183 NC
16 MDVDD 58 CLKX0 100 XD29 142 EMU2 184 VSS
17 MDVDD 59 FSX0 101 XD30 143 EMU3 185 VSS
18 RDY 60 DX0 102 IODVDD 144 EMU4/SHZ 186 D17
19 STRB 61 TCLK0 103 IODVDD 145 MC/MP 187 D16
20 R/W 62 TCLK1 104 NC 146 XA12 188 D15
21 RESET 63 EMU6 105 DVSS 147 XA11 189 D14
22 XF1 64 XD0 106 DVSS 148 XA10 190 D13
23 XF0 65 XD1 107 CVSS 149 XA9 191 D12
24 IACK 66 XD2 108 CVSS 150 XA8 192 D11
25 INT0 67 IODVDD 109 XD31 151 XA7 193 D10
26 VDD 68 IODVDD 110 A23 152 XA6 194 D9
27 VDD 69 XD3 111 A22 153 IVSS 195 D8
28 NC 70 XD4 112 A21 154 IVSS 196 D7
29 VSS 71 XD5 113 A20 155 DVSS 197 D6
30 VSS 72 XD6 114 A19 156 DVSS 198 D5
31 INT1 73 XD7 115 A18 157 ADVDD 199 D4
32 INT2 74 XD8 116 A17 158 ADVDD 200 D3
33 INT3 75 XD9 117 A16 159 XA5 201 D2
34 RSV0 76 XD10 118 A15 160 XA4 202 D1
35 RSV1 77 VDD 119 A14 161 XA3 203 D0
36 RSV2 78 VDD 120 ADVDD 162 XA2 204 H1
37 RSV3 79 NC 121 ADVDD 163 XA1 205 H3
38 RSV4 80 VSS 122 A13 164 XA0 206 DDVDD
39 RSV5 81 VSS 123 A12 165 D31 207 DDVDD
40 RSV6 82 XD11 124 A11 166 D30 208 NC
41 RSV7 83 XD12 125 A10 167 D29
42 RSV8 84 XD13 126 A9 168 D28
† ADVDD, CVSS, DDVDD, DVSS, IODVDD, IVSS, MDVDD, PDVDD, VDD, and VSS pins are on a common plane internal to the
device.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-11


Pinout and Pin Assignments

13.1.3 TMS320C31 Pinouts and Pin Assignments

The TMS320C31 device is packaged in a 132-pin plastic quad flat pack


(PQFP) JDEC standard package. Figure 13–6 shows the pinouts for this pack-
age, and Figure 13–7 shows the mechanical layout. Table 13–5 shows the as-
sociated pin assignments alphabetically; Table 13–6 shows the pin assign-
ments numerically.

Figure 13–6. TMS320C31 Pinout (Top View)

MCBL/MP

TCLK1

TCLK0
EMU2
EMU1
EMU0
EMU3
VDD

VDD

VDD
VDD

VDD

SHZ
VSS

VSS
VSS

VSS

VSS

VSS
A10

A12
A13
A14
A15
A16
A17
A18

A19

A20
A21

A22
A23
A11

17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117
A9 18 116 DX0
VSS 19 115 VDD
A8 20 114 FSX0
A7 21 113 VSS
A6 22 112 CLKX0
A5 23 111 CLKR0
VDD 24 110 FSR0
A4 25 109 VSS
A3 26 108 DR0
A2 27 107 INT3
A1 28 106 INT2
A0 29 105 VDD
VSS 30 104 VDD
D31 31 103 INT1
VDD 32 102 VSS
VDD 33 101 VSS
D30 34 100 INT0
VSS 35 99 IACK
VSS 36 98 XF1
VSS 37 97 VDD
D29 38 96 XF0
D28 39 95 RESET
VDD 40 94 R/W
D27 41 93 STRB
VSS 42 92 RDY
D26 43 91 VDD
D25 44 90 HOLD
D24 45 89 HOLDA
D23 46 88 X1
D22 47 87 X2/CLKIN
D21 48 86 VSS
VDD 49 85 VSS
D20 50 84 VSS
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
D19
D18
D17
D16
D15

D14

D13

D12

D10
V SS

V SS

V SS

D11

D9
D8
VSS
VSS
VSS
D7
D6

D5
D4
D3
D2
D1
D0
H1
H3
V DD

V DD
V DD

V DD

V DD

13-12
Pinout and Pin Assignments

Figure 13–7. TMS320C31 132-Pin Plastic Quad Flat Pack—PQL Package

4,45 (0.175)
0,254 (0.010) Nom 4,19 (0.165)

0,635 (0.025) Nom 0,76 (0.030) Nom

24,18 (0.952)
24,08 (0.948)

27,56 (1.085)
27,31 (1.075)

24,18 (0.952)
24,08 (0.948)

27,56 (1.085)
27,31 (1.075)

Thermal Resistance Characteristics


Air Flow
Parameter °C/W LFPM
RΘJC 11.0 N/A
RΘJA 49.0 0
RΘJA 35.5 200
RΘJA 28.0 400
RΘJA 23.5 600
RΘJA 21.6 800
RΘJA 20.0 1000

All linear dimensions are in millimeters and parenthetically in inches.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-13


Pinout and Pin Assignments

Table 13–5.TMS320C31 Pin Assignments (Alphabetical)†

Signal Pin Signal Pin Signal Pin Signal Pin Signal Pin
A0 29 D4 76 EMU0 124 VDD 40 VSS 84
A1 28 D5 75 EMU1 125 VDD 49 VSS 85
A2 27 D6 74 EMU2 126 VDD 59 VSS 86
A3 26 D7 73 EMU3 123 VDD 65 VSS 101
A4 25 D8 68 FSR0 110 VDD 66 VSS 102
A5 24 D9 67 FSX0 114 VDD 74 VSS 109
A6 23 D10 64 H1 81 VDD 83 VSS 113
A7 22 D11 63 H3 82 VDD 91 VSS 117
A8 21 D12 62 HOLD 90 VDD 97 VSS 119
A9 20 D13 60 HOLDA 89 VDD 104 VSS 128
A10 19 D14 58 IACK 99 VDD 105 X1 88
A11 18 D15 56 INT0 100 VDD 115 X2/CLKIN 87
A12 17 D16 55 INT1 103 VDD 121 XF0 96
A13 16 D17 54 INT2 106 VDD 131 XF1 98
A14 15 D18 53 INT3 107 VDD 132
A15 14 D19 52 MCBL/MP 127 VSS 3
A16 13 D20 50 RDY 92 VSS 4
A17 12 D21 48 RESET 95 VSS 17
A18 11 D22 47 R/W 94 VSS 19
A19 10 D23 46 SHZ 118 VSS 30
A20 9 D24 45 STRB 93 VSS 35
A21 8 D25 44 TCLK0 120 VSS 36
A22 7 D26 43 TCLK1 122 VSS 37
A23 6 D27 41 VSS 42
CLKR0 5 D28 39 VSS 51
CLKX0 4 D29 38 VDD 6 VSS 57
D0 3 D30 34 VDD 15 VSS 61
D1 2 D31 31 VDD 24 VSS 69
D2 1 DR0 108 VDD 32 VSS 70
D3 130 DX0 116 VDD 33 VSS 71
† VDD and VSS pins are on a common plane internal to the device.

13-14
Pinout and Pin Assignments

Table 13–6.TMS320C31 Pin Assignments (Numerical)†

Pin Signal Pin Signal Pin Signal Pin Signal Pin Signal
1 A21 31 D31 61 VSS 91 VDD 121 VDD
2 A20 32 VDD 62 D12 92 RDY 122 TCLK1
3 VSS 33 VDD 63 D11 93 STRB 123 EMU3
4 VSS 34 D30 64 D10 94 R/W 124 EMU0
5 A19 35 VSS 65 VDD 95 RESET 125 EMU1
6 VDD 36 VSS 66 VDD 96 XF0 126 EMU2
7 A18 37 VSS 67 D9 97 VDD 127 MCBL/MP
8 A17 38 D29 68 D8 98 XF1 128 VSS
9 A16 39 D28 69 VSS 99 IACK 129 A23
10 A15 40 VDD 70 VSS 100 INT0 130 A22
11 A14 41 D27 71 VSS 101 VSS 131 VDD
12 A13 42 VSS 72 D7 102 VSS 132 VDD
13 A12 43 D26 73 D6 103 INT1
14 A11 44 D25 74 VDD 104 VDD
15 VDD 45 D24 75 D5 105 VDD
16 A10 46 D23 76 D4 106 INT2
17 VSS 47 D22 77 D3 107 INT3
18 A9 48 D21 78 D2 108 DR0
19 VSS 49 VDD 79 D1 109 VSS
20 A8 50 D20 80 D0 110 FSR0
21 A7 51 VSS 81 H1 111 CLKR0
22 A6 52 D19 82 H3 112 CLKX0
23 A5 53 D18 83 VDD 113 VSS
24 VDD 54 D17 84 VSS 114 FSX0
25 A4 55 D16 85 VSS 115 VDD
26 A3 56 D15 86 VSS 116 DX0
27 A2 57 VSS 87 X2/CLKIN 117 VSS
28 A1 58 D14 88 X1 118 SHZ
29 A0 59 VDD 89 HOLDA 119 VSS
30 VSS 60 D13 90 HOLD 120 TCLK0
† VDD and VSS pins are on a common plane internal to the device.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-15


Signal Descriptions

13.2 Signal Descriptions


13.2.1 TMS320C30 Signal Descriptions
Table 13–7 describes the signals that the TMS320C30 device uses in the
microprocessor mode. It lists the signal/port/bit name; the number of pins allo-
cated; the input (I), output (O), or high-impedance state (Z) operating modes;
a brief description of the signal’s function; and the condition that places an out-
put pin in high impedance. A line over a signal name (for example, RESET)
indicates that the signal is active (low) (true at a logic 0 level). Pins labeled NC
are not to be connected by the user. The signals are grouped according to
function.

13-16
Signal Descriptions

Table 13–7.TMS320C30 Signal Descriptions


Condition When
Signal/Port # Pins I/O/Z† Description Signal Is in High Z‡
Primary Bus Interface (61 Pins)

D31–D0 32 I/O/Z 32-bit data port of the primary bus interface S H R

A23–A0 24 O/Z 24-bit address port of the primary bus inter- S H R


face

R/W 1 O/Z Read/write signal for primary bus interface. This S H R


pin is high when a read is performed and low
when a write is performed over the parallel inter-
face.

STRB 1 O/Z External access strobe for the primary bus S H


interface

RDY 1 I Ready signal. This pin indicates that the exter- S


nal device is prepared for a primary bus inter-
face transaction to complete.

HOLD 1 I Hold signal for primary bus interface. When


HOLD is a logic low, any ongoing transaction is
completed. The A23–A0, D31–D0, STRB, and
R/W signals are placed in a high-impedance
state, and all transactions over the primary bus
interface are held until HOLD becomes a logic
high or the NOHOLD bit of the primary bus con-
trol register is set.

HOLDA 1 O/Z Hold acknowledge signal for primary bus inter- S


face. This signal is generated in response to a
logic low on HOLD. It signals that A23–A0, D31–
D0, STRB, and R/W are placed in a high-impe-
dance state and that all transactions over the
bus will be held. HOLDA will be high in response
to a logic high of HOLD or when the NOHOLD
bit of the primary bus control register is set.

Expansion Bus Interface (49 Pins)

XD31–XD0 32 I/O/Z 32-bit data port of the expansion bus interface S R

XA12–XA0 13 O/Z 13-bit address port of the expansion bus inter- S R


face

XR/W 1 O/Z Read/write signal for expansion bus interface. S R


When a read is performed, this pin is held high;
when a write is performed, this pin is low.

MSTRB 1 O/Z External memory access strobe for the expan- S


sion bus interface
† Input (I), output (O), high-impedance state (Z)
‡ S = SHZ active, H = HOLD active, R = RESET active

TMS320C3x Signal Descriptions and Electrical Characteristics 13-17


Signal Descriptions

Table 13–7.TMS320C30 Signal Descriptions (Continued)


Condition When
Signal/Port # Pins I/O/Z† Description Signal Is in High Z‡
Expansion Bus Interface (49 Pins) (Continued)

IOSTRB 1 O/Z External I/O access strobe for expansion bus S


interface

XRDY 1 I Ready signal. This pin indicates that the exter-


nal device is prepared for an expansion bus in-
terface transaction to complete.

Control Signals (9 Pins)

RESET 1 I Reset. When this pin is a logic low, the device is


placed in the reset condition. After reset be-
comes a logic high, execution begins from the
location specified by the reset vector.

INT3–INT0 4 I External interrupts

IACK 1 O/Z Interrupt acknowledge signal. IACK is set to 1 S


(logic high) by the IACK instruction. This can be
used to indicate the beginning or end of an in-
terrupt service routine.

MC/MP 1 I Microcomputer/microprocessor mode pin

XF1, XF0 2 I/O/Z External flag pins. They are used as general- S R
purpose I/O pins or to support interlocked pro-
cessor instructions.

Serial Port 0 Signals (6 Pins)

CLKX0 1 I/O/Z Serial port 0 transmit clock. Serves as the serial S R


shift clock for the serial port 0 transmitter.

DX0 1 I/O/Z Data transmit output. Serial port 0 transmits se- S R


rial data on this pin.

FSX0 1 I/O/Z Frame synchronization pulse for transmit. The S R


FSX0 pulse initiates the transmit data process
over pin DX0.

CLKR0 1 I/O/Z Serial port 0 receive clock. Serves as the serial S R


shift clock for the serial port 0 receiver.

DR0 1 I/O/Z Data receive. Serial port 0 receives serial data S R


via the DR0 pin.

FSR0 1 I/O/Z Frame synchronization pulse for receive. The S R


FSR0 pulse initiates the receive data process
over DR0.

† Input (I), output (O), high-impedance state (Z)


‡ S = SHZ active, H = HOLD active, R = RESET active

13-18
Signal Descriptions

Table 13–7.TMS320C30 Signal Descriptions (Continued)


Condition When
Signal/Port # Pins I/O/Z† Description Signal Is in High Z‡
Serial Port 1 Signals (6 Pins)

CLKX1 1 I/O/Z Serial port 1 transmit clock. Serves as the seri- S R


al shift clock for the serial port 1 transmitter.

DX1 1 I/O/Z Data transmit output. Serial port 1 transmits S R


serial data on this pin.

FSX1 1 I/O/Z Frame synchronization pulse for transmit. The S R


FSX1 pulse initiates the transmit data process
over pin DX1.

CLKR1 1 I/O/Z Serial port 1 receive clock. Serves as serial S R


shift clock for the serial port 1 receiver.

DR1 1 I/O/Z Data receive. Serial port 1 receives serial data S R


via the DR1 pin.

FSR1 1 I/O/Z Frame synchronization pulse for receive. The S R


FSR1 pulse initiates the receive data process
over DR1.

Timer 0 Signals (1 Pin)

TCLK0 1 I/O/Z Timer clock. As input, TCLK0 is used by timer 0 S R


to count external pulses. As output pin, TCLK0
outputs pulses generated by timer 0.

Timer 1 Signals (1 Pin)

TCLK1 1 I/O/Z Timer clock. As input, TCLK1 is used by timer 1 S R


to count external pulses. As output pin, TCLK1
outputs pulses generated by timer 1.

Supply and Oscillator Signals (29 Pins)

VDD3–VDD0 4 I Four +5-V supply pins §

IODVDD1, IODVDD0 2 I Two +5-V supply pins §

ADVDD1, ADVDD0 2 I Two +5-V supply pins §

PDVDD 1 I One +5-V supply pin §


† Input (I), output (O), high-impedance state (Z)
‡ S = SHZ active, H = HOLD active, R = RESET active
§ The recommended decoupling capacitor is 0.1 µF.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-19


Signal Descriptions

Table 13–7.TMS320C30 Signal Descriptions (Continued)


Condition When
Signal/Port # Pins I/O/Z† Description Signal Is in High Z‡
Supply and Oscillator Signals (29 Pins) (Continued)

DDVDD1, DDVDD0 2 I Two +5-V supply pins §

MDVDD 1 I One +5-V supply pin §


VSS3–VSS0 4 I Four ground pins
DVSS3–DVSS0 4 I Four ground pins
CVSS1, CVSS0 2 I Two ground pins
IVSS 1 I One ground pin
VBBP 1 NC VBB pump oscillator output
VSUBS 1 I Substrate pin. Tie to ground.
X1 1 O Output pin from internal oscillator for the crystal.
If crystal not used, pin should be left uncon-
nected.
X2/CLKIN 1 I Input pin to internal oscillator from a crystal or a
clock
H1 1 O/Z External H1 clock—has a period equal to twice S
CLKIN.
H3 1 O/Z External H3 clock—has a period equal to twice S
CLKIN.
† Input (I), output (O), high-impedance state (Z)
‡ S = SHZ active, H = HOLD active, R = RESET active
§ Follow the connections specified for the reserved pins. 18- to 22-kΩ pull-up resistors are recommended. All +5-volt supply pins
must be connected to a common supply plane, and all ground pins must be connected to a common ground plane.

13-20
Signal Descriptions

Table 13–7.TMS320C30 Signal Descriptions (Continued)


Condition When
Signal/Port # Pins I/O/Z† Description Signal Is in High Z‡
Reserved (18 Pins) §
EMU2–EMU0 3 I Reserved. Use pull-ups to +5 volts. See Sec-
tion 12.7 on page 12-39.
EMU3 1 O Reserved. See Section 12.7 on page 12-39.
EMU4/SHZ 1 I Shutdown high impedance. An active low shuts
down the TMS320C30 and places all pins in a
high-impedance state. This signal is used for
board-level testing to ensure that no dual drive
conditions occur. CAUTION: An active low on
the SHZ pin corrupts TMS320C30 memory and
register contents. Reset the device with an
SHZ=1 to restore it to a known operating condi-
tion.
EMU6, EMU5 2 NC Reserved.
RSV10–RSV5 6 I/O Reserved. Use pull-ups on each pin to +5 volts.
RSV4–RSV0 5 I Reserved. Tie pins directly to +5 volts.
Locator (1 Pin)
Locator 1 NC Reserved. See Figure 13–1 on page 13-3 and
Table 13–1 on page 13-6.
† Input (I), output (O), high-impedance state (Z)
‡ S = SHZ active, H = HOLD active, R = RESET active
§ Follow the connections specified for the reserved pins. 18- to 22-kΩ pull-up resistors are recommended. All +5-volt supply pins
must be connected to a common supply plane, and all ground pins must be connected to a common ground plane.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-21


Signal Descriptions

13.2.2 TMS320C31 Signal Descriptions


Table 13–8 describes the signals that the TMS320C31 device uses in the
microprocessor mode. They are listed according to the signal name; the num-
ber of pins allocated; the input (I), output (O), or high-impedance state (Z) op-
erating modes; a brief description of the signal’s function; and the condition
that places an output pin in high impedance. A line over a signal name (for ex-
ample, RESET) indicates that the signal is active (low) (true at a logic 0 level).
Table 13–8.TMS320C31 Signal Descriptions
Condition When
Signal/Port # Pins I/O/Z† Description Signal Is in High Z‡
Primary Bus Interface (61 Pins)

D31–D0 32 I/O/Z 32-bit data port S H R

A23–A0 24 O/Z 24-bit address port S H R

HOLD 1 I Hold signal. When HOLD is a logic low, any on-


going transaction is completed. The A23–A0,
D31–D0, STRB, and R/W signals are placed in
a high-impedance state, and all transactions
over the primary bus interface are held until
HOLD becomes a logic high or until the NO-
HOLD bit of the primary bus control register is
set.

HOLDA 1 O/Z Hold acknowledge signal. This signal is gener- S


ated in response to a logic low on HOLD. It sig-
nals that A23–A0, D31–D0, STRB, and R/W are
placed in a high-impedance state and that all
transactions over the bus will be held. HOLDA
will be high in response to a logic high of HOLD
or until the NOHOLD bit of the primary bus con-
trol register is set.

R/W 1 O/Z Read/write signal. This pin is high when a read S H R


is performed and low when a write is performed
over the parallel interface.

RDY 1 I Ready signal. This pin indicates that the exter-


nal device is prepared for a transaction comple-
tion.

STRB 1 O/Z External access strobe S H


† Input (I), output (O), high-impedance (Z) state
‡ S = SHZ active, H = HOLD active, R = RESET active

13-22
Signal Descriptions

Table 13–8.TMS320C31 Signal Descriptions (Continued)


Condition When
Signal/Port # Pins I/O/Z† Description Signal Is in High Z‡
Control Signals (10 Pins)

INT3–INT0 4 I External interrupts

IACK 1 O/Z Interrupt acknowledge signal. IACK is set to 1 S


by the IACK instruction. This can be used to in-
dicate the beginning or end of an interrupt ser-
vice routine.

MCBL/MP 1 I Microcomputer boot loader/microprocessor


mode pin

RESET 1 I Reset. When this pin is a logic low, the device is


placed in the reset condition. When reset be-
comes a logic 1, execution begins from the loca-
tion specified by the reset vector.

SHZ 1 I Shut down high Z. An active (low) shuts down


the TMS320C31 and places all pins in a high-
impedance state. This signal is used for board-
level testing to ensure that no dual drive condi-
tions occur. CAUTION: An active (low) on the
SHZ pin corrupts TMS320C31 memory and reg-
ister contents. Reset the device with an SHZ = 1
to restore it to a known operating condition.

XF1, XF0 2 I/O/Z External flag pins. These are used as general- S R
purpose I/O pins or to support interlocked pro-
cessor instructions.

Serial Port 0 Signals (6 Pins)

CLKR0 1 I/O/Z Serial port 0 receive clock. This pin serves as S R


the serial shift clock for the serial port 0 receiver.

CLKX0 1 I/O/Z Serial port 0 transmit clock. Serves as the serial S R


shift clock for the serial port 0 transmitter.

DR0 1 I/O/Z Data receive. Serial port 0 receives serial data S R


via the DR0 pin.

DX0 1 I/O/Z Data transmit output. Serial port 0 transmits se- S R


rial data on this pin.

FSR0 1 I/O/Z Frame synchronization pulse for receive. The S R


FSR0 pulse initiates the receive data process
over DR0.
† Input (I), output (O), high-impedance state (Z)
‡ S = SHZ active, H = HOLD active, R = RESET active

TMS320C3x Signal Descriptions and Electrical Characteristics 13-23


Signal Descriptions

Table 13–8.TMS320C31 Signal Descriptions (Continued)


Condition When
Signal/Port # Pins I/O/Z† Description Signal Is in High Z‡
Serial Port 0 Signals (6 Pins) (Continued)

FSX0 1 I/O/Z Frame synchronization pulse for transmit. The S R


FSX0 pulse initiates the transmit data process
over pin DX0.

Timer Signals (2 Pins)

TCLK0 1 I/O/Z Timer clock 0. As an input, TCLK0 is used by S


timer 0 to count external pulses. As an output
pin, TCLK0 outputs pulses generated by timer
0.

TCLK1 1 I/O/Z Timer clock 1. As an input, TCLK0 is used by S


timer 1 to count external pulses. As an output
pin, TCLK1 outputs pulses generated by timer
1.

Supply and Oscillator Signals (49 Pins)

H1 1 O/Z External H1 clock. This clock has a period S


equal to twice CLKIN.

H3 1 O/Z External H3 clock. This clock has a period S


equal to twice CLKIN.

VDD 20 I +5-VDC supply pins. All pins must be con-


nected to a common supply plane. §

VSS 25 I Ground pins. All ground pins must be con-


nected to a common ground plane.

X1 1 O/Z Output pin from the internal crystal oscillator. If


a crystal is not used, this pin should be left un-
connected.

X2/CLKIN 1 I The internal oscillator input pin from a crystal or


a clock.

Reserved (4 Pins) ¶

EMU2–EMU0 3 I Reserved. Use 20-kΩ pull-up resistors to +5


volts.

EMU3 1 O Reserved.
† Input (I), output (O), high-impedance state (Z)
‡ S = SHZ active, H = HOLD active, R = RESET active
§ The recommended decoupling capacitor value is 0.1 µF.
¶ Follow the connections specified for the reserved pins. 18- to 22-kΩ pull-up resistors are recommended. All +5-volt supply pins
must be connected to a common supply plane, and all ground pins must be connected to a common ground plane.

13-24
Electrical Specifications

13.3 Electrical Specifications


Table 13–9, Table 13–10, Table 13–11, and Figure 13–8 show the electrical
specifications for the TMS320C3x.

Table 13–9.Absolute Maximum Ratings Over Specified Temperature Range


Condition/Characteristic ’C30/’C31 Range ’LC31 Range
Supply voltage range, VDD – 0.3 V to 7 V – 0.3 V to 5 V

Input voltage range – 0.3 V to 7 V – 0.3 V to 5 V

Output voltage range – 0.3 V to 7 V – 0.3 V to 5 V

Continuous power dissipation (worst case) 3.15 W for TMS320C30–33 1.1 W


1.7 W for TMS320C31–33 (See Note 3)
(See Note 3)

Operating case temperature range TMS320C30GEL 0 ° C to 85 °C 0 ° C to 85 °C


TMS320C31PQL 0 ° C to 85 °C
TMS320C31PQA –40 ° C to +125 °C

Storage temperature range – 55 °C to 150°C – 55 °C to 150°C


Notes: 1) All voltage values are with respect to VSS.
2) Stresses beyond those listed above may cause permanent damage to the device. This is a stress rating only;
functional operation of the device at these or any other conditions beyond those indicated in Table 13–10 is not im-
plied. Exposure to absolute-maximum-rated conditions for extended periods may affect device reliability.
3) Actual operating power will be less than stated. These values were obtained under specially produced worst-case
test conditions, which are not sustained during normal device operation. These conditions consist of continuous
parallel writes of a checkerboard pattern to both primary and expansion buses at the maximum rate possible. See
nominal (IDD) current specification in Table 13–11.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-25


Electrical Specifications

Table 13–10. Recommended Operating Conditions


’C30/’C31 ’LC31–33
P
Parameter Min Nom Max Min Nom Max U i
Unit
VDD Supply voltages (DDVDD, etc.) 4.75 5 5.25 3.13 3.3 3.47 V

VSS Supply voltages (CVSS, etc.) 0 0 V

VIH High-level input voltage 2 VDD 1.8 VDD V


+ 0.3† + 0.3†

VIL Low-level input voltage –0.3 0.8 –0.3† 0.6 V

IOH High-level output current –300 –300 µA

IOL Low-level output current 2 2 mA

T Operating case temperature 0 85 0 85 °C


range

VTH CLKIN high-level input voltage 2.6 VDD 2.5 VDD V


for CLKIN + 0.3† + 0.3†
† Guaranteed from characterization but not tested
Note: All voltage values are with respect to VSS. All input and output voltages except those for CLKIN are TTL compatible.
CLKIN can be driven by a CMOS clock.

13-26
Electrical Specifications

Table 13–11. Electrical Characteristics Over Specified Free-Air Temperature Range†

’C30/’C31 ’LC31-33
El
Electrical
i l Characteristic
Ch i i Min Nom‡ Max Min Nom‡ Max U i
Unit
VOH High-level output voltage ( VDD = Min, IOH = 2.4 3 2.0 V
Max)

VOL§ Low-level output voltage ( VDD = Min, IOL = 0.3 0.6 V


0.4
Max)

IZ Three-state current ( VDD = Max) –20 20 – 20 20 µA

II Input current ( VI = VSS to VDD) –10 10 – 10 10 µA

IIP Input current ( Inputs with internal pull-ups) ¶ –400 20 – 400 10 µA

ICC Supply current ( TA = ’C30-33 200 600 120 300 mA


25 ° C, VDD = Max, fx ’C30-27 175 500
= Max) #|| ’C30-40 170 600
’C31-27 120 260
’C31-33 150 325
’C31-33 (ext. temp) 150 325
’C31-40 160 390
’C31-50 200 425
’C30 PPM 170 600

IDD Supply current, standby; IDLE2, clocks shut 50 mA


21
off

Ci Input capacitance All inputs except 15k 15k pF


CLKIN

CLKIN 25 25

Co Output capacitance 20k 20k pF

† All input and output voltage levels are TTL compatible.


‡ All nominal values are at VDD = 5 V, TA = 25°C.
§ For ’C30 PPM: VOL(max)=0.6 V, except for the following:
VOL(max)=1 V for A(0–31)
VOL(max)=0.9 V for XA(0–12), D(0–31)
VOL(max)=0.7 V for STRB, XSTRB, MSTRB, FSX0/I, CLKX0/1,
CLKR0/1, DX0/1 R/W, XR/W
¶ Pins with internal pull-up devices: INT3 –INT0, MC/MP, RSV10 –RSV0. Although RSV10–RSV0 have internal pullup devices,
external pullups should be used on each pin as described in Table 13–7 beginning on page 13-17.
# Actual operating current will be less than this maximum value. This value was obtained under specially produced worst-case
test conditions, which are not sustained during normal device operation. These conditions consist of continuous parallel writes
of a checkerboard pattern to both primary and expansion buses at the maximum rate possible. See Calculation of TMS320C30
Power Dissipation, Appendix D.
|| fx is the input clock frequency. The maximum value is 40 MHz.
k Guaranteed by design but not tested

TMS320C3x Signal Descriptions and Electrical Characteristics 13-27


Electrical Specifications

Figure 13–8. Test Load Circuit

IOL

Output
Tester Pin VLoad Under
Electronics Test
CT

IOH

Where: IOL = 2.0 mA (all outputs)


IOH = 300 µA (all outputs)
VLoad = 2.15 V
CT = 80 pF typical load circuit capacitance

13-28
Signal Transition Levels

13.4 Signal Transition Levels

13.4.1 TTL-Level Outputs


TTL-compatible output levels are driven to a minimum logic-high level of 2.4
volts and to a maximum logic-low level of 0.6 volt. Figure 13–9 shows the TTL-
level outputs.

Figure 13–9. TTL-Level Outputs


2.4 V
2.0 V

1.0 V
0.6 V

TTL-output transition times are specified as follows:

- For a high-to-low transition, the level at which the output is said to be no


longer high is 2.0 volts, and the level at which the output is said to be low
is 1.0 volt.

- For a low-to-high transition, the level at which the output is said to be no


longer low is 1.0 volt, and the level at which the output is said to be high
is 2.0 volts.

13.4.2 TTL-Level Inputs


Figure 13–10 shows the TTL-level inputs.

Figure 13–10. TTL-Level Inputs


2.0 V
90%

10%
0.8 V

TTL-compatible input transition times are specified as follows:

- For a high-to-low transition on an input signal, the level at which the input
is said to be no longer high is 2.0 volts, and the level at which the input is
said to be low is 0.8 volt.

- For a low-to-high transition on an input signal, the level at which the input
is said to be no longer low is 0.8 volt, and the level at which the input is said
to be high is 2.0 volts.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-29


Timing

13.5 Timing

Timing specifications apply to the TMS320C30 and TMS320C31.

13.5.1 X2/CLKIN, H1, and H3 Timing

Table 13–12 defines the timing parameters for the X2/CLKIN, H1, and H3 in-
terface signals. The numbers shown in parentheses in Figure 13–11 and
Figure 13–12 correspond with those in the No. column of Table 13–12. Refer
to the RESET timing in Figure 13–23 on page 13-48 for CLKIN to H1/H3 delay
specification.

Table 13–12. Timing Parameters for X2/CLKIN, H1, and H3§


’C30-33/
’C30-27/ ’C31-33/ ’C30-40/
’C31-27 ’LC31 ’C31-40 ’C31-50
N
No. Name
N D
Description
i i Min Max Min Max Min Max Min Max Unit
(1) tf(CI) CLKIN fall time 6‡ 5‡ 5‡ 5‡ ns

(2) tw(CIL) CLKIN low pulse


duration 14 10 9 7 ns
tc(CI) = min

(3) tw(CIH) CLKIN high pulse


duration 14 10 9 7 ns
tc(CI) = min

(4) tr(CI) CLKIN rise time 6‡ 5‡ 5‡ 5‡ ns

(5) tc(CI) CLKIN cycle time 37 303 30 303 25 303 20 303 ns

(6) tf(H) H1/H3 fall time 4 3 3 3 ns

(7) tw(HL) H1/H3 low pulse


P–6 P–6 P–5 P–5 ns
duration

(8) tw(HH) H1/H3 high pulse


P–7 P–7 P–6 P–6 ns
duration

(9) tr(H) H1/H3 rise time 5 4 3 3 ns

(9.1) td(HL–HH) Delay from H1(H3)


low to H3(H1) high 0† 6 0† 5 0† 4 0† 4 ns

(10) tc(H) H1/H3 cycle time 74 606 60 606 50 606 40 606 ns


† Guaranteed from characterization but not tested
‡ Guaranteed by design but not tested
§ P = tc(CI)

13-30
Timing

Figure 13–11.Timing for X2/CLKIN


(5)
(4)
(1)

X2/CLKIN

(3)
(2)

Figure 13–12. Timing for H1/H3

(10)

(9) (6)

H1
(8)
(7)
(9.1) (9.1)

H3

(9) (6)
(7)
(8)
(10)

TMS320C3x Signal Descriptions and Electrical Characteristics 13-31


Timing

13.5.2 Memory Read/Write Timing


Table 13–13 defines memory read/write timing parameters for (M)STRB. The
numbers shown in parentheses in Figure 13–13 and Figure 13–14 corre-
spond with those in the No. column of Table 13–13.

13-32
Timing

Table 13–13. Timing Parameters for a Memory ( (M)STRB) = 0) Read/Write


’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
N
No. Name
N D
Description
i i Min Max Min Max Min Max Min Max Unit
U i

(11) td(H1L–(M)SL) H1 low to (M)STRB 0‡ 13 0‡ 10 0‡ 6§ 0‡ 4 ns


low delay

(12) td(H1L–(M)SH) H1 low to (M)STRB 0‡ 13 0‡ 10 0‡ 6 0‡ 4 ns


high delay

(13.1) td(H1H–RWL) H1 high to R/W low 0‡ 13 0‡ 10 0‡ 9 0‡ 7 ns


delay

(13.2) td(H1H–XRWL) H1 high to XR/W 0‡ 19 0‡ 15 0‡ 13 ns


low delay

(14.1) td(H1L–A) H1 low to A valid 0‡ 16 0‡ 14 0‡ 11 0‡ 9 ns


delay

(14.2) td(H1L–XA) H1 low to XA valid 0‡ 12 0‡ 10 0‡ 9 ns


delay

(15.1) tsu(D)R D setup before H1 18 16 14 10 ns


low (read)

(15.2) tsu(XD)R XD setup before H1 21 18 16 ns


low (read)

(16) th((X)D)R (X)D hold time after 0 0 0 0 ns


H1 low (read)

(17.1) tsu(RDY) RDY setup before 10 8 8 6 ns


H1 high

(17.2) tsu(XRDY) XRDY setup before 11 9 9 ns


H1 high

(18) th((X)RDY) (X)RDY hold time 0 0 0 0 ns


after H1 high

(19) td(H1H–(X)RWH) H1 high to (X)R/W 13 10 9 7 ns


high (write) delay

(20) tv((X)D)W (X)D valid after H1 25 20 17 14 ns


low (write)

(21) th((X)D)W (X)D hold time after 0‡ 0‡ 0‡ 0‡ ns


H1 high (write)

‡ Guaranteed by design but not tested


§ For ’C30 PPM, td(H1L–(M)SL) (max)=7ns

TMS320C3x Signal Descriptions and Electrical Characteristics 13-33


Timing

Table 13–13. Timing Parameters for a Memory ( (M)STRB) = 0) Read/Write (Continued)


’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
N
No. Name
N D
Description
i i Min Max Min Max Min Max Min Max Unit
U i

(22.1) td(H1H–A) H1 high to A valid 23 18 15 12 ns


on back-to-back
write cycles (write)
delay

(22.2) td(H1H–XA) H1 high to XA valid 32 25 21 ns


on back-to-back
write cycles (write)
delay

(26) td(A–(X)RDY) (X)RDY delay from 10† 8† 7† 6 ns


A valid delay
† Guaranteed from characterization but not tested
‡ Guaranteed by design but not tested
§ For ’C30 PPM, td(H1L–(M)SL) (max)=7ns

Figure 13–13. Timing for Memory ( (M)STRB = 0) Read

H3

H1

(11) (12)

(M)STRB

(X)R/W

(14.1/14.2) (13.1/13.2)

(X)A
(15.1/15.2)
(26) (16)

(X)D
(17.1/17.2)
(18)

(X)RDY

Note: (M)STRB will remain low during back-to-back read operations.

13-34
Timing

Figure 13–14. Timing for Memory ( (M)STRB = 0) Write

H3

H1

(12)
(11)
(M)STRB
(19)
(13.1/13.2)
(X)R/W
(14.1/14.2)
(22.1/22.2)
(X)A

(20) (21)

(X)D
(18)
(17.1/17.2)

(X)RDY

Table 13–14 defines memory read timing parameters for IOSTRB. The num-
bers shown in parentheses in Figure 13–15 and Figure 13–16 correspond
with those in the No. column of Table 13–14 and Table 13–15.

Table 13–14. Timing Parameters for a Memory ( IOSTRB = 0) Read


’C30-27 ’C30-33 ’C30-40
N
No. N
Name D
Description
i i Min Max Min Max Min Max U i
Unit
(11.1) td(H1H–IOSL) H1 high to IOSTRB low delay 0† 13 0† 10 0† 9 ns

(12.1) td(H1H–IOSH) H1 high to IOSTRB high delay 0† 13 0† 10 0† 9 ns

(13.1) td(H1L–XRWH) H1 low to XR/W high delay 0† 13 0† 10 0† 9 ns

(14.3) td(H1L–XA) H1 low to XA valid delay 0† 13 0† 10 0‡ 9 ns

(15.3) tsu(XD)R XD setup before H1 high 19 15 13 ns

(16.1) th(XD)R XD hold time after H1 high 0 0 0 ns

(17.3) tsu(XRDY) XRDY setup before H1 high 11 9 9 ns

(18.1) th(XRDY) XRDY hold time after H1 high 0 0 0 ns


† Guaranteed by design but not tested

TMS320C3x Signal Descriptions and Electrical Characteristics 13-35


Timing

Figure 13–15. Timing for Memory ( IOSTRB = 0) Read


H3

H1

(11.1) (12.1)

IOSTRB

(13.1) (23)

XR/W

(14.3)

XA

(15.3)
(16.1)

XD

(17.3)
(18.1)
(X)RDY

13-36
Timing

Figure 13–16. Timing for Memory ( IOSTRB = 0) Write

H3

H1

(11.1) (12.1)

IOSTRB

(13.1)
(23)

(X)R/W

(14.3)

(X)A

(24) (25)

(X)D

(17.3)
(18.1)

(X)RDY

Table 13–15 defines memory write timing parameters for IOSTRB. The num-
bers shown in parentheses in Figure 13–15 and Figure 13–16 correspond
with those in the No. column of Table 13–14 and Table 13–15.

Table 13–15. Timing Parameters for a Memory ( IOSTRB = 0) Write


’C30-27 ’C30-33 ’C30-40
N
No. N
Name D
Description
i i Min Max Min Max Min Max U i
Unit
(23) td(H1L–XRWL) H1 low to XR/W low delay 0† 19 0† 15 0† 13 ns

(24) tv(XD)W XD valid after H1 high 38 30 25 ns

(25) th(XD)W XD hold time after H1 low 0 0 0 ns


† Guaranteed by design but not tested

TMS320C3x Signal Descriptions and Electrical Characteristics 13-37


Timing

13.5.3 XF0 and XF1 Timing When Executing LDFI or LDII


Table 13–16 defines the timing parameters for XF0 and XF1 during execution
of LDFI or LDII. The numbers shown in parentheses in Figure 13–17 corre-
spond with those in the No. column of Table 13–16.

13-38
Timing

Table 13–16. Timing Parameters for XF0 and XF1 When Executing LDFI or LDII
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
N
No. Name
N D
Description
i i Min Max Min Max Min Max Min Max U i
Unit
(1) td(H3H–XF0L) H3 high to XF0 low delay 19 15 13 12 ns

(2) tsu(XF1) XF1 setup before H1 low 13 10 9 9 ns

(3) th(XF1) XF1 hold time after H1 low 0 0 0 0 ns

Figure 13–17. Timing for XF0 and XF1 When Executing LDFI or LDII

Fetch
LDFI or LDII Decode Read Execute

H3

H1

(M)STRB

(X)R/W

(X)A

(X)D

(X)RDY

(1)

XF0 Pin (2)

(3)

XF1 Pin

TMS320C3x Signal Descriptions and Electrical Characteristics 13-39


Timing

13.5.4 XF0 Timing When Executing STFI and STII

Table 13–17 defines the timing parameters for the XF0 and XF1 pins during
execution of STFI or STII. The number shown in parentheses in Figure 13–18
corresponds with the number in the No. column of Table 13–17.

Table 13–17. Timing Parameters for XF0 When Executing STFI or STII
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) td(H3H–XF0H) H3 high to XF0 high delay 19 15 13 12 ns

XF0 is always set high at the beginning of the execute phase of the interlock
store instruction. When no pipeline conflicts occur, the address of the store is
also driven at the beginning of the execute phase of the interlock store instruc-
tion. However, if a pipeline conflict prevents the store from executing, the ad-
dress of the store will not be driven until the store can execute.

Figure 13–18. Timing for XF0 When Executing an STFI or STII

Fetch
STFI or STII Decode Read Execute
H3

H1

(M)STRB

(X)R/W

(X)A

(X)D

(X)RDY (1)

XF0 Pin

13-40
Timing

13.5.5 XF0 and XF1 Timing When Executing SIGI


Table 13–18 defines the timing parameters for the XF0 and XF1 pins during
execution of SIGI. The numbers shown in parentheses in Figure 13–19 corre-
spond with those in the No. column of Table 13–18.

Table 13–18. Timing Parameters for XF0 and XF1 When Executing SIGI
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) td(H3H–XF0L) H3 high to XF0 low delay 19 15 13 12 ns

(2) td(H3H–XF0H) H3 high to XF0 high delay 19 15 13 12 ns

(3) tsu(XF1) XF1 setup before H1 low 13 10 9 9 ns

(4) th(XF1) XF1 hold time after H1 low 0 0 0 0 ns

Figure 13–19. Timing for XF0 and XF1 When Executing SIGI
Fetch
SIGI Decode Read Execute

H3

H1
(1)
(3) (2)

XF0

(4)

XF1

TMS320C3x Signal Descriptions and Electrical Characteristics 13-41


Timing

13.5.6 Loading When the XF Pin Is Configured as an Output


Table 13–19 defines the timing parameter for loading the XF register when the
XF pin is configured as an output. The number shown in parentheses in
Figure 13–20 corresponds with the number in the No. column of Table 13–19.

Table 13–19. Timing Parameters for Loading the XF Register When Configured as an Output
Pin
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) tv(H3H–XF) H3 high to XF valid 19 15 13 12 ns

Figure 13–20. Timing for Loading XF Register When Configured as an Output Pin
Fetch Load
Instruction Decode Read Execute

H3

H1

OUTXF
1 or 0
Bit
(1)

XF Pin

13-42
Timing

13.5.7 Changing the XF Pin From an Output to an Input


Table 13–20 defines the timing parameters for changing the XF pin from an
output pin to an input pin. The numbers shown in parentheses in Figure 13–21
correspond with those in the No. column of Table 13–20.

Table 13–20. Timing Parameters of XF Changing From Output to Input Mode


’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No.
N Name
N Description
D i i Min Max Min Max Min Max Min Max Unit
U i
(1) th(H3H–XF01) XF hold after H3 high 19 15 13† 12 ns

(2) tsu(XF) XF setup before H1 low 13 10 9 9 ns

(3) th(XF) XF hold after H1 low 0 0 0 0 ns


† For ’C30 PPM, tn(H3H–XF01) (max)=14ns

Figure 13–21. Timing for Change of XF From Output to Input Mode


Buffers Go
Execute Synchronizer Value on Pin
From Output
Load of IOF Delay Seen in IOF
to Output
H3

H1

(2)
IOXF
Bit (3)
(1)

XF Pin Output

INXF Bit Data


Sampled
Data
Seen

TMS320C3x Signal Descriptions and Electrical Characteristics 13-43


Timing

13.5.8 Changing the XF Pin From an Input to an Output


Table 13–21 defines the timing parameter for changing the XF pin from an in-
put pin to an output pin. The number shown in parentheses in Figure 13–22
corresponds with the number in the No. column of Table 13–21.

Table 13–21. Timing Parameters of XF Changing From Input to Output Mode


’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No.
N Name
N Description
D i i Min Max Min Max Min Max Min Max Unit
U i
(1) td(H3H–XFIO) H3 high to XF switching 25 20 17 17 ns
from input to output delay

Figure 13–22. Timing for Change of XF From Input to Output Mode


Execution of
Load of IOF

H3

H1

IOXF
Bit

(1)

XF Pin

13-44
Timing

13.5.9 Reset Timing


RESET is an asynchronous input that can be asserted at any time during a
clock cycle. If the specified timings are met, the exact sequence shown in
Figure 13–23 on page 13-48 will occur; otherwise, an additional delay of one
clock cycle is possible.
The asynchronous reset signals include XF0/1, CLKX0/1, DX0/1, FSX0/1,
CLKR0/1, DR0/1, FSR0/1, and TCLK0/1.
Table 13–22 (’C30) and Table 13–23 (’C31) define the timing parameters for
the RESET signal. The numbers shown in parentheses in Figure 13–23 corre-
spond with those in the No. column of Table 13–22 or Table 13–23.
Resetting the device initializes the primary and expansion bus control regis-
ters to seven software wait states and therefore results in slow external ac-
cesses until these registers are initialized.
Note also that HOLD is an asynchronous input and can be asserted during
reset.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-45


Timing

Table 13–22. Timing Parameters for RESET for the TMS320C30


’C30-27 ’C30-33 ’C30-40
No. Name Description Min Max Min Max Min Max Unit
(1) tsu(RESET) Setup for RESET before 28 P†§ 10 P† 10 P†§ ns
CLKIN low

(2.1) td(CLKINH–H1H) CLKIN high to H1 high delay‡ 6 20 4 14 2 12 ns

(2.2) td(CLKINH–H1L) CLKIN high to H1 low delay‡ 6 20 4 14 2 12 ns

(3) tsu(RESETH–H1L) Setup for RESET high 13 10 9 ns


before H1 low and after 10 H1
clock cycles

(5.1) td(CLKINH–H3L) CLKIN high to H3 low delay‡ 6 20 4 14 2 12 ns

(5.2) td(CLKINH–H3H) CLKIN high to H3 high delay‡ 6 20 4 14 2 12 ns

(8) tdis(H1H–(X)D) H1 high to (X)D disabled (high 19† 15† 13† ns


impedance)

(9) tdis(H3H–(X)A) H3 high to (X)A disabled (high 13† 10† 9† ns


impedance)

(10) td(H3H–CONTROLH) H3 high to control signals high 13† 10† 9† ns


delay

(11) td(H1H–RWH) H1 high to R/W high delay 13† 10† 9† ns

(13) td(H1H–IACKH) H1 high to IACK high delay 13† 10† 9† ns

(14) tdis(RESETL–ASYNCH) RESET low to asynchronous- 31† 25† 21† ns


ly reset signals disabled (high
impedance)
† Characterized but not tested
‡ See Figure 13–24 for temperature dependence for the 33-MHz TMS320C30. See Figure 13–25 for temperature dependence
for the 40-MHz TMS320C30.
§ P = tc(CI)

13-46
Timing

Table 13–23. Timing Parameters for RESET for the TMS320C31


’C31-33
’C31-27 ’LC31 ’C31-40 ’C31-50
N
No. N
Name Description
D i i Min Max Min Max Min Max Min Max Unit
U i
(1) tsu(RESET) Setup for RESET 28 P†¶ 10 P†¶ 10 P†¶ 10 P†¶ ns
before CLKIN low
(2.1) td(CLKINH–H1H) CLKIN high to H1 2 12 2 12‡ 2 12 2 10 ns
high delay §#
(2.2) td(CLKINH–H1L) CLKIN high to H1 2 12 2 12‡ 2 12 2 10 ns
low delay §#
(3) tsu(RESETH–H1L) Setup for RESET 13 10 9 7 ns
high before H1
low and after 10
H1 clock cycles
(5.1) td(CLKINH–H3L) CLKIN high to H3 2 12 2 12‡ 2 12 2 10 ns
low delay §#
(5.2) td(CLKINH–H3H) CLKIN high to H3 2 12 2 12‡ 2 12 2 10 ns
high delay §#
(8) tdis(H1H–(X)D) H1 high to D 19† 15† 13† 12† ns
disabled (high
impedance)
(9) tdis(H3H–(X)A) H3 high to A 13† 10† 9† 8† ns
disabled (high
impedance)
(10) td(H3H–CONTROLH) H3 high to 13† 10† 9† 8† ns
control signals
high delay
(12) td(H1H–RWH) H1 high to R/W 13† 10† 9† 8† ns
high delay
(13) td(H1H–IACKH) H1 high to IACK 13† 10† 9† 8† ns
high delay
(14) tdis(RESETL–ASYNCH) RESET low to 31† 25† 21† 17† ns
asynchronously
reset signals dis-
abled (high im-
pedance)
† Characterized but not tested
‡ 14 ns for the extended temperature ’C31-33
§ See Figure 13–25 for temperature dependence for the TMS320C31-27, TMS320C31-33, and the extended-temperature
TMS320C31-33.
¶ P = tc(CI)
# See Figure 13–26 for temperature dependence for the TMS320C31-50.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-47


Timing

Figure 13–23. Timing for RESET


CLKIN
(1)
RESET
(Notes 5, 6)
(2.1) (2.2) (3)
H1
(5.1)
H3
10 H1 Clock Cycles
(8)
(X)D
(Notes1,7)
(5.2) (9)
(X)A
(Notes 2,7)
(10)
Control
Signals
(Note 3)
(11)
TMS320C30
(X) R / W
(12)
TMS320C31
R/W
(13)
IACK
Asynchronous (14)
Reset Signals
(Note 4)

Notes: 1) (X)D includes D31–D0 and XD31–XD0.


2) (X)A includes A23–A0 and XA12–XA0.
3) Control signals include STRB, MSTRB, and IOSTRB.
4) Asynchronously reset signals include XF0/1, CLKX0/1, DX0/1, FSX0/1, CLKR0/1, DR0/1, FSR0/1, and TCLK0/1.
5) RESET is an asynchronous input and can be asserted at any point during a clock cycle. If the specified timings are
met, the exact sequence shown will occur; otherwise, an additional delay of one clock cycle is possible.
6) Note that the R/W and XR/W outputs are placed in a high-impedance state during reset and can be provided with
a resistive pull-up, nominally 18–22 kΩ, if undesirable spurious writes could be caused when these outputs go low.
7) In microprocessor mode, the reset vector is fetched twice, with seven software wait states each time. In microcom-
puter mode, the reset vector is fetched twice, with no software wait states.

13-48
Timing

Figure 13–24. CLKIN to H1/H3 as a Function of Temperature

22
TMS320C30-33
20
4.75 V ≤ VDD ≤ 5.25 V
18
CLKIN to H1/H3 (ns)

16
14
12
10
8
6
4
2
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
Case Temperature (C°)

Figure 13–25. CLKIN to H1/H3 as a Function of Temperature

22
TMS320C31-27
20 TMS320C31-33 extended
18 TMS320C31-33 (extended temperature) temperature
CLKIN to H1/H3 (ns)

TMS320C30-40 range
16
14 4.75 V ≤ VDD ≤ 5.25 V
12
10
8
6
4
2
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105110 115120125
Case Temperature (C°)

TMS320C3x Signal Descriptions and Electrical Characteristics 13-49


Timing

Figure 13–26. CLKIN to H1/H3 as a Function of Temperature

20
18
CLKIN to H1/H3 (ns)

TMS320C31-50
16
4.75 V ≤ VDD ≤ 5.25 V
14
12
10
8
6
4
2
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

Case Temperature (C°)

13-50
Timing

13.5.10 SHZ Pin Timing


Table 13–24 defines the timing parameters for the SHZ pin. The numbers
shown in parentheses in Figure 13–27 correspond with those in the No. col-
umn of Table 13–24.

Table 13–24. Timing Parameters for the SHZ Pin


’C30
’C31
’LC31
N
No. Name
N D
Description
i i Min Max U i
Unit
(1) tdis(SHZ) SHZ low to all O, I/O pins disabled 0† 2P†‡ ns
(high impedance)

(2) ten(SHZ) SHZ high to all O, I/O pins enabled 0† 2P†‡ ns


(active)
† Characterized but not tested
‡ P = tc(CI)

Figure 13–27. Timing for SHZ Pin

H3

H1

SHZ

(1) (2)
All I/O Pins

Note: Enabling SHZ destroys TMS320C3x register and memory contents. Assert SHZ = 1 and reset the TMS320C3x to restore
it to a known condition.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-51


Timing

13.5.11 Interrupt Response Timing


Table 13–25 defines the timing parameters for the INT signals. The numbers
shown in parentheses in Figure 13–28 correspond with those in the No. col-
umn of Table 13–25.
Table 13–25. Timing Parameters for INT3–INT0
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
N
No. N
Name D
Description
i i Min Max Min Max Min Max Min Max Unit
U i
(1) tsu(INT) INT3–INT0 setup before H1 19 15 13 10 ns
low

(2) tw(INT) Interrupt pulse duration to P 2P†‡ P 2P†‡ P 2P†‡ P 2P†‡ ns


guarantee only one interrupt
† Characterized but not tested
‡ P = tc(H)

The interrupt (INT) pins are asynchronous inputs that can be asserted at any
time during a clock cycle. The TMS320C3x interrupts are level-sensitive, not
edge-sensitive. Interrupts are detected on the falling edge of H1. Therefore,
interrupts must be set up and held to the falling edge of H1 for proper detection.
The CPU and DMA respond to detected interrupts on instruction fetch bound-
aries only.

For the processor to recognize only one interrupt on a given input, an interrupt
pulse must be set up and held to:

- A minimum of one H1 falling edge, and


- No more than two H1 falling edges.

The TMS320C3x can accept an interrupt from the same source every two H1
clock cycles.

If the specified timings are met, the exact sequence shown in Figure 13–28 will
occur; otherwise, an additional delay of one clock cycle is possible.

13-52 TMS320C3x User’s Guide


Timing

Figure 13–28. Timing for INT3–INT0 Response


Reset or Fetch First
Interrupt Instruction of
Vector Read Service Routine

H3

H1

(1)
INT3 – INT0
Pin
(2)
INT3 – INT0
Flag

Vector First
ADDR Address Instruction
Address

Data

TMS320C3x Signal Descriptions and Electrical Characteristics 13-53


Timing

13.5.12 Interrupt Acknowledge Timing


The IACK output goes active on the first half-cycle (HI rising) of the decode
phase of the IACK instruction and goes inactive at the first half-cycle (HI rising)
of the read phase of the IACK instruction.
Table 13–26 defines the timing parameters for the IACK signal. The numbers
shown in parentheses in Figure 13–29 correspond with those in the No. col-
umn of Table 13–26.

Table 13–26. Timing Parameters for IACK


’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) td(H1H–IACKL) H1 high to IACK low delay 13 10 9 7 ns

(2) td(H1H–IACKH) H1 high to IACK high delay 13 10 9 7 ns


Note: The IACK output is active for the entire duration of the bus cycle and is therefore extended if the bus cycle utilizes wait
states.

Figure 13–29. Timing for IACK

Fetch IACK Decode IACK IACK Data


Instruction Instruction Read

H3

H1

(1)
(2)

IACK

ADDR

Data

13-54
Timing

13.5.13 Data Rate Timing Modes


Unless otherwise indicated, the data rate timings shown in Figure 13–30 and
Figure 13–31 are valid for all serial port modes, including handshake. For a
functional description of serial port operation, refer to subsection 8.2.12 on
page 8-30.

Table 13–27 defines the serial port timing parameters for eight ’C3x devices.
The numbers shown in parentheses in Figure 13–30 and Figure 13–31 corre-
spond with those in the No. column of Table 13–27.

Figure 13–30. Timing for Fixed Data Rate Mode

(1) (2)

H1

(1)
(3)
(3)
CLKX/R

(5)
(4)
(6) (15)
(8)

DX Bit n-1 Bit n-2 Bit 0


(7)
DR
Bit n-1 Bit n-2
FSR
(10)
(9) (9)
FSX(INT) (11)

FSX(EXT)
(11)
(12)

Notes: 1) Timing diagrams show operations with CLKXP = CLKRP = FSXP = FSRP = 0.
2) Timing diagrams depend on the length of the serial port word, where n = 8, 16, 24, or 32 bits, respectively.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-55


Timing

Figure 13–31. Timing for Variable Data Rate Mode


CLKX/R
(9)
FSX(INT)
(14)
(12)
FSX(EXT) (6)
(15)
(13)
DX Bit n-1 Bit n-2 Bit n-3 Bit 0
(11)
FSR

(10)
DR Bit n-1 Bit n-2 Bit n-3
(7) (8)

Notes: 1) Timing diagrams show operation with CLKXP = CLKRP = FSXP = FSRP = 0.
2) Timing diagrams depend on the length of the serial port word, where n = 8, 16, 24, or 32 bits, respectively.
3) The timings that are not specified expressly for the variable data rate mode are the same as those that are specified
for the fixed data rate mode.

13-56
Timing

Table 13–27. Serial-Port Timing Parameters


TMS320C30-27/TMS320C31-27
N
No. N
Name D
Description
i i Min Max U i
Unit
(1) td(H1–SCK) H1 high to internal CLKX/R delay 19 ns
(2) tc(SCK) CLKX/R cycle time CLKX/R ext tc(H)x2.6† ns
CLKX/R int tc(H)x2 tc(H)x232‡
(3) tw(SCK) CLKX/R high/low pulse CLKX/R ext tc(H)+12† ns
duration
CLKX/R int [tc(SCK)/2]–15 [tc(SCK)/2]+5
(4) tr(SCK) CLKX/R rise time 10† ns
(5) tf(SCK) CLKX/R fall time 10† ns
(6) td(DX) CLKX to DX valid delay CLKX ext 44 ns
CLKX int 25
(7) tsu(DR) DR setup before CLKR ext 13 ns
CLKR low CLKR int 31
(8) th(DR) DR hold from CLKR ext 13 ns
CLKR low CLKR int 0
(9) td(FSX) CLKX to internal CLKX ext 40 ns
FSX high/low delay CLKX int 21
(10) tsu(FSR) FSR setup before CLKR CLKR ext 13 ns
low
CLKR int 13
(11) th(FS) FSX/R input hold from CLKX/R ext 13 ns
CLKX/R low
CLKX/R int 0
(12) tsu(FSX) External FSX setup be- CLKX ext –[tc(H)–8] [tc(SCK)/2]–10‡ ns
fore CLKX
CLKX int –[tc(H)–21] tc(SCK)/2‡
(13) td(CH–DX)V CLKX to first DX bit, FSX CLKX ext 45 ns
precedes CLKX high
delay
CLKX int 26
(14) td(FSX–DX)V FSX to first DX bit, CLKX precedes FSX 45 ns
delay
(15) td(DXZ) CLKX high to DX high impedance following 25† ns
last data bit delay
† Guaranteed by design but not tested
‡ Not tested

TMS320C3x Signal Descriptions and Electrical Characteristics 13-57


Timing

Table 13–27. Serial-Port Timing Parameters (Continued)


TMS320C30-33/TMS320C31-33/
TMS320LC31
No. Name Description Min Max Unit
(1) td(H1–SCK) H1 high to internal CLKX/R delay 15 ns
(2) tc(SCK) CLKX/R cycle time CLKX/R ext tc(H) x2.6† ns
CLKX/R int tc(H)x2 tc(H)x232‡
(3) tw(SCK) CLKX/R high/low pulse CLKX/R ext tc(H)+12† ns
duration
CLKX/R int [tc(SCK)/2]–15 [tc(SCK)/2]+5
(4) tr(SCK) CLKX/R rise time 8† ns
(5) tf(SCK) CLKX/R fall time 8† ns
(6) td(DX) CLKX to DX valid delay CLKX ext 35 ns
CLKX int 20
(7) tsu(DR) DR setup before CLKR ext 10 ns
CLKR low CLKR int 25
(8) th(DR) DR hold from CLKR ext 10 ns
CLKR low CLKR int 0
(9) td(FSX) CLKX to internal CLKX ext 32 ns
FSX high/low delay CLKX int 17
(10) tsu(FSR) FSR setup before CLKR ext 10 ns
CLKR low CLKR int 10
(11) th(FS) FSX/R input hold from CLKX/R ext 10 ns
CLKX/R low CLKX/R int 0
(12) tsu(FSX) External FSX setup be- CLKX ext –[tc(H)–8] [tc(SCK)/2]–10‡ ns
fore CLKX CLKX int [tc(H)–21] tc(SCK)/2‡
(13) td(CH–DX)V CLKX to first DX bit, CLKX ext 36 ns
FSX precedes CLKX int 21
CLKX high delay
(14) td(FSX–DX)V FSX to first DX bit, CLKX precedes FSX 36 ns
delay
(15) td(DXZ) CLKX high to DX high impedance follow- 20† ns
ing last data bit delay
† Guaranteed by design but not tested
‡ Not tested

13-58
Timing

Table 13–27. Serial-Port Timing Parameters (Continued)


TMS320C30-40/TMS320C31-40
N
No. Name
N Description
D i i Min Max U i
Unit
(1) td(H1–SCK) H1 high to internal CLKX/R delay 13 ns
(2) tc(SCK) CLKX/R cycle time CLKX/R ext tc(H)x2.6† ns
CLKX/R int tc(H)x2 tc(H)x232‡
(3) tw(SCK) CLKX/R high/low pulse CLKX/R ext tc(H)+10† [tc(SCK)/2]+5 ns
duration CLKX/R int [tc(SCK)/2]–5

(4) tr(SCK) CLKX/R rise time 7† ns


(5) tf(SCK) CLKX/R fall time 7† ns
(6) td(DX) CLKX to DX valid delay CLKX ext 30 ns
CLKX int 17

(7) tsu(DR) DR setup before CLKR ext 9 ns


CLKR low CLKR int 21
(8) th(DR) DR hold from CLKR ext 9 ns
CLKR low CLKR int 0
(9) td(FSX) CLKX to internal CLKX ext 27 ns
FSX high/low delay CLKX int 15
(10) tsu(FSR) FSR setup before CLKR ext 9 ns
CLKR low CLKR int 9
(11) th(FS) FSX/R input hold from CLKX/R ext 9 ns
CLKX/R low CLKX/R int 0
(12) tsu(FSX) External FSX setup be- CLKX ext –[tc(H)–8] [tc(SCK)/2]–10‡ ns
fore CLKX CLKX int –[tc(H)–21] tc(SCK)/2‡
(13) td(CH–DX)V CLKX to first DX bit, FSX CLKX ext 30 ns
precedes CLKX high CLKX int 18
delay
(14) td(FSX–DX)V FSX to first DX bit, CLKX precedes FSX 30 ns
delay
(15) td(DXZ) CLKX high to DX high impedance following last 17† ns
data bit delay
† Guaranteed by design but not tested
‡ Not tested

TMS320C3x Signal Descriptions and Electrical Characteristics 13-59


Timing

Table 13–27. Serial-Port Timing Parameters (Continued)


TMS320C31-50
N
No. N
Name D
Description
i i Min Max U i
Unit
(1) td(H1-SCK) H1 high to internal CLKX/R delay 10 ns
(2) tc(SCK) CLKX/R cycle time CLKX/R ext tc(H) × 2.6† tc(H) × 232‡ ns
CLKX/R int tc(H) × 2
(3) tw(SCK) CLKX/R high/low pulse dura- CLKX/R ext tc(H)+10† [tc(SCK)/2] + 5 ns
tion CLKX/R int [tc(SCK)/2] – 5
(4) tr(SCK) CLKX/R rise time 6† ns
(5) tf(SCK) CLKX/R fall time 6† ns
(6) td(DX) CLKX to DX valid delay CLKX ext 24 ns
CLKX int 16
(7) tsu(DR) DR setup before CLKR low CLKR ext 9 ns
CLKR int 17
(8) th(DR) DR hold from CLKR low CLKR ext 7 ns
CLKR int 0
(9) td(FSX) CLKX to internal FSX high/ CLKX ext 22 ns
low delay CLKX int 15
(10) tsu(FSR) FSR setup before CLKR low CLKR ext 7 ns
CLKR int 7
(11) th(FS) FSX/R input hold from CLKX/R ext 7 ns
CLKX/R low CLKX/R int 0
(12) tsu(FSX) External FSX setup before CLKX ext – [tc(H) – 8] [tc(SCK)/2] – 10‡ ns
CLKX CLKX int – [tc(H) – 21] tc(SCK)/2‡
(13) td(CH-DX)V CLKX to first DX bit, FSX pre- CLKX ext 24 ns
cedes CLKX high delay CLKX int 14
(14) td(FSX-DX)V FSX to first DX bit, CLKX precedes FSX 24 ns
delay
(15) td(DXZ) CLKX high to DX high impedance following 14† ns
last data bit delay
† Assured by design but not tested
‡ Not tested

13-60
Timing

13.5.14 HOLD Timing


HOLD is an asynchronous input that can be asserted at any time during a clock
cycle. If the specified timings are met, the exact sequence shown in
Figure 13–32 will occur; otherwise, an additional delay of one clock cycle is
possible.

Table 13–28 defines the timing parameters for the HOLD and HOLDA signals.
The numbers shown in parentheses in Figure 13–32 correspond with those in
the No. column of Table 13–28.

The NOHOLD bit of the primary bus control register (see subsection 7.1.1 on
page 7-3) overrides the HOLD signal. When this bit is set, the device comes
out of hold and prevents future hold cycles.

Asserting HOLD prevents the processor from accessing the primary bus. Pro-
gram execution continues until a read from or a write to the primary bus is re-
quested. In certain circumstances, the first write will be pending, thus allowing
the processor to continue until a second write is encountered.

Figure 13–32. Timing for HOLD/HOLDA

H3

H1

(1) (1)
(4)
HOLD
(3) (3)
(6)
HOLDA
(7) (8) (9)
STRB
(11)
(10)
R/W
(12) (13)
A
(16)
D Write Data

Note: HOLDA will go low in response to HOLD going low and will continue to remain low until one H1 cycle after HOLD goes
back high, as shown in Figure 13–32.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-61


Timing

Table 13–28. Timing Parameters for HOLD/HOLDA


’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
N
No. Name
N D
Description
i i Min Max Min Max Min Max Min Max U i
Unit
(1) tsu(HOLD) HOLD setup 19 15 13 10 ns
before H1 low
(3) tv(HOLDA) HOLDA valid 0‡ 14 0‡ 10 0‡ 9 0‡ 7 ns
after H1 low
(4) tw(HOLD§) HOLD low du- 2tc(H) 2tc(H) 2tc(H) 2tc(H) ns
ration
(6) tw(HOLDA) HOLDA low du- tcH-5† tcH-5† tcH-5† tcH –5† ns
ration
(7) td(H1L–SH)H) H1 low to 0‡ 13 0‡ 10 0‡ 9 0‡ 7 ns
STRB high for
a HOLD delay
(8) tdis(H1L–S) H1 low to 0‡ 13† 0‡ 10† 0‡ 9† 0‡ 8† ns
STRB disabled
(high-impe-
dance state)
(9) ten(H1L–S) H1 low to 0‡ 13 0‡ 10 0‡ 9 0‡ 7 ns
STRB enabled
(active)
(10) tdis(H1L–RW) H1 low to R/W 0‡ 13† 0‡ 10† 0‡ 9† 0‡ 8† ns
disabled (high-
impedance
state)
(11) ten(H1L–RW) H1 low to R/W 0‡ 13 0‡ 10 0‡ 9 0‡ 7 ns
enabled (ac-
tive)
(12) tdis(H1L–A) H1 low to ad- 0‡ 13† 0‡ 10† 0‡ 0‡ 8† ns
dress disabled
(high-impe-
dance state)
(13) ten(H1L–A) H1 low to ad- 0‡ 19 0‡ 15 0‡ 13 0‡ 12 ns
dress enabled
(valid)
(16) tdis(H1H–D) H1 high to data 0‡ 13† 0‡ 10† 0‡ 9† 0‡ 8† ns
disabled (high-
impedance
state)
† Characterized but not tested
‡ Not tested
§ HOLD is an asynchronous input and can be asserted at any point during a clock cycle. If the specified timings are met, the exact
sequence shown will occur; otherwise, an additional delay of one clock cycle is possible.

13-62
Timing

13.5.15 General-Purpose I/O Timing

Peripheral pins include CLKX0/1, CLKR0/1, DX0/1, DR0/1, FSX0/1, FSR0/1,


and TCLK0/1. The contents of the internal control registers associated with
each peripheral define the modes for these pins.

13.5.15.1 Peripheral Pin I/O Timing

Table 13–29 defines peripheral pin general-purpose I/O timing parameters.


The numbers shown in parentheses in Figure 13–33 correspond with those in
the No. column of Table 13–29.

Table 13–29. Timing Parameters for Peripheral Pin General-Purpose I/O


’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) tsu(GPIOH1L) General-purpose input setup 15 12 10 9 ns
before H1 low

(2) th(GPIOH1L) General-purpose input hold 0 0 0 0 ns


time after H1 low

(3) td(GPIOH1H) General-purpose output 19 15 13 10 ns


delay after H1 high
Note: Peripheral pins include CLKX0/1, CLKR0/1, DX0/1, DR0/1, FSX0/1, FSR0/1, and TCLK0/1. The modes of these pins are
defined by the contents of internal control registers associated with each peripheral.

Figure 13–33. Timing for Peripheral Pin General-Purpose I/O

H3

H1
(2) (3)
(1) (3)
Peripheral
Pin

13.5.15.2 Changing the Peripheral Pin I/O Modes

Table 13–30 and Table 13–31 show the timing parameters for changing the
peripheral pin from a general-purpose output pin to a general-purpose input
pin and vice versa. The numbers shown in parentheses in Figure 13–34 and
Figure 13–35 correspond to those shown in the No. column of Table 13–30
and Table 13–31, respectively.

TMS320C3x Signal Descriptions and Electrical Characteristics 13-63


Timing

Table 13–30. Timing Parameters for Peripheral Pin Changing From General-Purpose Output
to Input Mode
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) th(H3H) Hold after H1 high 19 15 13 10 ns

(2) tsu(GPIOH1L Peripheral pin setup before 13 10 9 9 ns


H1 low

(3) th(GPIOH1L Peripheral pin hold after H1 0 0 0 0 ns


low

Table 13–31. Timing Parameters for Peripheral Pin Changing From General-Purpose Input to
Output Mode
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) td(GPIOH1H) H1 high to peripheral pin 19 15 13 10 ns
switching from input to out-
put delay

Figure 13–34. Timing for Change of Peripheral Pin From General-Purpose Output to
Input Mode

Execution of Value on Pin


Buffers Go
Store of Seen in
From Output
Peripheral Synchronizer Delay Peripheral
to Input
Control Control
Register Register

H3

H1

IO (2)
Control Bit (3)
(1)
Peripheral
Output
Pin

Data Bit
Data
Sampled Data
Seen

13-64
Timing

Figure 13–35. Timing for Change of Peripheral Pin From General-Purpose Input to
Output Mode

Execution of Store
of Peripheral Control
Register

H3

H1

IO
Control
Bit
(1)

Peripheral
Pin

TMS320C3x Signal Descriptions and Electrical Characteristics 13-65


Timing

13.5.16 Timer Pin Timing


Valid logic-level periods and polarity are specified by the contents of the inter-
nal control registers.

Table 13–32 and Table 13–33 define the timing parameters for the timer pin.
The numbers shown in parentheses in Figure 13–36 correspond with those in
the No. column of Table 13–32 and Table 13–33.

Table 13–32. Timing Parameters for Timer Pin


’C30-27/’C31-27 ’C30-33/’C31-33
No. Name Description‡ Min Max Min Max Unit
(1) tsu(TCLKH1L TCLK ext TCLK 15 12 ns
setup before ext
H1 low

(2) th(TCLKH1L TCLK ext TCLK 0 0 ns


hold after ext
H1 low

(3) td(TCLKH1H) H1 high to TCLK 13 10 ns


TCLK int int
valid delay

(4) tc(TCLK) TCLK cycle TCLK tc(H)×2.6† tc(H)×2.6† ns


time ext

TCLK tc(H)×2 tc(H)×232† tc(H)×2 tc(H)×232† ns


int

(5) tw(TCLK) TCLK high/ TCLK tc(H)+12† tc(H)+12† ns


low pulse ext
duration

TCLK [tc(TCLK)/2]–15 [tc(TCLK)/2]+5 [tc(TCLK)/2]–15 [tc(TCLK)/2]+5 ns


int
† Guaranteed by design but not tested
‡ Timing parameters 1 and 2 are applicable for a synchronous input clock. Timing parameters 4 and 5 are applicable for an
asynchronous input clock.

13-66
Timing

Table 13–33. Timing Parameters for Timer Pin


’C30-40/’C31-40 ’C31-50
No. Name Description‡ Min Max Min Max Unit
(1) tsu(TCLKH1L) TCLK ext set- TCLK 10 8 ns
up before H1 ext
low

(2) th(TCLKH1L) TCLK ext hold TCLK 0 0 ns


after H1 low ext

(3) td(TCLKH1H) H1 high to TCLK 9 9 ns


TCLK int valid int
delay

(4) tc(TCLK) TCLK cycle TCLK tc(H)×2.6† tc(H)×2.6† ns


time ext

TCLK tc(H)×2 tc(H)×232† tc(H)×2 tc(H)×232† ns


int

(5) tw(TCLK) TCLK high/ TCLK tc(H)+10† tc(H)+10† ns


low pulse du- ext
ration

TCLK [tc(TCLK)/2]–5 [tc(TCLK)/2]+5 [tc(TCLK)/2]–5 [tc(TCLK)/2]+5 ns


int
† Guaranteed by design but not tested
‡ Timing parameters 1 and 2 are applicable for a synchronous input clock. Timing parameters 4 and 5 are applicable for an
asynchronous input clock.

Figure 13–36. Timing for Timer Pin

H3

H1
(2) (3)
(1) (3)
Peripheral
Pin
(5)
(4)

TMS320C3x Signal Descriptions and Electrical Characteristics 13-67


13-68
Appendix
AppendixAA

Instruction Opcodes

The opcode fields for all TMS320C3x instructions are shown in Table A–1. Bits
in the table marked with a hyphen are defined in the individual instruction de-
scriptions (see Chapter 10). Table A–1, along with the instruction descriptions,
fully defines the instruction words. The opcodes are listed in numerical order.
Note that an undefined operation may occur if an illegal opcode is executed.

A-1
Instruction Opcodes

Table A–1.TMS320C3x Instruction Opcodes


INSTRUCTION 31 30 29 28 27 26 25 24 23
ABSF 0 0 0 0 0 0 0 0 0
ABSI 0 0 0 0 0 0 0 0 1
ADDC 0 0 0 0 0 0 0 1 0
ADDF 0 0 0 0 0 0 0 1 1
ADDI 0 0 0 0 0 0 1 0 0
AND 0 0 0 0 0 0 1 0 1
ANDN 0 0 0 0 0 0 1 1 0
ASH 0 0 0 0 0 0 1 1 1
CMPF 0 0 0 0 0 1 0 0 0
CMPI 0 0 0 0 0 1 0 0 1
FIX 0 0 0 0 0 1 0 1 0
FLOAT 0 0 0 0 0 1 0 1 1
IDLE 0 0 0 0 0 1 1 0 0
IDLE2 0 0 0 0 0 1 1 0 0
LDE 0 0 0 0 0 1 1 0 1
LDF 0 0 0 0 0 1 1 1 0
LDFI 0 0 0 0 0 1 1 1 1
LDI 0 0 0 0 1 0 0 0 0
LDII 0 0 0 0 1 0 0 0 1
LDM 0 0 0 0 1 0 0 1 0
LDP 0 0 0 0 1 0 0 0 0
LSH 0 0 0 0 1 0 0 1 1
LOPOWER 0 0 0 1 0 0 0 0 1
MAXSPEED 0 0 0 1 0 0 0 0 1
MPYF 0 0 0 0 1 0 1 0 0
MPYI 0 0 0 0 1 0 1 0 1
NEGB 0 0 0 0 1 0 1 1 0
NEGF 0 0 0 0 1 0 1 1 1
NEGI 0 0 0 0 1 1 0 0 0

A-2
Instruction Opcodes

Table A–1.TMS320C3x Instruction Opcodes (Continued)

INSTRUCTION 31 30 29 28 27 26 25 24 23
NOP 0 0 0 0 1 1 0 0 1
NORM 0 0 0 0 1 1 0 1 0
NOT 0 0 0 0 1 1 0 1 1
POP 0 0 0 0 1 1 1 0 0
POPF 0 0 0 0 1 1 1 0 1
PUSH 0 0 0 0 1 1 1 1 0
PUSHF 0 0 0 0 1 1 1 1 1
OR 0 0 0 1 0 0 0 0 0
RND 0 0 0 1 0 0 0 1 0
ROL 0 0 0 1 0 0 0 1 1
ROLC 0 0 0 1 0 0 1 0 0
ROR 0 0 0 1 0 0 1 0 1
RORC 0 0 0 1 0 0 1 1 0
RPTS 0 0 0 1 0 0 1 1 1
STF 0 0 0 1 0 1 0 0 0
STFI 0 0 0 1 0 1 0 0 1
STI 0 0 0 1 0 1 0 1 0
STII 0 0 0 1 0 1 0 1 1
SIGI 0 0 0 1 0 1 1 0 0
SUBB 0 0 0 1 0 1 1 0 1
SUBC 0 0 0 1 0 1 1 1 0
SUBF 0 0 0 1 0 1 1 1 1
SUBI 0 0 0 1 1 0 0 0 0
SUBRB 0 0 0 1 1 0 0 0 1
SUBRF 0 0 0 1 1 0 0 1 0
SUBRI 0 0 0 1 1 0 0 1 1
TSTB 0 0 0 1 1 0 1 0 0
XOR 0 0 0 1 1 0 1 0 1
IACK 0 0 0 1 1 0 1 1 0
ADDC3 0 0 1 0 0 0 0 0 0
ADDF3 0 0 1 0 0 0 0 0 1
ADDI3 0 0 1 0 0 0 0 1 0
AND3 0 0 1 0 0 0 0 1 1
ANDN3 0 0 1 0 0 0 1 0 0
ASH3 0 0 1 0 0 0 1 0 1
CMPF3 0 0 1 0 0 0 1 1 0
CMPI3 0 0 1 0 0 0 1 1 1

Instruction Opcodes A-3


Instruction Opcodes

Table A–1.TMS320C3x Instruction Opcodes (Continued)

INSTRUCTION 31 30 29 28 27 26 25 24 23
LSH3 0 0 1 0 0 1 0 0 0
MPYF3 0 0 1 0 0 1 0 0 1
MPYI3 0 0 1 0 0 1 0 1 0
OR3 0 0 1 0 0 1 0 1 1
SUBB3 0 0 1 0 0 1 1 0 0
SUBF3 0 0 1 0 0 1 1 0 1
SUB13 0 0 1 0 0 1 1 1 0
TSTB3 0 0 1 0 0 1 1 1 1
XOR3 0 0 1 0 1 0 0 0 0
LDFcond 0 1 0 0 – – – – –
LDIcond 0 1 0 1 – – – – –
BR(D)† 0 1 1 0 0 0 0 – –
CALL 0 1 1 0 0 0 1 – –
RPTB 0 1 1 0 0 1 0 – –
SWI 0 1 1 0 0 1 1 – –
Bcond(D)† 0 1 1 0 1 0 – – –
DBcond(D)† 0 1 1 0 1 1 – – –
CALLcond 0 1 1 1 0 0 – – –
TRAPcond 0 1 1 1 0 1 0 – –
RETIcond 0 1 1 1 1 0 0 0 0
RETScond 0 1 1 1 1 0 0 0 1
MPYF3||ADDF3 1 0 0 0 0 0 0 0 –
1 0 0 0 0 0 0 1 –
1 0 0 0 0 0 1 0 –
1 0 0 0 0 0 1 1 –
MPYF3||SUBF3 1 0 0 0 0 1 0 0 –
1 0 0 0 0 1 0 1 –
1 0 0 0 0 1 1 0 –
1 0 0 0 0 1 1 1 –
MPYI3||ADDI3 1 0 0 0 1 0 0 0 –
1 0 0 0 1 0 0 1 –
1 0 0 0 1 0 1 0 –
1 0 0 0 1 0 1 1 –

† Opcode same for standard and delayed instructions.

A-4
Instruction Opcodes

Table A–1.TMS320C3x Instruction Opcodes (Concluded)


INSTRUCTION 31 30 29 28 27 26 25 24 23
MPYI3||SUBI3 1 0 0 0 1 1 0 0 –
1 0 0 0 1 1 0 1 –
1 0 0 0 1 1 1 0 –
1 0 0 0 1 1 1 1 –
STF||STF 1 1 0 0 0 0 0 – –
STI||STI 1 1 0 0 0 0 1 – –
LDF||LDF 1 1 0 0 0 1 0 – –
LDI||LDI 1 1 0 0 0 1 1 – –
ABSF||STF 1 1 0 0 1 0 0 – –
ABSI||STI 1 1 0 0 1 0 1 – –
ADDF3||STF 1 1 0 0 1 1 0 – –
ADDI3||STI 1 1 0 0 1 1 1 – –
AND3||STI 1 1 0 1 0 0 0 – –
ASH3||STI 1 1 0 1 0 0 1 – –
FIX||STI 1 1 0 1 0 1 0 – –
FLOAT||STF 1 1 0 1 0 1 1 – –
LDF||STF 1 1 0 1 1 0 0 – –
LDI||STI 1 1 0 1 1 0 1 – –
LSH3||STI 1 1 0 1 1 1 0 – –
MPYF3||STF 1 1 0 1 1 1 1 – –
MPYI3||STI 1 1 1 0 0 0 0 – –
NEGF||STF 1 1 1 0 0 0 1 – –
NEGI||STI 1 1 1 0 0 1 0 – –
NOT||STI 1 1 1 0 0 1 1 – –
OR3||STI 1 1 1 0 1 0 0 – –
SUBF3||STF 1 1 1 0 1 0 1 – –
SUBI3||STI 1 1 1 0 1 1 0 – –
XOR3||STI 1 1 1 0 1 1 1 – –
Reserved for reset, 0 1 1 1 1 1 1 1 1
traps, and interrupts

Instruction Opcodes A-5


A-6
Appendix
AppendixBA

Development Support/Part Ordering Information

This appendix provides development support information, device part num-


bers, and support tool ordering information for the TMS320C3x generation.

Each TMS320C3x support product is described in the TMS320 Family Devel-


opment Support Reference Guide (literature number SPRU011).In addition,
more than 100 third-party developers offer products that support the TI
TMS320 family. For more information, refer to the TMS320 Third-Party Refer-
ence Guide (literature number SPRU052).

For information on pricing and availability, contact the nearest TI field sales of-
fice or authorized distributor.

This appendix discusses the following major topics:

Topic Page

B.1 TMS320C3x Development Support Tools . . . . . . . . . . . . . . . . . . . . . . . . B-2


B.2 TMS320C3x Part Ordering Information . . . . . . . . . . . . . . . . . . . . . . . . . . B-7

B-1
TMS320C3x Development Support Tools

B.1 TMS320C3x Development Support Tools

Texas Instruments offers an extensive line of development tools for the


TMS320C3x generation of DSPs, including tools to evaluate the performance
of the processors, generate code, develop algorithm implementations, and ful-
ly integrate and debug software and hardware modules.

The following products support development of ’C3x applications:

Code Generation Tools

- Optimizing ANSI C compiler. Translates ANSI C language directly into


highly optimized assembly code. You can then assemble and link this code
with the TI assembler/linker, which is shipped with the compiler. It supports
both ’C3x and ’C4x assembly code. This product is currently available for
PC (DOS, DOS extended memory, and OS/2), VAX/VMS, and SPARC
workstations. Refer to the TMS320 Floating-Point DSP Optimizing C
Compiler User’s Guide (SPRU034) for detailed information.

- Assembler/linker. Converts source mnemonics to executable object code.


It supports both ’C3x and ’C4x assembly code. This product is currently
available for PC (DOS, DOS extended memory, and OS/2). The ’C3x/’C4x
assembler for the VAX/VMS and SPARC workstations is only available as
part of the optimizing ’C3x/’C4x compiler. Refer to the TMS320 Floating-
Point DSP Assembly Language Tools User’s Guide (SPRU035) for de-
tailed information.

System Integration and Debug Tools

- Simulator. Simulates via software the operation of the ’C3x and can be
used in C and assembly software development. This product is currently
available for PC (DOS and Windows) and SPARC workstations. Refer to
the TMS320C3x C Source Debugger User’s Guide (SPRU054) for de-
tailed information.

- XDS510 emulator. Performs full-speed in-circuit emulation with the ’C3x,


providing access to all registers as well as to internal and external memory.
It can be used in C and assembly software development and has the capa-
bility of debugging multiple processors. This product is currently available
for PC (DOS, Windows, and OS/2) and SPARC workstations. This product
includes the emulator board (emulator box, power supply, and SCSI con-
nector cables in the SPARC version), the ’C3x C source debugger soft-
ware, and the JTAG cable.

B-2
TMS320C3x Development Support Tools

Because ’C3x and ’C5x XDS510 emulators also come with the same emu-
lator board (or box), you can buy the ’C3x C source debugger software as a
separate product called ’C3x C Source Debugger Conversion Software.
This enables you to debug ’C3x/’C4x/’C5x applications with the same
emulator board. The emulator cable that comes with the ’C5x XDS510
emulator is not compatible with the ’C3x. You need a JTAG emulation con-
version cable. Refer to the TMS320C3x C Source Debugger User’s Guide
(SPRU053) for detailed information on the ’C3x emulator.

- Evaluation module (EVM). Each EVM comes complete with a PC halfcard


and software package. The EVM board contains the following:
J A TMS320C30 and a 33-MFLOPS, 32-bit floating-point DSP
J A 16K-word, zero-state SRAM, allowing coding of most algorithms di-
rectly on the board
J A speaker/microphone-ready analog interface for multimedia,
speech, and audio applications development
J A multiprocessor serial port interface for connecting to multiple EVMs
J A host port for PC communications
The system also comes with all the software required to begin applications
development on a PC host. Equipped with a C and assembly language
source level debugger for the DSP, the EVM has a window-oriented,
mouse-driven interface that enables the downloading, executing, and de-
bugging of assembly code or C code.
The TMS320C3x assembler/linker is also included with the EVM. For us-
ers who prefer programming in a high-level language, an optimizing ANSI
C compiler and Ada compiler are offered separately.

Development Support/Part Ordering Information B-3


TMS320C3x Development Support Tools

- Emulation porting kit (EPK). Enables you to integrate emulation technolo-


gy directly into your system without the need of an XDS510 board. This
product is intended to be used by third parties and high-volume board
manufacturers and requires a licensing agreement with Texas Instru-
ments. The kit contains host (or PC) source and object code, which lets
you tailor ’C30 EVM-like capabilities to your TMS320C3x system via the
SM74ACT8990 test bus controller (TBC). The EPK can be used in such
applications as program download for system self-test and initialization or
system emulation and debug to feature resident emulation support. EPK
software includes the TI high-level language (HLL) debugger in object as
well as source code for the TBC communication interface. The HLL code
is the windowed debugger found with many TI DSP simulators, evaluation
modules (EVMs), and emulators. With the EPK, the HLL user interface
can be ported directly to the system board. The source code for the TBC
communication interface consists of such commands as read/write,
memory run, stop, and reset that communicate with the TMS320C3x de-
vice. Using the EPK reduces system and development cost and speeds
time to market. For more information on the kit, call the DSP hotline at
(713) 274-2320.

B.1.1 TMS320 Third Parties


The TMS320 family is supported by product and service offerings from more
than 100 independent vendors and consultants, known as third parties. These
support products take various forms (both software and hardware) from cross-
assemblers, simulators, and DSP utility packages to logic analyzers and emu-
lators. Additionally, TI third parties offer more than 150 algorithms that are
available for license through the TMS320 software cooperative. These algo-
rithms can greatly reduce development time and decrease time to market. The
expertise of those involved in support services ranges from speech encoding
and vector quantization to software/hardware design and system analysis.

For a more detailed description of services and products offered by third par-
ties, refer to the TMS320 Third Party Support Reference Guide (literature
number SPRU052) and the TMS320 Software Cooperative Data Sheet Pack-
et (literature number SPRT111). Call the Literature Response Center at (800)
477–8924 to request a copy.

B-4
TMS320C3x Development Support Tools

B.1.2 TMS320 Literature


Extensive DSP documentation is available; this includes data sheets, user’s
guides, and application reports. In addition, DSP textbooks that aid research
and education have been published by Prentice-Hall, John Wiley and Sons,
and Computer Science Press. To order literature or to subscribe to the DSP
newsletter Details on Signal Processing (for up-to-date information on new
products and services), call the Literature Response Center at (800)
477–8924.

B.1.3 DSP Hotline


For answers to TMS320 technical questions on device problems, develop-
ment tools, documentation, upgrades, and new products, you can contact the
DSP hotline via:
- Phone at (713)274–2320 Monday through Friday from 8:30 a.m. to 5:00
p.m. central time

- Fax at (713)274–2324

- Electronic mail at [email protected].

- European fax at 33–1–3070–1032

- Semiconductor Product Information Center (PIC) at (214) 644–5580

To ask about third-party applications and algorithm development packages,


contact the third party directly. Refer to the TMS320 Third-Party Support Ref-
erence Guide (literature number SPRU052) for addresses and phone
numbers.
Extensive DSP documentation is available; this includes data sheets, user’s
guides, and application reports. Call the hotline at (800) 477–8924 for informa-
tion on literature that you can request from the Literature Response Center.
The DSP hotline does not provide pricing information. Contact the nearest TI
field sales office or the TI PIC for prices and availability of TMS320 devices and
support tools.

B.1.4 Bulletin Board Service (BBS)


The TMS320 DSP Bulletin Board Service (BBS) is a telephone-line computer
service that provides information on TMS320 devices, specification updates
for current or new devices and development tools, silicon and development
tool revisions and enhancements, new DSP application software as it be-
comes available, and source code for programs from any TMS320 user’s
guide.

Development Support/Part Ordering Information B-5


TMS320C3x Development Support Tools

You can access the BBS via the following:

- Modem: (300-, 1200-, or 2400-bps) dial (713)274–2323. Set your modem


to 8 data bits,1 stop bit, no parity.

- Internet: Use anonymous ftp to ti.com (Internet port address 192.94.94.1).


The BBS content is located in the subdirectory called mirrors.

To find out more about the BBS, refer to the TMS320 Family Development
Support Reference Guide (literature number SPRU011).

B.1.5 Technical Training Organization (TTO) TMS320 Workshop


The TMS320C3x DSP design workshop is tailored for hardware and software
design engineers and decision-makers who will be designing and utilizing the
TMS320C3x generation of DSP devices. Hands-on exercises throughout the
course give participants a rapid start in utilizing TMS320C3x design skills. Mi-
croprocessor/assembly language experience is required. Experience with dig-
ital design techniques and C language programming experience is desirable.
The following topics are covered in the TMS320C3x workshop:
- TMS320C3x architecture/instruction set
- Use of the PC-based TMS320C3x software simulator and EVM
- Floating-point and parallel operations
- Use of the TMS320C3x assembler/linker
- C programming environment
- System architecture considerations
- Memory and I/O interfacing
- TMS320C3x development support

For registration, pricing, or enrollment information on this and other TTO


TMS320 workshops, call (800) 336–5236, ext. 3904.

B-6
TMS320C3x Part Ordering Information

B.2 TMS320C3x Part Ordering Information


This section provides the device and support tool part numbers. Table B–1
lists the part numbers for the TMS320C30 and TMS320C31; Table B–2 gives
ordering information for TMS320C3x hardware and software support tools. An
explanation of the TMS320 family device and development support tool prefix
and suffix designators follows the two tables to assist in understanding the
TMS320 product numbering system.

Table B–1.TMS320C3x Digital Signal Processor Part Numbers

Operating
p g Typical
yp Power
D i
Device Technology
T h l Frequency Package
P k Type
T Dissipation
TMS320C30GEL 0.8-µm CMOS 33 MHz Ceramic 181-pin PGA 1.00 W

TMS320C30GEL27 0.8-µm CMOS 27 MHz Ceramic 181-pin PGA 0.875 W

TMS320C30GEL40 0.8-µm CMOS 40 MHz Ceramic 181-pin PGA 1.25 W

TMS320C30PPM40 0.8-µm CMOS 40 MHz Plastic 208-pin QFP 0.85 W

TMS320C31PQL/PQA 0.8-µm CMOS 33 MHz Plastic 132-pin QFP 0.75 W

TMS320C31PQL27 0.8-µm CMOS 27 MHz Plastic 132-pin QFP 0.60 W

TMS320C31PQL40 0.8-µm CMOS 40 MHz Plastic 132-pin QFP 0.90 W

TMS320LC31PQL 0.8-µm CMOS 33 MHz Plastic 132-pin QFP 0.50 W

TMS320C31PQL50 0.8-µm CMOS 50 MHz Plastic 132-pin QFP 1.00 W

SMJ320C316FA27 0.8-µm CMOS 28 MHz Ceramic 141-pin PGA 0.60 W


SMJ320C31HF627 Ceramic 132-pin QFP 0.60 W
SMJ320C316FA33 Ceramic 141-pin PGA 0.75 W
SMJ320C316HF633 Ceramic 132-pin PGA 0.75 W

SMJ320C306BM33 0.8-µm CMOS 33 MHz Ceramic 181-pin PGA 1.10 W


SMJ320C30HF633 Ceramic 196-pin QFP

SMJ320C30GBM28 0.8-µm CMOS 28 MHz Ceramic 181-pin PGA 1.00 W


SMJ320C30HF628 Ceramic 196-pin QFP 1.00 W
SMJ320C30HTM28

SMJ320C30GBM25 0.8-µm CMOS 25 MHz Ceramic 181-pin PGA 1.00 W


SMJ320C30HF625 Ceramic 196-pin QFP 1.00 W
SMJ320C30HTM25

Development Support/Part Ordering Information B-7


TMS320C3x Part Ordering Information

Table B–2.TMS320C3x Support Tool Part Numbers


Tool Description Operating System Part Number
(a) Software
C Compiler & Macro Assembler/ Linker VAX/VMS TMDS3243255-08
PC-DOS/MS-DOS TMDS3243855-02
SPARC (Sun OS)† TMDS3243555-08
Assembler/Linker PC-DOS/MS-DOS; OS/2 TMDS3243850-02
Simulator VAX VMS TMDS3243251-08
PC-DOS/MS-DOS TMDS3243851-02
SPARC (SUN OS) † TMDS3243551-09
Tartan Floating-Point Library PC-DOS 320 FLO-PC C30
SPARC (Sun OS) 320 FLO-Sun-C30
Digital Filter Design Package PC-DOS DFDP
Tartan C++ Compiler/Debugger PC-DOS; OS/2, Wiredown TAR-CCM-PC-C3x
SPARC (Sun OS) TAR-CCM-SP-C3x
Tartan C++ Compiler PC-DOS; OS/2, Wiredown TAR-SIM-PC-C3x
SPARC (Sun OS) TAR-SIM-SP-C3x
TMS320C3x Emulation Porting Kit TMSX3240030
(b) Hardware
XDS510 Emulator PC/MS-DOS TMDS3260130
Evaluation Module (EVM) PC-DOS/MS-DOS TMDS3260030
† Note that SUN UNIX supports TMS320C3x software tools on the 68000 family-based SUN-3 series workstations and on the
SUN-4 series machines that use the SPARC processor, but not on the SUN-386i series of workstations.

B.2.1 Device and Development Support Tool Prefix Designators


Prefixes to TI part numbers designate phases in the product’s development
stage for both devices and support tools, as shown in the following definitions:

Device Development Evolutionary Flow


- TMX: Experimental device that is not necessarily representative of the fi-
nal device’s electrical specifications
- TMP: Final silicon die that conforms to the device’s electrical specifica-
tions but has not completed quality and reliability verification
- TMS: Fully qualified production device

Support Tool Development Evolutionary Flow


- TMDX: Development support product that has not yet completed TI’s in-
ternal qualification testing for development systems
- TMDS: Fully qualified development support product

B-8
TMS320C3x Part Ordering Information

TMX and TMP devices and TMDX development support tools are shipped with
the following disclaimer:

“Developmental product is intended for internal evaluation purposes.”

Note: Prototype Devices


TI recommends that prototype devices (TMX or TMP) not be used in produc-
tion systems because their expected end-use failure rate is undefined but
predicted to be greater than standard qualified production devices.

TMS devices and TMDS development support tools have been fully character-
ized, and their quality and reliability have been fully demonstrated. TI’s stan-
dard warranty applies to TMS devices and TMDS development support tools.

TMDX development support products are intended for internal evaluation pur-
poses only. They are covered by TI’s Warranty and Update Policy for Micropro-
cessor Development Systems products; however, they should be used by cus-
tomers only with the understanding that they are developmental in nature.

B.2.2 Device Suffixes


The suffix indicates the package type (for example, N, FN, or GE) and temper-
ature range (for example, L).

Figure B–1 presents a legend for reading the complete device name for any
TMS320 family member.

Development Support/Part Ordering Information B-9


TMS320C3x Part Ordering Information

Figure B–1.TMS320 Device Nomenclature


TMS 320 C 30 GE L

Prefix Temperature Range


TMX = Experimental Device H= 0 to 50°C
TMP = Prototype Device L = 0 to 70°C
TMS = Qualified Device S = -55 to 100°C
SMJ = MIL-STD-883C M= -55 to 125°C
A†= -40 to 85°C

Device Family Package Type


320 = TMS320 Family N =
Plastic DIP
JD =
Ceramic DIP Side-Brazed
Technology FN =
Plastic Leaded CC
C = CMOS 6B =
Ceramic PGA
E = CMOS EPROM FJ =
Ceramic Leaded CC
P = OTPEPROM FD =
Leadless Ceramic CC
No Letter = NMOS FZ =
Ceramic Leaded CC
GE =
Ceramic PGA, Glass Seal
Device HU =
Ceramic Quad Flatpack
1st-generation DSP: HT =
Ceramic Quad Flatpack
(gull wing)
10
PQ = Plastic Quad Flatpack
14
15
16
17
2nd-generation DSP:
20
25
26
3rd-generation DSP:
30
31
4th-generation DSP:
40
5th-generation DSP:
50
51

† See electrical specifications for TMS320C31 PQA case temperature ratings.

B-10
Appendix
AppendixCA

Quality and Reliability

The quality and reliability of Texas Instruments (TI) microprocessor and


microcontroller products, which include TMS320 digital signal processors, re-
lies on feedback from the following:

- Our customers,

- Our total manufacturing operation from front-end wafer fabrication to final


shipping inspection, and

- Product quality and reliability monitoring.

Our customer’s perception of quality is the governing criterion for judging per-
formance. This concept is the basis for TI Corporate Quality Policy, which is
as follows:

“For every product or service we offer, we shall define the requirements that
solve the customer’s problems, and we shall conform to those requirements
without exception.”

Texas Instruments has developed a leadership reliability qualification system,


based on years of experience with leading-edge memory technology and on
years of research into customer requirements. To achieve constant improve-
ment, programs that support that system respond to customer input and inter-
nal information.

This appendix presents the following major topics:

Topic Page

C.1 Reliability Stress Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2


C.2 TMS320C31 PQFP Reflow Soldering Precautions . . . . . . . . . . . . . . . . C-7

C-1
Reliability Stress Tests

C.1 Reliability Stress Tests


Accelerated stress tests are performed on new semiconductor products and
process changes to qualify them and ensure excellence in product reliability.
The following test environments are typical:
- High-temperature operating life
- Storage life
- Temperature cycling
- Biased humidity
- Autoclave
- Electrostatic discharge
- Package integrity
- Electromigration
- Channel-hot electrons (performed on geometries less than 2.0 µm)

Typical events or changes that require internal requalification of a product in-


clude the following:

- New die design, shrink, or layout

- Wafer process (baseline/control systems, flow, mask, chemicals, gases,


dopants, passivation, or metal systems)

- Packaging assembly (baseline control systems or critical assembly equip-


ment)

- Piece parts (such as lead frame, mold compound, mount material, bond
wire, or lead finish)

- Manufacturing site

TI reliability control systems extend beyond qualification. Total reliability con-


trols and management include product reliability monitoring as well as final
product release controls. MOS memories, utilizing high-density active ele-
ments, serve as the leading indicator in wafer-process integrity at TI MOS fab-
rication sites, enhancing all MOS logic device yields and reliability. TI places
more than several thousand MOS devices per month on reliability tests to en-
sure and sustain built-in product excellence.

Table C–1 lists the microprocessor and microcontroller reliability tests, the du-
ration of the test, and sample size. Table C–2 contains definitions and descrip-
tions of terms used in those tests.

C-2
Reliability Stress Tests

Table C–1. Microprocessor and Microcontroller Tests


Sample Size
Test Duration Plastic Ceramic
Operating life, 125° C, 5.0 V 1000 hrs 129 129

Storage life, 150° C 1000 hrs 45† 45

Biased humidity, 85° C/85 percent 1000 hrs 77 –


RH, 5.0 V

Autoclave, 121° C, 1 ATM 240 hrs 45 –

Temperature cycle, – 65 to 150° C 1000 cyc‡ 77 77

Temperature cycle, 0 to 125° C 3000 cyc 77 77

Thermal shock,– 65 to 150° C 200 cyc 77 77

Electrostatic discharge, ± 2 kV 15 15

Latch-up (CMOS devices only) 5 5

Mechanical sequence – 22

Thermal sequence – 22

Thermal/mechanical sequence – 22

PIND – 45

Internal water vapor – 3

Solderability 22 22

Solder heat 22 22

Resistance to solvents 15 15

Lead integrity 15 15

Lead pull 22 –

Lead finish adhesion 15 15

Salt atmosphere 15 15

Flammability (UL94-V0) 3 –

Thermal impedance 5 5

† If junction temperature does not exceed plasticity of package


‡ For severe environments; reduced cycles for office environments

Quality and Reliability C-3


Reliability Stress Tests

Table C–2. Definitions of Microprocessor Testing Terms


Term Definition/Description References
Average Outgoing Quality (AOQ) Amount of defective product in a popu-
lation, usually expressed in terms of
parts per million (PPM).
Failure in Time (FIT) Estimated field failure rate in number of
failures per billion power-on device
hours; 1000 FITS equal 0.1 percent fail-
ure per 1000 device hours.
Operating Life Device dynamically exercised at a high
ambient temperature (usually 125° C) to
simulate field usage that would expose
the device to a much lower ambient
temperature (such as 55° C). Using a
derived high temperature, a 55°C ambi-
ent failure rate can be calculated.
Storage Life Device exposed to 150° C unbiased
condition. Bond integrity is stressed in
this environment.
Biased Humidity Moisture and bias used to accelerate
corrosion-type failures in plastic pack-
ages. Conditions include 85° C ambient
temperature with 85% relative humidity
(RH). Typical bias voltage is +5V and is
grounded on alternating pins.
Autoclave (Pressure Cooker) Plastic-packaged devices exposed to
moisture at 121° C using a pressure of
one atmosphere above normal pres-
sure. The pressure forces moisture per-
meation of the package and acceler-
ates corrosion mechanisms (if present)
on the device. External package con-
taminants can also be activated and
caused to generate inter-pin current
leakage paths.
Temperature Cycle Device exposed to severe temperature
extremes in an alternating fashion (–65°
C for 15 minutes and 150° C for 15 min-
utes per cycle) for at least 1000 cycles.
Package strength, bond quality, and
consistency of assembly process are
tested in this environment.
Electrostatic Discharge Device exposed to electrostatic
discharge pulses. Calibration is accord-
ing to MIL STD 883C, method 3015.6.
Devices are stressed to determine fail-
ure threshold of the design.

C-4
Reliability Stress Tests

Table C–2. Definitions of Microprocessor Testing Terms (Continued)


Term Definition/Description References
Thermal Shock Test similar to the temperature cycle MIL-STD-883C, Method 1011
test, but involving a liquid-to-liquid
transfer.
Particle Impact Noise Detection A nondestructive test to detect loose
(PIND) particles inside a device cavity.
Mechanical Sequence Fine and gross leak MIL-STD-883C, Method 1014
Mechanical shock MIL-STD-883C, Method 2002,
1500 g, 0.5 ms, Condition B
PIND (optional) MIL-STD-883C, Method 2020
Vibration, variable frequency MIL-STD-883C, Method 2007,
20g, Condition A
Constant acceleration MIL-STD-883C, Method 2001
Fine and gross leak MIL-STD-883C, Method 1014
Electrical test To data sheet limits
Thermal Sequence Fine and gross leak MIL-STD-883C, Method 1014
Solder heat (optional) MIL-STD-750C, Method 1014
Temperature cycle MIL-STD-883C, Method 1010,
(10 cycles minimum) – 65 to + 150 °C, Condition C
Thermal shock MIL-STD-883C, Method 1011,
(10 cycles minimum) – 55 to +125 °C, Condition B
Moisture resistance MIL-STD-883C, Method 1004
Fine and gross leak MIL-STD-883C, Method 1014
Electrical test To data sheet limits
Thermal/Mechanical Sequence Fine and gross leak MIL-STD-883C, Method 1014
Temperature cycle MIL-STD-883C, Method 1010,
(10 cycles minimum) – 65 to +150 °C, Condition C
Constant acceleration MIL-STD-883C, Method 2001,
30 kg, Y1 Plane
Fine and gross leak MIL-STD-883C, Method 1014
Electrical test To data sheet limits
Electrostatic discharge MIL-STD-883C, Method 3015
Solderability MIL-STD-883C, Method 2033
Solder heat MIL-STD-750C, Method 2031,
10 sec
Salt atmosphere MIL-STD-883C, Method 1009,
Condition A, 24 hrs min
Lead pull MIL-STD-883C, Method 2004,
Lead integrity Condition A
MIL-STD-883C, Method 2004,
Condition B1
Electromigration Accelerated stress testing of
conductor patterns to ensure
acceptable lifetime of power-
on operation
Resistance to solvents MIL-STD-883C, Method 2015

Quality and Reliability C-5


Reliability Stress Tests

Table C–3 lists the TMS320C3x devices, the approximate number of transis-
tors, and the equivalent gates. The numbers have been determined from de-
sign verification runs.

Table C–3. TMS320C3x Transistors


Device # Transistors # Gates
CMOS: TMS320C30 600K–700K 200K

CMOS: TMS320C31 500K–600K 100K

Note: MOS Semiconductors


Texas Instruments reserves the right to make changes in MOS semiconduc-
tor test limits, procedures, or processing without notice. Unless prior ar-
rangements for notification have been made, TI advises all customers to re-
verify current test and manufacturing conditions prior to relying on published
data.

C-6
TMS320C31 PQFP Reflow Soldering Precautions

C.2 TMS320C31 PQFP Reflow Soldering Precautions


Recent tests have identified an industry-wide problem experienced by sur-
face-mounted devices exposed to reflow soldering temperatures. This prob-
lem involves a package-cracking phenomenon sometimes experienced by
large (for example, 132-pin) plastic quad flat pack (PQFP) packages during
surface-mount manufacturing. This phenomenon occurs if the TMS320C31
PQA or PQL is exposed to uncontrolled levels of humidity prior to reflow solder.
This moisture can flash to steam during solder reflow and cause sufficient
stress to crack the package and compromise device integrity. Once the device
is soldered or socketed into the board, no special handling precautions are re-
quired.

To minimize moisture absorption, TI ships the TMS320C31 PQA or PQL in dry


pack shipping bags with a relative humidity (RH) indicator card and moisture-
absorbing desiccant. These moisture-barrier shipping bags will adequately
block moisture transmission to allow shelf storage for 12 months from date of
seal when stored at less than 60% RH and less than 30° C. Devices may be
stored outside the sealed bags indefinitely if stored at less than 25% RH and
less than 30° C.

Once the bag seal is broken, the devices should, within two days of removal,
be reflow soldered and stored at less than 60% RH and less than 30° C. If these
conditions are not met, TI recommends baking the devices in a clean oven at
125° C and 10% maximum RH for 25 hours. This procedure restores the de-
vices to their dry-packed moisture level.

Note: ESD Precautions


Shipping tubes will not withstand the 125° C baking process. Before baking,
transfer the devices to a metal tray or tube. Follow standard ESD precau-
tions.

TI recommends that the reflow process not exceed two solder cycles and that
the temperature not exceed 220° C.

If you have questions or concerns, please contact your local TI representative.

Quality and Reliability C-7


C-8
Appendix
AppendixDA

Calculation of TMS320C30 Power Dissipation

The TMS320C30 is a state-of-the-art, high-performance, 32-bit floating-point


digital signal processing (DSP) microprocessor fabricated in CMOS
technology. This device is the first member of the third generation of TMS320
family single-chip DSP microprocessors. Since 1982, when the first-genera-
tion TMS32010 was introduced, the TMS320 family has established itself as
the industry standard for DSP. The TMS320C30’s innovative architecture and
specialized instruction set provide high-speed and increased flexibility for DSP
applications. This combination makes it possible to execute up to 40 million
floating point operations per second (MFLOPS).
As device sophistication and levels of integration increase with evolving semi-
conductor technologies, actual levels of power dissipation vary widely and de-
pend heavily on the particular application in which the device is used and the
nature of the program being executed. In addition, due to the inherent charac-
teristics of CMOS technology, power requirements vary according to clock
rates and data values being processed.
This appendix presents the information necessary to determine TMS320C30
power supply current requirements under different operating conditions. With
this information, you can determine the device’s power dissipation, which, in
turn, you can use to calculate thermal management requirements.
This appendix discusses the following major topics:

Topic Page
D.1 Fundamental Power Dissipation Characteristics . . . . . . . . . . . . . . . . . D-2
D.2 Current Requirement for Internal Circuitry . . . . . . . . . . . . . . . . . . . . . . D-5
D.3 Current Requirement for Output Driver Circuitry . . . . . . . . . . . . . . . . . D-9
D.4 Calculation of Total Supply Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-18
D.5 Example Supply Current Calculations . . . . . . . . . . . . . . . . . . . . . . . . . D-26
D.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-28
D.7 Photo of IDD for FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-29
D.8 FFT Assembly Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-30

D-1
Fundamental Power Dissipation Characteristics

D.1 Fundamental Power Dissipation Characteristics


Typically, an IC’s (integrated circuit) power specification is expressed as a
function of operating frequency, supply voltage, operating temperature, and
output load. As devices become more complex, the specification must also be
based on device functionality. CMOS devices inherently draw current only dur-
ing switching through the linear region. Therefore, the power supply current
is related to the rate of switching. Furthermore, since the output drivers of the
TMS320C30 are specified to drive direct current (DC) loads, the power supply
current resulting from external writes depends not only on switching rate but
also on the value of data written.

D.1.1 Components of Power Supply Current Requirements


There are four basic components of the power supply current:
- Quiescent,
- Internal Operations,
- Internal Bus Operations, and
- External Bus Operations

D.1.2 Dependencies
The power supply current consumption depends on many factors. Four are
system-related:
- Operating frequency,
- Supply voltage,
- Operating temperature, and
- Output load

Several others are also related to TMS320C30 operation, including:


- Duty cycle of operations,
- Number of buses used,
- Wait states,
- Cache usage, and
- Data value

D-2
Fundamental Power Dissipation Characteristics

The total power supply current for the device is described in this equation,
which applies the four basic power supply current components and the depen-

ǒ Ǔ
dencies described above:

I + Iq ) Iiops ) Iibus ) Ixbus FV T

where

Iq is the quiescent current component,

Iiops is the current component due to internal operations,

Iibus is the current component due to internal bus usage, including data value
and cycle time dependencies,

Ixbus is the current component due to external bus usage, including data
value, wait state, cycle time, and capacitive load dependencies,

FV is a scale factor for frequency and supply voltage, and

T is a scale factor for operating temperature.

Application of this equation and determination of all of the dependencies are


described in detail in this appendix.

This appendix explains, in detail, how to determine the power supply current
requirement for the TMS320C30. If a less detailed analysis is sufficient, the
minimum, typical, and maximum values can be used to determine a rough esti-
mate of the power supply current requirements. The minimum power supply
current requirement is 110 mA. The typical and average current consumption
is 200 mA, as described in the TMS320C30 data sheet, and will be associated
with most algorithms running on the device unless data output is excessive.

Maximum Current Requirement


The maximum current requirement is 600 mA and occurs only
under worst case conditions: namely, writing alternating data
(AAAAAAAAh to 55555555h) out of both external buses
simultaneously, every cycle, with 80 pF loads and running at 33
MHz.

If an extremely conservative approach is desired, the maximum value can be


used.

Calculation of TMS320C30 Power Dissipation D-3


Fundamental Power Dissipation Characteristics

D.1.3 Determining Algorithm Partitioning

Each part of an algorithm behaves differently, depending on its internal and ex-
ternal bus usage. To analyze the power supply current requirement, you must
partition an algorithm into segments with distinct concentrations of internal or
external bus usage. The analysis that follows is applied to each distinct pro-
gram segment to determine the power supply current requirement for that sec-
tion. The average power supply current requirement can then be calculated
from the requirements of each segment of the algorithm.

D.1.4 Test Setup Description

All TMS320C30 supply current measurements were performed on the test set-
up shown in Figure D–1. The test setup consists of a TMS320C30, 8K words
of zero-wait-state Cypress Semiconductor SRAMs (CY7C186–25PC), and
RC loads on all data and address lines. A Tektronix Current Probe (P6042)
measures the power supply current in all VDD lines of the device. The supply
voltage on the output load is 2.15 V. Unless otherwise specified, all measure-
ments are made at a supply voltage of 5.0 V, an input clock frequency of 33
MHz, a capacitive load of 80 pF, and an operating temperature of 25°C.

Figure D–1.Current Measurement Test Setup


+ VDD

CY7C186-25PC
Tektronix
Current Probe
(P6042)
SRAM

2.15 V VDD 2.15 V

R = 825 Ω R = 825 Ω
TMS320C30
32 D 32 D
Primary Expansion
24 A 13 A
C C

VSS

D-4
Current Requirement for Internal Circuitry

D.2 Current Requirement for Internal Circuitry


The power supply current requirement for internal circuitry consists of three
components: quiescent, internal operations, and internal bus operations.
Quiescent and internal operations are constants, but the internal bus opera-
tions component varies with the rate of internal bus usage and the data values
being transferred.

D.2.1 Quiescent
Quiescent refers to the baseline supply current drawn by the TMS320C30 dur-
ing minimal internal activity, such as executing the IDLE instruction or branch-
ing to self. It includes the current required to fetch an instruction from on- or
off-chip memory. The quiescent requirement for the TMS320C30 is 110 mA.
Examples of quiescent current include:
- Maintaining timers and serial ports
- Executing the IDLE instruction
- TMS320C30 in HOLD mode pending external bus access
- TMS320C30 in reset
- Branching to self

D.2.2 Internal Operations


Internal operations are those that require more current than quiescent activity
but do not include external bus usage or significant internal bus usage. Internal
operations include register-to-register multiplication, ALU operations, and
branches. They add a constant 55 mA above the quiescent so that the total
contribution of quiescent and internal operations is 165 mA. Note, however,
that internal and/or external bus operations executed via an RPTS instruction
do not contribute an internal operations power supply current component and
hence do not add 55 mA to quiescent current. During an instruction in RPTS,
activity other than the instruction being repeated is suspended; therefore,
power supply current is related only to the operation performed by the instruc-
tion being executed. The next contributing factor to the power supply current
requirement is internal bus operations.

Calculation of TMS320C30 Power Dissipation D-5


Current Requirement for Internal Circuitry

D.2.3 Internal Bus Operations

The internal bus operations include all operations that utilize the internal buses
extensively, such as accessing internal RAM every cycle. No distinction is
made between internal reads (such as instruction or operand fetches from in-
ternal ROM or internal RAM banks) and internal writes (such as operand
stores to internal RAM banks), because internally they are equal. Significant
use of internal buses adds a term to the power supply current requirement that
is data-dependent. Since switching requires more current, moving changing
data at high rates requires higher power supply current.

Pipeline conflicts, use of cache, fetches from external wait-state memory, and
writes to external wait-state memory all affect the internal and external bus
cycles of an algorithm executing on the TMS320C30. Therefore, the internal
bus usage of the algorithm must be determined to accurately calculate power
supply current requirements. The TMS320C30 software simulator and XDS
emulator both provide benchmarking and timing capabilities that allow bus
usage to be determined.

The current resulting from internal bus usage varies roughly exponentially with
transfer rates. Figure D–2 shows internal bus current requirements for trans-
ferring alternating data (AAAAAAAAh to 55555555h) at several transfer rates
(expressed as the transfer cycle time). A transfer rate less than 1 implies multi-
ple accesses per single H1 cycle (that is, using direct memory access (DMA),
etc.). Transfer cycle times greater than 1 refer to single-cycle transfers with
one or more cycles between them. The minimum transfer cycle time is one-
third, which corresponds to three accesses in a single H1 cycle.

The data set AAAAAAAAh to 55555555h exhibits the maximum current for
these types of operations. Less current is required for transferring other data
patterns, and current values can be derated accordingly as described later in
this subsection.

As the transfer rate decreases (that is, transfer cycle time increases), the in-
cremental IDD approaches 0 mA. Transfer rates corresponding to more than
seven H1 cycles do not add any current and are considered insignificant. This
figure represents the incremental IDD due to internal bus operations and is
added to quiescent and internal operations current values.

For example, the maximum transfer rate corresponds to three accesses every
cycle or one-third H1 transfer cycle time. At this rate, 85 mA is added to the
quiescent (110 mA) and internal operation (55 mA) current values for a total
of 250 mA.

D-6
Current Requirement for Internal Circuitry

IncrementalFigure D–2 shows the internal bus current requirement when tran-
sferring As, followed by 5s, for various transfer rates. Figure D–3 shows the
data dependence of the internal bus current requirement when the data is oth-
er than As followed by 5s. The trapezoidal region bounds all possible data val-
ues transferred. The lower line represents the scale factor for transferring the
same data. The upper line represents the scale factor for transferring alternat-
ing data (all 0s to all Fs or all As to all 5s, etc.).

Figure D–2.Internal Bus Current Versus Transfer Rate


Internal Bus Rate of Transfer Analysis [As/5s]
100
Incremental I DD (mA)

80

60

40

20

–20
0 2 4 6 8 10 12 14

Transfer Cycle Time (H1 Cycles)

Figure D–3.Internal Bus Current Versus Data Complexity Derating Curve


Internal Bus Data Dependency

1.2
0s–Fs As–5s
Alternating Data
1
Normalized I DD

0.8

0.6
Same Data Fs–Fs
0.4
0s–0s

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Relative Data Complexity

Calculation of TMS320C30 Power Dissipation D-7


Current Requirement for Internal Circuitry

Since the possible permutations of data values is quite large, the extent to
which data varies is referred to as relative data complexity. This term repre-
sents a relative measure of the extent to which data values are changing and
the extent to which the number of bits are changing state. Therefore, relative
data complexity ranges from 0, signifying minimal variation of data, to a nor-
malized value of 1, signifying greatest data variation.

If a statistical knowledge of the data exists, Figure D–3 can be used to deter-
mine the exact power supply requirement according to internal bus usage. For
example, Figure D–3 indicates a 63% scale factor when all Fs are moved inter-
nally every cycle with two accesses per cycle. This scale factor is multiplied
by 55 mA (from Figure D–2, at one-half H1 cycle transfer time), yielding 34.65
mA because of internal bus usage. Therefore, an algorithm running under
these conditions requires about 200 mA of power supply current (110 + 55 +
34.65).

Since a statistical knowledge of the data might not be readily available, a nomi-
nal scale factor will suffice. The median between the minimum and maximum
values at 50% relative data complexity yields a value of 0.80. This value will
serve as an estimate of a nominal scale factor. Therefore, you can use this
nominal data scale factor of 80% for internal bus data dependency, adding 44
mA to 110 mA (quiescent) and 55 mA (internal operations) to yield 210 mA. As
an upper bound, assume worst case conditions of three accesses of alternat-
ing data every cycle, adding 85 mA to 110 mA (quiescent) and 55 mA (internal
operations) to yield 250 mA.

D-8
Current Requirement for Output Driver Circuitry

D.3 Current Requirement for Output Driver Circuitry


The output driver circuits on the TMS320C30 are required to drive significantly
higher DC and capacitive loads than internal device logic. Therefore, they are
designed to drive larger currents than internal devices. Because of this, output
drivers impose higher supply current requirements than other sections of cir-
cuitry on the device.

Accordingly, the highest values of supply current are exhibited when external
writes are being performed at high speed. During reads, or when the external
buses are not being used, the TMS320C30 is not driving the data bus; this
eliminates the most significant component of output buffer current. Further-
more, in typical cases, only a few address lines are changing, or the whole ad-
dress bus is static. Under these conditions, an insignificant amount of supply
current is consumed. Therefore, when no external writes are being performed
or when writes are performed infrequently, current due to output buffer circuitry
can be ignored.

When external writes are being performed, the current required to supply the
output buffers depends on several considerations. As with internal bus opera-
tions, current required for output drivers depends on the data being transferred
and the rate at which transfers are being made. Additionally, output driver cur-
rent requirements depend on the number of wait states implemented, because
wait states affect rates at which bus signals switch. Finally, current values are
also dependent upon external bus DC and capacitive loading.

External operations involve writes external to the device and constitute the
major power supply current component. The power supply current for the ex-
ternal buses is made up of three components and is summarized in the follow-
ing equation:

I
base
) Iprim ) Iexp
where

Ibase is the 60-mA baseline current component

Iprim is the primary bus current component

Iexp is the expansion bus current component

The remainder of this section describes in detail the calculation of external bus
current components.

Calculation of TMS320C30 Power Dissipation D-9


Current Requirement for Output Driver Circuitry

D.3.1 Primary Bus


The current due to primary bus writes varies roughly exponentially with both
wait states and write cycle time. Also, current components due to output driver
circuitry are represented as offsets from the baseline value. Since the baseline
value is related to internal current components, negative values for current off-
set are obtained under some circumstances. Note, however, that actual nega-
tive current does not occur.

As previously mentioned, to obtain accurate current values, you must first es-
tablish timing of write cycles on the buses. To determine the rate and timings
at which write cycles to the external buses occur, you must analyze program
activity, including any pipeline conflicts that may exist. Information from this
manual and the TMS320C30 emulator or simulator is useful in making these
determinations. Note that effects from the use of cache must also be ac-
counted for in these analyses because use of cache can affect whether in-
structions are fetched from external memory.

When evaluating external write activity in a given program segment, you must
consider whether a particular level of external write activity constitutes signifi-
cant activity. If writes are being performed at a slow enough rate, they do not
significantly impact supply current requirements; therefore, current due to ex-
ternal writes can be ignored. This is the case, however, only if writes are being
performed at very slow rates on both the primary and the expansion buses. If
writes are being performed at high speed on only one of the two external
buses, you should still use the approach described in this section to calculate
current requirements.

Note that, although you obtain negative incremental current values under
some circumstances, the total contribution for external buses, including base-
line current, must always be positive. The reason is that, when external buses
are used minimally, total current requirements always approach the current
contribution due to internal components, which is solely a function of internal
activity. This places a lower limit on current contributions resulting from the pri-
mary and expansion buses, because the total current due to external buses
is the sum of the 60-mA baseline value and the primary and expansion bus
components. This effect is discussed in further detail in the rest of this subsec-
tion.

D-10
Current Requirement for Output Driver Circuitry

When you have established bus-write cycle timing, you can use Figure D–4
to determine the contribution to supply current due to this bus activity.
Figure D–4 shows values of current contribution from the primary bus for vari-
ous numbers of wait states and H1 cycles between writes. These characteris-
tics are exhibited when writes of alternating 55555555h and AAAAAAAAh are
being performed at a capacitive load of 80 pF per output signal line. The condi-
tions exhibit the highest current values on the device. The values presented
in the figure represent incremental or additional current contributed by the pri-
mary bus output driver circuitry under the given conditions. Current values ob-
tained from this graph are later scaled and added to several other current
terms to calculate the total current for the device. As indicated in the figure, the
lower curve represents the current contribution for 18 or more cycles between
writes.

Figure D–4.Primary Bus Current Versus Transfer Rate and Wait States
Primary Bus Analysis [80 pF, As/5s]
200
q = Number of cycles between writes

150
Incremental I DD (mA)

q=1

100

q=2
50
q=4

0
q ≥ 18

–50
0 1 2 3 4 5 6 7

Wait States

Note that number of cycles between writes refers to the number of H1 cycles
between the active portion of the write cycles as defined in Chapter 13—that
is, between H1 cycles when STRB, MSTRB, or IOSTRB and R/W (or XR/W,
as the case may be) are low. As shown in Figure D–4, the minimum number
of cycles between writes is 1 because with back-to-back writes there is one H1
cycle between active portions of the writes.
To further illustrate the relationship of current and write cycle time, Figure D–5
shows the characteristics of current for various numbers of cycles between
writes for zero wait states. The information on this curve can be used to obtain
more precise values of current if zero wait states are being used and the num-
ber of cycles between writes does not fall on one of the curves in Figure D–4.

Calculation of TMS320C30 Power Dissipation D-11


Current Requirement for Output Driver Circuitry

Figure D–5.Primary Bus Current Versus Transfer Rate at Zero Wait States
Primary Bus Duty Cycle Analysis [80 pF, As/5s]
200

Incremental I DD (mA)
150

100

50

–50
0 2 4 6 8 10 12 14 16 18 20

H1 Cycles Between Writes

Note that, although these graphs contain negative current values, negative
current has not necessarily actually occurred. The negative values exist be-
cause the graphs represent a current offset from a common baseline current
value, which is not necessarily the lowest current exhibited. Using this ap-
proach to depict current contributions due to different components simplifies
current calculations because it allows calculations to be made independently.
Independent calculations are possible because information about relation-
ships between different sections of the device are included implicitly in the in-
formation for each section.

Figure D–4 and Figure D–5 show that the contribution of writes for external
bus activities becomes insignificant if writes are being performed at intervals
of more than 18 cycles. Under these conditions, you should use the incremen-
tal value of –30-mA current contribution due to the primary bus. Note, however,
that you should use a value of –30 mA only if the expansion bus is being used
extensively. This is because the total contribution for external buses, including
baseline current, must always be positive. If the expansion bus is not being
used and the primary bus is being used minimally, the current contribution due
to the primary bus must always be greater than or equal to 20 mA. This ensures
that the correct total current value is obtained when summing external bus
components. Once a current value has been obtained from Figure D–4 or
Figure D–5, this value can, if necessary, be scaled by a data dependency fac-
tor, as described at the end of this section. This scaled value is then summed
along with several other current terms to determine the total supply current.
Calculation of total supply current is described in detail in Section D.4 on page
D-18.

D-12
Current Requirement for Output Driver Circuitry

D.3.2 Expansion Bus


Currents due to the primary and expansion buses are similar in characteristics
but differ slightly because of several factors, including the fact that the expan-
sion bus has 11 fewer address outputs than the primary bus (13 rather than
24). This difference is exhibited in an overall current contribution that is slightly
lower from the expansion bus than from the primary bus.

Accordingly, determination of expansion bus current follows the same basic


premises as determination of the primary bus current. Figure D–6 and
Figure D–7 show the same current relationships for the expansion bus as
Figure D–4 and Figure D–5 show for the primary bus. Also, since the total ex-
ternal buses’ current contributions must be positive, if the primary bus is not
being used and the expansion bus is being used minimally, then the minimum
current contribution due to the expansion bus is –30 mA. Finally, as with the
primary bus, current values obtained from these figures may require scaling
by a data dependency factor, as described in subsection D.3.3 on page D-14.

Figure D–6.Expansion Bus Current Versus Transfer Rate and Wait States
Expansion Bus Analysis [80 pF, As/5s]
q = Number of cycles between writes
100
Incremental I DD (mA)

q=1

50

q=2
0
q=4

–50
q ≥ 18

–100
0 1 2 3 4 5 6 7

Wait States

Calculation of TMS320C30 Power Dissipation D-13


Current Requirement for Output Driver Circuitry

Figure D–7.Expansion Bus Current Versus Transfer Rate at Zero Wait States
Expansion Bus Duty Cycle Analysis [80 pF, As/5s]
200

150
Incremental I DD (mA)
100

50

–50

–100

–150
0 2 4 6 8 10 12 14 16 18 20

H1 Cycles Between Writes

D.3.3 Data Dependency


Data dependency of current for the primary and expansion buses is expressed
as a scale factor that is a percentage of the maximum current exhibited by ei-
ther of the two buses. Data dependencies for the primary and expansion buses
are shown in Figure D–8 and Figure D–9, respectively.
These two figures show normalized weighting factors that you can use to scale
current requirements on the basis of patterns in data being written on the exter-
nal buses. The range of possible weighting factors forms a trapezoidal pattern
bounded by extremes of data values. As can be seen from Figure D–8 and
Figure D–9, the minimum current is exhibited by writing all 0s, while the maxi-
mum current occurs when writing alternating 55555555h and AAAAAAAAh.
This condition results in a weighting factor of 1, which corresponds to using the
values from Figure D–4 and/or Figure D–5 directly.
As with internal bus operations, data dependencies for the external buses are
well defined, but accurate prediction of data patterns is often either impossible
or impractical. Therefore, unless you have precise knowledge of data patterns,
you should use an estimate of a median or average value for scale factor. If
you assume that data will be neither 5s and As nor all 0s and will be varying
randomly, a value of 0.85 is appropriate. Otherwise, if you prefer a conserva-
tive approach, you can use a value of 1.0 as an upper bound.

D-14
Current Requirement for Output Driver Circuitry

Regardless of the approach you take for scaling, once you determine the scale
factors for primary and expansion buses, apply these factors to scale the cur-
rent values found by using the graphs in the previous two subsections. For ex-
ample, if a nominal scale factor of 0.85 is used and the system uses zero wait
states with two cycles between accesses on both the primary and expansion
buses, the current contribution from the two buses is as follows:

Primary: 0.85 x 80 mA = 68 mA
Expansion: 0.85 x 40 mA = 34 mA

Figure D–8.Primary Bus Current Versus Data Complexity Derating Curve


Primary Bus Data Dependency Analysis [80 pF]
1 As–5s

0.95

0.9 Alternating Data Fs–Fs


Normalized I DD

0s–Fs
0.85

0.8

0.75
Same Data
0.7

0.65 0s–0s

0.6
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Data Complexity

Calculation of TMS320C30 Power Dissipation D-15


Current Requirement for Output Driver Circuitry

Figure D–9.Expansion Bus Current Versus Data Complexity Derating Curve


Expansion Bus Data Dependency [80 pF]
1 As–5s

0.95 Alternating Data

0.9 0s–Fs
Normalized I DD

0.85
Fs–Fs
0.8

0.75

0.7 Same Data


0.65

0.6 0s–0s
0.55
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Data Complexity

D.3.4 Capacitive Load Dependence

Once you account for cycle timing and data dependencies, you should include
capacitive loading effects in a manner similar to that of data dependency.
Figure D–10 shows the scale factor to be applied to the current values
obtained above as a function of actual load capacitance if the load capacitance
presented to the buses is less than 80 pF.

In the previous example, if the load capacitance is 20 pF instead of 80 pF, a


scale factor of 0.84 is used, yielding:

Primary: 0.84 × 68 mA = 57.12 mA


Expansion: 0.84 × 34 mA = 28.56 mA

The slope of the load capacitance line in Figure D–10 is 0.26% normalized IDD
per pF. While this slope may be used to interpolate scale factors for loads
greater than 80 pF, the TMS320C30 is specified to drive output loads of less
than 80 pF, and interface timings cannot be guaranteed at higher loads. With
data dependency and capacitive load scale factors applied to the current val-
ues for primary and expansion buses, the total supply current required for the
device for a particular application can be calculated, as described in the next
section.

D-16
Current Requirement for Output Driver Circuitry

Figure D–10. Current Versus Output Load Capacitance


IDD Versus Output Load Capacitance
1

0.95
Normalized I DD

0.9

0.85

0.8

0.75
0 10 20 30 40 50 60 70 80

Output Load Capacitance (pF)

Calculation of TMS320C30 Power Dissipation D-17


Calculation of Total Supply Current

D.4 Calculation of Total Supply Current


The previous sections have discussed currents contributed by several
sources on the TMS320C30. Because determinations of actual current values
are unique and independent for each source, each current source was dis-
cussed separately. In an actual application, however, the sum of the indepen-
dent contributions from each current determines the total current requirement
for the device. This total current value is exhibited as the total current supplied
to the device through all of the VDD inputs and returned through the VSS con-
nections.

Note that numerous VDD and VSS pins on the device are routed to a variety of
internal connections, not all of which are common. Externally, however, all of
these pins should be connected in parallel to 5 V and ground planes, respec-
tively, with as low impedance as possible.

As mentioned previously, because different program segments inherently per-


form different operations that are often quite distinct from each other, it is typi-
cally appropriate to consider current for each of the different segments inde-
pendently. Once this is done, peak current requirements are readily obtained.
Further, you can use average current calculations to determine heating effects
of power dissipation. In turn, you can use these effects to determine thermal
management considerations.

D.4.1 Combining Supply Current Due to All Components


To determine the total supply current requirements for any given program ac-
tivity, calculate each of the appropriate components and combine them in the
following sequence:

1) Start with 110-mA quiescent current requirement.

2) Add 55 mA for internal operations unless the device is dormant, as during


execution of IDLE, NOPs, or branches-to-self, or performance of internal
and/or external bus operations using an RPTS instruction (see subsection
D.2.2 on page D-5). Internal or external bus operations executed via
RPTS do not contribute an internal operations power supply current com-
ponent and hence do not add 55 mA to quiescent current. Therefore, cur-
rent components in the next two steps might still be required, even though
the 55 mA is omitted.

D-18
Calculation of Total Supply Current

3) If significant internal bus operations are being performed (see subsection


D.2.2 on page D-5), add the calculated current value.

4) If external writes are being performed at high speed (see section D.3 on
page D-9), add 60 mA and then add the values calculated for primary and
expansion bus current components. If only one external bus is being used,
the appropriate incremental current for the unused bus should still be in-
cluded because the current offsets include components required for oper-
ating both buses. Note, however, that, as discussed previously, the total
current contribution for external buses, including baseline, must always be
positive.

The current value resulting from summing these components is the total de-
vice current requirement for a given program activity.

D.4.2 Supply Voltage, Operating Frequency, and Temperature Dependencies


Current dependencies specific to each supply current component (such as in-
ternal or external bus operations) are discussed in subsection D.1.2 on page
D-2. Supply voltage level, operating temperature, and operating frequency
affect requirements for the total supply current and must be maintained within
required device specifications.

Once the total current for a particular program segment has been determined,
the dependencies that affect total current requirements are applied as a scale
factor in the same manner as data dependencies discussed in other sections.
Figure D–11 shows the relative scale factors to be applied to the supply current
values as a function of both VDD and operating frequency.

Power supply current consumption does not vary significantly with operating
temperature. However, if desired, a scale factor of 2% normalized IDD per 50°C
change in operating temperature may be used to derate current within the spe-
cified range noted in the TMS320C30 data sheet. This temperature depen-
dence is shown graphically in Figure D–12. Note that a temperature scale fac-
tor of 1.0 corresponds to current values at 25°C, which is the temperature at
which all other references in the document are made.

Calculation of TMS320C30 Power Dissipation D-19


Calculation of Total Supply Current

Figure D–11. Current Versus Frequency and Supply Voltage


IDD Versus f(CLKIN) and Supply Voltage
1.2
VDD = 5.5 V
1.1 VDD = 5.25 V
1 VDD = 5.0 V
VDD = 4.75 V
0.9
Normalized I DD

VDD = 4.5 V
0.8
0.7
0.6
0.5
0.4
0.3 VDD Increments in 0.25 V
0.2
0 5 10 15 20 25 30

f(CLKIN) (MHz)

Figure D–12. Current Versus Operating Temperature Change


Operating Temperature Effects
1.03

1.02
Normalized I DD

1.01

0.99

0.98

0.97
–80 –60 –40 –20 0 20 40 60 80

Change in Operating Temperature (°C)

D-20
Calculation of Total Supply Current

D.4.3 Design Equation


The procedure for determining the power supply current requirement can be

ǒ Ǔ
summarized in the following equation:

I + Iq ) Iiops ) Iibus ) Ixbus FV T

where

lq + 110 mA
l ops + 55 mA
i

l
bus
i
+ D1 f1 (see Table D–1)
l
xbus
+ lprim ) lexp
with

l + 60 mA
base
l
prim
+ D2 C2 f 2 (see Table D–1)

l exp + D 3 C 3 f 3 (see Table D–1)

FV is the scale factor for frequency and supply voltage, and

T is the scale factor for operating temperature.

Table D–1 describes the symbols used in the power supply current equation.
The table displays figure numbers from which the value can be obtained.

Calculation of TMS320C30 Power Dissipation D-21


Calculation of Total Supply Current

Table D–1. Current Equation Symbols


Symbol Description Graph/Value
Iq Quiescent Current 110 mA
Iiops Internal Operations Current 55 mA
Iibus Internal Bus Operations Current †

D1 Internal Bus Data Scale Factor Figure D–3


f1 Internal Bus Current Requirement Figure D–2
Ixbus External Bus Operations Current †

Ibase External Bus Base Current 60 mA


Iprim Primary Bus Operations Current †

D2 Primary Bus Data Scale Factor Figure D–8


C2 Primary Bus Cap Load Scale Factor Figure D–10
f2 Primary Bus Current Requirement Figure D–4 or
Figure D–5
Iexp Expansion Bus Operations Current †

D3 Expansion Bus Data Scale Factor Figure D–9


C3 Expansion Bus Cap Load Scale Factor Figure D–10
f3 Expansion Bus Current Requirement Figure D–6 or
Figure D–7
FV Freq/Supply Voltage Scale Factor Figure D–11
T Temperature Scale Factor Figure D–12
† See equation in subsection D.4.3 on page D-21.

D.4.4 Peak Versus Average Current


If current is observed over the course of an entire program, some segments
will usually exhibit significantly different levels of current required for different
durations of time. For example, a program may spend 80% of its time perform-
ing internal operations, drawing a current of 250 mA, and spend the remaining
20% of its time performing writes at full speed to the expansion bus, drawing
300 mA.
While knowledge of peak current levels is important in order to establish power
supply requirements, some applications require information about average
current. This is particularly significant if periods of high peak current are short
in duration. Average current can be obtained by performing a weighted sum
of the currents due to the various independent program segments over time.
In the example above, the average current can be calculated as follows:

I +
0.8 250 mA )
0.2 300 mA 260 mA+
Using this approach, average current for any number of program segments
can be calculated.

D-22
Calculation of Total Supply Current

D.4.5 Thermal Management Considerations

Heating characteristics of the TMS320C30 depend on power dissipation,


which in turn depends on power supply current. When you make thermal man-
agement calculations, you must consider the manner in which power supply
current contributes to power dissipation and to the time constant of the
TMS320C30 package thermal characteristics.

Depending on sources and destinations of current on the device, some current


contributions to IDD do not constitute a component of power dissipation at 5
volts. Accordingly, if you use the total current flowing into VDD to calculate pow-
er dissipation at 5 volts, you will obtain erroneously large values for power dis-
sipation. Power dissipation is defined as:

P +I V

(where P is power, I is current, and V is voltage). If device outputs are driving


any DC load to a logic high level, only a minor contribution is made to power
dissipation because CMOS outputs typically drive to a level within a few tenths
of a volt of the power supply rails. If this is the case, subtract these current com-
ponents out of the total supply current value; then calculate their contribution
to power dissipation separately and add it to the total power dissipation (see
Figure D–13). If this is not done, these currents resulting from driving a logic
high level into a DC load will cause unrealistically high power dissipation val-
ues. The error occurs because the currents resulting from driving a logic high
level into a DC load will appear as a portion of the current used to calculate
power dissipation due to VDD at 5 volts.

Figure D–13. Load Currents


VDD

IDD
IOUT

TMS320C30 Device Output Driven High

ISS

VDD

IDD
IOUT

TMS320C30 Device Output Driven Low

ISS

Calculation of TMS320C30 Power Dissipation D-23


Calculation of Total Supply Current

Furthermore, external loads draw supply-only current when outputs are being
driven high, because, when outputs are in the logic 0 state, the device is sink-
ing current that is supplied from an external source. Therefore, the power dissi-
pation due to this current component will not have a contribution through IDD
but will contribute to power dissipation with a magnitude of:

P + VOL I
OL
where VOL is the low-level output voltage and IOL is the current being sunk by
the output as shown in Figure D–13. The power dissipation component due
to outputs being driven low should be calculated and added to the total power
dissipation.

When outputs with DC loads are being switched, the power dissipation compo-
nents from outputs being driven high and outputs being driven low are aver-
aged and added to the total device power dissipation. You should calculate
power components due to DC loading of the outputs separately for each pro-
gram segment before you calculate average power.

Note that any unused inputs that are left disconnected may float to a voltage
level that will cause input buffer circuits to remain in the linear region and there-
fore contribute a significant component to power supply current. Accordingly,
any unused inputs should be made inactive by being either grounded or pulled
high if absolute minimum power dissipation is desired. If several unused inputs
must be pulled high, they may be pulled high together through one resistor to
minimize component count and board space.

When you use power dissipation values to determine thermal management


considerations, you should use the average power unless the time duration of
individual program segments is long. The thermal characteristics of the
TMS320C30 in the 181-pin grid analysis (PGA) package are exponential in na-
ture, with a time constant t = 4.5 minutes. Therefore, when subjected to a
change in power, the temperature of the device package will, after 4.5 minutes,
reach approximately 63% of the total temperature change. Accordingly, if the
time duration of program segments exhibiting high power dissipation values
is short (on the order of a few seconds), you can use average power, calculated
in the same manner as average current (as described in subsection D.4.4 on
page D-22).

Otherwise, you should calculate maximum device temperature on the basis


of the actual time duration of the program segments involved. For example,
if a particular program segment lasts for seven minutes, then, using the expo-
nential function, you can calculate that a device will reach approximately 80%
of the temperature due to the total power dissipation during the program seg-
ment.

D-24
Calculation of Total Supply Current

Note that the average power should be determined by calculating the power
for each program segment (including considerations described above) and
performing a time average of these values, rather than simply multiplying the
average current as determined in the previous subsection by VDD.

Specific device temperature calculations are made by using the TMS320C30


thermal impedance characteristics included in Chapter 13.

Calculation of TMS320C30 Power Dissipation D-25


Example Supply Current Calculations

D.5 Example Supply Current Calculations


A Fast Fourier Transform (FFT) represents a typical DSP algorithm. The FFT
code in Section D.8 on page D-30 processes data in the RAM blocks and
writes the result out to zero-wait-state external SRAM on the primary bus. The
program executes out of zero-wait-state external SRAM on the primary bus,
and the TMS320C30’s cache is enabled. The entire algorithm consists mainly
of internal bus operations and so includes quiescent and internal operations
in general. At the end of processing, the 1024 results are written out on the pri-
mary bus. Therefore, the algorithm exhibits a higher current requirement dur-
ing the write portion, where the external bus is being used significantly.

D.5.1 Processing
The processing portion of the algorithm is 95% of the total algorithm. During
this portion, the power supply current is required only for the internal circuitry.
Data is processed in several loops that compose a majority of the algorithm.
During these loops, two operands are transferred on every cycle. The current
required for internal bus usage, then, is 55 mA, taken from Figure D–2 on page
D-7. The data is assumed to be random. A data value scale factor of 0.8 is
used from Figure D–3 on page D-7. This value scales 55 mA, yielding 44 mA
for internal bus operations. Adding 44 mA to the quiescent current requirement
and internal operations current requirement yields a current requirement of
209 mA for the major portion of the algorithm.
I + Iq ) Iiops ) Iibus
I + 110 mA ) 55 mA ) (55 mA)(0.8) + 209 mA

D.5.2 Data Output


The portion of the algorithm corresponding to writing out data is approximately
5% of the total algorithm. Again, the data that is being written is assumed to
be random. From Figure D–3 on page D-7 and Figure D–8 on page D-15,
scale factors of 0.80 and 0.85 are used for derating due to data value depen-
dency for internal and primary buses, respectively. During the data dump por-
tion of the code, a load and store are performed every cycle; however, the par-
allel load/store instruction is in an RPTS loop, so there is no contribution due
to internal operations, because the instruction is fetched only once. The only
internal contributions are due to quiescent and internal bus operations.
Figure D–4 on page D-11 indicates a 170-mA current contribution due to back-
to-back zero-wait-state writes, and Figure D–6 on page D-13 indicates a
–80-mA contribution due to the expansion bus being idle (that is, with more
than 18 H1 cycles between writes). Therefore, the total contribution due to this
portion of the code is:

D-26
Example Supply Current Calculations

I + Iq ) Iibus ) Ixbus
or,

I + 110 ) (55 mA)(0.8) ) 60 mA – 80 mA ) (170 mA)(0.85)


+ 278.5 mA
D.5.3 Average Current
The average current is derived from the two portions of the algorithm. The pro-
cessing portion took 95% of the time and required about 210 mA, and the data
dump portion took the other 5% and required about 280 mA. The average is
calculated as:

I avg + (0.95)(21 mA) ) (0.05)(280 mA) + 213.5 mA


From the thermal characteristics specified in Chapter 13, it can be shown that
this current level corresponds to a case temperature of 43°C. This temperature
meets the maximum device specification of 85°C and hence requires no
forced air cooling.

D.5.4 Experimental Results


A photograph of the power supply current for the FFT is shown Section D.7 on
page D-29. During the FFT processing, the measured current varied between
180 and 220 mA. The peak of the current during external writes was 270 mA,
and the average current requirement, as measured on a digital multimeter,
was 200 mA. The calculations yielded results that were extremely close to the
actual measured power supply current.

Calculation of TMS320C30 Power Dissipation D-27


Summary

D.6 Summary
An accurate power supply current requirement for the TMS320C30 cannot be
expressed simply in terms of operating frequency, supply voltage, and output
load capacitance. The specification must be more complete and depends on
device functionality and system parameters. The current components related
to device functionality are due to quiescent current, internal operations, inter-
nal bus operations, and external bus operations. Those related to system pa-
rameters are due to operating frequency, supply voltage, output load capaci-
tance, and operating temperature. The typical power supply current require-
ment is 200 mA, and the minimum, or quiescent, is 110 mA.

This application report presents information required to determine power sup-


ply specifications. Specifications are based on an algorithm’s use of internal
and external buses on the TMS320C30. As devices become more complex,
the calculation of power dissipation becomes more critical.

The maximum current requirement is 600 mA and occurs only


under worst case conditions: writing alternating data
(AAAAAAAAh to 55555555h) out of both external buses
simultaneously every cycle, with 80 pF loads and running at 33
MHz.

D-28
Photo of IDD for FFT

D.7 Photo of IDD for FFT

400

300

200

100

mA

500 µs/Div

Input Clock Frequency = 33 MHz


Voltage Level = 5.0 VDD

Calculation of TMS320C30 Power Dissipation D-29


FFT Assembly Code

D.8 FFT Assembly Code


.GLOBL FFT
.GLOBL N
.GLOBL M
.GLOBL SINE

SINTAB: ; setup
.WORD SINE
RAM0:
.WORD 809800h
OUTBUF:
.WORD 800h

.TEXT

FFT: LDP SINTAB ; processing portion:


; quiescent, internal and
; bus operations

LDI N,IR0
LSH –1,IR0

; LENGTH–TWO BUTTERFLIES

LDI @RAM0,AR0
LDI IR0,RC
SUBI 1,RC

RPTB BLK1
ADDF *+AR0,*AR0++,R0
SUBF *AR0,*–AR0,R1
BLK1 STF R0,*–AR0
|| STF R1,*AR0++

; FIRST PASS OF THE DO–20 LOOP (STAGE K=2 IN DO–10 LOOP)

LDI @RAM0,AR0
LDI 2,IR0
LDI N,RC
LSH –2,RC
SUBI 1,RC

RPTB BLK2
ADDF *+AR0(IR0),*AR0++(IR0),R0
SUBF *AR0,*–AR0(IR0),R1
NEGF *+AR0,R0
|| STF R0,*–AR0(IR0)
BLK2 STF R1,*AR0++(IR0)
|| STF R0,*+AR0

; MAIN LOOP (FFT STAGES)

D-30
FFT Assembly Code

LDI N,IR0
LSH –2,IR0
LDI 3,R5
LDI 1,R4
LDI 2,R3
LOOP LSH –1,IR0
LSH 1,R4
LSH 1,R3

; INNER LOOP (DO–20 LOOP IN THE PROGRAM)

LDI @RAM0,AR5
INLOP:
LDI IR0,AR0
ADDI @SINTAB,AR0
LDI R4,IR1
LDI AR5,AR1
ADDI 1,AR1
LDI AR1,AR3
ADDI R3,AR3
LDI AR3,AR2
SUBI 2,AR2
ADDI R3,AR2,AR4
LDF *AR5++(IR1),R0
ADDF *+AR5(IR1),R0,R1
SUBF R0,*++AR5(IR1),R0
|| STF R1,*–AR5(IR1)
NEGF R0
NEGF *++AR5(IR1),R1
|| STF R0,*AR5
STF R1,*AR5

; INNERMOST LOOP

LDI N,IR1
LSH –2,IR1
LDI R4,RC
SUBI 2,RC

RPTB BLK3
MPYF *AR3,*+AR0(IR1),R0
MPYF *AR4,*AR0,R1
MPYF *AR4,*+AR0(IR1),R1
|| ADDF R0,R1,R2
MPYF *AR3,*AR0++(IR0),R0
SUBF R0,R1,R0
SUBF *AR2,R0,R1
ADDF *AR2,R0,R1
|| STF R1,*AR3++
ADDF *AR1,R2,R1
|| STF R1,*AR4– –
SUBF R2,*AR1,R1

Calculation of TMS320C30 Power Dissipation D-31


FFT Assembly Code

|| STF R1,*AR1++
BLK3 STF R1,*AR2– –

SUBI @RAM0,AR5
ADDI R4,AR5
CMPI N,AR5
BLTD INLOP
ADDI @RAM0,AR5
NOP
NOP

ADDI 1,R5
CMPI M,R5
BLE LOOP

DUMP LDI @RAM0,AR0 ; data dump portion


LDI @OUTBUF,AR1 ; quiescent, internal bus

LDF *AR0++,R0 ; ops and primary bus ops


RPTS N–2
LDF *AR0++,R0
|| STF R0,*AR1++
STF R0,*AR1++

LDI RAM0,AR1

LDI @RAM0,AR0 ; swap RAM banks


XOR 400h,AR0
STI AR0,*AR1

B FFT
.END

D-32
Appendix
AppendixEA

SMJ320C3x Digital Signal Processor


Data Sheet

This appendix contains the standalone data sheet for the military version of the
’C3x digital signal processor, the SMJ320C3x Digital Signal Processor.

E-1
E-2
Appendix
AppendixFA

Analog Interface Peripherals and


Applications

Texas Instruments (TI) offers many products for total system solutions, includ-
ing memory options, data acquisition, and analog input/output devices. This
appendix describes a variety of devices that interface directly to the TMS320
DSPs in rapidly expanding applications.

Major topics discussed in this appendix are listed below.

Topic Page

F.1 Multimedia Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-2


F.2 Telecommunications Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-5
F.3 Dedicated Speech Synthesis Applications . . . . . . . . . . . . . . . . . . . . . F-11
F.4 Servo Control/Disk Drive Applications . . . . . . . . . . . . . . . . . . . . . . . . . F-14
F.5 Modem Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-17
F.6 Advanced Digital Electronics Applications for Consumers . . . . . . F-20

F-1
Multimedia Applications

F.1 Multimedia Applications


Multimedia integrates different media through a centralized computer. These
media can be visual or audio and can be input to or output from the central
computer via a number of technologies. The technologies can be digital-based
or analog-based (such as audio or video tape recorders). The integration and
interaction of media enhance the transfer of information and can accommo-
date both analysis of problems and synthesis of solutions.

Figure F–1 shows both the central role of the multimedia computer and the
multimedia system’s ability to integrate the various media to optimize informa-
tion flow and processing.

Figure F–1. System Block Diagram

CD ROM Operator Input Modem

Video Input
Video Monitor
Image Sensor
Multimedia
Computer

Microphone Facsimile/Modem

Music Input Slides and Printing


(MIDI) Speakers

F.1.1 System Design Considerations


Multimedia systems can include various grades of audio and video quality. The
most popular video standard currently used (VGA) covers 640 × 480 pixels
with 1, 2, 4, and 8-bit memory-mapped color. Also, 24-bit true color is sup-
ported, and 1024 × 768 (beyond VGA) resolution has emerged. There are two
grades of audio. The lower grade accommodates 11.25-kHz sampling for 8-bit
monaural systems, while the higher grade accommodates 44.1-kHz sampling
for 16-bit stereo.

Audio specifications include a musical instrument digital interface (MIDI) with


compression capability, which is based on keystroke encoding, and an input/
output port with a three-disc voice synthesizer. In the media control area, video
disc, CD audio, and CD ROM player interfaces are included. Figure F–2
shows a multimedia subsystem.

F-2
Multimedia Applications

The TLC32047 wide-band analog interface circuit (AIC) is well suited for multi-
media applications because it features wide-band audio and up to 25-kHz
sampling rates. The TLC32047 is a complete analog-to-digital and digital-to-
analog interface system for the TMS320 DSPs. The nominal bandwidths of the
filters accommodate 11.4 kHz, and this bandwidth is programmable. The
application circuit shown in Figure F–2 handles both speech encoding and
modem communication functions, which are associated with multimedia appli-
cations.

Figure F–2. Multimedia Speech Encoding and Modem Communication


VOCODER (Speech Analysis) 9600-bps Modem (V.32 bis)

TLC32047 TMS320 TMS320 TMS320 TLC32047


Handset

DSP
AIC DSP Encrypt/ DSP AIC DAA HYB Phone
Decrypt Line

TMS320 DSP/
TLC32047
Interface

Controller Memory

Figure F–3 shows the interfacing of the TMS320C25 DSP to the TLC32047
AIC, which constitutes a building block of the 9600-bps V.32 bis modem shown
in Figure F–2.

Figure F–3. TMS320C25 to TLC32047 Interface


TMS320C25 TLC32047
CLKOUT MSTR CLK VCC+ 5V
0.2 µF Cer.
FSX FSX REF 0.2 µF Cer.

DX DX ANLG GND
FSR FSR BAT 42 0.2 µF Cer.

DR DR VCC– –5 V
CLKR SHIFT CLK VDD +5 V
CLKX 0.1 µF
DGTL GND
D A

Analog Interface Peripherals and Applications F-3


Multimedia Applications

F.1.2 Multimedia-Related Devices


As shown in Table F–1 and Table F–2, TI provides a complete array of analog
and graphics interface devices. These devices support the TMS320 DSPs for
complete multimedia solutions.

Table F–1.Data Converter ICs


Resolution Conversion
Device Description I/O (Bits) CLK Rate Application
TLC320AC01 Analog interface (5 V only) Serial 14 43.2 kHz Portable modem and
speech, multimedia

TLC32047 Analog interface Serial 14 25 kHz Speech, modem, and


(11.4 kHz BW) (AIC) multimedia

TLC32046 Analog interface (AIC) Serial 14 25 kHz Speech and modems

TLC32044 Analog interface (AIC) Serial 14 19.2 kHz Speech and modems

TLC32040 Analog interface (AIC) Serial 14 19.2 kHz Speech and modems

TLC34075/6 Video palette Parallel Triple 8 135 MHz Graphics

TLC34058 Video palette Parallel Triple 8 135 MHz Graphics

TLC5502/3 Flash ADC Parallel 8 20 MHz Video

TLC5602 Video DAC Parallel 8 20 MHz Video

TLC5501 Flash ADC Parallel 6 20 MHz Video

TLC5601 Video DAC Parallel 6 20 MHz Video

TLC1550/1 ADC Parallel 10 150 kHz Servo ctrl / speech

TLC32071 Analog interface (AIC) Parallel 8 1 MHz Servo ctrl / disk drive

TMS57013/4 Dual audio DAC + digital Serial 16/18 32, 37.8, Digital audio
filter 44.1, 48 kHz

Table F–2.Switched-Capacitor Filter ICs


Device Function Order Roll-Off Power Out Power Down
TLC2470 Differential audio filter amplifier 4 5 kHz 500 mW Yes

TLC2471 Differential audio filter amplifier 4 3.5 kHz 500 mW Yes

TLC10/20 General-purpose dual filter 2 CLK ÷ 50 N/A No


CLK ÷ 100

TLC04/14 Low pass, Butterworth filter 4 CLK ÷ 50 N/A No


CLK ÷ 100

For application assistance or additional information, please call TI Linear


Applications at (214) 997–3772.

F-4
Telecommunications Applications

F.2 Telecommunications Applications


The TI linear product line focuses on three primary telecommunications appli-
cation areas:

- Subscriber instruments (telephones, modems, etc.)


Includes the TCM508x DTMF tone encoder family, the TCM150x tone
ringer family, the TCM1520 ring detector, and the TCM3105 FSK modem.

- Central office line card products


Includes the TCM29Cxx combo (combined PCM filter plus codec) family,
the TCM420x subscriber line control circuit family, and the TCM1030/60
line card transient protector.

- Personal communications products


Includes the TCM320AC3x family of 5-volt voice-band audio processors
(VBAP).

TI continues to develop new telecom integrated circuits, such as a high-perfor-


mance three-volt combo family for personal communications applications and
an RF power amplifier family for hand-held and mobile cellular phones.

System Design Considerations. The size, network complexity, and com-


patibility requirements of telecommunications central office systems create
demanding performance requirements. Combo voice-band filter performance
is typically ± 0.15 dB in the passband. Idle channel noise must be on the order
of 15 dBrnc0. Gain tracking (S/Q) and distortion must also meet stringent re-
quirements. The key parameters for a SLIC device are gain, longitudinal bal-
ance, and return loss.

Analog Interface Peripherals and Applications F-5


Telecommunications Applications

Figure F–4. Typical DSP/Combo Interface

TMS320C25 16
GND
TCM320AC36

DR J1 13 DOUT
20 kΩ
8 DIN 18
DX E11 ANLGIN Codec
19 IN
14 DCLKX MIC_GS
CLKX A9 20 kΩ
F10 2 Codec
FSX EAR_A
B9 9 3 OUT
CLKR FSR EAR_B
J2 12
FSR FSX 5
11 VDD 5V
CLK
1 kΩ PDN DCLKR
5V
1 7
7 10 7 10 1 kΩ

ENP ENT ENP ENT


3 15 3 15
A RCO A RCO
4 4
B B
5 5
C U1 C U2
6 6 2.048 MHz
D D
9 1 74S161 9
1 74S161
RS CLR LOAD CLR LOAD 390 Ω Y1 390 Ω
CLK 1 kΩ CLK 1 kΩ
1 kΩ A8
2 +5 V 2 +5 V 6 5 4 3 2 5
U3 U3 U3
+5 V 0.01 nF
74S04 74S04 74S04

Reset

The TCM320AC36 combo interfaces directly to the TMS320C25 serial port


with a minimum of external components, as shown in Figure F–4. Half of hex
inverter U3 and crystal Y1 form an oscillator that provides clock timing to the
TCM320AC36. The synchronous four-bit counters U1 and U2 generate an
8-kHz frame sync signal. DCLKR on the TCM320AC36 is connected to VDD,
placing the combo in fixed data-rate mode. Two 20-kΩ resistors connected to
ANLGIN and MIC_GS set the gain of the analog input amplifier to 1. The timing
is shown in Figure F–5.

F-6
Telecommunications Applications

Figure F–5. DSP/Combo Interface Timing

CLKR/CLKX

FSX/FSR

... Receive
DR/DOUT A8 Bit 1 Bit 2 Bit 3 Bit 8
Timing

MSB LSB

Transmit
DX/PCMIN A8 Bit 1 Bit 2 Bit 3 ... Bit 8
Timing

MSB LSB

Telecommunications-Related Devices. Data sheets for the devices in


Table F–3 on page F-8 are contained in the 1991 Telecommunications Cir-
cuits Databook (literature number SCTD001B). To request your copy, contact
your nearest TI field sales office or call the Literature Response Center at (800)
477–8924.

Analog Interface Peripherals and Applications F-7


Telecommunications Applications

Table F–3.Telecom Devices


Coding Clock Rates
Device Number Law MHz† # of Bits Comments
Codec/Filter
TCM29C13 A and µ 1.544, 1.536, 2.048 8 C.O. and PBX line cards

TCM29C14 A and µ 1.544, 1.536, 2.048 8 Includes 8th-bit signal

TCM29C16 µ 2.048 8 16-pin package

TCM29C17 A 2.048 8 16-pin package

TCM29C18 µ 2.048 8 Low-cost DSP interface

TCM29C19 µ 1.536 8 Low-cost DSP interface

TCM29C23 A and µ Up to 4.096 8 Extended frequency range

TCM29C26 A and µ Up to 4.096 8 Low-power TCM29C23

TCM320AC36 µ and Linear Up to 4.096 8 and 13 Single voltage (+5) VBAP

TCM320AC37 A and Linear Up to 4.096 8 and 13 Single voltage (+5) VBAP

TCM320AC38 µ and Linear Up to 4.096 8 and 13 Single voltage (+5) GSM

TCM320AC39 A and Linear Up to 4.096 8 and 13 Single voltage (+5) GSM

TP3054/64 µ 1.544, 1.536, 2.048 8 National Semiconductor


second source

TP3054/67 A 1.544, 1.536, 2.048 8 National Semiconductor


second source

TLC320AC01 Linear 43.2 kHz 14 5-volt-only analog interface

TLC32040/1 Linear Up to 19.2-kHz sampling 14 For high-dynamic linearity

TLC32044/5 Linear Up to 19.2-kHz sampling 14 For high-dynamic linearity

TLC32046 Linear Up to 25-kHz sampling 14 For high-dynamic linearity

TLC32047 Linear Up to 25-kHz sampling 14 For high-dynamic linearity


Transient Suppressor
TCM1030 Transient suppressor for SLIC-based line card (30 A max)

TCM1060 Transient suppressor for SLIC-based line card (60 A max)


† Unless otherwise noted

F-8
Telecommunications Applications

Table F–4 is a list of switched-capacitor filter ICs.

Table F–4.Switched-Capacitor Filter ICs


Device Function Order Roll-Off Power Out Power Down
TLC2470 Differential audio filter amplifier 4 5 kHz 500 mW Yes

TLC2471 Differential audio filter amplifier 4 3.5 kHz 500 mW Yes

TLC10/20 General-purpose dual filter 2 CLK ÷ 50 N/A No


CLK ÷ 100

TLC04/14 Low pass, Butterworth filter 4 CLK ÷ 50 N/A No


CLK ÷ 100

For further information on these telecommunications products, please call


(214) 997–3772.

Figure F–6 and Figure F–7 show telecom applications.

Figure F–6. General Telecom Applications

Analog
Phones Neighborhood Cellular
Concentrator Phone

TCM5087 Tone HVLI/HCombo TCM29C23 Combo


TCM5089 Encoder TCM1060/1030 TMS320xx DSP
TCM5092 Cell Base TGAP90x
TCM5094 Station TCM320AC3X VBAP Combo
TCM153x Ringer
TGAP901
Answering
Machine

Central Office Toll Office

TCM1520 Detector
TSP50C1x Speech Synthesis TCM29C13 Combo
TP3054 Combo PBX
TCM1060/30 Transient Suppressors
Low- TCM9050/51 HVLI/HCombo TP305x
Speed DSP/Memory/Logic TCM29Cxx
Modem

TCM1520
TCM5089 DSP
TCM3105 Modem

Phones Phones
TMS320xx DSP
TCM291x Combo TCM29Cxx Combo

Analog Interface Peripherals and Applications F-9


Telecommunications Applications

Figure F–7. Generic Telecom Applications


TLC320AC01 – +
+
ADC and DAC

Fine Tune
Echo-Cancel
D
A Telephone
TMS320C25 Line
A
Echo Canceler

Transmitter
ADC
Serial and
RS-232 TMS320C25 DAC
I/F I/O
Control Receiver
TLC320AC01

F-10
Dedicated Speech Synthesis Applications

F.3 Dedicated Speech Synthesis Applications


For dedicated speech synthesis applications, TI offers a family of dedicated
speech synthesizer chips. This speech technology has been used in a wide
range of products, including games, toys, burglar alarms, fire alarms, automo-
biles, airplanes, answering machines, voice mail, industrial control machines,
office machines, advertisements, novelty items, exercise machines, and
learning aids.

Dedicated speech synthesis chips are a good alternative for low-cost applica-
tions. The speech synthesis technology provided by the dedicated chips is ei-
ther linear-predictive coding (LPC) or continuously variable slope delta modu-
lation (CVSD). Table F–5 shows the characteristics of the TI voice synthesiz-
ers.

Table F–5.TI Voice Synthesizers


On-Chip
Synthesis Memory External Data Rate
Device Microprocessor Method I/O Pins (Bits) Memory (Bits/Sec)
TSP50C4x 8-bit LPC–10 20/32 64K/128K VROM 1200–2400

TSP50C1x 8-bit LPC–12 10 64K/128K VROM 1200–2400

TSP53C30 8-bit LPC–10 20 N/A From host µP 1200–2400

TSP50C20 8-bit LPC–10 32 N/A EPROM 1200–2400

TMS3477 N/A CVSD 2 None DRAM 16K–32K

In addition to the speech synthesizers, TI has low-cost memories that are ideal
for use with these chips. TI can also be of assistance in developing and pro-
cessing the speech data that is used in these speech synthesis systems.
Table F–6 shows speech memory devices of different capabilities. Additional-
ly, audio filters are outlined in Table F–7.

Analog Interface Peripherals and Applications F-11


Dedicated Speech Synthesis Applications

Table F–6.Speech Memories


TSP60Cxx Family of Speech ROMs
Family Size No. of Pins Interface For use with:
TSP60C18 256K 16 Parallel 4-bit TSP50C1x

TSP60C19 256K 16 Serial TSP50C4x

TSP60C20 256K 28 Parallel/serial TSP50C4x


8-bit

TSP60C80 1M 28 Serial TSP50C4x

TSP60C81 1M 28 Parallel 4-bit TSP50C1x

Table F–7.Switched-Capacitor Filter ICs


Device Function Order Roll-Off Power Out Power Down
TLC2470 Differential audio filter amplifier 4 5 kHz 500 mW Yes

TLC2471 Differential audio filter amplifier 4 3.5 kHz 500 mW Yes

CLK ÷ 50
TLC10/20 General-purpose dual filter 2 N/A No
CLK ÷ 100

CLK ÷ 50
TLC04/14 Low pass, Butterworth filter 4 N/A No
CLK ÷ 100

F-12
Dedicated Speech Synthesis Applications

Table F–8 lists some of TI’s speech synthesis development tools.

Table F–8.Speech Synthesis Development Tools


Name Definition

(a) Software

EVM Code development tool

(b) Speech

SAB Speech audition board


SD85000 PC-based speech analysis system

(c) System

SEB System emulator board


SEB60Cxx System emulator boards for speech memories

For further information, call Linear Applications at (214) 997–3772.

Analog Interface Peripherals and Applications F-13


Servo Control/Disk Drive Applications

F.4 Servo Control/Disk Drive Applications


In the past, most servo control systems used only analog circuitry. However,
the growth of digital signal processing (DSP) has made digital control theory
a reality. Figure F–8 is a block diagram of a generic digital control system using
a DSP, along with an analog-to-digital converter (ADC) and a digital-to-analog
converter (DAC).

Figure F–8. Generic Servo Control Loop

r(n) + e(n) u(n) u(t) y(t)


Sum TMS320-Based DAC Plant
Digital Controller

y(n)
ADC Sensor

In a DSP-based control system, the control algorithm is implemented via soft-


ware. No component aging or temperature drift is associated with digital con-
trol systems. Additionally, sophisticated algorithms can be implemented and
easily modified to upgrade system performance.

System Design Considerations


TMS320 DSPs have facilitated the development of high-speed digital servo
control for disk drive and industrial control applications. In recent years, disk
drives have increased storage capacity from 5 megabytes to over 1 gigabyte.
This equates to a 23,900 percent growth in capacity. To accommodate these
increasingly higher densities, the data on the servo platters, whether servo-po-
sitioning or actual storage information, must be converted to digital electronic
signals at increasingly closer points in relation to the platter pick-off point. The
ADC must have increasingly higher conversion rates and greater resolution
to accommodate the increasing bandwidth requirements of higher storage
densities. In addition, the ADC conversion rates must increase to accommo-
date the shorter data retrieval access time.

F-14
Servo Control/Disk Drive Applications

Figure F–9 is a block diagram of a disk drive control system.

Figure F–9. Disk Drive Control System Block Diagram

SCSI
Data
Bus
To SCSI RAM Buffer Control Data
Host and
Interface Buffer Data Sequencer Separator
Control

Disk Drive Motor EPROM Servo


Controller
TMS320C14 TMS2764 Demodulator

Address
Decode
Control Disk Head
Select
Control

TLC32071
To To/From Disk Heads

From Spindle
SN74LS393
Motor

Table F–9 lists analog/digital interface devices used for servo control.

Analog Interface Peripherals and Applications F-15


Servo Control/Disk Drive Applications

Table F–9.Control-Related Devices


Function Device Bits Speed Channels Interface
ADC TLC1550 10 3–5 µs 1 Parallel

TLC1551 10 3–5 µs 1 Parallel

TLC5502/3 8 50 ns (flash) 1 Parallel

TLC0820 8 1.5 µs 1 Parallel

TLC1225 13 12 µs 1 (Diff.) Parallel

TLC1558 10 3–5 µs 8 Parallel

TLC1543 10 21 µs 11 Serial

TLC1549 10 21 µs 1 Serial

DAC TLC7524 8 9 MHz 1 Parallel

TLC7628 8 9 MHz (Dual) Parallel

TLC5602 8 30 MHz 1 Parallel

1 µs 8
AIC TLC32071 8 (ADC) Parallel
9 MHz 1

Figure F–10 shows the interfacing of the TMS320C14 and the TLC32071.

Figure F–10. TMS320C14–TLC32071 Interface

D0–D7 CSCNTRL
A2 CSAN
Address Decode Logic
A1 WE
A0 DEN
RESET
WE
DEN

TMS320C14 TLC32071

For further information on these servo control products, please call TI Linear
Applications at (214) 997–3772.

F-16
Modem Applications

F.5 Modem Applications


High-speed modems (9,600 bps and above) require a great deal of analog sig-
nal processing in addition to digital signal processing. Designing both high-
speed capabilities and slower fall-back modes poses significant engineering
challenges. TI offers a number of analog front-end (AFE) circuits to support
various high-speed modem standards.

The TLC32040, TLC32044, TLC32046, TLC32047, and TLC320AC01 AICs


are especially suited for modem applications by the integration of an input mul-
tiplexer, switched capacitor filters, high resolution 14-bit ADC and DAC, a four-
mode serial port, and control and timing logic. These converters feature ad-
justable parameters, such as filtering characteristics, sampling rates, gain se-
lection, (sin x)/x correction (TLC32044, TLC32046, and TLC32047 only), and
phase adjustment. All of these parameters are software-programmable, mak-
ing the AIC suitable for a variety of applications. Table F–10 has the descrip-
tion and characteristics of these devices.

Table F–10.Modem AFE Data Converters


Resolution Conversion
Device Description I/O (Bits) Rate
TLC32040 Analog interface chip (AIC) Serial 14 19.2 kHz

TLC32041 AIC without on-board VREF Serial 14 19.2 kHz

TLC32044 Telephone speed/modem AIC Serial 14 19.2 kHz

TLC32045 Low-cost version of the TLC32044 Serial 14 19.2 kHz

TLC32046 Wide-band AIC Serial 14 25 kHz

TLC32047 AIC with 11.4-kHz BW Serial 14 25 kHz

TLC320AC01 5-volt-only AIC Serial 14 43.2 kHz

TCM29C18 Companding codec/filter PCM 8 8 kHz

TCM29C23 Companding codec/filter PCM 8 16 kHz

TCM29C26 Low-power codec/filter PCM 8 16 kHz

TCM320AC36 Single-supply codec/filter PCM and 8 25 kHz


Linear

Analog Interface Peripherals and Applications F-17


Modem Applications

The AIC interfaces directly with serial-input TMS320 DSPs, which execute the
modem’s high-speed encoding and decoding algorithms. The TLC320C4x
family performs level-shifting, filtering, and A/D and D/A data conversion. The
DSP’s software-programmable features provide the flexibility required for mo-
dem operations and make it possible to modify and upgrade systems easily.
Under DSP control, the AIC’s sampling rates permit designers to include fall-
back modes without additional analog hardware in most cases. Phase adjust-
ments can be made in real time so that the A/D and D/A conversions can be
synchronized with the upcoming signal. In addition, the chip has a built-in loop-
back feature to support modem self-test requirements.
For further information or application assistance, please call TI Linear Applica-
tions at (214) 997–3772.
Figure F–11 shows a V.32 bis modem implementation using the TMS320C25
and a TLC320AC01. The upper TMS320C25 performs echo cancellation and
transmit data functions, while the lower TMS320C25 performs receive data
and timing recovery functions. The echo canceler simulates the telephone
channel and generates an estimated echo of the transmit data signal.

Figure F–11. High-Speed V.32 Bis and Multistandard Modem With the TLC320AC01 AIC
TLC320AC01 – +
+
ADC and DAC

Fine Tune
Echo-Cancel
D
A Telephone
TMS320C25/C5X Line
A
Echo Canceler

Transmitter
ADC
Serial and
RS–232 TMS320C25/C5X DAC
I/F I/O
Control Receiver
TLC320AC01

The TLC320AC01 performs the following functions:


- Upper TLC320AC01 D/A Path
Converts the estimated echo, as computed by the upper TMS320C25, into
an analog signal, which is subtracted from the receive signal
- Upper TLC320AC01 A/D Path
Converts the residual echo to a digital signal for purposes of monitoring
the residual echo and continuously training the echo canceler for optimum
performance. The converted signal is sent to the upper TMS320C25.

F-18
Modem Applications

- Lower TLC320AC01 D/A Path


Converts the upper TMS320C25 transmit output to an analog signal, per-
forms a smoothing filter function, and drives the DAC

- Lower TLC320AC01 A/D Path


Converts the echo-free receive signal to a digital signal, which is sent to
the lower TMS320C25 to be decoded

Note: Modem Functions


Figure F–11 is for illustration only. In reality, one single TMS320C5x DSP can
implement high-speed modem functions.

Analog Interface Peripherals and Applications F-19


Advanced Digital Electronics Applications for Consumers

F.6 Advanced Digital Electronics Applications for Consumers

With the extensive use of the TMS320 DSPs in consumer electronics, much
electromechanical control and signal processing can be done in the digital do-
main. Digital systems generally require some form of analog interface, usually
in the form of high-performance ADCs and DACs. Figure F–12 shows the gen-
eral performance requirements for a variety of applications.

Figure F–12. Applications Performance Requirements

MSPS
300

Instrumentation

100
HDTV
Sampling Frequency

30
Broadcasting
ADTV
DVTR

10
Fax/PC

Bits
4 5 6 7 8 9 10
Performance/Application

Advanced Television System Design Considerations. Advanced


Digital Television (ADTV) is a technology that uses DSP to enhance video and
audio presentations and to reduce noise and ghosting. Because of these DSP
techniques, a variety of features can be implemented, including frame store,
picture-in-picture, improved sound quality, and zoom. The bandwidth require-
ments remain at the existing six-MHz television allocation. From the intermedi-
ate frequency (IF) output, the video signal is converted by an eight-bit video
ADC. The digital output can be processed in the digital domain to provide noise
reduction, interpolation or averaging for digitally increased sharpness, and
higher quality audio. The DSP digital output is converted back to analog by a
video DAC, as shown in Figure F–13.

F-20
Advanced Digital Electronics Applications for Consumers

Figure F–13. Video Signal Processing Basic System

TV IF TMS320 CRT
ADC DAC Buffer
Amplifier DSP Video
Signal

Field System
Memory Controller

Clock
Generator

Video casette recorders (VCRs), compact disc (CD) and digital audio tape
(DAT) players, and personal computers (PCs) are a few of the products that
have taken a major position in the marketplace in recent years. The audio
channels for compact disc and DAT require 16-bit A/D resolution to meet the
distortion and noise standards. See Figure F–14 for a block diagram of a typi-
cal digital audio system.

Figure F–14. Typical Digital Audio Implementation

384fs Third 128fs


Overtone Oscillator
Circuit

1024fs

1fs L L
TMS57001
Analog
Digital Audio TMS57013/4 Dual 16/18 PWM Analog
Data Power
Sound Bit DAC+ Digital Filter Output
Amplifier
Processor R
R

Analog Interface Peripherals and Applications F-21


Advanced Digital Electronics Applications for Consumers

The motion and motor control systems usually use 8- to 10-bit ADCs for the
lower frequency servo loop. Tape or disk systems use motor or motion control
for proper positioning of the record or playback heads. With the storage me-
dium compressing data into an increasingly smaller physical size, the position-
ing systems require more precision.

The audio processing becomes more demanding as higher fidelity is required.


Better fidelity translates into lower noise and distortion in the output signal.

The TMS57013DW/57014DW one-bit DACs include an eight-times-over sam-


pling digital filter designed for digital audio systems, such as compact disk
players (CDPs), DATs, compact disks interactive (CDIs), laser disk players
(LDPs), digital amplifiers, and car stereos. They are also suitable for all sys-
tems that include digital sound processing like TVs, VCRs, musical instru-
ments, multimedia, etc.

The converters have dual channels so that the right and left stereo signals can
be transformed into analog signals with only one chip. There are some func-
tions that allow the customers to select the conditions according to their appli-
cations, such as muting, attenuation, de-emphasis, and zero data detection.
These functions are controlled by external 16-bit serial data from a controller
like a microcomputer.

The TMS5703DW/57014DW adopt 129-tap finite impulse response (FIR) filter


and third-order ∆ Σ modulation to get –75-dB stop band attenuation and 96-dB
signal noise ratio (SNR). The output is pulse width modulation (PWM) wave,
which facilitates analog signals through a low-pass filter.

Table F–11 lists TI products for analog interfacing to digital systems.

F-22
Advanced Digital Electronics Applications for Consumers

Table F–11.Audio/Video Analog/Digital Interface Devices


Function Device Bits Speed Channels Interface
Dual audio DAC + digital filter TMS57013/4 16/18 32, 37.8, 2 Serial
44.1, 48 kHz

Analog interface TLC32071


A/D 8 2 µs 8 Parallel
D/A 8 15 µs 1 Parallel

A/D TLC1225 12 12 µs 1 Parallel

A/D TLC1550 10 6 µs 1 Parallel

Video D/A TLC5602 8 50 ns 1 Parallel

Video D/A TL5602 8 50 ns 1 Parallel

Triple video D/A TL5632 8 16 ns 3 Parallel

Triple flash A/D TLC5703 8 70 ns 3 Parallel

Flash A/D TLC5503 8 100 ns 1 Parallel

Flash A/D TLC5502 8 50 ns 1 Parallel

For further information or application assistance, please call TI Linear Applica-


tions at (214) 997–3772.

Analog Interface Peripherals and Applications F-23


F-24
Appendix
AppendixGA

Boot Loader Source Code

This appendix contains the source code for the TMS320C3x boot loader.

G-1
Boot Loader Source Code

************************************************************************
* C31BOOT – TMS320C31 BOOT LOADER PROGRAM
* (C) COPYRIGHT TEXAS INSTRUMENTS INC., 1990
*
* NOTE: 1. AFTER DEVICE RESET, THE PROGRAM IS SET TO WAIT FOR
* THE EXTERNAL INTERRUPTS. THE FUNCTION SELECTION OF
* THE EXTERNAL INTERRUPTS IS AS FOLLOWS:
* –––––––––––––––––––––––––––––––––––––––––––––––––––
* INTERRUPT PIN | FUNCTION
* –––––––––––––––|–––––––––––––––––––––––––––––––––––
* 0 | EPROM boot loader from 1000H
* –––––––––––––––|–––––––––––––––––––––––––––––––––––
* 1 | EPROM boot loader from 400000H
* –––––––––––––––|–––––––––––––––––––––––––––––––––––
* 2 | EPROM boot loader from FFF000H
* –––––––––––––––|–––––––––––––––––––––––––––––––––––
* 3 | Serial port 0 boot loader
* –––––––––––––––––––––––––––––––––––––––––––––––––––
*
* 2. THE EPROM BOOT LOADER LOADS WORD, HALFWORD, OR BYTE-
* WIDE PROGRAMS TO SPECIFIED LOCATIONS. THE
* 8 LSBs OF FIRST MEMORY SPECIFY THE MEMORY WIDTH OF
* THE EPROM. IF THE HALFWORD OR BYTE-WIDE PROGRAM IS
* SELECTED, THE LSBs ARE LOADED FIRST, FOLLOWED BY THE MSBs.
* THE FOLLOWING WORD CONTAINS THE CONTROL WORD FOR
* THE LOCAL MEMORY REGISTER. THE PROGRAM BLOCKS FOLLOW.
* THE FIRST TWO WORDS OF EACH PROGRAM BLOCK CONTAIN
* THE BLOCK SIZE AND MEMORY ADDRESS TO BE LOADED INTO.
* WHEN THE ZERO BLOCK SIZE IS READ, THE PROGRAM BLOCK
* LOADING IS TERMINATED. THE PC WILL BRANCH TO THE
* STARTING ADDRESS OF THE FIRST PROGRAM BLOCK.
*
* 3. IF SERIAL PORT 0 IS SELECTED FOR BOOT LOADING, THE
* PROCESSOR WILL WAIT FOR THE INTERRUPT FROM THE
* RECEIVE SERIAL PORT 0 AND PERFORM THE DOWNLOAD.
* AS WITH THE EPROM LOADER, PROGRAMS CAN BE LOADED
* INTO DIFFERENT MEMORY BLOCKS. THE FIRST TWO WORDS OF EACH
* PROGRAM BLOCK CONTAIN THE BLOCK SIZE AND MEMORY ADDRESS
* TO BE LOADED INTO. WHEN THE ZERO BLOCK SIZE IS READ,
* PROGRAM BLOCK LOADING IS TERMINATED. IN OTHER WORDS,
* IN ORDER TO TERMINATE THE PROGRAM BLOCK LOADING,
* A ZERO HAS TO BE ADDED AT THE END OF THE PROGRAM BLOCK.
* AFTER THE BOOT LOADING IS COMPLETED, THE PC WILL BRANCH
* TO THE STARTING ADDRESS OF THE FIRST PROGRAM BLOCK.
*
************************************************************************

G-2
Boot Loader Source Code

.global check
.sect ”vectors”
reset .word check
int0 .word 809FC1h
int1 .word 809FC2h
int2 .word 809FC3h
int3 .word 809FC4h
xint0 .word 809FC5h
rint0 .word 809FC6h
.word 809FC7h
.word 809FC8h
tint0 .word 809FC9h
tint1 .word 809FCAh
dint .word 809FCBh
.word 809FCCh
.word 809FCDh
.word 809FCEh
.word 809FCFh
.word 809FD0h
.word 809FD1h
.word 809FD2h
.word 809FD3h
.word 809FD4h
.word 809FD5h
.word 809FD6h
.word 809FD7h
.word 809FD8h
.word 809FD9h
.word 809FDAh
.word 809FDBh
.word 809FDCh
.word 809FDDh
.word 809FDEh
.word 809FDFh

***************************************************************************

trap0 .word 809FE0h


trap1 .word 809FE1h
trap2 .word 809FE2h
trap3 .word 809FE3h
trap4 .word 809FE4h
trap5 .word 809FE5h
trap6 .word 809FE6h
trap7 .word 809FE7h
trap8 .word 809FE8h
trap9 .word 809FE9h
trap10 .word 809FEAh

Boot Loader Source Code G-3


Boot Loader Source Code

trap11 .word 809FEBh


trap12 .word 809FECh
trap13 .word 809FEDh
trap14 .word 809FEEh
trap15 .word 809FEFh
trap16 .word 809FF0h
trap17 .word 809FF1h
trap18 .word 809FF2h
trap19 .word 809FF3h
trap20 .word 809FF4h
trap21 .word 809FF5h
trap22 .word 809FF6h
trap23 .word 809FF7h
trap24 .word 809FF8h
trap25 .word 809FF9h
trap26 .word 809FFAh
trap27 .word 809FFBh
.word 809FFCh
.word 809FFDh
.word 809FFEh
.word 809FFFh

***************************************************************************
.space 5

check: LDI 4040h,AR0 ; load peripheral mem. map


LSH 9,AR0 ; start addr. 808000h
LDI 404Ch,SP ; initialize stack pointer to
LSH 9,SP ; ram0 addr. 809800h
LDI 0,R0 ; set start address flag off
intloop TSTB 8,IF ; test for ext int3
BNZ serial ; on int3 go to serial
LDI 8,AR1 ; load 001000h / 2^9 –> AR1
TSTB 1,IF ; test for int0
BNZ eprom_load ; branch to eprom_load if int0 = 1
LDI 2000h,AR1 ; load 400000h / 2^9 –> AR1
TSTB 2,IF ; test for int1
BNZ eprom_load ; branch to eprom_load if int1 = 1
LDI 7FF8h,AR1 ; load FFF000h / 2^9 –> AR1
TSTB 4,IF ; test for int2
BZ intloop ; if no intX go to intloop
eprom_load LSH 9,AR1 ; eprom address = AR1 * 2^9
LDI *AR1++(1),R1 ; load eprom mem. width
LDI sub_w,AR3 ; full–word size subroutine
; address –> AR3
LSH 26,R1 ; test bit 5 of mem. width word
BN load0 ; if ’1’ start PGM loading
; (32 bits width)

G-4
Boot Loader Source Code

NOP *AR1++(1) ; jump last half word from mem. word


LDI sub_h,AR3 ; half word size subroutine
; address –> AR3
LSH 1,R1 ; test bit 4 of mem. width word
BN load0 ; if ’1’ start PGM loading
; (16 bits width)

LDI sub_b,AR3 ; byte size subroutine address –> AR3


ADDI 2,AR1 ; jump last 2 bytes from mem. word

load0 CALLU AR3 ; load new word


; according to mem. width
STI R1,*+AR0(64h) ; set primary bus control

load2 CALLU AR3 ; load new word according to


; mem. width
LDI R1,RC ; set block size for repeat loop
CMPI 0,RC ; if 0 block size start PGM
BZ AR2
SUBI 1,RC ; block size –1

CALLU AR3 ; load new word according to


; mem. width
LDI R1,AR4 ; set destination address
LDI R0,R0 ; test start address loaded flag
LDIZ R1,AR2 ; load start address if flag off
LDI –1,R0 ; set start & dest. address flag on
SUBI 1,AR3 ; sub address with loop

CALLUAR3 ; load new word according to


; mem. width
LDI 1,R0 ; set dest. address flag off
ADDI 1,AR3 ; sub address without loop
BR load2 ; jump to load a new block
; when loop completed

.space 1

serial LDI sub_s,AR3 ; serial words subroutine


; address –> AR3
LDI 111h,R1 ; R1 = 0000111h
STI R1,*+AR0(43h) ; set CLKR,DR,FSR as serial port pins
LDI 0A30h,R2
LSH 16,R2 ; R2 = A300000h
STI R2,*+AR0(40h) ; set serial port global
; ctrl. register
BR load2 ; jump to load 1st block

.space 29

loop_s RPTB load_s ; PGM load loop


sub_s TSTB 20h,IF
BZ sub_s ; wait for receive buffer full
AND 0FDFh,IF ; reset interrupt flag

Boot Loader Source Code G-5


Boot Loader Source Code

LDI *+AR0(4Ch),R1
LDI R0,R0 ; test load address flag
BNN end_s
load_s STI R1,*AR4++(1) ; store new word to dest. address
end_s RETSU ; return from subroutine

.space 22

loop_h RPTB load_h ; PGM load loop


sub_h LDI *AR1++(1),R1 ; load LSB half word
AND 0FFFFh,R1
LDI *AR1++(1),R2 ; load MSB half word
LSH 16,R2
OR R2,R1 ; R1 = a new 32-bit word
LDI R0,R0 ; test load address flag
BNN end_h
load_h STI R1,*AR4++(1) ; store new word to dest. address
end_h RETSU ; return from subroutine

.space 26

loop_w RPTB load_w ; PGM load loop


sub_w LDI *AR1++(1),R1 ; read a new 32-bit word
LDI R0,R0 ; test load address flag
BNN end_w
load_w STI R1,*AR4++(1) ; store new word to dest. address
end_w RETSU ; return from subroutine

.space 14

loop_b RPTB load_b ; PGM load loop


sub_b LDI *AR1++(1),R1
AND 0FFh,R1 ; load 1st byte ( LSB )
LDI *AR1++(1),R2
AND 0FFh,R2
LSH 8,R2
OR R2,R1 ; load 2nd byte
LDI *AR1++(1),R2
AND 0FFh,R2
LSH 16,R2
OR R2,R1 ; load 3rd byte
LDI *AR1++(1),R2 ; load 4th byte ( MSB )
LSH 24,R2
OR R2,R1 ; R1 = a new 32-bit word
LDI R0,R0 ; test load address flag
BNN end_b
load_b STI R1,*AR4++(1) ; store new word to dest. address
end_b RETSU ; return from subroutine

.space 1

.end

G-6
Index

Index

12-pin emulator connector, dimensions 12-45 ADDI3 and MPYI3 instructions


12-pin header, MPSD 12-39 to 12-40 (parallel) 10-130 to 10-132
ADDI3 and STI instructions
(parallel) 10-40 to 10-41
A ADDI3 instruction 10-38 to 10-39
A-law addition example 11-39
compression 11-56 address space segmentation 12-11
expansion 11-57
addressing 5-1 to 5-34
A/D converter interface 12-19 to 12-22 bit-reversed 5-29 to 5-30
A/D input/output system 12-32 to 12-35 FFT algorithms 5-29 to 5-30
abbreviations 10-14 to 10-15 circular 5-24 to 5-28
ABSF and STF instructions algorithm 5-26
(parallel) 10-23 to 10-24 buffer 5-24 to 5-28
operation 5-27
ABSF instruction 10-22
modes
ABSI and STI instructions (parallel) 10-27 to 10-28 conditional branch 2-16, 5-23
ABSI instruction 10-25 to 10-26 general 5-19 to 5-20
absolute value of floating-point instruction 10-22 groups 5-19 to 5-23
absolute value of integer instruction 10-25 long-immediate 2-16
adaptive filters 11-67 parallel 2-16, 5-21 to 5-22
three-operand 2-16, 5-20 to 5-21
ADC F-23
types 5-2 to 5-18
add floating-point instruction 10-32 direct 5-4
3-operand instruction 10-33 indirect 5-5 to 5-16
add integer instruction 10-37 long-immediate 5-17
3-operand instruction 10-38 PC-relative 5-17 to 5-18
add integer with carry instruction 10-29 register 5-3
3-operand instruction 10-30 short-immediate 5-16 to 5-17
ADDC instruction 10-29 used in addressing modes 5-2 to 5-18
ADDC3 instruction 10-30 to 10-31 ADTV F-20
ADDF instruction 10-32 advanced interface design 12-1
ADDF3 and MPYF3 instructions algorithm partitioning D-4
(parallel) 10-119 to 10-121 analog interface circuit (AIC) 12-32 to 12-35
ADDF3 and STF instructions analog interface peripherals and applications
(parallel) 10-35 to 10-36 F-1 to F-24
ADDF3 instruction 10-33 to 10-34 dedicated speech synthesis F-11 to F-13
ADDI instruction 10-37 digital electronics for consumers F-20 to F-24

Index-1
Index

analog interface peripherals and applications assembler/linker B-2


(continued)
assembly language
modem F-17 to F-19
condition codes and flags 10-10 to 10-13
multimedia F-2 to F-4
individual instructions 10-14 to 10-210
multimedia-related devices F-4
example 10-19 to 10-21
system design considerations F-2 to F-3
general information 10-14 to 10-18
servo control/disk drive F-14 to F-16
optional assembler syntaxes 10-16 to 10-18
telecommunications F-5 to F-10
symbols and abbreviations 10-14 to 10-15
AND instruction 10-42 instruction set 10-2 to 10-9
AND3 and STI instructions illegal instructions 10-9
(parallel) 10-45 to 10-46 interlocked operations instructions 10-6
AND3 instruction 10-43 to 10-44 load-and-store instructions 10-2
ANDing of the ready signals 12-10 low-power control instructions 10-5
ANDN instruction 10-47 parallel operations instructions 10-7 to 10-8
program control instructions 10-5
ANDN3 instruction 10-48 to 10-49
three-operand instructions 10-4
application-oriented operations 11-53 to 11-130 two-operand instructions 10-3
adaptive filters 11-67
assembly language instructions 10-1 to 10-18
companding 11-53 to 11-57
ABSF and STF instructions (parallel)
fast Fourier transforms (FFT) 11-73 to 11-125
10-23 to 10-24
FIR filters 11-58 to 11-60
ABSF instruction 10-22
IIR filters 11-60 to 11-66
ABSI and STI instructions (parallel)
lattice filters 11-125 to 11-131
10-27 to 10-28
matrix-vector multiplication 11-70 to 11-73
ABSI instruction 10-25 to 10-26
applications, general listing 1-10 absolute value of floating-point 10-22
architecture 2-2 absolute value of integer 10-25 to 10-26
block diagram 2-3 add floating-point 10-32
introduction 2-2 3-operand instruction 10-33 to 10-34
overview 2-1 add integer 10-37
arithmetic 3-operand instruction 10-38 to 10-39
logic unit (ALU) 2-6 add integer with carry 10-29
operations 11-23 to 11-52 3-operand instruction 10-30 to 10-31
bit manipulation 11-23 to 11-24 ADDC instruction 10-29
bit-reversed addressing 11-25 to 11-26 ADDC3 instruction 10-30 to 10-31
block moves 11-25 ADDF instruction 10-32
extended-precision arithmetic 11-38 to 11-41 ADDF3 and MPYF3 instructions (parallel)
floating-point format conversion 10-119 to 10-121
11-42 to 11-52 ADDF3 and STF instructions (parallel)
integer and floating-point division 10-35 to 10-36
11-26 to 11-33 ADDF3 instruction 10-33 to 10-34
square root 11-34 ADDI instruction 10-37
arithmetic shift instruction 10-50 ADDI3 and MPYI3 instructions (parallel)
3-operand instruction 10-52 10-130 to 10-132
ASH instruction 10-50 to 10-51 ADDI3 and STI instructions (parallel)
10-40 to 10-41
ASH3 and STI instructions
ADDI3 instruction 10-38 to 10-39
(parallel) 10-54 to 10-55
AND instruction 10-42
ASH3 instruction 10-52 to 10-53 AND3 and STI instructions (parallel)
assembler syntax expression, example 10-19 10-45 to 10-46
assembler syntax, optional 10-16 to 10-18 AND3 instruction 10-43 to 10-44

Index-2
Index

assembly language instructions (continued) assembly language instructions (continued)


ANDN instruction 10-47 DBcond D instruction 10-73 to 10-74
ANDN3 instruction 10-48 to 10-49 decrement and branch conditionally
arithmetic shift 10-50 to 10-51 delayed 10-73 to 10-74
3-operand instruction 10-52 to 10-53 standard 10-71 to 10-72
ASH instruction 10-50 to 10-51 example instruction 10-19 to 10-21
ASH3 and STI instructions (parallel) FIX and STI instructions (parallel)
10-54 to 10-55 10-77 to 10-78
ASH3 instruction 10-52 to 10-53 FIX instruction 10-75 to 10-76
Bcond instruction 10-56 to 10-57 FLOAT and STF instructions (parallel)
Bcond D instruction 10-58 to 10-59 10-80 to 10-81
bitwise exclusive-OR 10-206 FLOAT instruction 10-79
3-operand instruction 10-207 to 10-208 floating-point-to-integer conversion
bitwise logical-AND 10-42 10-75 to 10-76
3-operand instruction 10-43 to 10-44 IACK instruction 10-82
bitwise logical-AND with complement 10-47 IDLE instruction 10-83
3-operand instruction 10-48 to 10-49 idle until interrupt 10-83
bitwise logical-complement 10-148 IDLE2 instruction 10-84 to 10-85
bitwise logical-OR 10-151 individual instructions 10-14 to 10-210
3-operand instruction 10-152 to 10-153 integer to floating-point conversion 10-79
BR instruction 10-60 interrupt acknowledge 10-82
branch conditionally (delayed) 10-58 to 10-59 LDE instruction 10-86
branch conditionally (standard) 10-56 to 10-57 LDF and LDF instructions (parallel)
branch unconditionally (delayed) 10-61 10-91 to 10-92
branch unconditionally (standard) 10-60 LDF and STF instructions (parallel)
BRD instruction 10-61 10-93 to 10-94
CALL instruction 10-62 LDF instruction 10-87
call subroutine 10-62 LDFcond instruction 10-88 to 10-89
call subroutine conditionally 10-63 to 10-64 LDFI instruction 10-90
CALLcond instruction 10-63 to 10-64 LDI and LDI instructions (parallel)
categories 10-100 to 10-101
illegal 10-9 LDI and STI instructions (parallel)
interlocked operation 10-6 10-102 to 10-103
load and store 10-2 LDI instruction 10-95 to 10-96
low-power control 10-5 LDIcond instruction 10-97 to 10-98
parallel operation 10-7 to 10-8 LDII instruction 10-99
program control 10-5 LDM instruction 10-104
three-operand 10-4 LDP instruction 10-105
two-operand 10-3 load data page pointer 10-105
CMPF instruction 10-65 load floating-point 10-87
CMPF3 instruction 10-66 to 10-67 interlocked 10-90
CMPI instruction 10-68 load floating-point conditionally 10-88 to 10-89
CMPI3 instruction 10-69 to 10-70 load floating-point exponent 10-86
compare floating-point 10-65 load floating-point mantissa 10-104
3-operand instruction 10-66 to 10-67 load integer 10-95 to 10-96
compare integer 10-68 interlocked 10-99
3-operand instruction 10-69 to 10-70 load integer conditionally 10-97 to 10-98
condition codes 10-10 to 10-13 logical shift 10-107 to 10-108
condition for execution 10-10 to 10-13 3-operand instruction 10-109 to 10-111
DBcond instruction 10-71 to 10-72 LOPOWER instruction 10-106

Index-3
Index

assembly language instructions (continued) assembly language instructions (continued)


low-power idle 10-84 to 10-85 parallel ABSF and STF instructions
LSH instruction 10-107 to 10-108 10-23 to 10-24
LSH3 and STI instructions (parallel) parallel ABSI and STI instructions
10-112 to 10-114 10-27 to 10-28
LSH3 instruction 10-109 to 10-111 parallel ADDF3 and MPYF3 instructions
MAXSPEED instruction 10-115 10-119 to 10-121
MPYF instruction 10-116 parallel ADDF3 and STF instructions
MPYF3 and ADDF3 instructions (parallel) 10-35 to 10-36
10-119 to 10-121 parallel ADDI3 and MPYI3 instructions
MPYF3 and STF instructions (parallel) 10-130 to 10-132
10-122 to 10-123 parallel ADDI3 and STI instructions
MPYF3 and SUBF3 instructions (parallel) 10-40 to 10-41
10-124 to 10-126 parallel AND3 and STI instructions
MPYF3 instruction 10-117 to 10-118 10-45 to 10-46
MPYI instruction 10-127 parallel ASH3 and STI instructions
MPYI3 and ADDI3 instructions (parallel) 10-54 to 10-55
10-130 to 10-132 parallel FIX and STI instructions 10-77 to 10-78
MPYI3 and STI instructions (parallel) parallel FLOAT and STF instructions
10-133 to 10-134 10-80 to 10-81
MPYI3 and SUBI3 instructions (parallel) parallel instructions advantages 11-132
10-135 to 10-137 parallel LDF and LDF instructions
MPYI3 instruction 10-128 to 10-129 10-91 to 10-92
multiply floating-point 10-116 parallel LDF and STF instructions
3-operand instruction 10-117 to 10-118 10-93 to 10-94
multiply integer 3-operand instruction parallel LDI and LDI instructions
10-128 to 10-129 10-100 to 10-101
multiply integer instruction 10-127 parallel LDI and STI instructions
negative floating-point 10-139 10-102 to 10-103
negative integer 10-142 parallel LSH3 and STI instructions
negative integer with borrow 10-138 10-112 to 10-114
NEGB instruction 10-138 parallel MPYF3 and ADDF3 instructions
NEGF and STF instructions (parallel) 10-119 to 10-121
10-140 to 10-141 parallel MPYF3 and STF instructions
NEGF instruction 10-139 10-122 to 10-123
NEGI and STI instructions (parallel) parallel MPYF3 and SUBF3 instructions
10-143 to 10-144 10-124 to 10-126
NEGI instruction 10-142 parallel MPYI3 and ADDI3 instructions
no operation 10-145 10-130 to 10-132
NOP instruction 10-145 parallel MPYI3 and STI instructions
NORM instruction 10-146 to 10-147 10-133 to 10-134
normalize 10-146 to 10-147 parallel MPYI3 and SUBI3 instructions
NOT and STI instructions (parallel) 10-135 to 10-137
10-149 to 10-150 parallel NEGF and STF instructions
NOT instruction 10-148 10-140 to 10-141
OR instruction 10-151 parallel NEGI and STI instructions
OR3 and STI instructions (parallel) 10-143 to 10-144
10-154 to 10-155 parallel NOT and STI instructions
OR3 instruction 10-152 to 10-153 10-149 to 10-150

Index-4
Index

assembly language instructions (continued) assembly language instructions (continued)


parallel OR3 and STI parallel SUBI3 and STI instructions
instructions 10-154 to 10-155 10-195 to 10-196
parallel STF and ABSF instructions parallel XOR3 and STI instructions
10-23 to 10-24 10-209 to 10-210
parallel STF and ADDF3 instructions POP floating-point 10-157
10-35 to 10-36 POP integer instruction 10-156
parallel STF and FLOAT instructions POPF instruction 10-157
10-80 to 10-81 PUSH floating-point 10-159
parallel STF and LDF instructions PUSH integer instruction 10-158
10-93 to 10-94 PUSHF instruction 10-159
parallel STF and MPYF3 instructions register syntax 10-18
10-122 to 10-123 repeat block 10-170
parallel STF and NEGF instructions repeat single 10-171 to 10-172
10-140 to 10-141 restore clock to regular speed 10-115
parallel STF and STF instructions RETIcond instruction 10-160 to 10-161
10-176 to 10-177, 10-180 to 10-181 return from subroutine conditionally 10-162
parallel STF and SUBF3 instructions RETScond instruction 10-162
10-190 to 10-191 return from interrupt conditionally
parallel STI and ABSI instructions 10-160 to 10-161
10-27 to 10-28 RND instruction 10-163 to 10-164
parallel STI and ADDI3 instructions ROL instruction 10-165
10-40 to 10-41 ROLC instruction 10-166 to 10-167
parallel STI and AND3 instructions ROR instruction 10-168
10-45 to 10-46 RORC instruction 10-169
parallel STI and ASH3 instructions rotate
10-54 to 10-55 left 10-165
parallel STI and FIX instructions 10-77 to 10-78 left through carry 10-166 to 10-167
parallel STI and LDI instructions right 10-168
10-102 to 10-103 right through carry 10-169
parallel STI and LSH3 instructions round floating-point 10-163 to 10-164
10-112 to 10-114 RPTB instruction 10-170
parallel STI and MPYI3 instructions RPTS instruction 10-171 to 10-172
10-133 to 10-134 SIGI instruction 10-173
parallel STI and NEGI instructions signal, interlocked 10-173
10-143 to 10-144 software interrupt 10-200
parallel STI and NOT instructions STF and ABSF instructions (parallel)
10-149 to 10-150 10-23 to 10-24
parallel STI and OR3 instructions STF and ADDF3 instructions (parallel)
10-154 to 10-155 10-35 to 10-36
parallel STI and SUBI3 instructions STF and FLOAT instructions (parallel)
10-195 to 10-196 10-80 to 10-81
parallel STI and XOR3 instructions STF and LDF instructions (parallel)
10-209 to 10-210 10-93 to 10-94
parallel SUBF3 and MPYF3 instructions STF and MPYF3 instructions (parallel)
10-124 to 10-126 10-122 to 10-123
parallel SUBF3 and STF instructions STF and NEGF instructions (parallel)
10-190 to 10-191 10-140 to 10-141
parallel SUBI3 and MPYI3 instructions STF and STF instructions (parallel)
10-135 to 10-137 10-176 to 10-177

Index-5
Index

assembly language instructions (continued) assembly language instructions (continued)


STF and SUBF3 instructions (parallel) SUBI3 and MPYI3 instructions (parallel)
10-190 to 10-191 10-135 to 10-137
STF instruction 10-174 SUBI3 and STI instructions (parallel)
STFI instruction 10-175 10-195 to 10-196
STI and ABSI instructions (parallel) SUBI3 instruction 10-193 to 10-194
10-27 to 10-28 SUBRB instruction 10-197
STI and ADDI3 instructions (parallel) SUBRF instruction 10-198
10-40 to 10-41 SUBRI instruction 10-199
STI and AND3 instructions (parallel) subtract floating-point 10-187
10-45 to 10-46 3-operand instruction 10-188 to 10-189
STI and ASH3 instructions (parallel) subtract integer 10-192
10-54 to 10-55 3-operand instruction 10-193 to 10-194
STI and FIX instructions (parallel) subtract integer conditionally 10-185 to 10-186
10-77 to 10-78 subtract integer with borrow 10-182
STI and LDI instructions (parallel) 3-operand instruction 10-183 to 10-184
10-102 to 10-103 subtract reverse floating-point 10-198
STI and LSH3 instructions (parallel) subtract reverse integer 10-199
10-112 to 10-114 subtract reverse integer with borrow 10-197
STI and MPYI3 instructions (parallel) SWI instruction 10-200
10-133 to 10-134 symbols used to define 10-15 to 10-18
STI and NEGI instructions (parallel) syntax options 10-16 to 10-18
10-143 to 10-144 test bit fields 10-203
STI and NOT instructions (parallel) 3-operand instruction 10-204 to 10-205
10-149 to 10-150 trap conditionally 10-201 to 10-202
TRAPcond instruction 10-201 to 10-202
STI and OR3 instructions (parallel)
TSTB instruction 10-203
10-154 to 10-155
TSTB3 instruction 10-204 to 10-205
STI and STI instructions (parallel)
XOR instruction 10-206
10-180 to 10-181
XOR3 and STI instructions (parallel)
STI and SUBI3 instructions (parallel)
10-209 to 10-210
10-195 to 10-196
XOR3 instruction 10-207 to 10-208
STI and XOR3 instructions (parallel)
10-209 to 10-210 auxiliary (AR0–AR7) registers 3-3
STI instruction 10-178 auxiliary register ALUs 2-6
STII instruction 10-179 auxiliary register arithmetic units (ARAUs) 5-5
store floating-point 10-174
store floating-point, interlocked 10-175
store integer 10-178 B
store integer, interlocked 10-179
SUBB instruction 10-182 bank switching
SUBB3 instruction 10-183 to 10-184 external bus 12-13 to 12-18
SUBC instruction 10-185 to 10-186 programmable 7-30 to 7-32
integer division 11-27 to 11-30 bank switching techniques 12-13 to 12-19
SUBF instruction 10-187 Bcond instruction 10-56 to 10-57
SUBF3 and MPYF3 instructions (parallel)
Bcond D instruction 10-58 to 10-59
10-124 to 10-126
SUBF3 and STF instructions (parallel) biquad 11-60
10-190 to 10-191 bit manipulation 11-23 to 11-24
SUBF3 instruction 10-188 to 10-189 bit-reversed addressing 5-29 to 5-30, 11-25
SUBI instruction 10-192 FFT algorithms 5-29 to 5-30

Index-6
Index

bitwise exclusive-OR instruction 10-206 bulletin board service (BBS) B-5 to B-6
3-operand instruction 10-207 bus operation 7-1 to 7-32
bitwise logical-complement instruction 10-148 external 2-26
bitwise logical-AND instruction 10-42 internal 2-22
3-operand instruction 10-43 buses
bitwise logical-ANDN instruction 10-47 DMA 2-22
3-operand instruction 10-48 program 2-22
bitwise logical-OR instruction 10-151 busy-waiting example 6-14
3-operand instruction 10-152 byte-wide configured memory 3-31
block
moves 11-25
repeat 11-18
C
repeat modes 6-2 to 6-7 C (HLL) routines 11-131 to 11-134
control bits 6-3 C compiler B-2
nested block repeats 6-7
’C30, memory maps 2-14
operation 6-3 to 6-4
RC register value 6-6 to 6-7 ’C30 power dissipation D-1 to D-32
restrictions 6-6 FFT assembly code D-30 to D-32
RPTB instruction 6-4 to 6-5 photo of IDD for FFT D-29
RPTS instruction 6-5 summary D-28
repeat registers (RC, RE, RS) 3-11, 6-2 ’C31
size (BK) register 3-4 memory maps 2-15
block diagram interrupt and trap memory maps 3-34
architectural 2-3 reserved memory locations 2-31
functional 1-5 ’C3x DSPs 1-2
boot loader 3-26 cache
external memory loading 3-30 architecture 3-21 to 3-23
interrupt and trap vector mapping 3-33 control bits 3-24
invoking 3-26 cache clear bit (CC) 3-24
mode selection 3-29 cache enable bit (CE) 3-24
operations 3-26 cache freeze bit (CF) 3-25
precautions 3-35 hit 3-23
serial-port loading 3-33 instruction 2-12
boot loader source code G-1 to G-6 memory 2-11, 3-21
algorithm 3-23 to 3-24
BR instruction 10-60 architecture 3-21
branch conflicts 9-4 to 9-6 instruction 3-21
branch unconditionally (delayed) instruction miss 3-23
10-58, 10-61 segment 3-24
branch unconditionally (standard) instruction word 3-23
10-56, 10-60 CALL instruction 6-10, 10-62
branches 6-8 call subroutine conditionally instruction 10-63
delayed 6-8 to 6-9, 11-17 call subroutine instruction 10-62
BRD instruction 10-61 CALLcond instruction 6-10, 10-63 to 10-64
breakdown of numbers B-9 to B-10 calls 6-10 to 6-11
buffered signals 12-43 carry flag 10-12
MPSD 12-42 cautions x
buffering 12-41 C-callable routines 11-131

Index-7
Index

central processing unit 2-4 connector


block diagram 2-5 dimensions, mechanical 12-43 to 12-45
registers 2-8 12-pin header 12-39
circular addressing 5-24 to 5-28 consumer electronics F-20 to F-24
algorithm 5-26
circular buffer 5-24 context switching 11-11 to 11-15
FIR filters 5-28, 11-58 context restore for ’C3x 11-14 to 11-16
operation 5-27 context save for ’C3x 11-12 to 11-13
clkout 8-21, 8-22 control registers, external interface 7-2 to 7-5
CLKR pins 8-20 expansion bus 7-5 to 7-6
primary bus 7-3 to 7-4
CLKX pins 8-19
conversion
clock mode
floating-point to integer 4-22 to 4-23
timer interrupt 8-11
integer to floating-point 4-24
timer pulse generator 8-8 to 8-9
time to frequency domain (FFTs)
clock oscillator circuitry 12-27 to 12-29 11-73 to 11-125
clocking of memory accesses 9-23 to 9-30
counter
data loads and stores 9-24 to 9-30
example 6-14
program fetches 9-23
register (timer) 8-3, 8-8
CMPF instruction 10-65
CPU 2-4 to 2-10
CMPF3 instruction 10-66 to 10-67
block diagram 2-5
CMPI instruction 10-68 general 2-4
CMPI3 instruction 10-69 to 10-70 interrupt
COMBO F-6 DMA interaction 6-30
latency 6-30
companding 11-53 to 11-57 processing cycle 6-29
compare floating-point instruction 10-65 interrupt flag register (IF) 3-9
3-operand instruction 10-66 register file 2-7, 3-2 to 3-12
compare integer instruction 10-68 registers 2-7 to 2-10, 3-2 to 3-12
3-operand instruction 10-69 auxiliary (AR0–AR7) 2-8, 3-3
compiler B-2
block repeat (RS, RE) 3-11
block size (BK) 2-9, 3-4
compression CPU/DMA interrupt enable (IE) 3-7
A-law 11-56 data-page pointer (DP) 2-9, 3-4
U-law 11-54 extended precision (R0–R7) 2-8, 3-3
computed GOTO 11-22 I/O flag (IOF) 2-9, 3-10
condition codes and flags 10-10 to 10-13 index (IR1, IR0) 2-9, 3-4
interrupt enable (IE) 2-9, 3-7
condition flags 10-10 to 10-13
interrupt flag (IF) 2-9, 3-9
floating-point underflow 10-11
list of 3-2
latched floating-point underflow 10-11
program counter (PC) 2-10, 2-22, 3-11
latched overflow 10-11
repeat count (RC) 2-10, 3-11, 6-2
negative 10-11
repeat end address (RE) 2-10, 3-11, 6-2
overflow 10-12
repeat start address (RS) 2-10, 3-11, 6-2
zero 10-11
reserved bits 3-12
conditional-branch addressing modes 2-16, 5-23 status register (ST) 2-9, 3-4, 10-11
conditional delayed branches 6-8 system stack pointer (SP) 2-9, 3-4
compare instructions 6-8 transfer, with serial-port transmit polling
extended-precision registers 6-8 8-38 to 8-39

Index-8
Index

current calculations D-26 to D-27 debugger B-3


average D-27 decode unit 9-2
data output D-26 to D-27 decrement and branch conditionally (delayed)
processing D-26 instruction 10-73
decrement and branch conditionally (standard)
D instruction 10-71
delayed branches 6-8 to 6-9, 11-17
D/A converter interface 12-23 to 12-26 advantages 11-132
D/A input/output system 12-32 to 12-35 conditional 6-8
incorrectly placed 6-6
DAC F-23
dependencies D-2 to D-3
data
converters F-17 dequeue (stacks) 5-31, 5-33
loads and stores 9-24 to 9-29 development support B-1 to B-10
operations with parallel stores 9-27 to 9-29 tools B-2 to B-6
parallel multiplies and adds 9-29 bulletin board service B-5 to B-6
three-operand instructions 9-24 to 9-27 code generation tools B-2
two-operand instructions 9-24 assembler/linker B-2
C compiler B-2
data formats 4-1 to 4-24 compiler B-2
floating-point formats 4-4 to 4-9 linker B-2
conversion between formats 4-8 to 4-9 digital filter design package B-2
extended-precision 4-6 to 4-7 documentation B-5
short 4-4 to 4-5 hotline B-5
single-precision 4-6 literature B-5
floating-point to integer conversion 4-22 to 4-23 seminars B-6
floating-point addition and subtraction system integration and debug
4-14 to 4-17 tools B-3 to B-4
floating-point multiplication 4-10 to 4-13 debugger B-3
integer formats 4-2 emulation porting kit (EPK) B-4 to B-5
short 4-2 emulator B-3
single-precision 4-2 to 4-3 evaluation module (EVM) B-3
integer to floating-point conversion 4-24 simulator B-3
normalization using NORM 4-18 to 4-19 XDS510 emulator B-3
rounding with RND 4-20 to 4-21 technical training organization (TTO) work-
unsigned-integer formats 4-3 shop B-6
short 4-3 third parties B-4
single-precision 4-3 to 4-4 workshops B-6
data-page pointer (DP) register 2-9, 3-4 device suffixes B-9 to B-10
data-rate timing operation diagnostic applications 12-45 to 12-46
fixed 8-30 digital audio F-21
burst mode 8-30 digital electronics F-20 to F-24
continuous mode 8-30 digital filter design package B-2
variable 8-34 dimensions, 12-pin emulator connector
burst mode 8-34 12-43 to 12-45
continuous mode 8-35
direct
data-receive register 8-24 addressing 5-4
data-transmit register 8-23, 8-27, 8-30, 8-32 memory access 2-29
DBcond instruction 10-71 to 10-72 disabled interrupts by branch 6-8
DBcond D instruction 10-73 to 10-74 displacements 5-5

Index-9
Index

dissipation, power D-1 to D-32


algorithm partitioning D-4
E
dependencies D-2 to D-3
FFT assembly code D-30 to D-32 electrical
photo of IDD for FFT D-29 characteristics
power requirements D-2 pinout and pin assignments 13-2 to 13-15
power supply current requirements D-2 signal descriptions 13-16 to 13-24
test setup description D-4 to D-5 signal transition levels 13-29
summary D-28
divide clock by 16 instruction 10-106 specifications 13-25 to 13-28
division 11-26 to 11-33 emulation porting kit (EPK) B-4 to B-5
floating-point 11-31 to 11-33 emulator B-3
DMA connection to target system 12-41 to 12-43
architecture 2-29 MPSD mechanical dimensions
block moves 8-43, 11-25 12-43 to 12-45
buses 2-22 connector, mechanical dimensions
12-43 to 12-45
channel 9-2
MPSD connector, 12-pin header
channel synchronization 8-54 to 8-56
12-39 to 12-40
controller 2-22, 8-43 to 8-64 pod interface 12-40
block diagram 2-29 signal buffering 12-41
destination register 8-49 to 8-53
destination/source address register 8-47 emulator cable, signal timing, MPSD
12-40 to 12-41
general 2-29
initialization reconfiguration 8-57 emulator pod
interrupt 8-56 MPSD timings 12-41
CPU interaction 6-30 parameters 12-41
processing cycle 6-29 evaluation module (EVM) B-3
interrupt-enable register 8-47 to 8-49
event counters 8-2
maximum transfer rates 8-53
memory transfer 8-49 to 8-53 example circuit 12-13 to 12-46
memory-mapped registers 8-43 example instruction 10-19 to 10-21
programming hints 8-57 to 8-58
execute unit 9-2
setup and use examples 8-58 to 8-64
source register 8-49 to 8-53 expansion
synchronization of channels 8-54 to 8-56 A-law 11-57
timing bus. See expansion buses and external buses
expansion bus destination 8-52 U-law 11-55
on-chip destination 8-50 expansion buses 7-2
primary bus destination 8-51 functional timing of operations 7-6
transfer-counter register 8-47 I/O cycles 7-11 to 7-32
programmable wait states 7-28 to 7-29
documentation v, vii, B-5
expansion bus control register 7-5 to 7-6
DR pins 8-20
expansion bus interface 12-19 to 12-26
dry pack C-7 A/D converter 12-19
D/A converter 12-23
dummy fetch 9-4
ready generation 12-9 to 12-13
DX pins 8-19 functions 12-11

Index-10
Index

extended-precision filters 11-58 to 11-67


arithmetic 11-38 to 11-41 adaptive 11-67
floating-point format 4-6 to 4-7 FIR 11-58 to 11-60
addition example 11-39 IIR 11-60 to 11-66
multiplication example 11-40 lattice 11-125 to 11-130
subtract example 11-39 LMS algorithm 11-67
extended-precision (R7–R0) registers 3-3 FIR filters 5-28, 11-58 to 11-60
circular addressing 5-28, 11-58
external
buses (expansion, primary) 2-26, 7-1 FIX and STI instructions (parallel) 10-77 to 10-78
bank switching 12-13 to 12-18 FIX instruction 10-75 to 10-76
expansion bus interface 12-19 to 12-26 fixed data-rate timing operation, timing 8-30
external interrupts 2-26 burst mode 8-30
interlocked instructions 2-26 continuous mode 8-30
primary bus interface 12-4 to 12-18 fixed point 1-4
ready generation 12-9 to 12-13
flag
wait states 12-9 to 12-13
carry 10-12
devices 12-3
condition
interfaces 12-2
floating-point underflow 10-11
external bus operation 2-26, 7-1 to 7-32 latched floating-point underflow 10-11
external interface control registers 7-2 to 7-5 latched overflow 10-11
expansion bus 7-5 to 7-6 negative 10-11
primary bus 7-3 to 7-4 overflow 10-12
external interface timing zero 10-11
expansion bus 7-6 to 7-27
FLOAT and STF instructions
expansion-bus I/O cycles 7-11 to 7-32
(parallel) 10-80 to 10-81
primary-bus cycles 7-6 to 7-10
programmable bank switching 7-30 to 7-32 FLOAT instruction 4-24, 10-79
programmable wait states 7-28 to 7-29 floating point 1-4
addition 4-14 to 4-17
external interface, control registers 7-2 to 7-5
examples 4-16 to 4-18
external interface timing 7-6 to 7-27 conversion to integer 4-22 to 4-23
expansion bus I/O cycles 7-11 to 7-32 division 11-26, 11-31 to 11-33
primary bus cycles 7-6 to 7-10 format 4-4 to 4-9
external interrupts 6-23 conversion 4-8 to 4-9, 11-44 to 11-48,
11-49 to 11-52
external memory loader header 3-30
extended-precision 4-6 to 4-7
external ready generation 12-10 to 12-11 IEEE definition 11-43
external reset signal 6-18 short 4-4 to 4-5
single-precision 4-6
TMS320C3x definition 11-42 to 11-44
IEEE to TMS320, 11-42 to 11-52
F inverse 11-31 to 11-33
multiplication 4-10 to 4-13
fast Fourier transforms (FFT) 11-25, examples 4-12 to 4-14
11-73 to 11-125, D-26 flowchart 4-11
fetch unit 9-2 normalization 4-18 to 4-19
normalized 4-14
FFT 11-73 to 11-125 operation 4-1 to 4-24
FFT algorithms 5-29 rounding value 4-20 to 4-21
bit-reversed addressing 5-29 square root 11-34

Index-11
Index

floating point (continued) hardware applications (continued)


subtraction 4-14 to 4-17 XDS target design
examples 4-16 to 4-18 considerations 12-39 to 12-46
TMS320 to IEEE 11-42 to 11-52 connections between emulator and target
underflow 4-15 system 12-41 to 12-43
floating-point-to-integer conversion instruction diagnostic applications 12-45 to 12-46
10-75 mechanical dimensions for emulator
floating-point underflow condition flag 10-11 connector 12-43 to 12-45
MPSD emulator cable signal timing
frame sync 8-32, 8-33 12-40 to 12-41
FSR pins 8-20 MPSD emulator connector 12-39 to 12-40
FSX pins 8-19 hardware control 6-1
functional block diagram 1-5 hardware reset 11-2
HDTV F-20
G header
12-pin 12-39
general addressing modes 2-16, 5-19 to 5-20 dimensions
general-purpose applications 1-4 mechanical 12-43 to 12-45
12-pin header 12-39
generation, TMS320C3x DSPs 1-2
signal descriptions, 12-pin header 12-39
global memory 6-12, 6-15 straight, unshrouded 12-39
global-control register 8-2 hints for assembly coding 11-131 to 11-132
DMA 8-47
hotline B-5
register bits 8-45 to 8-47
serial port 8-13, 8-15 to 8-18
bits summary 8-15 to 8-18
timer 8-3 to 8-8 I
register bits summary 8-4 to 8-6
I/O flags register (IOF) 3-10
GOTO 11-22
IACK instruction 6-29, 10-82
IDLE instruction 10-83
H IDLE2 power management mode 6-36 to 6-37
IDLE2 instruction 10-84 to 10-85, 12-36 to 12-38
hardware applications 12-1 to 12-46
expansion bus interface 12-19 to 12-26 IE register bits summary, CPU register file 3-8
A/D converter 12-19 to 12-22 IF register bits summary, CPU register file 3-9
D/A converter 12-23 to 12-27 I/O flag register (IOF), CPU register file 3-10
low-power mode interrupt interface IIR filters 11-60 to 11-66
12-36 to 12-38
illegal instructions 10-9
primary bus interface 12-4 to 12-18
bank switching techniques 12-13 to 12-19 index (IR0,IR1) register 3-4
ready generation 12-9 to 12-13 indirect addressing 5-5 to 5-16
zero-wait-state to static-RAMs 12-4 to 12-8 ARAUs 5-5
serial-port interface 12-32 to 12-35 auxiliary register 5-5
system configuration options 12-2 to 12-3 parallel addressing mode 5-22
categories of interfaces 12-2 three-operand addressing mode 5-21
typical block diagram 12-3 to 12-4 with postdisplacement 5-10
system control functions 12-27 to 12-31 with postindex 5-14 to 5-17
clock oscillator circuitry 12-27 to 12-29 with predisplacement 5-8 to 5-10
reset signal generation 12-29 to 12-39 with preindex 5-12 to 5-14

Index-12
Index

individual instructions 10-14 to 10-210 interfaces (continued)


example 10-19 to 10-21 primary bus 2-26, 12-4 to 12-18
symbols and abbreviations 10-14 to 10-15 See also primary bus interface
initialization bank switching techniques 12-13 to 12-19
DMA 8-57 ready generation 12-9 to 12-13
processor 11-2 to 11-5 zero-wait-state to static RAMs 12-4 to 12-8
serial port 12-32 to 12-35
input clock 12-27 system control, clock circuitry 12-27 to 12-29
instruction types 12-2
cache 3-21
interlocked operations 6-12 to 6-17
memory
busy-waiting loop 6-14
three-operand reads 9-24 to 9-27
external flag pins (XF0, XF1) 6-12
two-operand accesses 9-24
instructions 6-13
opcodes A-1 to A-6
loads and stores 6-12
register (IR) 2-22
multiprocessor counter 6-14
instruction cache 2-12
interlocked operations instructions 10-6
instruction set 10-22 to 10-210
categories 10-2 internal
example instruction 10-19 to 10-21 bus operation 2-22
summary clock 8-10
alphabetical 2-17 to 2-21 internal circuitry current requirement D-5 to D-8
function listing 10-2 to 10-9 internal bus operations D-6 to D-9
table 2-17 to 2-21 internal operations D-5
instructions quiescent D-5
assembly language 10-1 to 10-18 internal interrupts 6-23
illegal 10-9
interlocked operations 10-6 interrupt 6-23 to 6-35
load-and-store 10-2 acknowledge instruction 10-82
low-power control operations 10-5 enable (IE) register 3-7
parallel operations 10-7 to 10-8 bits summary 3-8
program control 10-5 flag (IF) register 3-9
three-operand 10-4 bits summary 3-9
two-operand 10-3 interrupts 2-26
INT0–INT3 signals 3-18, 3-19, 6-24 considerations (’C3x) 6-31 to 6-34
integer context switching 11-11 to 11-15
division 11-26, 11-27 to 11-30 context restore for ’C3x 11-14 to 11-16
format 4-2 context save for ’C3x 11-12 to 11-13
short integer 4-2 control bits 6-26 to 6-27
signed 4-2 global control register 6-27
single-precision integer 4-2 interrupt enable register (IE) 6-26
unsigned 4-3 interrupt flag register (IF) 6-26
status register (ST) 6-26
integer-to-floating-point conversion 4-24 CPU/DMA interaction 6-30
instruction 10-79 DMA 8-56
interfaces flag register behavior 6-27
expansion bus 2-26, 12-19 to 12-26 latency (CPU) 6-29 to 6-30
A/D converter interface 12-19 to 12-22 prioritization and control
D/A converter 12-23 to 12-26 6-25 to 6-26, 6-34 to 6-35, 11-16
low-power-mode interrupt 12-36 to 12-38 processing 6-27 to 6-30

Index-13
Index

interrupts (continued) load floating-point conditional instruction 10-88


serial port 8-29 load floating-point exponent instruction 10-86
receive timer 8-29 load floating-point mantissa instruction 10-104
receiver 8-29
load floating-point interlocked instruction 10-87
transmit timer 8-29
load integer conditionally instruction 10-97
transmitter 8-29
service routines 11-9 load integer instruction 10-95
example 11-16 load integer, interlocked instruction 10-99
timer 8-2, 8-11 load-and-store instructions 10-2
vectors 3-18, 3-19, 6-35 loader mode selection 3-30
table 6-23 to 6-25 logical operations 11-23 to 11-34
inverse 11-31 to 11-33 bit manipulation 11-23 to 11-24
inverse lattice filter 11-126 bit-reversed addressing 11-25 to 11-26
IOF register bits summary, CPU register file 3-11 block moves 11-25
IOSTRB signal 7-2, 7-6 extended-precision arithmetic 11-38 to 11-41
floating-point format conversion 11-42 to 11-52
integer and floating-point division
K 11-26 to 11-33
square root 11-34
key features logical shift instruction 10-107
’C30 1-6 3-operand instruction 10-109
’C31 1-8 long-immediate addressing 2-16, 5-17
looping 11-18 to 11-21
L block repeat 11-18 to 11-20
single-instruction repeat 11-20 to 11-26
latched floating-point overflow and underflow LOPOWER instruction 10-106
condition flags 10-11 LOPOWER mode 6-38
lattice filters 11-125 to 11-130 low-power control instructions 10-5
LDE instruction 10-86 low-power idle instruction 10-84
LDF and LDF instructions (parallel) 10-91 to 10-92 low-power-mode interrupt interface 12-36 to 12-38
LDF and STF instructions (parallel) 10-93 to 10-94 low-power-mode wakeup example
LDF instruction 10-87 11-133 to 11-134
LDFcond instruction 10-88 to 10-89 LRU cache update 3-21
LDFI instruction 10-90 LSH instruction 10-107 to 10-108
LDI and LDI instructions (parallel) LSH3 and STI instructions (parallel)
10-100 to 10-101 10-112 to 10-114
LDI and STI instructions (parallel) LSH3 instruction 10-109 to 10-111
10-102 to 10-103
LDI instruction 10-95 to 10-96 M
LDIcond instruction 10-97 to 10-98
LDII instruction 10-99 matrix-vector multiplication 11-70
LDM instruction 10-104 MAXSPEED instruction 10-115
LDP instruction 10-105 memory 2-11, 3-13, 3-21
accesses (pipeline) clocking 9-23 to 9-29
linker B-2
addressing modes 2-16
literature v to viii B-5 cache 2-11, 3-21, 11-132
LMS algorithm filters 11-67 See also cache
load data page pointer instruction 10-105 DMA memory transfer 8-49 to 8-53

Index-14
Index

memory (continued) MPYI3 instruction 10-128 to 10-129


general organization 2-11 MSTRB signal 7-2, 7-6
global 6-12, 6-15 multimedia applications F-2 to F-4
maps 2-13, 3-13, 3-17 multimedia-related devices F-4
’C30 2-14, 3-15 system design considerations F-2 to F-3
’C31 2-15, 3-16
multiple processors 6-12
microcomputer mode 3-13
microprocessor mode 3-13 multiplication
pipeline conflicts 9-10 to 9-17 floating-point 4-10
execute only 9-13 to 9-15 examples 4-12 to 4-14
hold everything 9-15 to 9-17 flowchart 4-11
program fetch incomplete 9-12 matrix-vector 11-70 to 11-73
program wait 9-10 to 9-13 multiplier 2-6
resolving 9-21 to 9-22 multiply floating-point instruction 10-116
quick access 11-132 3-operand instruction 10-117
memory addressing multiply integer instruction 10-127
modes 2-16 3-operand instruction 10-128
parallel multiplies and adds 9-29 multiprocessor support 6-12
three-operand instructions 9-24
two-operand instructions 9-24
memory maps N
’C30 2-14, 3-15 negative condition flag 10-11
’C31 2-15, 3-16
negative floating-point instruction 10-139
memory organization, block diagram 2-12
negative integer instruction 10-142
microcomputer mode 2-13, 3-14, 3-17
negative integer with borrow instruction 10-138
microcomputer/boot loader mode 3-17
NEGB instruction 10-138
microprocessor mode 2-13, 3-13, 3-17
NEGF and STF instructions (parallel)
modem applications F-17 to F-19 10-140 to 10-141
MPSD emulator NEGF instruction 10-139
buffered transmission signals 12-42
NEGI and STI instructions (parallel)
cable signal timing 12-40 to 12-41
10-143 to 10-144
connector 12-39 to 12-40
no signal buffering 12-41 NEGI instruction 10-142
MPYF instruction 9-4, 10-116 nested block repeats 6-7
MPYF3 and ADDF3 instructions (parallel) no operation instruction 10-145
10-119 to 10-121 NOP instruction 10-145
MPYF3 and STF instructions (parallel) NORM instruction 4-18 to 4-19, 10-146 to 10-147
10-122 to 10-123 normalization, floating-point value 4-14,
MPYF3 and SUBF3 instructions (parallel) 4-18 to 4-19
10-124 to 10-126 normalize instruction 10-146
MPYF3 instruction 10-117 to 10-118 NOT and STI instructions (parallel)
MPYI instruction 10-127 10-149 to 10-150
MPYI3 and ADDI3 instructions (parallel) NOT instruction 10-148
10-130 to 10-132
MPYI3 and STI instructions (parallel)
10-133 to 10-134
O
MPYI3 and SUBI3 instructions (parallel) operations with parallel stores 9-27 to 9-29
10-135 to 10-137 optional assembler syntax 10-16 to 10-18

Index-15
Index

options overview (system configuration) 12-2 parallel MPYF3 and ADDF3 instructions
OR instruction 10-151 10-119 to 10-121
OR3 and STI instructions (parallel) parallel MPYF3 and STF instructions
10-154 to 10-155 10-122 to 10-123
parallel MPYF3 and SUBF3 instructions
OR3 instruction 10-152 to 10-153
10-124 to 10-126
ordering information B-7 to B-10
parallel MPYI3 and ADDI3 instructions
ORing of the ready signals 12-9 to 12-10 10-130 to 10-132
output driver circuitry current parallel MPYI3 and STI instructions
requirement D-9 to D-17 10-133 to 10-134
capacitive load dependence D-16 to D-18 parallel MPYI3 and SUBI3 instructions
data dependency D-14 to D-16 10-135 to 10-137
expansion bus D-13 to D-14 parallel multiplies and adds 9-29
primary bus D-10 to D-12 parallel NEGF and STF instructions
output value formats 10-10 10-140 to 10-141
overflow 4-15, 4-22 parallel NEGI and STI instructions
overflow condition flag 10-12 10-143 to 10-144
parallel NOT and STI instructions
10-149 to 10-150
P parallel operations instructions 10-7 to 10-8
parallel OR3 and STI instructions
parallel ABSF and STF instructions 10-23 to 10-24 10-154 to 10-155
parallel ABSI and STI instructions 10-27 to 10-28 parallel STF and ABSF instructions 10-23 to 10-24
parallel ADDF3 and MPYF3 instructions parallel STF and ADDF3 instructions
10-119 to 10-121 10-35 to 10-36
parallel ADDF3 and STF instructions parallel STF and FLOAT instructions
10-35 to 10-36 10-80 to 10-81
parallel ADDI3 and MPYI3 instructions parallel STF and LDF instructions 10-93 to 10-94
10-130 to 10-132 parallel STF and MPYF3 instructions
parallel ADDI3 and STI instructions 10-40 to 10-41 10-122 to 10-123
parallel STF and NEGF instructions
parallel addressing modes 2-16, 5-21 to 5-22
10-140 to 10-141
parallel AND3 and STI instructions 10-45 to 10-46 parallel STF and STF instructions
parallel ASH3 and STI instructions 10-54 to 10-55 10-176 to 10-177
parallel bus 12-19 parallel STF and SUBF3 instructions
See also expansion bus interface 10-190 to 10-191
parallel FIX and STI instructions 10-77 to 10-78 parallel STI and ABSI instructions 10-27 to 10-28
parallel FLOAT and STF instructions parallel STI and ADDI3 instructions 10-40 to 10-41
10-80 to 10-81 parallel STI and AND3 instructions 10-45 to 10-46
parallel instruction set summary 2-23 to 2-24 parallel STI and ASH3 instructions 10-54 to 10-55
parallel instructions advantages 11-132 parallel STI and FIX instructions 10-77 to 10-78
parallel STI and LDI instructions 10-102 to 10-103
parallel LDF and LDF instructions 10-91 to 10-92
parallel STI and LSH3 instructions
parallel LDF and STF instructions 10-93 to 10-94
10-112 to 10-114
parallel LDI and LDI instructions 10-100 to 10-101 parallel STI and MPYI3 instructions
parallel LDI and STI instructions 10-102 to 10-103 10-133 to 10-134
parallel LSH3 and STI instructions parallel STI and NEGI instructions
10-112 to 10-114 10-143 to 10-144

Index-16
Index

parallel STI and NOT instructions peripherals, DMA controller (continued)


10-149 to 10-150 synchronization of DMA channels
parallel STI and OR3 instructions 8-54 to 8-56
10-154 to 10-155 transfer-counter register 8-47
serial ports 8-13 to 8-42
parallel STI and STI instructions 10-180 to 10-181
data-transmit register 8-23
parallel STI and SUBI3 instructions data-receive register 8-24
10-195 to 10-196 FSR/DR/CLKR port control register 8-20
parallel STI and XOR3 instructions FSX/DX/CLKX port control register
10-209 to 10-210 8-18 to 8-19
parallel SUBF3 and MPYF3 instructions functional operation 8-30 to 8-36
10-124 to 10-126 global-control register 8-15 to 8-18
initialization/reconfiguration 8-36
parallel SUBF3 and STF instructions
interrupt sources 8-29
10-190 to 10-191
operation configurations 8-24 to 8-26
parallel SUBI3 and MPYI3 instructions receive/transmit timer control register
10-135 to 10-137 8-21 to 8-22
parallel SUBI3 and STI instructions receive/transmit timer counter register 8-22
10-195 to 10-196 receive/transmit timer period register 8-23
parallel XOR3 and STI instructions timing 8-26 to 8-29
10-209 to 10-210 TMS320C3x interface examples
part numbers B-7 to B-10 8-36 to 8-46
breakdown of numbers B-9 to B-10 timers 8-2 to 8-12
device suffixes B-9 to B-10 global-control register 8-3 to 8-8
prefix designators B-8 to B-9 initialization/reconfiguration 8-12 to 8-15
interrupts 8-11
part ordering B-1 to B-10 operation modes 8-10 to 8-11
PC-relative addressing 5-17 to 5-18 period and counter registers 8-8
period register (timer) 8-2, 8-8 pulse generation 8-8 to 8-9
peripheral bus 2-27 pin
general architecture 2-27 assignments 13-6, 13-7
map 3-20 states at reset 6-19
peripherals on pinout and pin assignments 13-2 to 13-15
DMA controller 8-43 to 8-64 PGA 13-2 to 13-7
serial port 2-28, 8-13 to 8-42 PQFP
timers 2-28, 8-2 ’C30 13-8 to 13-11
register diagram 2-27 ’C31 13-12 to 13-15
peripheral modules, block diagram 2-27 pipeline
peripherals 2-27, 8-1 to 8-64 conflicts 9-4 to 9-17
DMA controller 8-43 to 8-64 avoiding 11-132
CPU/DMA interrupt enable register delayed branches 9-6
8-47 to 8-49 registers 9-7 to 9-9
destination- and source-address registers standard branches 9-4 to 9-6
8-47 memory accesses clocking 9-23 to 9-30
global-control register 8-47 memory conflicts 9-10 to 9-17
hints for programming 8-57 to 8-58 execute only 9-13 to 9-15
initialization/reconfiguration 8-57 hold everything 9-15 to 9-17
interrupts 8-56 program fetch incomplete 9-12
memory transfer operation 8-49 to 8-53 program wait 9-10 to 9-13
programming examples 8-58 to 8-64 resolving 9-21 to 9-22

Index-17
Index

pipeline (continued) program


operation 9-1 to 9-30 buses 2-22
clocking of memory accesses 9-23 to 9-30 counter (PC) 2-22, 3-11
data loads and stores 9-24 to 9-30 fetches 9-23
program fetches 9-23 flow 6-1
branch conflicts 9-4 to 9-6
memory conflicts 9-10 to 9-23 program control 11-6
register conflicts 9-7 to 9-9 computed GOTOs 11-22 to 11-23
resolving memory conflicts 9-21 to 9-22 delayed branches 11-17
resolving register conflicts 9-18 to 9-20 instructions 10-5
structure 9-2 to 9-3 interrupt service routines 11-9 to 11-16
pod interface, emulator 12-40 context switching 11-11 to 11-16
example 11-16
POP floating-point instruction 10-157 priority 11-16
POP integer instruction 10-156 repeat modes 11-18 to 11-21
POPF instruction 10-157 block repeat 11-18 to 11-20
single-instruction repeat 11-20 to 11-26
power dissipation D-1 to D-32 software stack 11-8 to 11-9
algorithm partitioning D-4 subroutines 11-6 to 11-8
characteristics D-2 to D-4
dependencies D-2 to D-3 program fetch incomplete 9-12
FFT assembly code D-30 to D-32 program flow control 6-1 to 6-38
photo of IDD for FFT D-29 calls, traps, and returns 6-10 to 6-11
power requirements D-2 delayed branches 6-8 to 6-9
power supply current requirements D-2 interlocked operations 6-12 to 6-17
summary D-28 interrupts 6-23 to 6-35
test setup description D-4 to D-5 control bits 6-26 to 6-27
power supply current requirements D-2 CPU interrupt latency 6-30
PQFP reflow soldering precautions C-7 to C-8 CPU/DMA interaction 6-30
prioritization 6-25 to 6-26
prefix designators B-8 to B-9 prioritization and control 6-34 to 6-36
primary bus 7-2 processing 6-27 to 6-30
See also external buses TMS320C30 considerations 6-32 to 6-34
bus cycles 7-6 to 7-10 TMS320C3x considerations 6-31 to 6-32
control register 7-3 to 7-4 vector table 6-23 to 6-25
functional timing of operations 7-6 repeat modes 6-2 to 6-7
programmable bank switching 7-31 nested block repeats 6-7 to 6-23
programmable wait states 7-28 to 7-29 RC register value after repeat mode
ready generation, segmentation of address 6-6 to 6-7
space 12-11 repeat-mode control bits 6-3
primary bus interface 2-26, 12-4 to 12-18 repeat-mode operation 6-3 to 6-4
bank switching techniques 12-13 to 12-19 restrictions 6-6
ready generation 12-9 to 12-13 RPTB instruction 6-4 to 6-5
ANDing of the ready signals 12-10 RPTS instruction 6-5
example circuit 12-13 to 12-46 reset operation 6-18 to 6-22
external ready generation 12-10 to 12-11 TMS320LC31 power management
ORing of the ready signals 12-9 to 12-10 mode 6-36 to 6-38
ready control logic 12-11 to 12-12 IDLE2 6-36 to 6-37
zero-wait-state to static-RAMs 12-4 to 12-8 LOPOWER 6-38
processor initialization 11-2 to 11-5 program wait 9-10 to 9-13

Index-18
Index

programmable registers (continued)


bank switching 7-30 to 7-32 conflicts (resolving) 9-18 to 9-20
wait states 7-28 to 7-29 counter (timer) 8-8
programming tips 11-131 to 11-134 CPU interrupt flag (IF) 3-9
C-callable routines 11-131 CPU/DMA interrupt-enable (IE) 3-7,
hints for assembly coding 11-131 to 11-132 8-47 to 8-49
low-power mode wakeup example data-page pointer (DP) 3-4
11-133 to 11-134 destination, extended-precision registers
pulse mode (R0–R7) 6-8
timer interrupt 8-11 destination register (R7–R0)
timer pulse generator 8-8 to 8-9 condition flags 10-20
PUSH floating-point instruction 10-159 DMA
destination and source address 8-47
PUSH integer instruction 10-158
global-control register 8-47
PUSHF instruction 10-159 transfer-counter register 8-47
extended precision (R0–R7) 2-8, 3-3
FSR/DR/CLKR serial port control 8-20
Q FSX/DX/CLKX serial port control 8-18
functional groups 9-7
quality C-1 to C-8
I/O flag (IOF) 2-9, 3-10
queue (stacks) 5-31, 5-33
index (IR0, IR1) 2-9, 3-4
interrupt enable (IE) 2-9
R interrupt flag (IF) 2-9, 6-33
maximum use 11-132
RAM. See memory memory-mapped peripheral 3-20
period (timer) 8-8
RC register value 6-6 to 6-7
program counter (PC) 2-10, 2-22, 3-11
read unit 9-2 receive/transmit timer control 8-21
ready control logic 12-11 to 12-12 repeat
ready generation 12-9 to 12-13 count (RC) 2-10
ANDing of the ready signals 12-10 count address (RC) 6-2
example circuit 12-13 to 12-46 end address (RE) 2-10, 6-2
external ready generation 12-10 to 12-11 start address (RS) 2-10, 6-2
functions 12-11 repeat mode operation 6-3 to 6-4
ORing of the ready signals 12-9 to 12-10 reserved bits 3-12
ready control logic 12-11 to 12-12 serial port 8-13 to 8-42
receive shift register (RSR) 8-24 serial port global-control 8-15 to 8-18
receive/transmit timer bits summary 8-15 to 8-18
control register (serial port) 8-21 to 8-22 status (ST) 3-4
counter register (serial port) 8-22 status register (ST) 2-9, 10-11
period register (serial port) 8-23 system stack pointer (SP) 2-9, 3-4, 5-31
reflow soldering precautions C-7 to C-8 timer global-control 8-3
register addressing 5-3 reliability C-1 to C-8
register conflicts 9-7 to 9-9 stress testing C-2 to C-6
register file, CPU 2-7 repeat
registers count register (RC) 3-11, 6-2
auxiliary (AR7–AR0) 3-3 end address register (RE) 3-11, 6-2
block size (BK) 2-9, 3-4, 5-24 mode 6-2 to 6-7, 11-18 to 11-21
buses 2-22 block repeat 11-18 to 11-20

Index-19
Index

repeat, mode (continued)


control bits 6-3 S
maximum number of repeats 6-3
nested block repeats 6-7 scan paths, TBC emulation connections for ’C3x
operation 6-3 to 6-4 12-46
RC register value 6-6 to 6-7 segment start address (SSA) 3-21
restrictions 6-6 segmentation of address space 12-11
RPTB instruction 6-4 to 6-5 semaphores 6-15
RPTS instruction 6-5
seminars B-6
single-instruction repeat 11-20 to 11-26
start address register (RS) 3-11, 6-2 serial port 8-13 to 8-42
repeat block instruction 10-170 clock 8-13, 8-27
timer 8-37
reserved area, unpredictable results 2-13 timing 8-26 to 8-29
reserved memory locations clock configurations 8-24 to 8-26
TMS320C31, 2-31 continuous transmit and receive mode 8-28
reset 3-17 CPU transfer with transmit polling 8-38 to 8-39
operation 6-18 to 6-22 data-receive register 8-24
pin states 6-19 data-transmit register 8-23
vectors 3-18, 3-19, 6-35 fixed date-rate timing 8-30
RESET signal, generation 12-29 to 12-31 burst mode 8-30
continuous mode 8-30
resolving register conflicts 9-18 to 9-20
frame sync 8-32, 8-33
restore clock to regular speed instruction 10-115 functional operation 8-30 to 8-36
RETIcond instruction 6-10, 10-160 to 10-161 global-control register 8-13, 8-15 to 8-18
RETScond instruction 6-10, 10-162 bits summary 8-15 to 8-18
return from interrupt conditionally instruction handshake mode 8-16, 8-28 to 8-30, 8-37, 8-38
10-160 direct connect 8-29
initialization reconfiguration 8-36 to 8-42
return from subroutine 6-10
interface 12-32 to 12-35
return from subroutine conditionally instruction handshake mode example 8-37 to 8-38
10-162 serial A/C interface example 8-40
returns 6-10 to 6-11 serial A/D and DIA interface example
RINT0, RINT1 signals 3-18, 3-19, 6-24 8-40 to 8-46
RND instruction 10-163 to 10-164 interrupt sources 8-29
receive timer 8-29
ROL instruction 10-165
receiver 8-29
ROLC instruction 10-166 to 10-167 transmit timer 8-29
ROM. See memory transmitter 8-29
ROR instruction 10-168 operation configurations 8-24 to 8-26
RORC instruction 10-169 port control register
FSR/DR/CLKR 8-20
rotate left instruction 10-165
FSR/DR/CLKR bits summary 8-20
rotate left through carry instruction 10-166 FSX/DX/CLKX 8-18 to 8-19
rotate right instruction 10-168 FSX/DX/CLKX bits summary 8-19
rotate right through carry instruction 10-169 receive/transmit timer
round floating-point instruction 10-163 control register 8-21 to 8-22
counter register 8-22
rounding of floating-point value 4-20 to 4-21
period register 8-23
RPTB instruction 6-4 to 6-5, 10-170 registers 8-13, 8-42
RPTS instruction 6-5, 10-171 to 10-172 timing 8-26 to 8-29

Index-20
Index

serial-port loading 3-33 software applications, logical and arithmetic


servo control/disk drive applications F-14 to F-16 operations (continued)
integer and floating-point division
servo control-related devices F-16
11-26 to 11-33
short-immediate addressing 5-16 to 5-17 square root 11-34
SIGI instruction 10-173 processor initialization 11-2
signal program control 11-6 to 11-22
descriptions 13-16 to 13-24 computed GOTOs 11-22 to 11-23
’C30 13-16 to 13-21 delayed branches 11-17
’C31 13-22 to 13-29 interrupt service routines 11-9 to 11-16
transition levels 13-29 repeat modes 11-18 to 11-21
TTL-level inputs 13-29 to 13-30 software stack 11-8 to 11-9
TTL-level outputs 13-29 subroutines 11-6 to 11-8
programming tips 11-131 to 11-134
signal buffering for emulator connections 12-41
C-callable routines 11-131
signal descriptions 13-1, 13-16 to 13-24 hints for assembly coding 11-131 to 11-132
pinout and pin assignments 13-2 to 13-15 low-power-mode wakeup example
signal, interlocked instruction 10-173 11-133 to 11-134
signals software control 6-1
12-pin header 12-39 software development tools B-2 to B-6
buffered 12-39, 12-43 bulletin board service (BBS) B-5 to B-6
buffering for emulator connections code generation tools B-2
12-41 to 12-43 assembler/linker B-2
no buffering 12-41 C compiler B-2
timing 12-40 to 12-41 compiler B-2
signed-precision, unsigned integer format 4-3 linker B-2
simulator B-3 digital filter design package B-2
documentation B-5
single-instruction repeat 11-20 to 11-21
hotline B-5
single-precision literature B-5
floating-point format 4-6 seminars B-6
integer format 4-2 system integration and debug tools B-3 to B-4
16-bit-wide configured memory 3-32 debugger B-3
software applications 11-1 to 11-34 emulation porting kit (EPK) B-4 to B-5
application-oriented operations 11-53 to 11-67 emulator B-3
adaptive filters 11-67 evaluation module (EVM) B-3
companding 11-53 to 11-57 simulator B-3
fast Fourier transforms (FFT) XDS510 emulator B-3
11-73 to 11-125 technical training organization (TTO) work-
FIR filters 11-58 to 11-60 shop B-6
IIR filters 11-60 to 11-66 third parties B-4
lattice filters 11-125 to 11-131 workshops B-6
matrix-vector multiplication 11-70 to 11-73 software interrupt instruction 10-200
logical and arithmetic operations 11-23 to 11-34 software stack 11-8 to 11-9
bit manipulation 11-23 to 11-24
bit-reversed addressing 11-25 to 11-26 soldering precautions C-7 to C-8
block moves 11-25 speech
extended-precision arithmetic 11-38 to 11-41 encoding F-3
floating-point format conversion memories F-12
11-42 to 11-53 synthesis applications F-11 to F-13

Index-21
Index

square root 11-34 STI and MPYI3 instructions (parallel)


stack, software 11-8 to 11-9 10-133 to 10-134
pointer (SP) register 3-4, 5-31, 11-8 to 11-9 STI and NEGI instructions (parallel)
stack management 5-31 to 5-34 10-143 to 10-144
STI and NOT instructions (parallel)
stack queues 5-33
10-149 to 10-150
stacks 5-32 to 5-33
STI and OR3 instructions (parallel)
growth 5-32
10-154 to 10-155
implementation of high-to-low 5-32
implementation of low-to-high 5-33 STI and STI instructions (parallel)
10-180 to 10-181
standard branches 6-8
STI and SUBI3 instructions (parallel)
status register (ST) 3-4, 10-11 10-195 to 10-196
bits summary 3-6
STI and XOR3 instructions (parallel)
CPU register file 3-5
10-209 to 10-210
global interrupt enable (GIE) bit
’C30 interrupt considerations 6-32 STI instruction 10-178
’C3x interrupt considerations 6-31 STII instruction 10-179
STF and ABSF instructions (parallel) store floating-point instruction 10-174
10-23 to 10-24 store floating-point, interlocked instruction 10-175
STF and ADDF3 instructions (parallel) store integer instruction 10-178
10-35 to 10-36 store integer, interlocked instruction 10-179
STF and FLOAT instructions (parallel) STRB signal 7-2, 7-6
10-80 to 10-81 stress testing C-2 to C-6
STF and LDF instructions (parallel) 10-93 to 10-94 style (manual) viii
STF and MPYF3 instructions (parallel) SUBB instruction 10-182
10-122 to 10-123 SUBB3 instruction 10-183 to 10-184
STF and NEGF instructions (parallel) SUBC instruction 10-185 to 10-186
10-140 to 10-141 SUBF instruction 10-187
STF and STF instructions (parallel) SUBF3 and MPYF3 instructions (parallel)
10-176 to 10-177 10-124 to 10-126
STF and SUBF3 instructions (parallel) SUBF3 and STF instructions (parallel)
10-190 to 10-191 10-190 to 10-191
STF instruction 10-174 SUBF3 instruction 10-188 to 10-189
STFI instruction 10-175 SUBI instruction 10-192
SUBI3 and MPYI3 instructions (parallel)
STI and ABSI instructions (parallel) 10-27 to 10-28
10-135 to 10-137
STI and ADDI3 instructions (parallel)
SUBI3 and STI instructions (parallel)
10-40 to 10-41
10-195 to 10-196
STI and AND3 instructions (parallel) SUBI3 instruction 10-193 to 10-194
10-45 to 10-46
SUBRB instruction 10-197
STI and ASH3 instructions (parallel)
SUBRF instruction 10-198
10-54 to 10-55
SUBRI instruction 10-199
STI and FIX instructions (parallel) 10-77 to 10-78
subroutines
STI and LDI instructions (parallel) computed GOTO 11-22
10-102 to 10-103 context switching 11-11 to 11-15
STI and LSH3 instructions (parallel) context restore for ’C3x 11-14 to 11-16
10-112 to 10-114 context save for ’C3x 11-12 to 11-13

Index-22
Index

subroutines (continued) technical training organization (TTO) workshop


interrupt priority 11-16 to 11-18 B-6
program control 11-6 to 11-8 telecommunications applications F-5 to F-10
runtime select 11-20 to 11-21
telecommunications-related devices F-7
subtract example 11-39
test bit fields instruction 10-203
subtract floating-point instruction 10-187 3-operand instruction 10-204
3-operand instruction 10-188
test bus controller 12-45
subtract integer conditionally instruction 10-185
test load circuit 13-28
subtract integer instruction 10-192
test setup description D-4 to D-5
3-operand instruction 10-193
third parties B-4
subtract integer with borrow instruction 10-182
3-operand instruction 10-183 32-bit-wide configured memory 3-32
subtract reverse floating-point instruction 10-198 three-operand addressing modes 2-16,
5-20 to 5-21
subtract reverse integer instruction 10-199
three-operand instructions 10-4
subtract reverse integer with borrow instruction
10-197 timer 2-28
control register 8-11
supply current calculations D-26 to D-27
receive/transmit 8-21 to 8-22
average D-27
counter register 8-8
data output D-26 to D-27
receive/transmit 8-22
experimental results D-27
global-control register 8-3 to 8-8
processing D-26
bits summary 8-4 to 8-6
SWI instruction 10-200 I/O port configurations 8-10
symbols (used in manual) viii initialization/reconfiguration 8-12 to 8-15
symbols and abbreviations 10-14 to 10-15 interrupts 8-11
synchronize two processors example 6-17 operation modes 8-10 to 8-11
output generation examples 8-9
syntaxes, assembler 10-16 to 10-18 period register 8-2, 8-8
system receive/transmit 8-23
control functions 12-27 to 12-31 pulse generation 8-8 to 8-9
clock oscillator circuitry 12-27 to 12-29 registers 8-42
reset signal generation 12-29 to 12-31 timing figure 8-7
integration 2-32 timers 8-2 to 8-12
system configuration counter 8-2
categories of interfaces 12-2 timing
options overview 12-2 to 12-3 external interface 7-6 to 7-27
typical system block diagram 12-3 to 12-4 expansion bus I/O cycles 7-11 to 7-32
system management 5-31 to 5-34 primary bus cycles 7-6 to 7-10
system stack pointer 5-31 parameters 13-30 to 13-67
changing the XF pin from an input to an
output 13-44
T changing the XF pin from an output to an
input 13-43
target, system, connection 12-39 to 12-46 data rate timing modes 13-55 to 13-60
target cable 12-39, 12-43 general-purpose I/O timing 13-63 to 13-65
peripheral pin I/O modes 13-63 to 13-65
target system, connection to emulator peripheral pin I/O timing 13-63
12-41 to 12-43 interrupt acknowledge timing 13-54
technical assistance x interrupt response timing 13-52 to 13-53

Index-23
Index

timing, parameters (continued) TMS320LC31 power management


loading when the XF pin is configured as an modes 6-36 to 6-38
output 13-42 IDLE2 6-36 to 6-37
memory read/write timing 13-32 to 13-37 LOPOWER 6-38
reset timing 13-45 to 13-50 total supply current calculation D-18 to D-25
SHZ pin timing 13-51 average current D-22
timer pin timing 13-66 to 13-67 average current versus peak current D-22
X2/CLKIN, H1, and H3 13-30 to 13-31 combining D-18 to D-19
XF0 and XF1 timing when executing LDFI or dependencies D-19 to D-20
LDII 13-38 to 13-39 design equation D-21 to D-22
XF0 and XF1 timing when executing SIGI peak current D-22
13-41 thermal management considerations
XF0 and XF1 timing when executing STFI or D-23 to D-25
STII 13-40
trap conditionally instruction 10-201
TINT0, TINT1 signals 3-18, 3-19, 6-24 trap vectors 3-18, 3-19
TLC32046, F-3 TRAPcond instruction 6-10, 10-201 to 10-202
TLC32070, F-16 traps 3-17, 6-10 to 6-11
interrupt considerations
TMS320 ’C30 6-32 to 6-34
DSP evolution 1-3 ’C3x 6-31
family, general description 1-2
TSTB instruction 10-203
TMS320C30 TSTB3 instruction 10-204 to 10-205
FFT assembly code D-30 to D-32
two-operand instructions 10-3
memory maps 2-14
photo of IDD for FFT D-29
power dissipation D-1 to D-32
summary D-28 U
TMS320C30 and TMS320C31 differences 2-30 U-law compression 11-54
data/program bus differences 2-30 U-law expansion 11-55
development considerations 2-31
underflow 4-14
effects on the IF and IE interrupt registers 2-31
reserved memory locations 2-30 unsigned-integer format 4-3
serial-port differences 2-30 short 4-3
user program/data ROM 2-31 single-precision 4-3
user state management 5-31
TMS320C31
interrupt and trap memory maps 3-34
memory maps 2-15
reserved memory locations 2-31 V
TMS320C3x block diagram variable data-rate timing operation 8-34
architectural 2-3 burst mode 8-34
functional 1-5 continuous mode 8-35
TMS320C3x DSPs 1-1 to 1-2 vectors
interrupts 3-17, 6-35
TMS320C3x family, general description 1-2
reset 3-17, 6-35
TMS320C3x interfaces 12-1 trap 3-17
TMS320C3x video signal processing F-21
serial-port interface examples 8-36 to 8-42 voice synthesizers F-11

Index-24
Index

XDS, target design considerations (continued)


W mechanical dimensions of emulator connector
12-43 to 12-45
wait states MPSD emulator cable signal timing
external bus 12-9 to 12-13 12-40 to 12-41
programmable 7-28 to 7-29 XDS510 emulator B-3
zero 12-4 to 12-8 XF0, XF1 signals 2-26
workshops B-6 XINT0, XINT1 signals 3-18, 3-19, 6-24
XOR instruction 10-206
XOR3 and STI instructions (parallel)
10-209 to 10-210
X XOR3 instruction 10-207 to 10-208

XDS, target design considerations 12-39 to 12-46 Z


connections between emulator and target
zero condition flag 10-11
system 12-41 to 12-43
designing MPSD emulator connector zero-logic interconnect of ’C3x 6-16
12-39 to 12-40 zero-overhead looping 6-2
diagnostic applications 12-45 to 12-46 zero-wait-states 12-4 to 12-8

Index-25
Index-26

You might also like