Tms320c3x J
Tms320c3x J
User’s Guide
2558539-9721 revision J
October 1994
TMS320C3x
User’s Guide
1994
User’s
Guide
IMPORTANT NOTICE
Texas Instruments (TI) reserves the right to make changes to its products or to discontinue any
semiconductor product or service without notice, and advises its customers to obtain the latest
version of relevant information to verify, before placing orders, that the information being relied
on is current.
TI warrants performance of its semiconductor products and related software to the specifications
applicable at the time of sale in accordance with TI’s standard warranty. Testing and other quality
control techniques are utilized to the extent TI deems necessary to support this warranty.
Specific testing of all parameters of each device is not necessarily performed, except those
mandated by government requirements.
Certain applications using semiconductor products may involve potential risks of death,
personal injury, or severe property or environmental damage (“Critical Applications”).
Inclusion of TI products in such applications is understood to be fully at the risk of the customer.
Use of TI products in such applications requires the written approval of an appropriate TI officer.
Questions concerning potential risk applications should be directed to TI through a local SC
sales office.
In order to minimize risks associated with the customer’s applications, adequate design and
operating safeguards should be provided by the customer to minimize inherent or procedural
hazards.
Preface
Notational Conventions
This document uses the following conventions:
- Braces ( { and } ) indicate a list. The symbol | (read as or) separates items
within the list. Here’s an example of a list:
{ * | *+ | *– }
This provides three choices: *, *+, or *–.
Unless the list is enclosed in square brackets, you must choose one item
from the list.
iv
Notational Conventions / Information About Cautions / Related Documentation from Texas Instruments
The information in a caution is provided for your information. Please read each
caution carefully.
References
vi
References
Digital Signal Processing Applications with the TMS320 Family, Vol. II.
Texas Instruments, 1990; Prentice-Hall, Inc., 1990.
Digital Signal Processing Applications with the TMS320 Family, Vol. III.
Texas Instruments, 1990; Prentice-Hall, Inc., 1990.
Gold, Bernard, and Rader, C.M., Digital Processing of Signals. New York,
NY: McGraw-Hill Company, Inc., 1969.
Hamming, R.W., Digital Filters. Englewood Cliffs, NJ: Prentice-Hall, Inc.,
1977.
Hutchins, B., and Parks, T., A Digital Signal Processing Laboratory Using
the TMS320C25. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1990.
IEEE ASSP DSP Committee (Editor), Programs for Digital Signal
Processing. New York, NY: IEEE Press, 1979.
Jackson, Leland B., Digital Filters and Signal Processing. Hingham, MA:
Kluwer Academic Publishers, 1986.
Jones, D.L., and Parks, T.W., A Digital Signal Processing Laboratory
Using the TMS32010. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1987.
Lim, Jae, and Oppenheim, Alan V. (Editors), Advanced Topics in Signal
Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1988.
Morris, L. Robert, Digital Signal Processing Software. Ottawa, Canada:
Carleton University, 1983.
Oppenheim, Alan V. (Editor), Applications of Digital Signal Processing.
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1978.
Oppenheim, Alan V., and Schafer, R.W., Digital Signal Processing.
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1975.
Oppenheim, Alan V., and Schafer, R.W., Discrete-Time Signal
Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1989.
Oppenheim, Alan V., and Willsky, A.N., with Young, I.T., Signals and
Systems. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1983.
Parks, T.W., and Burrus, C.S., Digital Filter Design. New York, NY: John
Wiley and Sons, Inc., 1987.
Rabiner, Lawrence R., and Gold, Bernard, Theory and Application of
Digital Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1975.
Treichler, J.R., Johnson, Jr., C.R., and Larimore, M.G., Theory and Design
of Adaptive Filters. New York, NY: John Wiley and Sons, Inc., 1987.
- Speech:
Gray, A.H., and Markel, J.D., Linear Prediction of Speech. New York, NY:
Springer-Verlag, 1976.
Jayant, N.S., and Noll, Peter, Digital Coding of Waveforms. Englewood
Cliffs, NJ: Prentice-Hall, Inc., 1984.
viii
References / If You Need Assistance / Trademarks
Trademarks
ABEL is a registered trademark of Data I/O Corporation.
CodeView, MS, MS-DOS, MS-Windows, and Presentation Manager are trademarks of
Microsoft Corp.
DEC, Digital DX, Ultrix, VAX, and VMS and are trademarks of Digital Equipment Corp.
HPGL is a registered trademark of Hewlett-Packard Co.
Macintosh and MPW are trademarks of Apple Computer Corp.
Micro Channel, OS/2, PC-DOS, and PGA are trademarks of IBM Corp.
SPARC, Sun 3, Sun 4, Sun Workstation, SunView, and SunWindows are trademarks
of Sun Microsystems, Inc.
UNIX is a registered trademark of UNIX Systems Laboratories, Inc.
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
A general description of the TMS320C30 and TMS320C31, their key features, and typical
applications.
1.1 General Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
1.2 TMS320C30 Key Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
1.3 TMS320C31 Key Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
1.4 Typical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
xi
Contents
xii
Contents
5 Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
Operation, encoding, and implementation of addressing modes. Format descriptions. System
stack management.
5.1 Types of Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
5.1.1 Register Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5.1.2 Direct Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5.1.3 Indirect Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5.1.4 Short-Immediate Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16
5.1.5 Long-Immediate Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5.1.6 PC-Relative Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5.2 Groups of Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-19
5.2.1 General Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-19
5.2.2 Three-Operand Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
5.2.3 Parallel Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-21
5.2.4 Conditional-Branch Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23
5.3 Circular Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24
5.4 Bit-Reversed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-29
5.5 System and User Stack Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5.5.1 System Stack Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5.5.2 Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32
5.5.3 Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-33
Contents xiii
Contents
xiv
Contents
Contents xv
Contents
xvi
Contents
Contents xvii
Contents
xviii
Contents
Contents xix
Figures
Figures
1–1 TMS320 Device Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
1–2 TMS320C3x Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
2–1 TMS320C3x Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2–2 Central Processing Unit (CPU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
2–3 Memory Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
2–4 TMS320C30 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
2–5 TMS320C31 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
2–6 Peripheral Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27
2–7 DMA Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-29
3–1 Extended-Precision Register Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3–2 Extended-Precision Register Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3–3 Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
3–4 CPU/DMA Interrupt Enable Register (IE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
3–5 CPU Interrupt-Flag Register (IF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3–6 I/O-Flag Register (IOF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3–7 TMS320C30 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
3–8 TMS320C31 Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
3–9 Reset, Interrupt, and Trap-Vector Locations
for the TMS320C30/TMS320C31 Microprocessor Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-18
3–10 Interrupt and Trap Branch Instructions for the TMS320C31 Microcomputer Mode . . . . . 3-19
3–11 Peripheral Bus Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-20
3–12 Instruction Cache Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-22
3–13 Address Partitioning for Cache Control Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-22
3–14 Boot-Loader-Mode Selection Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-27
3–15 Boot-Loader Memory-Load Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-28
3–16 Boot-Loader Serial-Port Load-Mode Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-29
4–1 Short-Integer Format and Sign Extension of Short Integers . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
4–2 Single-Precision Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
4–3 Short Unsigned-Integer Format and Zero Fill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4–4 Single-Precision Unsigned-Integer Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4–5 Generic Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
4–6 Short Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
4–7 Single-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
4–8 Extended-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
4–9 Converting From Short Floating-Point Format
to Single-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
4–10 Converting From Short Floating-Point Format
to Extended-Precision Floating-Point Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
xx
Figures
Contents xxi
Figures
7–13 Memory Write and I/O Write for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16
7–14 Memory Write and I/O Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-17
7–15 I/O Write and Memory Write for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18
7–16 I/O Write and Memory Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19
7–17 I/O Read and Memory Write for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-20
7–18 I/O Read and Memory Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7–19 I/O Write and I/O Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
7–20 I/O Write and I/O Write for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
7–21 I/O Read and I/O Read for Expansion Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-24
7–22 Inactive Bus States for IOSTRB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-25
7–23 Inactive Bus States for STRB and MSTRB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-26
7–24 HOLD and HOLDA Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-27
7–25 BNKCMP Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
7–26 Bank-Switching Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-31
8–1 Timer Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
8–2 Memory-Mapped Timer Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
8–3 Timer Global-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
8–4 Timer Modes as Defined by CLKSRC and FUNC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
8–5 Timer Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
8–6 Timer Output Generation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-9
8–7 Timer I/O Port Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10
8–8 Serial-Port Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-14
8–9 Memory-Mapped Locations for the Serial Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-15
8–10 Serial-Port Global-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18
8–11 FSX/DX/CLKX Port-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19
8–12 FSR/DR/CLKR Port-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20
8–13 Receive/Transmit Timer-Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
8–14 Receive/Transmit Timer-Counter Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
8–15 Receive/Transmit Timer-Period Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-23
8–16 Transmit Buffer Shift Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-23
8–17 Receive Buffer Shift Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
8–18 Serial-Port Clocking in I/O Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-25
8–19 Serial-Port Clocking in Serial-Port Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26
8–20 Data Word Format in Handshake Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
8–21 Single Zero Sent as an Acknowledge Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
8–22 Direct Connection Using Handshake Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29
8–23 Fixed Burst Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-31
8–24 Fixed Continuous Mode With Frame Sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-31
8–25 Fixed Continuous Mode Without Frame Sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-33
8–26 Exiting Fixed Continuous Mode Without Frame Sync, FSX Internal . . . . . . . . . . . . . . . . . 8-34
8–27 Variable Burst Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
8–28 Variable Continuous Mode With Frame Sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
8–29 Variable Continuous Mode Without Frame Sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
8–30 TMS320C3x Zero-Glue-Logic Interface to TLC3204x Example . . . . . . . . . . . . . . . . . . . . . 8-40
xxii
Figures
Contents xxiii
Figures
12–25 Signals Between the Emulator and the ’C3x With No Signals Buffered . . . . . . . . . . . . . 12-42
12–26 Signals Between the Emulator and the ’C3x
With Transmission Signals Buffered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-42
12–27 All Signals Buffered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-43
12–28 Pod/Connector Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-44
12–29 12-Pin Connector Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-45
12–30 TBC Emulation Connections for ’C3x Scan Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-46
13–1 TMS320C30 Pinout (Top View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-3
13–2 TMS320C30 Pinout (Bottom View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-4
13–3 TMS320C30 181-Pin PGA Dimensions—GEL Package . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5
13–4 TMS320C30 PPM Pinout (Top View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-8
13–5 TMS320C30 PPM 208-Pin Plastic Quad Flat Pack—PQL Package . . . . . . . . . . . . . . . . . 13-9
13–6 TMS320C31 Pinout (Top View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-12
13–7 TMS320C31 132-Pin Plastic Quad Flat Pack—PQL Package . . . . . . . . . . . . . . . . . . . . . 13-13
13–8 Test Load Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-28
13–9 TTL-Level Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-29
13–10 TTL-Level Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-29
13–11 Timing for X2/CLKIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-31
13–12 Timing for H1/H3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-31
13–13 Timing for Memory ( (M)STRB = 0) Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-34
13–14 Timing for Memory ( (M)STRB = 0) Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-35
13–15 Timing for Memory ( IOSTRB = 0) Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-36
13–16 Timing for Memory ( IOSTRB = 0) Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-37
13–17 Timing for XF0 and XF1 When Executing LDFI or LDII . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-39
13–18 Timing for XF0 When Executing an STFI or STII . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-40
13–19 Timing for XF0 and XF1 When Executing SIGI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-41
13–20 Timing for Loading XF Register When Configured as an Output Pin . . . . . . . . . . . . . . . . 13-42
13–21 Timing for Change of XF From Output to Input Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-43
13–22 Timing for Change of XF From Input to Output Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-44
13–23 Timing for RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-48
13–24 CLKIN to H1/H3 as a Function of Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-49
13–25 CLKIN to H1/H3 as a Function of Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-49
13–26 CLKIN to H1/H3 as a Function of Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-50
13–27 Timing for SHZ Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-51
13–28 Timing for INT3–INT0 Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-53
13–29 Timing for IACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-54
13–30 Timing for Fixed Data Rate Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-55
13–31 Timing for Variable Data Rate Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-56
13–32 Timing for HOLD/HOLDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-61
13–33 Timing for Peripheral Pin General-Purpose I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-63
13–34 Timing for Change of Peripheral Pin From General-Purpose Output to Input Mode . . . 13-64
13–35 Timing for Change of Peripheral Pin From General-Purpose Input to Output Mode . . . 13-65
13–36 Timing for Timer Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-67
B–1 TMS320 Device Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-10
xxiv
Figures
Contents xxv
Tables
Tables
1–1 Typical Applications of the TMS320 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
2–1 CPU Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
2–2 Instruction Set Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
2–3 Parallel Instruction Set Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-24
2–4 Feature Set Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
2–5 TMS320C31 Reserved Memory Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31
3–1 CPU Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3–2 Status Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
3–3 IE Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
3–4 IF Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3–5 IOF Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
3–6 Combined Effect of the CE and CF Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-25
3–7 Loader Mode Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
3–8 External Memory Loader Header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
3–9 TMS320C31 Interrupt and Trap Memory Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-34
5–1 CPU Register Address/Assembler Syntax and Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5–2 Indirect Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5–3 Index Steps and Bit-Reversed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-30
6–1 Repeat-Mode Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6–2 Interlocked Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
6–3 Pin Operation at Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-19
6–4 Reset, Interrupt, and Trap-Vector Locations
for the TMS320C30/TMS320C31 Microprocessor Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24
6–5 Reset, Interrupt, and Trap-Vector Locations
for the TMS320C31 Microcomputer Boot Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-25
6–6 Reset and Interrupt Vector Priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-26
6–7 Interrupt Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-29
6–8 Reset and Interrupt Vector Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-35
7–1 Primary-Bus Control Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4
7–2 Expansion-Bus Control Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
7–3 Wait-State Generation When SWW = 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7–4 Wait-State Generation When SWW = 0 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7–5 Wait-State Generation When SWW = 1 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7–6 Wait-State Generation When SWW = 1 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7–7 BNKCMP and Bank Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
8–1 Timer Global-Control Register Bits Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
8–2 Result of a Write of Specified Values of GO and HLD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
xxvi
Tables
Contents xxvii
Tables
xxviii
Examples
Examples
3–1 Byte-Wide Configured Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-31
3–2 16-Bit-Wide Configured Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-32
3–3 32-Bit-Wide Configured Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-32
4–1 Floating-Point Multiply (Both Mantissas = –2.0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-12
4–2 Floating-Point Multiply (Both Mantissas = 1.5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-12
4–3 Floating-Point Multiply (Both Mantissas = 1.0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-13
4–4 Floating-Point Multiply Between Positive and Negative Numbers . . . . . . . . . . . . . . . . . . . 4-13
4–5 Floating-Point Multiply by 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-13
4–6 Floating-Point Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-16
4–7 Floating-Point Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-16
4–8 Floating-Point Addition With a 32-Bit Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17
4–9 Floating-Point Addition/Subtraction With Floating-Point 0 . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17
4–10 NORM Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-18
5–1 Direct Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5–2 Auxiliary Register Indirect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5–3 Indirect With Predisplacement Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5–4 Indirect With Predisplacement Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5–5 Indirect With Predisplacement Add and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5–6 Indirect With Predisplacement Subtract and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5–7 Indirect With Postdisplacement Add and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
5–8 Indirect With Postdisplacement Subtract and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
5–9 Indirect With Postdisplacement Add and Circular Modify . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5–10 Indirect With Postdisplacement Subtract and Circular Modify . . . . . . . . . . . . . . . . . . . . . . 5-11
5–11 Indirect With Preindex Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5–12 Indirect With Preindex Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5–13 Indirect With Preindex Add and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
5–14 Indirect With Preindex Subtract and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
5–15 Indirect With Postindex Add and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
5–16 Indirect With Postindex Subtract and Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
5–17 Indirect With Postindex Add and Circular Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15
5–18 Indirect With Postindex Subtract and Circular Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15
5–19 Indirect With Postindex Add and Bit-Reversed Modify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16
5–20 Short-Immediate Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5–21 Long-Immediate Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5–22 PC-Relative Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-18
5–23 Circular Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-27
Contents xxix
Examples
xxx
Examples
Contents xxxi
xxxii
Chapter 1
Introduction
Topic Page
1-1
General Description
The TMS320’s internal busing and special DSP instruction set have the speed
and flexibility to execute at up to 50 MFLOPS. The TMS320 family optimizes
speed by implementing functions in hardware that other processors imple-
ment through software or microcode. This hardware-intensive approach pro-
vides power previously unavailable on a single chip.
The emphasis on total system cost has resulted in a less expensive processor
that can be designed into systems currently using costly bit-slice processors.
Also, cost/performance selection is provided by the different processors in the
TMS320C3x generation:
All of these processors are described in this user’s guide. Essentially, their
functionality is the same. However, electrical and timing characteristics vary
(as described in Chapter 13); part numbering information is found in Section
B.2 on page B-7. Throughout this book, TMS320C3x is used to refer to the
TMS320C30 and TMS320C31 and all speed variations. TMS320C30 and
TMS320C31 are used to refer to all speed variants of those processors where
appropriate. Special references, such as TMS320C30-40, are used to note
specific exceptions.
1-2
General Description
TMS320C4x
TMS320C3x
TMS320C40
TMS320C30 TMS320C40-40
TMS320c30-27
TMS320C30-40
TMS320C31
TMS320C31-27
TMS320C31-40
TMS320C31PQA
TMS320C31-50
TMS320LC31
PERFORMANCE MIPS/MFLOPS
TMS320C5x
TMS320C50
TMS320C51
TMS320C52
TMS320C53
TMS320C2x
TMS320C1x TMS320C25
TMS320E25
TMS320C25-33
TMS320C10 TMS320C25-50
TMS320C10-14/-25 TMS320C26
TMS320C14
TMS320E14/P14
TMS320C15/LC15
TMS320E15/P15
TMS320C15-25
TMS320E15-25
TMS320C16
TMS320C17/LC17
TMS320E17/P17
GENERATION
Introduction 1-3
General Description
The TMS320C30 and TMS320C31 can perform parallel multiply and arithme-
tic logic unit (ALU) operations on integer or floating-point data in a single cycle.
The processor also possesses a general-purpose register file, a program
cache, dedicated auxiliary register arithmetic units (ARAU), internal dual-ac-
cess memories, one DMA channel supporting concurrent I/O, and a short ma-
chine-cycle time. High performance and ease of use are products of those fea-
tures.
1-4
General Description
Figure 1–2 is a functional block diagram that shows the interrelationships be-
tween the various TMS320C3x key components.
Peripheral Bus
IACK Multiplier ALU Control Registers Serial
XF1–0 Port 1
8 Extended-Precision
Controller
MCBL/MP Registers
Timer 0
X1
X2/CLKIN Address Address
Generator 0 Generator 1 Timer 1
VDD
VSS 8 Auxiliary Registers
SHZ
12 Control Registers
Available on
TMS320C30,
TMS320C30-27, and
TMS320C30-40
Introduction 1-5
TMS320C30 Key Features
- Performance
J TMS320C30 (33 MHz)
H 60-ns, single-cycle instruction execution time
H 33.3 MFLOPS
H 16.7 MIPS
J TMS320C30-27
H 74-ns, single-cycle instruction execution time
H
H
27 MFLOPS
13.5 MIPS
J TMS320C30-40
H 50-ns, single-cycle instruction execution time
H
H
40 MFLOPS
20 MIPS
- 24-bit addresses
- Two address generators with eight auxiliary registers and two auxiliary
register arithmetic units
1-6
TMS320C30 Key Features
Introduction 1-7
TMS320C31 Key Features
1-8
TMS320C31 Key Features
J TMS320C31-27
H 74-ns, single-cycle instruction execution time
H
H
27 MFLOPS
13.5 MIPS
J TMS320C31-40
H 50-ns, single-cycle instruction execution time
H
H
40 MFLOPS
20 MIPS
J TMS320C31-50
H
H
40-ns, single-cycle instruction execution time
50 MFLOPS
H 25 MIPS
J TMS320LC31
H
H
60-ns, single-cycle instruction execution time
33.3 MFLOPS
H 16.7 MIPS
H
H
Low-power, 3.3 volt operation
Two power-down nodes; 2-MHz operation and idle
Introduction 1-9
Typical Applications
Telecommunications Automotive
1-10
Chapter 2
TMS320C3x Architecture
Topic Page
2-1
Architectural Overview
2-2
Architectural Overview
ÉÉÉÉ
ÉÉÉ
Figure 2–1. TMS320C3x Block Diagram
Cache
RAM RAM
ÉÉÉÉ
ÉÉÉ ROM
ÉÉÉÉ
ÉÉÉ
Block 0 Block 1 Block
(64 × 32) (1K × 32) (1K × 32) (4K × 32)
32 24 24 32 24 32 24 32
ÉÉ
ÉÉÉÉ
ÉÉ
ÉÉ
PDATA Bus XRDY
MSTRB
PADDR Bus
ÉÉÉÉÉ
IOSTRB
RDY
XR/W
HOLD DDATA Bus
Multiplexer
Multiplexer
ÉÉ
XD31–XD0
HOLDA
XA12–XA0
STRB DADDR1 Bus
R/W
D31–D0 DADDR2 Bus
A23–A0
DMADATA Bus
DMAADDR Bus
32 24 32 24 24 32 24 Serial Port 0
Port Control
Register FSX0
DMA Controller DX0
R/X Timer CLKX0
Global Control Register FSR0
Register DR0
MULTIPLEXER Data Transmit
Register CLKR0
IR
Source Address
PC CPU1 Register Data Receive
RESET Register
ÉÉÉÉÉ
ÉÉÉÉ
INT3–0 CPU2 Destination
IACK Address Serial Port 1
ÉÉÉ
ÉÉÉÉÉ
ÉÉÉÉ
MC/MP REG1 Register
Port Control
ÉÉÉÉÉ
ÉÉÉÉ
ÉÉÉ
VDD(3-0) DX1
REGISTER 1
Counter
REGISTER2
Register
ÉÉÉÉÉ
ÉÉÉÉ
ADVDD(1,0) FSR1
32-Bit
PDVDD Barrel DR1
Multiplier Data Transmit
ÉÉÉÉÉ
DDVDD(1,0) Shifter Register CLKR1
Controller
MDVDD ALU
40 Data Receive
ÉÉÉÉÉ
VSS(3-0)
Register
DVSS(3–0) 40
CVSS(1,0) 40
Extended 40 Timer 0
IVSS 40 Precision
VBBP 32 Registers 40 Global Control
SUBS (R7–R0) Register
X1 Timer Period TCLK0
X2/CLKIN Register
DISP0, IR0, IR1
H1
H3 Timer Counter
ARAU0 ARAU1 Register
EMU6-0 BK
RSV10–0
Timer 1
24 Global Control
24
Register
24 Auxiliary 24
32 Registers Timer Period TCLK1
32 (AR0–AR7) Register
32
Timer Counter
32 Register
Other 32
ÉÉÉÉ
32 Registers Port Control
(12)
Primary
- Floating-point/integer multiplier
- Arithmetic logic unit (ALU) for performing floating-point, integer, and log-
ical-operations arithmetic
Figure 2–2 shows the various CPU components that are discussed in the
succeeding subsections.
2-4
Central Processing Unit (CPU)
DADDR1 Bus
DADDR2 Bus
DDATA Bus
Multiplexer
CPU1 Bus
CPU2 Bus
REG1 Bus
REG2 Bus
DADDR1 Bus
DADDR2 Bus
REG1 Bus
REG2 Bus
CPU1 Bus
32 32 40 40
32-Bit Barrel
Multiplier Shifter
ALU
40
40
40
Extended 40
40 Precision
Registers
40
32 (R0–R7)
ARAU0 BK ARAU1
24 24
24 Auxiliary
Registers 24
32
(AR0–AR7) 32
32
32
Other
32
Registers
32 (12)
2.2.1 Multiplier
The multiplier performs single-cycle multiplications on 24-bit integer and 32-bit
floating-point values. The TMS320C3x implementation of floating-point arith-
metic allows for floating-point operations at fixed-point speeds via a 50-ns in-
struction cycle and a high degree of parallelism. To gain even higher through-
put, you can use parallel instructions to perform a multiply and ALU operation
in a single cycle.
When the multiplier performs floating-point multiplication, the inputs are 32-bit
floating-point numbers, and the result is a 40-bit floating-point number. When
the multiplier performs integer multiplication, the input data is 24 bits and yields
a 32-bit result. Refer to Chapter 4 for detailed information on data formats and
floating-point operation.
2-6
Central Processing Unit (CPU)
The register names and assigned functions are listed in Table 2–1. Following
the table, the function of each register or group of registers is briefly described.
Refer to Chapter 3 for detailed information on each of the CPU registers.
R0 Extended-precision register 0
R1 Extended-precision register 1
R2 Extended-precision register 2
R3 Extended-precision register 3
R4 Extended-precision register 4
R5 Extended-precision register 5
R6 Extended-precision register 6
R7 Extended-precision register 7
DP Data-page pointer
IR0 Index register 0
IR1 Index register 1
BK Block size
SP System stack pointer
ST Status register
IE CPU/DMA interrupt enable
IF CPU interrupt flags
IOF I/O flags
The 32-bit auxiliary registers (AR7–AR0) can be accessed by the CPU and
modified by the two ARAUs. The primary function of the auxiliary registers is
the generation of 24-bit addresses. They can also be used as loop counters
or as 32-bit general-purpose registers that can be modified by the multiplier
and ALU. Refer to Chapter 5 for detailed information and examples of the use
of auxiliary registers in addressing.
2-8
Central Processing Unit (CPU)
The data page pointer (DP) is a 32-bit register. The eight LSBs of the data
page pointer are used by the direct addressing mode as a pointer to the page
of data being addressed. Data pages are 64K words long, with a total of 256
pages.
The 32-bit index registers (IR0, IR1) contain the value used by the ARAU to
compute an indexed address. Refer to Chapter 5 for examples of the use of
index registers in addressing.
The ARAU uses the 32-bit block size register (BK) in circular addressing to
specify the data block size.
The system stack pointer (SP) is a 32-bit register that contains the address
of the top of the system stack. The SP always points to the last element pushed
onto the stack. A push performs a preincrement of the system stack pointer;
a pop performs a postdecrement. The SP is manipulated by interrupts, traps,
calls, returns, and the PUSH and POP instructions. Refer to Section 5.5 for in-
formation about system stack management.
The status register (ST) contains global information relating to the state of the
CPU. Operations usually set the condition flags of the status register accord-
ing to whether the result is 0, negative, etc. This includes register load and
store operations as well as arithmetic and logical functions. When the status
register is loaded, however, a bit-for-bit replacement is performed with the con-
tents of the source operand, regardless of the state of any bits in the source
operand. Therefore, following a load, the contents of the status register are
identical to the contents of the source operand. This allows the status register
to be easily saved and restored. See Table 3–2 for a list and definitions of the
status register bits.
The CPU/DMA interrupt enable register (IE) is a 32-bit register. The CPU
interrupt enable bits are in locations 10–0. The DMA interrupt enable bits are
in locations 26–16. A 1 in a CPU/DMA interrupt enable register bit enables the
corresponding interrupt. A 0 disables the corresponding interrupt. Refer to
subsection 3.1.8 for bit definitions.
The CPU interrupt flag register (IF) is also a 32-bit register (see subsection
3.1.9). A 1 in a CPU interrupt flag register bit indicates that the corresponding
interrupt is set. A 0 indicates that the corresponding interrupt is not set.
The I/O flags register (IOF) controls the function of the dedicated external
pins, XF0 and XF1. These pins may be configured for input or output and may
also be read from and written to. See subsection 3.1.10 for detailed informa-
tion.
The repeat counter (RC) is a 32-bit register used to specify the number of
times a block of code is to be repeated when performing a block repeat. When
the processor is operating in the repeat mode, the 32-bit repeat start address
register (RS) contains the starting address of the block of program memory
to be repeated, and the 32-bit repeat end address register (RE) contains the
ending address of the block to be repeated.
The program counter (PC) is a 32-bit register containing the address of the
next instruction to be fetched. Although the PC is not part of the CPU register
file, it is a register that can be modified by instructions that modify the program
flow.
2-10
Memory Organization
ÉÉÉÉ
Figure 2–3. Memory Organization
RAM RAM
ÉÉÉÉ ROM
ÉÉÉÉ
Cache
Block 0 Block 1 Block
(64 x 32)
(1K x 32) (1K x 32) (4K x 32)
32 24 24 32 24 32 24 32
ÉÉÉÉÉ
ÉÉÉÉÉ
PDATA Bus
ÉÉÉÉÉ
PADDR Bus XRDY
ÉÉÉÉÉ
RDY MSTRB
HOLD DDATA Bus IOSTRB
ÉÉÉÉÉ
Multiplexer
Multiplexer
HOLDA XR/W
XD31–XD0
ÉÉÉÉÉ
STRB DADDR1 Bus
R/W XA12–XA0
D31–D0 DADDR2 Bus
Peripheral Bus
A23–A0
DMADATA Bus
DMAADDR Bus
32 24 32 24 24 32 24
ÉÉÉÉ
Program Counter/ DMA
Instruction Register CPU Controller
Refer to Chapter 3 for detailed information about the memory and instruction
cache.
2-12
Memory Organization
Section 3.2 on page 3-13 describes the memory maps in greater detail and
provides the peripheral bus map and vector locations for reset, interrupts, and
traps.
0h 0h
Reset, Interrupt, Trap Vectors,
Reset, Interrupt, Trap Vectors,
and Reserved Locations (192)
and Reserved Locations (192)
(External STRB Active)
03Fh 0BFh
040h 0C0h
ROM
(Internal)
External
STRB Active 0FFFh
1000h
External
STRB Active
7FFFFFh 7FFFFFh
800000h 800000h
Expansion Bus Expansion Bus
MSTRB Active MSTRB Active
(8K Words) (8K Words)
801FFFh 801FFFh
802000h 802000h
Reserved Reserved
(8K Words) (8K Words)
803FFFh 803FFFh
804000h 804000h
Expansion Bus Expansion Bus
IOSTRB Active IOSTRB Active
(8K Words) (8K Words)
805FFFh 805FFFh
806000h 806000h
Reserved Reserved
(8K Words) (8K Words)
807FFFh 807FFFh
808000h 808000h
2-14
Memory Organization
0h 0h
Reset, Interrupt, Trap Vectors,
and Reserved Locations (192)
(External STRB Active) Reserved for Boot
03Fh Loader Operations
040h
(See Section 3.4)
FFFh
1000h
External Boot 1
STRB Active
External
STRB
Active
400000h Boot 2
7FFFFFh 7FFFFFh
800000h 800000h
Reserved Reserved
(32K Words) (32K Words)
807FFFh 807FFFh
808000h 808000h
Peripheral Bus Peripheral Bus
Memory-Mapped Memory-Mapped
Registers Registers
(6K Words Internal) (6K Words Internal)
8097FFh 8097FFh
809800h 809800h
RAM Block 0 RAM Block 0
(1K Word Internal) (1K Word Internal)
809BFFh 809BFFh
809C00h 809C00h
RAM Block 1
(1K Word—63 Internal)
809FC0h
RAM Block 1 809FC1h
(1K Word Internal)
User Program Interrupt
and Trap Branches
(63 Words Internal)
809FFFh 809FFFh
80A000h 80A000h
External
External FFF000h Boot 3 STRB
STRB Active
Active
FFFFFFh FFFFFFh
Five groups of addressing modes are provided on the TMS320C3x. Six types
of addressing can be used within the groups, as shown in the following list:
2-16
Instruction Set Summary
Table 2–2 lists the TMS320C3x instruction set in alphabetical order. Each
table entry shows the instruction mnemonic, description, and operation. Refer
to Chapter 10 for a functional listing of the instructions and individual instruc-
tion descriptions.
Table 2–2. Instruction Set Summary
2-18
Instruction Set Summary
ROLC Rotate left through carry Dreg rotated left 1 bit through carry → Dreg
RORC Rotate right through carry Dreg rotated right 1 bit through carry → Dreg
2-20
Instruction Set Summary
The PC is connected to the 24-bit program address bus (PADDR). The instruc-
tion register (IR) is connected to the 32-bit program data bus (PDATA). These
buses can fetch a single instruction word every machine cycle.
The 24-bit data address buses (DADDR1 and DADDR2) and the 32-bit data
data bus (DDATA) support two data memory accesses every machine cycle.
The DDATA bus carries data to the CPU over the CPU1 and CPU2 buses. The
CPU1 and CPU2 buses can carry two data memory operands to the multiplier,
ALU, and register file every machine cycle. Also internal to the CPU are regis-
ter buses REG1 and REG2, which can carry two data values from the register
file to the multiplier and ALU every machine cycle. Figure 2–2 shows the buses
internal to the CPU section of the processor.
The DMA controller is supported with a 24-bit address bus (DMAADDR) and
a 32-bit data bus (DMADATA). These buses allow the DMA to perform memory
accesses in parallel with the memory accesses occurring from the data and
program buses.
2-22
Parallel Instruction Set Summary
2-24
Parallel Instruction Set Summary
2-26
Peripherals
2.8 Peripherals
All TMS320C3x peripherals are controlled through memory-mapped registers
on a dedicated peripheral bus. This peripheral bus is composed of a 32-bit data
bus and a 24-bit address bus. This peripheral bus permits straightforward
communication to the peripherals. The TMS320C3x peripherals include two
timers and two serial ports (only one serial port is available on the
TMS320C31). Figure 2–6 shows the peripherals with associated buses and
signals. Refer to Chapter 8 for detailed information on the peripherals.
ÉÉÉÉÉÉÉÉÉ
Data Receive Register
CLKR0
ÉÉÉÉÉÉÉÉÉ
FSX1
Port Control Register
ÉÉÉÉÉÉÉÉÉ
DX1
ÉÉÉÉÉÉÉÉÉ
P R/X Timer Register CLKX1
E
FSR1
ÉÉÉÉÉÉÉÉÉ
P R
Data Transmit Register
E I DR1
ÉÉÉÉÉÉÉÉÉ
R P
I H Data Receive Register CLKR1
P E
H R
E A Timer 0
R L
A Global Control Register
L A
TCLK0
D Timer Period Register
D D
A R
Timer Counter Register
T E
A S
S Timer 1
B
U B Global Control Register
S U
S TCLK1
Timer Period Register
ÉÉÉÉ
ÉÉÉÉ
Available on TMS320C30
2.8.1 Timers
The two timer modules are general-purpose 32-bit timer/event counters with
two signaling modes and internal or external clocking. Each timer has an I/O
pin that can be used as an input clock to the timer or as an output signal driven
by the timer. The pin can also be configured as a general-purpose I/O pin.
2-28
Direct Memory Access (DMA)
DMADATA Bus
DMAADDR Bus
Table 2–4 shows these differences, which are detailed in the following subsec-
tions.
Serial I/O ports 1 serial port (SP0) 2 serial ports (SP0, SP1)
2-30
TMS320C30 and TMS320C31 Differences
2-32
Chapter 3
The central processing unit (CPU) register file contains 28 registers that can
be operated on by the multiplier and arithmetic logic unit (ALU). Included in the
register file are the auxiliary registers, extended-precision registers, and index
registers. The registers in the CPU register file support addressing, float-
ing-point/integer operations, stack management, processor status, block re-
peats, and interrupts.
The TMS320C3x provides a total memory space of 16M (million) 32-bit words
containing program, data, and I/O space. Two RAM blocks of 1K x 32 bits each
and a ROM block of 4K x 32 bits (available only on the TMS320C30) permit
two CPU accesses in a single cycle. The memory maps for the microcomputer
and microprocessor modes are similar, except that the on-chip ROM is not
used in the microprocessor mode.
This chapter describes in detail each of the CPU registers, the memory maps,
and the instruction cache. Major topics are as follows:
Topic Page
3-1
CPU Register File
3-2
CPU Register File
e s fraction (f)
mantissa
39 32 31 0
3-4
CPU Register File
Figure 3–3 shows the format of the status register. Table 3–2 defines the sta-
tus register bits, their names, and their functions.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx GIE CC CE CF xx RM OVM LUF LV UF N Z V C
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
1† V 0 Overflow flag
2† Z 0 Zero flag
3† N 0 Negative flag
7 OVM 0 Overflow mode flag. This flag affects only the integer operations. If OVM
= 0, the overflow mode is turned off; integer results that overflow are
treated in no special way. If OVM = 1,
a) integer results overflowing in the positive direction are set to the
most positive 32-bit twos-complement number (7FFFFFFFh), and
b) integer results overflowing in the negative direction are set to the
most negative 32-bit twos-complement number (80000000h).
Note that the function of V and LV is independent of the setting of OVM.
9 Reserved 0 Read as 0
12 CC 0 Cache clear. CC = 1 invalidates all entries in the cache. This bit is always
cleared after it is written to and thus always read as 0. At reset, 0 is writ-
ten to this bit.
13 GIE 0 Global interrupt enable. If GIE = 1, the CPU responds to an enabled in-
terrupt. If GIE = 0, the CPU does not respond to an enabled interrupt.
† The seven condition flags (ST bits 6–0) are defined in Section 10.2 on page -10.
3-6
CPU Register File
xx xx xx xx xx EDINT ETINT1 ETINT0 ERINT1 EXINT1 ERINT0 EXINT0 EINT3 EINT2 EINT1 EINT0
(DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA)
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx xx EDINT ETINT1 ETINT0 ERINT1 EXINT1 ERINT0 EXINT0 EINT3 EINT2 EINT1 EINT0
(CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU)
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
3-8
CPU Register File
15 13 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx xx DINT TINT1 TINT0 RINT1 XINT1 RINT0 XINT0 INT3 INT2 INT1 INT0
14 12 R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
† Reserved on TMS320C31
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx xx xx xx xx INXF1 OUTXF1 I/OXF1 xx INXF0 OUTXF0 I/OXF0 xx
R R/W R/W R R/W R/W
3-10
CPU Register File
4 Reserved 0 Read as 0
The 32-bit repeat start address register (RS) contains the starting address of
the block of program memory to be repeated when the CPU is operating in the
repeat mode.
The 32-bit repeat end address register (RE) contains the ending address of
the block of program memory to be repeated when the CPU is operating in the
repeat mode.
Note: RE < RS
If RE < RS, the block of program memory will not be repeated, and the code
will not loop backwards. However, the ST(RM) bit remains set to 1.
The repeat-count register (RC) is a 32-bit register used to specify the number
of times a block of code is to be repeated when a block repeat is performed.
If RC contains the number n, the loop is executed n + 1 times.
3-12
Memory
3.2 Memory
The TMS320C3x’s total memory space of 16M (million) 32-bit words contains
program, data, and I/O space, allowing tables, coefficients, program code, or
data to be stored in either RAM or ROM. In this way, you can maximize memory
usage and allocate memory space as desired.
RAM blocks 0 and 1 are each 1K x 32 bits. The ROM block is 4K x 32 bits. Each
on-chip RAM and ROM block is capable of supporting two CPU accesses in
a single cycle. The separate program buses, data buses, and DMA buses al-
low for parallel program fetches, data reads/writes, and DMA operations.
Chapter 9 covers this in detail.
Reserved Spaces
Do not read and write to reserved portions of the TMS320C3x
memory space and reserved peripheral bus addresses. Doing so
might cause the TMS320C3x to halt operation and require a system
reset to restart.
3-14
Memory
0h 0h
Reset, Interrupt, Trap Vector,
Reset, Interrupt, Trap Vector,
and Reserved Locations (64)
and Reserved Locations (192)
External STRB Active
03Fh 0BFh
040h 0C0h
ROM
(Internal)
External
0FFFh
STRB Active 1000h
External
STRB Active
7FFFFFh 7FFFFFh
800000h 800000h
Expansion Bus Expansion Bus
MSTRB Active MSTRB Active
(8K Words) (8K Words)
801FFFh 801FFFh
802000h 802000h
Reserved Reserved
(8K Words) (8K Words)
803FFFh 803FFFh
804000h 804000h
Expansion Bus Expansion Bus
IOSTRB Active IOSTRB Active
(8K Words) (8K Words)
805FFFh 805FFFh
806000h 806000h
Reserved Reserved
(8K Words) (8K Words)
807FFFh 807FFFh
808000h 808000h
8097FFh 8097FFh
809800h 809800h
RAM Block 0 RAM Block 0
(1K Word Internal) (1K Word Internal)
809BFFh 809BFFh
809C00h 809C00h
RAM Block 1 RAM Block 1
(1K Word Internal) (1K Word Internal)
809FFFh 809FFFh
80A000h 80A000h
External External
STRB Active STRB Active
0FFFFFFh 0FFFFFFh
0h 0h
Reset, Interrupt, Trap Vector,
and Reserved Locations (64)
(External STRB Active) Reserved for Boot
03Fh Loader Operations
040h
(See Section 3.4.)
FFFh
1000h
External Boot 1
STRB Active
External
STRB
Active
400000h Boot 2
7FFFFFh 7FFFFFh
800000h 800000h
Reserved Reserved
(32K Words) (32K Words)
807FFFh 807FFFh
808000h 808000h
Peripheral Bus Peripheral Bus
Memory-Mapped Memory-Mapped
Registers Registers
(6K Words Internal) (6K Words Internal)
8097FFh 8097FFh
809800h 809800h
RAM Block 0 RAM Block 0
(1K Word Internal) (1K Word Internal)
809BFFh 809BFFh
809C00h 809C00h
RAM Block 1
(1K Word— 63 Internal)
809FC0h
RAM Block 1 809FC1h
(1K Word Internal)
User Program Interrupt
and Trap Branches
(63 Words Internal)
809FFFh 809FFFh
80A000h 80A000h
External
External FFF000h Boot 3 STRB
STRB Active
Active
FFFFFFh FFFFFFh
Boot 1–3 locations are used by the boot-loader function. See Section 3.4 for
a complete description. All reserved memory locations are described in
Table 2–5 on page 2-31.
3-16
Memory
The major difference between these two modes is their memory maps (see
Figure 3–8). The program boot load feature is enabled when the MCBL/MP pin
is driven high during reset.
Figure 3–8 shows the memory locations (internal and external) used by the
boot loader to load the source program.
Figure 3–9. Reset, Interrupt, and Trap-Vector Locations for the TMS320C30/TMS320C31
Microprocessor Mode
00h RESET
01h INT0
02h INT1
03h INT2
04h INT3
05h XINT0
06h RINT0
07h XINT1†
08h RINT1†
09h TINT0
0Ah TINT1
0Bh DINT
0Ch
RESERVED
1Fh
20h TRAP 0
•
•
•
3Bh TRAP 27
3Ch TRAP 28 (Reserved)
3Dh TRAP 29 (Reserved)
3Eh TRAP 30 (Reserved)
3Fh TRAP 31 (Reserved)
† Reserved on TMS320C31
3-18
Memory
Figure 3–10. Interrupt and Trap Branch Instructions for the TMS320C31 Microcomputer
Mode
809FC1h INT0
809FC2h INT1
809FC3h INT2
809FC4h INT3
809FC5h XINT0
809FC6h RINT0
809FC7h XINT1
809FC8h RINT1
809FC9h TINT0
809FCAh TINT1
809FCBh DINT
809FCC–
809FDFh RESERVED
809FE0h TRAP0
809FE1h TRAP1
•
•
•
809FFBh TRAP27
809FFCh TRAP28 (Reserved)
809FFDh TRAP29 (Reserved)
809FFEh TRAP30 (Reserved)
809FFFh TRAP31 (Reserved)
80800Fh (16)
808010h
Reserved
80801Fh (16)
808020h
Timer 0 Registers
80802Fh (16)
808030h
Timer 1 Registers
80803Fh (16)
808040h
Serial-Port 0 Registers
80804Fh (16)
808050h Serial-Port 1 Registers†
(16)
80805Fh
808060h
Primary and Expansion Port
Registers (16)
80806Fh
808070h
Reserved
8097FFh
† Reserved on TMS320C31
3-20
Instruction Cache
30 Segment Word 30
31 Segment Word 31
32
Segment 1
30 Segment Word 30
31 Segment Word 31
When the CPU requests an instruction word from external memory, the cache
algorithm checks to determine whether the word is already contained in the
instruction cache. Figure 3–13 shows the partitioning of an instruction address
as used by the cache control algorithm. The algorithm uses the19 most signifi-
cant bits (MSBs) of the instruction address to select the segment; the five least
significant bits (LSBs) define the address of the instruction word within the per-
tinent segment. The algorithm compares the 19 MSBs of the instruction ad-
dress with the two SSA registers. If there is a match, the algorithm checks the
relevant P flag. The P flag indicates whether a word within a particular segment
is already present in cache memory.
If there is no match, one of the segments must be replaced by the new data.
The segment replaced in this circumstance is determined by the LRU algo-
rithm. The LRU stack (see Figure 3–12) is maintained for this purpose.
3-22
Instruction Cache
The LRU stack determines which of the two segments qualifies as the least
recently used after each access to the cache; therefore, the stack contains ei-
ther 0,1 or 1,0. Each time a segment is accessed, its segment number is re-
moved from the LRU stack and pushed onto the top of the LRU stack. There-
fore, the number at the top of the stack is the most recently used segment num-
ber, and the number at the bottom of the stack is the least recently used seg-
ment number.
At system reset, the LRU stack is initialized with 0 at the top and 1 at the bot-
tom. All P flags in the instruction cache are cleared.
- Cache Hit. The cache contains the requested instruction, and the follow-
ing actions occur:
2) The number of the segment containing the word is removed from the
LRU stack and pushed to the top of the LRU stack, thus moving the
other segment number to the bottom of the stack.
- Cache Miss. The cache does not contain the instruction. Following are
the types of cache miss:
J Word miss. The segment address register matches the instruction ad-
dress, but the relevant P flag is not set. The following actions occur in
parallel:
H The instruction word is read from memory and copied into the
cache.
3-24
Instruction Cache
- Cache Freeze Bit (CF). When CF = 1, the cache is frozen. If, in addition,
the cache is enabled, fetches from the cache are allowed, but no modifica-
tion of the state of the cache is performed. Specifically, no SSA register
updates are performed, no P flags are modified (unless CC = 1), and the
LRU stack is not modified. You can use this function to keep frequently
used code resident in the cache. Writing a 1 to CC when the cache is fro-
zen clears the cache, and, thus, the P flags. At reset, 0 is written to this bit.
Table 3–6 defines the effect of the CE and CF bits used in combination.
3-26
Using the TMS320C31 Boot Loader
Begin
Reset
MCBL/MP = 1
Is
Register Yes
Bit INT3 Serial Port Load
Set?
No
Is
Register Yes Memory Load
Bit INT0 From 1000h
Set?
No
Is
Register Yes Memory Load
Bit INT1 From 400000h
Set?
No
Is
Register Yes Memory Load
Bit INT2 From FFF000h
Set?
No
Memory Load
Yes
Branch to Address Block Size = 0?
Boot 1,
Boot 2, or
Boot 3 No
Load Destination
Address
Determine Mode
8, 16, or 32?
No
Load Block Size Transfer Data From
Source to
Destination
Block Size –1
Branch to Destination
Address of First
Block Loaded
3-28
Using the TMS320C31 Boot Loader
Block Size –1
Yes
Block Size = 0?
No
Load Destination
Address Load Block Size
Branch to Destination
Address of First
Block Loaded
After reset, the loader mode is determined by polling the status of the
INT3–INT0 bits of the IF register. The bits are polled in the order described in
the flowchart in Figure 3–14 on page 3-27. Table 3–7 lists the mode options
and the interrupt that you can use to set the particular mode. The interrupt can
be driven any time after the RESET pin has been deasserted. Unless only one
interrupt flag bit is set (INT0, INT1, INT2, or INT3), the boot mode cannot be
guaranteed.
This information must be specified in the first four locations of the Boot 1, Boot
2, or Boot 3 areas. The header is followed by the data or program code that
is the block size in length.
1 Boot memory configuration See Chapter 7 for valid bus-control register entries.
(defined # of wait states, etc.)
2 Program block size (blk) Any value 0 < blk < 224
4 Program code starts here Any 32-bit data value or valid TMS320C3x instruction
The loader fetches 32 bits of data for each specified location, regardless of
what memory configuration width is specified. The data values must reside
within or be written to memory, beginning with the value of least significance
for each 32 bits of information.
3-30
Using the TMS320C31 Boot Loader
- An INT0 signal was detected after reset was deasserted (signifying an ex-
ternal memory load from Boot 1).
- The loader header resides at memory location 0x1000 and defines the fol-
lowing:
J Boot memory type EPROMs that require two wait states and SWW = 11,
J A loader destination address at the beginning of the TMS320C31’s in-
ternal RAM Block 1, and
J A single block of memory that is 0x1FF in length.
0x1001 0x00
0x1002 0x00
0x1003 0x00
0x1005 0x10
0x1006 0x00
0x1007 0x00
0x1009 0x01
0x100A 0x00
0x100B 0x00
0x100D 0x9C
0x100E 0x80
0x100F 0x00
0x1001 0x0000
0x1003 0x0000
0x1005 0x0000
0x1007 0x0080
After reading the header, the loader transfers blk, 32-bit words beginning at a
specified destination address. Code blocks require the same byte and half-
word ordering conventions. The loader can also load multiple code blocks at
different address destinations.
After loading all code blocks, the boot loader branches to the destination ad-
dress of the first block loaded and begins program execution. Consequently,
the first code block loaded should be a start-up routine to access the other
loaded programs.
End the loader function and begin execution of the first code block by append-
ing the value of 0x00000000 to the last block.
3-32
Using the TMS320C31 Boot Loader
It is assumed that at least one block of code will be loaded when the
loader is invoked. Initial loader invocation with a block size of
0x00000000 produces unpredictable results.
The transferred data-bit order must begin with the MSB and end with the LSB.
Table 3–9 shows the MCBL/MP mode interrupt and trap instruction memory
maps.
809FC2 INT1
809FC3 INT2
809FC4 INT3
809FC5 XINT0
809FC6 RINT0
809FC7 Reserved
809FC8 Reserved
809FC9 TINT0
809FCA TINT1
809FCB DINT0
809FCC–809FDF Reserved
809FE0 TRAP0
809FE1 TRAP1
• •
• •
• •
809FFB TRAP27
809FFC–809FFF Reserved
3-34
Using the TMS320C31 Boot Loader
3.4.8 Precautions
The boot loader builds a one-word-deep stack, starting at location 809801h.
The interrupt flags are not reset by the boot-loader function. If pending inter-
rupts are to be avoided when interrupts are enabled, clear the IF register be-
fore enabling interrupts.
The MCBL/MP pin should remain high during the entire boot-loader execution,
but it can be changed subsequently at any time. The TMS320C31 does not
need to be reset after the MCBL/MP pin is changed. During the change, the
TMS320C31 should not access addresses 0h–FFFh.
This chapter discusses in detail the data formats and floating-point operations
supported in the TMS320C3x. Major topics in this section are as follows:
Topic Page
4-1
Integer Formats
The short integer format is a 16-bit two’s complement integer format for imme-
diate integer operands. For those instructions that assume integer operands,
this format is sign-extended to 32 bits (see Figure 4–1). The range of an
integer si, represented in the short integer format, is –215 ≤ si ≤ 215 – 1. In
Figure 4–1, s = signed bit.
Figure 4–1. Short Integer Format and Sign Extension of Short Integers
15 0
31 16 15 0
s s s s s s s s s s s s s s s s
31 0
4-2
Unsigned-Integer Formats
15 0
31 16 15 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x
31 0
e s f
man (mantissa)
4-4
Floating-Point Formats
15 12 11 10 0
e s f
mantissa
Operations are performed with an implied binary point between bits 11 and 10.
When the implied most significant nonsign bit is made explicit, it is located to
the immediate left of the binary point. The floating-point two’s complement
number x in the short floating-point format is given by the following:
x = 01.f × 2e if s = 0
10.f × 2e if s = 1
0 if e = – 8
You must use the following reserved values to represent 0 in the short float-
ing-point format:
e=–8
s=0
f=0
The following examples illustrate the range and precision of the short float-
ing-point format:
Most Positive: x = (2 – 2 –11) × 27 = 2.5594 × 102
Least Positive: x = 1 × 2 –7 = 7.8125 × 10–3
Least Negative: x = (–1– 2 –11) × 2 –7 = –7.8163 × 10–3
Most Negative: x = –2 × 27 = – 2.5600 × 102
31 24 23 22 0
e s f
mantissa
Operations are performed with an implied binary point between bits 23 and 22.
When the implied most significant nonsign bit is made explicit, it is located to
the immediate left of the binary point. The floating-point number x is given by
the following:
x = 01.f × 2e if s = 0
10.f × 2e if s = 1
0 if e = – 8
You must use the following reserved values to represent 0 in the single-preci-
sion floating-point format:
e = – 128
s=0
f=0
The following examples illustrate the range and precision of the single-preci-
sion floating-point format.
4-6
Floating-Point Formats
39 32 31 30 0
e s f
mantissa
Operations are performed with an implied binary point between bits 31 and 30.
When the implied most significant nonsign bit is made explicit, it is located to
the immediate left of the binary point. The floating-point number x is given by
the following:
x = 01.f × 2e if s = 0
10.f × 2e if s = 1
0 if e = –128
You must use the following reserved values to represent 0 in the extended-pre-
cision floating-point format:
e = –128
s=0
f=0
The following examples illustrate the range and precision of the extended-pre-
cision floating-point format:
Most Positive: x = (2 – 2 – 23) × 2127 = 3.4028234 × 1038
Least Positive: x = 1 × 2 –127 = 5.8774717541 × 1038
Least Negative: x = (–1–2 –31) × 2 –127 = – 5.8774717569 × 10–39
Most Negative: x = – 2 × 2127 = – 3.4028236691 × 1038
15 12 11 10 0
s x x x y y y
31 27 24 23 22 12 11 0
s s s s x x x x y y y 0 0
In this format, the exponent field is sign-extended, and the fraction field is filled
with 0s.
15 12 11 10 0
s x x x y y y
39 35 32 31 30 20 19 0
s s s s x x x x y y y 0 0
The exponent field in this format is sign-extended, and the fraction field is filled
with 0s.
4-8
Floating-Point Formats
31 24 23 22 0
x x y y y
39 32 31 30 8 7 0
x x y y y 0 0
39 32 31 30 8 7 0
x x y y y z z
31 24 23 22 0
x x y y y
α = α(man) × 2α(exp)
where:
α(man) is the mantissa and α(exp) is the exponent.
where:
c(man) = α(man) × b(man), and
c(exp) = α(exp) + b(exp)
Steps 4 and 5 normalize the result. If a right shift of 1 is necessary, then in step
8, c(man) is right-shifted 1 bit, thus adding 1 to c(exp). If a right shift of 2 is nec-
essary, then in step 9, c(man) is right-shifted 2 bits, thus adding 2 to c(exp).
Step 6 occurs when the result is normalized.
4-10
Floating-Point Multiplication
(1) (2)
Multiply mantissas Add exponents
(14)
If c(man) > 0, c(exp) = –128 (15)
set c(exp) to most c(man) = 0
positive value
If c(man) < 0,
set c(exp) to most
negative value
c=αxb
Example 4–1, Example 4–2, Example 4–3, Example 4–4, and Example 4–5
illustrate how floating-point multiplication is performed on the TMS320C3x.
For these examples, the implied most significant nonsign bit is made explicit.
where:
α and b are both represented in binary form according to the normalized sing-
le-precision floating-point format.
Then:
10 .00000000000000000000000 × 2α(exp)
× 10 .00000000000000000000000 × 2b(exp)
0100 .0000000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp))
10 .00000000000000000000000 × 2α(exp)
x 10 .00000000000000000000000 × 2b(exp)
01 .0000000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp) + 2)
In floating-point multiplication, the exponent of the result may overflow. This
can occur when the exponents are initially added or when the exponent is mo-
dified during normalization.
4-12
Floating-Point Multiplication
01 .10000000000000000000000 × 2α(exp)
× 01 .10000000000000000000000 × 2b(exp)
01 .00100000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp) + 1)
01 .00000000000000000000000 × 2α(exp)
× 01 .00000000000000000000000 × 2b(exp)
0001.0000000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp))
This number is in the proper normalized format. Therefore, no shift of the man-
tissa or modification of the exponent is necessary.
These examples have shown cases where the product of two normalized num-
bers can be normalized with a shift of 0, 1, or 2. For all normalized inputs with
the floating-point format used by the TMS320C3x, a normalized result can be
produced by a shift of 0, 1, or 2.
01 .00000000000000000000000 × 2α(exp)
× 10 .00000000000000000000000 × 2b(exp)
1110 .0000000000000000000000000000000000000000000000 × 2 (α(exp) + b(exp))
The result is c = – 2.0 x 2(α(exp) + b(exp))
The flowchart for floating-point addition is shown in Figure 4–14. Since this
flowchart assumes signed data, it is also appropriate for floating-point subtrac-
tion. In this figure, it is assumed that α(exp) ≤ b(exp). In step 1, the source ex-
ponents are compared, and c(exp) is set equal to the largest of the two source
exponents. In step 2, d is set to the difference of the two exponents. In step 3,
the mantissa with the smallest exponent, in this case α(man), is right-shifted
d bits to align the mantissas. After the mantissas have been aligned, they are
added (step 4).
Steps 5 through 7 check for a special case of c(man). If c(man) is 0 (step 5),
then c(exp) is set to its most negative value (step 8) to yield the correct repre-
sentation of 0. If c(man) has overflowed c (step 6), then c(man) is right-shifted
one bit, and 1 is added to c(exp). Otherwise, step 10 normalizes c by left-shift-
ing c(man) and subtracting c(exp) by the number of leading non-significant
sign bits (step 7). Steps 11 through 13 check for special cases of c(exp). If
c(exp) has overflowed (step 11) in the positive direction, then step 14 sets
c(exp) to the most positive extended-precision format value. If c(exp) has over-
flowed (step 11) in the negative direction, then step 14 sets c(exp) to the most
negative extended-precision format value. If c(exp) has underflowed (step 12),
then step 15 sets c to 0; that is, c(man) = 0 and c(exp) = –128.
4-14
Floating-Point Addition and Subtraction
(16)
Set c to final result
c=α+b
Example 4–6, Example 4–7, Example 4–8, and Example 4–9 describe the
floating-point addition and subtraction operations. It is assumed that the data
is in the extended-precision floating-point format.
It is necessary to shift b to the right by 1 so that α and b have the same expo-
nent. This yields:
b = 0.5 = 00.1000000000000000000000000000000 × 20
Then:
01 .10000000000000000000000000000000 × 20
+ 00 .10000000000000000000000000000000 × 20
010 .00000000000000000000000000000000 × 20
As in the case of multiplication, it is necessary to shift the binary point one place
to the left and add 1 to the exponent. This yields:
01 .1000000000000000000000000000000 × 20
± 00 .1000000000000000000000000000000 × 20
01 .0000000000000000000000000000000 × 21
01 .0000000000000000000000000000001 × 20
– 01 .0000000000000000000000000000000 × 20
00 .0000000000000000000000000000001 × 20
4-16
Floating-Point Addition and Subtraction
01 .0000000000000000000000000000001 × 20
– 01 .0000000000000000000000000000000 × 20
01 .0000000000000000000000000000000 × 2 –31
α = 01.1111111111111111111111111111111 × 2127
b = 10.0000000000000000000000000000000 × 2127
01.1111111111111111111111111111111 × 2127
+ 10.0000000000000000000000000000000 × 2127
11.1111111111111111111111111111111 × 2127
01.1111111111111111111111111111111 × 2127
+ 10.0000000000000000000000000000000 × 2127
10.0000000000000000000000000000000 × 295
α ± 0 = α (α ≠ 0)
0±0=0
0 –α = – α (α ≠ 0)
This number is then sign-extended one bit so that the mantissa contains 33
bits.
man = 00.0000000000000000001000000000001, exp = 0
The intermediate result after the most significant nonsign bit is located and the
shift performed is:
man = 01.0000000000010000000000000000000, exp = –19
The final 32-bit value output after removing the redundant bit is:
man = 00000000000010000000000000000000, exp = –19
The NORM instruction is useful for counting the number of leading 0s or lead-
ing 1s in a 32-bit field. If the exponent is initially 0, the absolute value of the final
value of the exponent is the number of leading 1s or 0s. This instruction is also
useful for manipulating unnormalized floating-point numbers.
4-18
Normalization Using the NORM Instruction
(8)
c (exp) = –128
No change to c (man)
c = norm(α)
4-20
Rounding: The RND Instruction
If c (man) > 0,
set c to most positive
single-precision value
If c (man) < 0,
set c to most negative
single-precision value
c = rnd(α)
α(exp) ≤ 30
If these bounds are not met, an overflow occurs. If an overflow occurs in the
positive direction, the output is the most positive integer. If an overflow occurs
in the negative direction, the output is the most negative integer. If α(exp) is
within the valid range, then α(man), with implied bit included, is sign-extended
and right-shifted (rs) by the amount
rs = 31 – α(exp)
This right-shift (rs) shifts out those bits corresponding to the fractional part of
the mantissa. For example:
If 0 ≤ × < 1, then fix(x) = 0.
If –1 ≤ × < 0, then fix(x) = –1.
4-22
Floating-Point-to-Integer Conversion
α(exp) in range
α(exp) > 30
rs = 31 – α(exp)
Overflow Shift
c = fix(α)
c (man) = α
c (exp) = 30
Leading nonsignificant
c (man) = 0 sign bits
k = # leading
nonsignificant
sign bits
c = float (α)
4-24
Chapter 5
Addressing
Topic Page
5-1
Types of Addressing
Some types of addressing are appropriate for some instructions but not others.
For this reason, the types of addressing are used in the five groups of address-
ing modes as follows:
The six types of addressing are discussed first, followed by the five groups of
addressing modes.
5-2
Types of Addressing
ABSF R1 ; R1 = |R1|
The syntax for the CPU registers, the assembler syntax, and the assigned
function for those registers are listed in Table 5–1.
Addressing 5-3
Types of Addressing
Syntax: @expr
Figure 5–1 shows the formation of the data address. Example 5–1 is an
instruction example with data before and after instruction execution.
31 16 15 0
Instruction
expr
Word
31 8 7 0
DP x x...x x page
(Data
Page Pointer)
31 24 23 0
0 0...0 0 address
31 0
operand
DP = 8Ah DP = 8Ah
R7 = 0h R7 = 12345678h
5-4
Types of Addressing
31 24 23 0
ARn x x address
31 0
operand
Table 5–2 lists the various kinds of indirect addressing, along with the value
of the modification (mod) field, assembler syntax, operation, and function for
each. The succeeding 17 examples show the operation for each kind of indi-
rect addressing. Figure 5–2 shows the format in the instruction encoding.
Addressing 5-5
Types of Addressing
5-6
Types of Addressing
Example 5–3, Example 5–4, Example 5–5, Example 5–6, Example 5–7,
Example 5–8, Example 5–9, Example 5–10, Example 5–11, Example 5–12,
Example 5–13, Example 5–14, Example 5–15, Example 5–16,
Example 5–17, Example 5–18, and Example 5–19 exemplify indirect addres-
sing in Table 5–2.
Addressing 5-7
Types of Addressing
31 24 23 0
ARn x x address
31 8 7 0
31 0
operand
31 24 23 0
ARn x x address
31 8 7 0
31 0
operand
5-8
Types of Addressing
31 24 23 0
ARn x x address
31 8 7 0
31 0
operand
31 24 23 0
ARn x x address
31 8 7 0
31 0
operand
Addressing 5-9
Types of Addressing
31 24 23 0
ARn x x address
31 8 7 0
31 0
operand
31 24 23 0
ARn x x address
31 8 7 0
31 0
operand
5-10
Types of Addressing
31 24 23 0
ARn x x address
31 8 7 0 (%)
disp 0 0...0 0 integer (+)
31 0
operand
31 24 23 0
ARn x x address
31 8 7 0 (%)
disp 0 0...0 0 integer (–)
31 0
operand
Addressing 5-11
Types of Addressing
31 24 23 0
ARn x x address
31 24 23 0
31 0
operand
31 24 23 0
ARn x x address
31 24 23 0
31 0
operand
5-12
Types of Addressing
31 24 23 0
ARn x x address
31 24 23 0
IRm x x index (+)
31 0
operand
31 24 23 0
ARn x x address
31 24 23 0
IRm x x index (–)
31 0
operand
Addressing 5-13
Types of Addressing
31 24 23 0
ARn x x address
31 24 23 0
(+)
IRm X X index
31 0
operand
31 24 23 0
ARn x x address
31 24 23 0
IRm x x index (–)
31 0
operand
5-14
Types of Addressing
31 24 23 0
ARn x x address
31 24 23 0 (%)
IRm x x index (+)
31 0
operand
31 24 23 0
ARn x x address
31 24 23 0 (%)
IRm x x index (–)
31 0
operand
Addressing 5-15
Types of Addressing
31 24 23 0
ARn x x address
31 24 23 0 (B)
IRm x x index (+)
31 0
operand
Syntax: expr
5-16
Types of Addressing
R0 = 0h R0 = 0FFFFFFFFh
Syntax: expr
PC = 0h PC = 8000h
The displacement is stored as a 16-bit or 24-bit signed integer in the least sig-
nificant bits of the instruction word. The displacement is added to the PC during
the pipeline decode phase. Notice that because the PC is incremented by 1
in the fetch phase, the displacement is added to this incremented PC value.
Addressing 5-17
Types of Addressing
PC = 1002h PC = 1005h
The 24-bit addressing mode encodes the program control instructions (for ex-
ample, BR, BRD, CALL, RPTB, and RPTBD). Depending on the instruction,
the new PC value is derived by adding a 24-bit signed value in the instruction
word with the present PC value. Bit 24 determines the type of branch (D = 0
for a standard branch or D = 1 for a delayed branch). Some of the instructions
are encoded in Figure 5–3.
31 25 24 23 0
0 1 1 0 0 0 0 0 displacement
31 24 23 0
0 1 1 0 0 0 1 0 displacement
31 25 24 23 0
0 1 1 0 0 1 0 0 displacement
5-18
Groups of Addressing Modes
where the destination operand is signified by dst and the source operand by
src; operation defines an operation to be performed on the operands using the
general addressing modes. Bits 31 –29 are 0, indicating general addressing
mode instructions. Bits 22 and 21 specify the general addressing mode (G)
field, which defines how bits 15–0 are to be interpreted for addressing the src
operand.
If the src and dst fields contain register specifications, the value in these fields
contains the CPU register addresses as defined by Table 5–1 on page 5-3.
For the general addressing modes, the following values of ARn are valid:
ARn, 0 ≤ n ≤ 7
Figure 5–4 shows the encoding for the general addressing modes. The nota-
tion mod indicates the modification field that goes with the ARn field. Refer to
Table 5–2 on page 5-6 for further information.
Addressing 5-19
Groups of Addressing Modes
where the destination operand is signified by dst and the source operands by
SRC1 and SRC2; operation defines an operation to be performed. Note that
the 3 can be omitted from three-operand instructions.
Bits 31–29 are set to the value of 001, indicating three-operand addressing
mode instructions. Bits 22 and 21 specify the three-operand addressing mode
(T) field, which defines how bits 15–0 are to be interpreted for addressing the
SRC operands. Bits 15–8 define the SRC1 address; bits 7–0 define the SRC2
address. Options for bits 22 and 21 (T) are as follows:
T SRC1 SRC2
0 0 register register
0 1 indirect register
1 0 register indirect
1 1 indirect indirect
Figure 5–5 shows the encoding for three-operand addressing. If the SRC1
and SRC2 fields use the same auxiliary register, both addresses are correctly
generated. However, only the value created by the SRC1 field is saved in the
auxiliary register specified. The assembler issues a warning if you specify this
condition.
ARn,0 ≤ n ≤ 7
ARm,0 ≤ m ≤ 7
5-20
Groups of Addressing Modes
The notation modm or modn indicates that the modification field goes with the
ARm or ARn field, respectively. Refer to Table 5–2 on page 5-6 for further
information.
31 29 28 23 22 21 20 16 15 13 12 11 10 87 54 3 2 0
T SRC1 SRC2
31 3029 26 25 2423 22 21 19 18 16 15 10 11 87 32 0
src3 src4
Addressing 5-21
Groups of Addressing Modes
The parallel addressing mode (P) field specifies how the operands are to be
used, that is, whether they are source or destination. The specific relationship
between the P field and the operands is detailed in the description of the indi-
vidual parallel instructions (see Chapter 10). However, the operands are al-
ways encoded in the same way. Bits 31 and 30 are set to the value of 10, indi-
cating parallel addressing mode instructions. Bits 25 and 24 specify the paral-
lel addressing mode (P) field, which defines how bits 21–0 are to be interpreted
for addressing the src operands. Bits 21–19 define the src1 address, bits
18–16 define the src2 address, bits 15–8 the src3 address, and bits 7–0 the
src 4 address. The notations modn and modm indicate which modification field
goes with which ARn or ARm (auxiliary register) field, respectively. Following
is a list of the parallel addressing operands:
- src1 0 ≤ src1 ≤ 7 (extended-precision registers R0 – R7)
- src2 0 ≤ src2 ≤ 7 (extended-precision registers R0–R7)
- d1 If 0, dst1 is R0. If 1, dst1 is R1.
- d2 If 0, dst2 is R2. If 1, dst2 is R3.
- P 0≤ P≤3
- src3 indirect (disp = 0, 1, IR0, IR1)
- src4 indirect (disp = 0, 1, IR0, IR1)
In the encoding shown for this mode in Figure 5–6 on page 5-21, if the src3
and src4 fields use the same auxiliary register, both addresses are correctly
generated, but only the value created by the src3 field is saved in the auxiliary
register specified. The assembler issues a warning if you specify this condi-
tion.
5-22
Groups of Addressing Modes
31 27 26 25 24 22 21 20 16 15 5 4 0
Bcond (D):
31 27 26 25 24 22 21 20 16 15 5 4 0
CALLcond:
31 27 26 25 24 22 21 20 16 15 5 4 0
Addressing 5-23
Circular Addressing
The block size register (BK) specifies the size of the circular buffer. By labeling
v
the most significant 1 of the BK register as bit N, with N 15, you can find the
address immediately following the bottom of the circular buffer by concatenat-
ing bits 31 through N + 1 of a user-selected register (ARn) with bits N through
0 of the BK register. The address of the top of the buffer is referred to as the
effective base (EB) and can be found by concatenating bits 31 through N + 1
of ARn, with bits N through 0 of EB being 0.
Figure 5–8 illustrates the relationships between the block size register (BK),
the auxiliary registers (ARn), the bottom of the circular buffer, the top of the cir-
cular buffer, and the index into the circular buffer.
A circular buffer of size R must start on a K-bit boundary (that is, the K LSBs
of the starting address of the circular buffer must be 0), where K is an integer
that satisfies 2K > R. Since the value R must be loaded into the BK register,
K w N + 1. For example, a 31-word circular buffer must start at an address
whose five LSBs are 0 (that is, XXXXXXXXXXXXXXXXXXXXXXXXXXX000002),
and the value 31 must be loaded into the BK register.
5-24
Circular Addressing
31 N+1 N 0 31 N+1 N 0
1 (N LSBs
EB H...H 0...0 H...H
of BK)
Top of Buffer + 1
Bottom of Buffer + 1
31 N+1 N 0
Circular
Addressing
Algorithm
Logic
New
Index 0...0 L′ . . . L′
31 N+1 N 0
New
ARn H...H L′ . . . L′
Addressing 5-25
Circular Addressing
In circular addressing, index refers to the N LSBs of the auxiliary register se-
lected, and step is the quantity being added to or subtracted from the auxiliary
register. Follow these two rules when you use circular addressing:
- The step used must be less than or equal to the block size. The step size
is treated as an unsigned integer.
- The first time the circular queue is addressed, the auxiliary register must
be pointing to an element in the circular queue.
Figure 5–9 shows how the circular buffer is implemented and illustrates the re-
lationship of the quantities generated and the elements in the circular buffer.
Address Data
31 N+1 N 0
Auxiliary Register (ARn) H...H L...L → Element (N LSBs of ARn)
MSBs of ARn LSBs of ARn
5-26
Circular Addressing
Example 5–23 shows circular addressing operation. Assuming that all ARs
are four bits, let AR0 = 0000, and BK = 0110 (block size of 6). Example 5–23
shows a sequence of modifications and the resulting value of AR0.
Example 5–23 also shows how the pointer steps through the circular queue
with a variety of step sizes (both incrementing and decrementing).
0th → Element 0 0
2nd → Element 1 1
Element 2 2
5th → Element 3 3
Last Element + 1 6
Addressing 5-27
Circular Addressing
h(1) x(1)
h(0) x(0) ← AR1
5-28
Bit-Reversed Addressing
Table 5–3 shows the relationship of the index steps and the four LSBs of AR2.
You can find the four LSBs by reversing the bit pattern of the steps.
Addressing 5-29
Bit-Reversed Addressing
4 0100 0010 2
5 0101 1010 10
6 0110 0110 6
7 0111 1110 14
8 1000 0001 1
9 1001 1001 9
10 1010 0101 5
11 1011 1101 13
12 1100 0011 3
13 1101 1011 11
14 1110 0111 7
15 1111 1111 15
5-30
System and User Stack Management
Bottom of Stack
.
.
.
SP → Top of Stack
(Free)
High Memory
Addressing 5-31
System and User Stack Management
5.5.2 Stacks
Stacks can be built from low to high memory or high to low memory. Two cases
for each type of stack are shown. Stacks can be built using the preincrement/
decrement and postincrement/decrement modes of modifying the auxiliary
registers (AR). Stack growth from high-to-low memory can be implemented in
two ways:
CASE 1: Stores to memory using *– – ARn to push data onto the stack and
reads from memory using *ARn ++ to pop data off the stack.
CASE 2: Stores to memory using *ARn – – to push data onto the stack and
reads from memory using * ++ ARn to pop data off the stack.
Figure 5–12 illustrates these two cases. The only difference is that in case 1,
the AR always points to the top of the stack, and in case 2, the AR always points
to the next free location on the stack.
Case 1 Case 2
Low Memory Low Memory
(Free) ARn → (Free)
ARn → Top of Stack Top of Stack
CASE 3: Stores to memory using *++ ARn to push data onto the stack and
reads from memory using *ARn – – to pop data off the stack.
CASE 4: Stores to memory using *ARn ++ to push data onto the stack and
reads from memory using *– – ARn to pop data off the stack.
Figure 5–13 shows these two cases. In case 3, the AR always points to the top
of the stack. In case 4, the AR always points to the next free location on the
stack.
5-32
System and User Stack Management
Case 3 Case 4
Low Memory Low Memory
5.5.3 Queues
A queue is like a FIFO. The implementation of queues is based on the manipu-
lation of auxiliary registers. Two auxiliary registers are used: one to mark the
front of the queue from which data is popped (or dequeued) and the other to
mark the rear of the queue where data is pushed. With proper management
of the auxiliary registers, the queue can also be circular. (A queue is circular
when the rear pointer is allowed to point to the beginning of the queue memory
after it has pointed to the end of the queue memory.)
Addressing 5-33
5-34
Chapter 6
Topic Page
6-1
Repeat Modes
RPTB and RPTS are four-cycle instructions. These four cycles of overhead
occur during the initial execution of the loop. All subsequent executions of the
loop have no overhead (zero cycle).
Three registers (RS, RE, and RC) are associated with the updating of the pro-
gram counter (PC) when it is updated in a repeat mode. Table 6–1 describes
these registers.
ÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁ
Register Function
ÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁ
RS Repeat Start Address Register. Holds the address of the first instruc-
tion of the block of code to be repeated.
RE ÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁ
Repeat End Address Register. Holds the address of the last instruc-
tion of the block of code to be repeated.
RC Repeat Count Register. Contains one less than the number of times
the block remains to be repeated. For example, to execute a block
N times, load N–1 into RC.
For correct operation of the repeat modes, you must correctly initialize all of
the above-mentioned registers.
6-2
Repeat Modes
- RM bit. The repeat-mode flag (RM) bit in the status register specifies
whether the processor is running in the repeat mode.
J RM = 0 indicates standard instruction fetching mode.
J RM = 1 indicates repeat-mode instruction fetches.
The repeat counter should be loaded with a value one less than the number
of times to execute the block; for example, an RC value of 4 would execute the
block five times. The detailed algorithm for the update of the PC is shown in
Example 6–1.
The number of times to repeat the block is the RC (repeat count) register value
plus one. Because the execution of RPTB does not load the RC, you must load
this register yourself. The RC register must be loaded before the RPTB instruc-
tion is executed. A typical setup of the block repeat operation is shown in
Example 6–2.
6-4
Repeat Modes
The RPTS instruction loads all registers and mode bits necessary for the oper-
ation of the single-instruction repeat mode. Step 1 loads the start address of
the block into RS. Step 2 loads the end address into the RE (end address of
the block). Since this is a repeat of a single instruction, the start address and
the end address are the same. Step 3 sets the status register to indicate the
repeat mode of operation. Step 4 indicates that this is the repeat single-instruc-
tion mode of operation. Step 5 loads src into RC.
- The last instruction in the block (or the only instruction in a block of
size 1) cannot be a Bcond, BR, DBcond, CALL, CALLcond, TRAPcond,
RETIcond, RETScond, IDLE, RPTB, or RPTS. Example 6–3 shows an in-
correctly placed standard branch.
- None of the last four instructions from the bottom of the block (or the only
instruction in a block of size 1) can be a BcondD, BRD, or DBcondD.
Example 6–4 shows an incorrectly placed delayed branch.
6-6
Repeat Modes
Standard branches empty the pipeline before performing the branch; this
guarantees correct management of the program counter and results in a
TMS320C3x branch taking four cycles. Included in this class are repeats,
calls, returns, and traps.
Delayed branches on the TMS320C3x do not empty the pipeline, but rather
guarantee that the next three instructions will execute before the program
counter is modified by the branch. The result is a branch that requires only a
single cycle, thus making the speed of the delayed branch very close to that
of the optimal block repeat modes of the TMS320C3x. However, unlike block
repeat modes, delayed branches may be used in situations other than looping.
Every delayed branch has a standard branch counterpart that is used when
a delayed branch cannot be used. The delayed branches of the TMS320C3x
are Bcond D, BRD, and DBcond D.
Conditional delayed branches use the conditions that exist at the end of the
instruction immediately preceding the delayed branch. They do not depend on
the instructions following the delayed branch. The condition flags are set by
a previous instruction only when the destination register is one of the exten-
ded-precision registers (R0–R7) or when one of the compare instructions
(CMPF, CMPF3, CMPI, CMPI3, TSTB, or TSTB3) is executed. Delayed
branches guarantee that the next three instructions will execute, regardless
of other pipeline conflicts.
When a delayed branch is fetched, it remains pending until the three subse-
quent instructions are executed. None of the three instructions that follow a
delayed branch can be any of the following (see Example 6–6):
Bcond DBcond D
Bcond D IDLE
BR RETIcond
BRD RETScond
CALL RPTB
CALLcond RPTS
DBcond TRAPcond
Delayed branches disable interrupts until the three instructions following the
delayed branch are completed. This is independent of whether the branch is
taken.
6-8
Delayed Branches
The CALL, CALLcond, and TRAPcond instructions store the value of the PC
on the stack before changing the PC’s contents. The stack thus provides a re-
turn using either the RETScond or RETIcond instruction.
- The CALL instruction places the next PC value on the stack and places
the src (source) operand into the PC. The src is a 24-bit immediate value.
Figure 6–1 shows CALL response timing.
- RETIcond returns from traps or calls like the RETScond (above) with the
addition that RETIcond also sets the GIE bit of the status register, which
enables all interrupts whose enabling bit is set to 1. Conditions are the
same as for the CALLcond instruction.
6-10
Calls, Traps, and Returns
Calls and traps accomplish the same functional task (that is, a subfunction is
called and executed, and control is then returned to the calling function). Traps
offer several advantages. Among them are the following:
- You can use traps to indirectly call functions. This is particularly beneficial
when a kernel of code contains the basic subfunctions to be used by appli-
cations. In this case, the functions in the kernel can be modified and relo-
cated without the need to recompile each application.
Fetch CALL Decode CALL Read CALL Execute CALL Fetch First
(Store PC Subroutine
on Stack) Instruction
H3
H1
Data PC Inst 1
The interlocked operations use the two external flag pins, XF0 and XF1. XF0
must be configured as an output pin; XF1 is an input pin. When configured in
this manner, XF0 signals an interlock operation request, and XF1 acts as an
acknowledge signal for the requested interlocked operation. In this mode, XF0
and XF1 are treated as active-low signals.
The external timing for the interlocked loads and stores is the same as for stan-
dard loads and stores. The interlocked loads and stores may be extended like
standard accesses by using the appropriate ready signal (RDYint or XRDYint).
(RDYint and XRDYint are a combination of external ready input and software
wait states. Refer to Chapter 7, External Bus Operation, for more information
on ready generation.)
6-12
Interlocked Operations
The read/write operation is identical to any other read/write cycle except for
the special use of XF0 and XF1. The src operand for LDFI and LDII is always
a direct or indirect memory address. XF0 is set to 0 only if the src is located
off-chip; that is, STRB, MSTRB, or IOSTRB is active, or the src is one of the
on-chip peripherals. If on-chip memory is accessed, then XF0 is not asserted,
and the operation is as an LDF or LDI from internal memory.
As in the case for LDFI and LDII, the dst of STFI and STII affects XF0. If dst
is located off-chip (STRB, MSTRB, or IOSTRB is active) or the dst is one of
the on-chip peripherals, XF0 is set to 1. If on-chip memory is accessed, then
XF0 is not asserted and the operations are as an STF or STI to internal
memory.
While the LDFI, LDII, and SIGI instructions are waiting for XF1 to be set to 0,
you can interrupt them. LDFI and LDII require a ready signal (RDYint or‘
XRDYint) in order to be interrupted. Because interrupts are taken on bus cycle
boundaries (see Section 6.6), an interrupt may be taken any time after a valid
ready. This allows you to implement protection mechanisms against deadlock
conditions by interrupting an interlocked load that has taken too long. Upon re-
turn from the interrupt, the next instruction is executed. The STFI and STII
instructions are not interruptible. Since the STFI and STII instructions com-
plete when ready is signaled, the delay until an interrupt can occur is the same
as for any other instruction.
Example 6–8 shows how a location COUNT may contain a count of the num-
ber of times a particular operation needs to be performed. This operation may
be performed by any processor in the system. If the count is 0, the processor
waits until it is nonzero before beginning processing. The example also shows
the algorithm for modifying COUNT correctly.
Figure 6–2 illustrates multiple TMS320C3xs sharing global memory and using
the interlocked instructions as in Example 6–9, Example 6–10, and
Example 6–11.
6-14
Interlocked Operations
Global Memory
ADDR
CTRL
DATA
Arbitration Logic
Lock, Count, or S
Local Local
Memory Memory
Indivisibility of V(S) and P(S) means that when these processes access and
modify the semaphore S, they are the only processes accessing and modify-
ing S.
To enter a critical section, a P operation is performed on a common sema-
phore, say S (S is initialized to 1). The first processor performing P(S) will be
able to enter its critical section. All other processors are blocked because S
has become 0. After leaving its critical section, the processor performs a V(S),
thus allowing another processor to execute P(S) successfully.
The TMS320C3x code for V(S) is shown in Example 6–9; code for P(S) is
shown in Example 6–10. Compare the code in Example 6–10 to the code in
Example 6–8.
Processor #1 runs until it executes the SIGI. It then waits until processor #2
executes a SIGI. At this point, the two processors have synchronized and con-
tinue execution.
6-16
Interlocked Operations
SIGI
(WAIT)
At powerup, the state of the TMS320C3x processor is undefined. You can use
the RESET signal to place the processor in a known state. This signal must
be asserted low for ten or more H1 clock cycles to guarantee a system reset.
H1 is an output clock signal generated by the TMS320C3x (see Chapter 13
for more information).
Reset affects the other pins on the device in either a synchronous or asynchro-
nous manner. The synchronous reset is gated by the TMS320C3x’s internal
clocks. The asynchronous reset directly affects the pins and is faster than the
synchronous reset. Table 6–3 shows the state of the TMS320C3x’s pins after
RESET = 0. Each pin is described according to whether the pin is reset syn-
chronously or asynchronously.
6-18
Reset Operation
6-20
Reset Operation
- The external bus control registers are reset. The reset values of the control
registers are described in Chapter 7.
- The reset vector is read from memory location 0h and loaded into the PC.
This vector contains the start address of the system reset routine.
Multiple TMS320C3xs driven by the same system clock may be reset and syn-
chronized. When the 1 to 0 transition of RESET occurs, the processor is placed
on a well-defined internal phase, and all of the TMS320C3xs will come up on
the same internal phase.
6-22
Interrupts
6.6 Interrupts
The TMS320C3x supports multiple internal and external interrupts, which can
be used for a variety of applications. This section discusses the operation of
these interrupts.
Internal Interrupt
Set Signal EINTn(CPU)
Interrupt GIE(CPU)
Flag (n)
INTn Set Q Internal To
DQ D Q D Q Interrupt Control
Processor Section
CLK CLK CLK RESET
Internal Interrupt
Clear/Acknowledge GIE(DMA)
H1 H3 H1 Signal
EINTn(DMA)
External interrupts are latched internally on the falling edge of H1 (see Chapter
13 for timing information). An external interrupt must be held low for at least
one H1/H3 cycle to be recognized by the TMS320C3x. Interrupts should be
held low for only one or two H1 falling edges. If the interrupt is held low for three
or more H1 falling edges, multiple interrupts may be recognized.
Table 6–4. Reset, Interrupt, and Trap-Vector Locations for the TMS320C30/TMS320C31
Microprocessor Mode
Address Routine
00h RESET
01h INT0
02h INT1
03h INT2
04h INT3
05h XINT0
06h RINT0
07h XINT1†
08h RINT1†
09h TINT0
0Ah TINT1
0Bh DINT
0Ch
Reserved
1Fh
20h TRAP 0
3Bh TRAP 27
6-24
Interrupts
Table 6–5. Reset, Interrupt, and Trap-Vector Locations for the TMS320C31 Microcomputer
Boot Mode
Address Description
809FC1 INT0
809FC2 INT1
809FC3 INT2
809FC4 INT3
809FC5 XINT0
809FC6 RINT0
809FC7 Reserved
809FC8 Reserved
809FC9 TINT0
809FCA TINT1
809FCB DINT0
809FCC–809FDF Reserved
809FE0 TRAP0
809FE1 TRAP1
• •
• •
• •
809FFB TRAP27
809FFC–809FFF Reserved
When two interrupts occur in the same clock cycle or when two previously
received interrupts are waiting to be serviced, one interrupt will be serviced be-
fore the other. The CPU handles this prioritization by servicing the interrupt
with the least priority. Table 6–6 shows the priorities assigned to the reset and
interrupt vectors.
The CPU controls all prioritization of interrupts (see Table 6–6 for reset and in-
terrupt vector locations and priorities).
The CPU global interrupt enable bit (GIE) located in the CPU status regis-
ter (ST) controls all maskable CPU interrupts. When this bit is set to 1, the
CPU responds to an enabled interrupt. When this bit is cleared to 0, all
CPU interrupts are disabled. Refer to subsection 3.1.7 on page 3-4 for
more information.
6-26
Interrupts
The interrupt flag register bits may be read and written under software control.
Writing a 1 to an IF register bit sets the associated interrupt flag to 1. Similarly,
writing a 0 resets the corresponding interrupt flag to 0. In this way, all interrupts
may be triggered and/or cleared through software. Since the interrupt flags
may be read, the interrupt pins may be polled in software when an interrupt-dri-
ven interface is not required.
Internal interrupts operate in a similar manner. In the IF register, the bit corre-
sponding to an internal interrupt may be read and written through software.
Writing a 1 sets the interrupt latch; writing a 0 clears it. All internal interrupts
are one H1/H3 cycle in length.
The CPU global interrupt enable bit (GIE), located in the CPU status register
(ST), controls all CPU interrupts. All DMA interrupts are controlled by the DMA
global interrupt enable bit, which is not dependent on ST(GIE) and is local to
the DMA. The DMA global interrupt enable bit is dependent, in part, on the
state of the DMA SYNC bits. It is not directly accessible through software (see
Chapter 8). The AND of the interrupt flag bit and the interrupt enables is then
connected to the interrupt processor.
The ’C3x allows the CPU and DMA coprocessor to respond to and process in-
terrupts in parallel. Figure 6–5 on page 6-28 shows interrupt processing flow;
for exact sequence, refer to Table 6–7 on page 6-29.
No Is an Enabled
Interrupt Set
?
Yes
If Enabled, If Enabled,
Interrupt Is Interrupt Is
a CPU Interrupt a DMA Interrupt
Disable Interrupts
Clear Interrupt Flag
GIE← 0
PC ← Interrupt Vector
6-28
Interrupts
2 Temporarily disable interrupt until GIE is cleared. — interrupt prog a prog a–1
4 Clear Interrupt flag; clear GIE bit; store return address — — — interrupt
to stack.
8 Execute first instruction of interrupt service routine. isr4 isr3 isr2 isr1
In the CPU interrupt processing cycle (left side of Figure 6–5), the correspond-
ing interrupt flag in the IF register is cleared, and interrupts are globally dis-
abled (GIE = 0). The CPU completes all fetched instructions. The current PC
is pushed to the top of the stack. The interrupt vector is fetched and loaded into
the PC, and the CPU starts executing the first instruction in the interrupt ser-
vice routine (ISR).
If you wish to make the interrupt service routine interruptible, you can set the
GIE bit to 1 after entering the ISR.
The DMA interrupt processing cycle (right side of Figure 6–5) is similar to that
of the CPU. After the pertinent interrupt flag is cleared, the DMA coprocessor
proceeds according to the status of the SYNC bits in the DMA coprocessor
global control register.
- Interrupts are disabled during an RPTS and during a delayed branch (until
the three instructions following a delayed branch are completed). Inter-
rupts are held until after the branch.
J If the interrupt occurs in the first cycle of the fetch of an instruction, the
fetched instruction is discarded (not executed), and the address of
that instruction is pushed to the top of the system stack.
J If the interrupt occurs after first cycle of the fetch (in the case of a multi-
cycle fetch due to wait states), that instruction is executed, and the ad-
dress of the next instruction to be fetched is pushed to the top of the
system stack.
CPU interrupt latency, defined as the time from the acknowledgement of the
interrupt to the execution of the first interrupt service routine (ISR) instruction,
is at least eight cycles. This is explained in Table 6–7 on page 6-29, where the
interrupt is treated as an instruction. It assumed that all of the instructions are
single-cycle instructions.
If the DMA is not using interrupts for synchronization of transfers, it will not be
affected by the processing of the CPU interrupts. Detected interrupts are re-
sponded to by the CPU and DMA on instruction fetch boundaries only. Since
instruction fetches are halted due to pipeline conflicts or when executing
instructions in an RPTS loop, interrupts will not be responded to until instruc-
tion fetching continues. It is therefore possible to interrupt the CPU and DMA
simultaneously with the same or different interrupts and, in effect, synchronize
their activities. For example, it may be necessary to cause a high-priority DMA
transfer that avoids bus conflicts with the CPU (that is, that makes the DMA
higher priority than the CPU). This may be accomplished by using an interrupt
that causes the CPU to trap to an interrupt routine that contains an IDLE
instruction. Then if the same interrupt is used to synchronize DMA transfers,
the DMA transfer counter can be used to generate an interrupt and thus return
control to the CPU following the DMA transfer.
Since the DMA and CPU share the same set of interrupt flags, the DMA may
clear an interrupt flag before the CPU can respond to it. For example, if the
CPU interrupts are disabled, the DMA can respond to interrupts and thus clear
the associated interrupt flags.
6-30
Interrupts
The GIE bit is set to 0 by an interrupt. This can cause a processing error if any
code following within two cycles of the interrupt recognition attempts to read
or modify the status register. For example, if the status register is being pushed
onto the stack, it will be stored incorrectly if an interrupt was acknowledged two
cycles before the store instruction.
the PUSH ST instruction will save the ST contents in memory, which includes
GIE = 0. Since the device is expected to have GIE = 1, the POP ST instruction
will put the wrong status register value into the ST.
A similar situation may occur if the GIE bit = 1 and an instruction executes that
is intended to modify the other status bits and leave the GIE bit set. In the
above example, this erroneous setting would occur if the interrupt were recog-
nized two cycles before the POP ST instruction. In that case, the interrupt
would clear the GIE bit, but the execution of the POP instruction would set the
GIE bit. Since the interrupt has been recognized, the interrupt service routine
will be entered with interrupts enabled, rather than disabled as expected.
One solution is to use traps. For example, you can use TRAP 0 to reset GIE
and use TRAP 1 to set GIE. This is accomplished by making TRAP 0 and
TRAP 1 be the instructions RETS and RETI, respectively.
- The status register global interrupt enable (GIE) bit may be erroneously
reset to 0 (disabled setting) if all of the following conditions are true:
J A conditional trap instruction (TRAPcond) has been fetched,
J The condition for the trap is false, and
J A pipeline conflict has occurred, resulting in a delay in the decode or
read phases of the instruction.
During the decode phase of a conditional trap, interrupts are temporarily
disabled to ensure that the trap will execute before a subsequent interrupt.
If a pipeline conflict occurs and causes a delay in execution of the condi-
tional trap, the interrupt disabled condition may become the last known
condition of the GIE bit. In the case that the trap condition is false, inter-
rupts will be permanently disabled until the GIE bit is intentionally set. The
condition does not present itself when the trap condition is true, because
normal operation of the instruction causes the GIE to be reset, and stan-
dard coding practice will set the GIE to 1 before the trap routine is exited.
Several instruction sequences that can cause pipeline conflicts have been
found:
J LDI mem,SP
TRAPcond n
J LDI mem,SP
NOP
TRAPcond n
6-32
Interrupts
J STI SP,mem
TRAPcond n
J STI Rx,*ARy
LDI *ARx,Ry
||LDI *ARz,Rw
TRAPcond n
Other similar conditions may also cause a delay in the execution. There-
fore, the following solution is recommended to avoid or rectify the problem.
Insert two NOP instructions immediately prior to the TRAPcond instruc-
tion. One NOP is insufficient in some cases, as illustrated in the second
bulleted item, above. This eliminates the opportunity for any pipeline con-
flicts in the immediately preceding instructions and enables the conditional
trap instruction to execute without delays.
- Asynchronous accesses to the interrupt flag register (IF) can cause the
TMS320C3x to fail to recognize and service an interrupt. This may occur
when an interrupt is generated and is ready to be latched into the IF regis-
ter on the same cycle that the IF is being written to by the CPU. Note that
logic operations (AND, OR, XOR) may write to the IF register.
The logic currently gives the CPU write priority; consequently, the as-
serted interrupt might be lost. This is particularly true if the asserted inter-
rupt has been generated internally (for example, a direct memory access
(DMA) interrupt). This situation can arise as a result of a decision to poll
certain interrupts or a desire to clear pending interrupts due to a long pulse
width. In the case of a long pulse width, the interrupt may be generated
after the CPU responds to the interrupt and attempts to automatically clear
it by the interrupt vector process.
The recommended solution is not to use the interrupt polling technique but
to design the external interrupt inputs to have pulse widths of between 1
and 2 instruction cycles. The alternative to strict polling is to periodically
enable and disable the interrupts that would be polled, thereby allowing
the normal interrupt vectoring to take place; that automatically clears the
interrupt flag without affecting other interrupts. If you need to clear a pend-
ing interrupt, it is recommended that you use a memory location to indicate
that the interrupt is invalid. Then the interrupt service routine can read that
location, clear it (if the pending interrupt is invalid), and return immediately.
The following code fragments show how a dummy interrupt due to a long
interrupt pulse might be handled:
ISR_n: PUSH ST ;
PUSH DP ; Save registers
PUSH R0 ;
LDI 0, DP ; Clear Data Page Pointer
ISR_n_START: .
. ; Normal interrupt service routine
. ; Code goes here
LDI INT_Fn, R0 ;
AND IF, R0 ; If ones in IF reg match
BZ ISR_n_END ; INT_Fn, exit ISR
LDI 0, DP ; Otherwise clear
LDI 0FFFFh, R0 ; DP and set
STI R0, @DUMMY_INT ; DUMMY_INT negative & exit
ISR_n_END:
POP R0 ;
POP DP ; Exit ISR
POP ST ;
RETI ;
The CPU controls all prioritization of interrupts (see Table 6–8 for reset and in-
terrupt vector locations and priorities). If the DMA is not using interrupts for
synchronization of transfers, it will not be affected by the processing of the
CPU interrupts. Detected interrupts are responded to by the CPU and DMA
on instruction fetch boundaries only. If instruction fetches are halted due to
pipeline conflicts or when executing instructions in an RPTS loop, interrupts
will not be responded to until instruction fetching continues. It is therefore pos-
sible to interrupt the CPU and DMA simultaneously with the same or different
interrupts and, in effect, synchronize their activities. For example, it may be
necessary to cause a high-priority DMA transfer that avoids bus conflicts with
the CPU, that is, make the DMA higher priority than the CPU. This may be ac-
complished by using an interrupt that causes the CPU to trap to an interrupt
routine that contains an IDLE instruction. Then if the same interrupt is used to
synchronize DMA transfers, the DMA transfer counter can be used to generate
an interrupt, thereby returning control to the CPU following the DMA transfer.
Since the DMA and CPU share the same set of interrupt flags, the DMA can
clear an interrupt flag before the CPU can respond to it. For example, if the
CPU interrupts are disabled, the DMA can respond to interrupts and thus clear
the associated interrupt flags.
6-34
Interrupts
The TMS320LC31 CPU has been enhanced by the addition of two power man-
agement modes:
- IDLE2, and
- LOPOWER.
6.7.1 IDLE2
The H1 instruction clock is held high until one of the four external interrupts is
asserted. In IDLE2 mode, the TMS320C31 behaves as follows:
- The CPU, peripherals, and internal memory retain their previous states.
- When the device is in the functional (non-emulation) mode, the clocks stop
with H1 high and H3 low (see Figure 6–6).
- The ’C31 will remain in IDLE2 until one of the four external interrupts
(INT3–INT0) is asserted for at least one H1 cycle. When one of the four
interrupts is asserted, the clocks start after a delay of one H1 cycle. When
the clocks restart, they may be in the opposite phase (that is, H1 may be
high if H3 was high before the clocks were stopped; H3 may be high if H1
_
was previously high). The H1 and H3 clocks will remain 180 out of phase
with each other (see Figure 6–7).
- The instruction following the IDLE2 instruction will not be executed until
after the return from interrupt instruction (RETI) is executed.
- When the device is in emulation mode, the H1 and H3 clocks will continue
to run normally and the CPU will operate as if an IDLE instruction had been
executed. The clocks continue to run for correct operation of the emulator.
6-36
TMS320C31 Power Management Modes
Delayed Branch
For correct device operation, the three instructions after a delayed
branch should not be IDLE or IDLE2 instructions.
CLKIN
Idle 2 Execution
H3
H1
ADDR
Data
Fetch 1st
Instr of
Interrupt Vector Service
Clocks Driven Read Routing
CLKIN
H3
H1
INT3 to
INT0
INT3 to
INT0 Flag
Data
6.7.2 LOPOWER
In the LOPOWER (low power) mode, the CPU continues to execute instruc-
tions, and the DMA can continue to perform transfers, but at a reduced clock
rate of CLKIN frequency .
16
A TMS320C31 with a CLKIN frequency of 32 MHz will perform identically to
a 2 MHz TMS320C31 with an instruction cycle time of 1,000 ns.
CLKIN
LOPOWER Read
H3
H1
32 CLKIN
CLKIN
MAXSPEED Read
H3
H1
32 CLKIN
6-38
Chapter 7
Memories and external peripheral devices are accessible through two external
interfaces on the TMS320C30:
- the primary bus, and
- the expansion bus.
On the TMS320C31, one bus, the primary bus, is available to access external
memories and peripheral devices. You can control wait-state generation, per-
mitting access to slower memories and peripherals, by manipulating
memory-mapped control registers associated with the interfaces and by using
an external input signal.
Topic Page
7-1
External Interface Control Registers
7-2
External Interface Control Registers
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx BNKCMP WTCNT SWW HIZ NOHOLD HOLDST
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R
1 NOHOLD 0 Port hold signal. NOHOLD allows or disallows the port to be held by an
external HOLD signal. When NOHOLD = 1, the TMS320C3x takes over
the external bus and controls it, regardless of serviced or pending re-
quests by external devices. No hold acknowledge (HOLDA) is asserted
when a HOLD is received. However, it is asserted if an internal hold is
generated (HIZ = 1). NOHOLD is set to 0 at reset.
2 HIZ 0 Internal hold. When set (HIZ = 1), the port is put in hold mode. This is
equivalent to the external HOLD signal. By forcing a high-impedance
condition, the TMS320C3x can relinquish the external memory port
through software. HOLDA goes low when the port is placed in the
high-impedance state. HIZ is set to 0 at reset.
4–3 SWW 11 Software wait mode. In conjunction with WTCNT, this two-bit field de-
fines the mode of wait-state generation. It is set to 1 1 at reset.
7–5 WTCNT 111 Software wait mode. This three-bit field specifies the number of cycles
to use when in software wait mode for the generation of internal wait
states. The range is 0 (WTCNT = 0 0 0) to 7 (WTCNT = 1 1 1) H1/H3
cycles. It is set to 1 1 1 at reset.
12–8 10000 Bank compare. This five-bit field specifies the number of MSBs of the
BNKCMP address to be used to define the bank size. It is set to 1 0 0 0 0 at reset.
7-4
External Interface Control Registers
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx xx xx xx xx WTCNT SWW xx xx xx
R/W R/W R/W R/W R/W
4–3 SWW 11 Software wait-state generation. In conjunction with the WTCNT, this
two-bit field defines the mode of wait-state generation. It is set to 1 1
at reset.
7–5 WTCNT 111 Software wait mode. This three-bit field specifies the number of cycles
to use when in software wait mode for the generation of internal wait
states. The range is 0 (WTCNT = 0 0 0) to 7 ( WTCNT = 1 1 1) H1/H3
clock cycles. It is set to 1 1 1 at reset.
The parallel buses implement three mutually exclusive address spaces distin-
guished through the use of three separate control signals: STRB, MSTRB, and
IOSTRB. The STRB signal controls accesses on the primary bus, and the
MSTRB and IOSTRB control accesses on the expansion bus. Since the two
buses are independent, you can make two accesses in parallel.
With the exception of bank switching and the external HOLD function (dis-
cussed later in this section), timing of primary bus cycles and MSTRB expan-
sion bus cycles are identical and are discussed collectively. The acronym
(M)STRB is used in references that pertain equally to STRB and MSTRB. Sim-
ilarly, (X)R/W, (X)A, (X)D, and (X)RDY are used to symbolize the equivalent
primary and expansion bus signals. The IOSTRB expansion bus cycles are
timed differently and are discussed independently.
The (M)STRB signal is low for the active portion of both reads and writes. The
active portion lasts one H1 cycle. Additionally, before and after the active por-
tion ((M)STRB low) of writes only, there is a transition cycle of H1. This transi-
tion cycle consists of the following sequence:
1) (M)STRB is high.
7-6
External Interface Timing
H3
H1
(M)STRB
(X)R/W
(X)A
(X)RDY
H3
H1
(M)STRB
(X)R/W
(X)A
(X)RDY
7-8
External Interface Timing
Figure 7–6 illustrates a read cycle with one wait state. Since (X)RDY = 1, the
read cycle is extended. (M)STRB, (X)R/W, and (X)A are also extended one
cycle. The next time (X)RDY is sampled, it is 0.
H3
H1
(M)STRB
XR/W
(X)A
(X)RDY
Extra
Cycle
Figure 7–7 illustrates a write cycle with one wait state. Since initially (X)RDY =
1, the write cycle is extended. (M)STRB, (X)R/W, and (X)A are extended one
cycle. The next time (X)RDY is sampled, it is 0.
H3
H1
(M)STRB
(X)R/W
(X)A
(X)RDY
Extra
Cycle
7-10
External Interface Timing
Figure 7–8 illustrates read and write cycles when IOSTRB is active and there
are no wait states. For IOSTRB accesses, reads and writes require a minimum
of two cycles. Some off-chip peripherals might change their status bits when
read or written to. Therefore, it is important to maintain valid addresses when
communicating with these peripherals. For reads and writes when IOSTRB is
active, IOSTRB is completely framed by the address.
H3
H1
IOSTRB
XR/W
XA
XRDY
Figure 7–9 illustrates a read with one wait state when IOSTRB is active, and
Figure 7–10 illustrates a write with one wait state when IOSTRB is active. For
each wait state added, IOSTRB, XR/W, and XA are extended one clock cycle.
Writes hold the data on the bus one additional cycle. The sampling of XRDY
is repeated each cycle.
H3
H1
IOSTRB
XR/W
XA
XD Read
XRDY
Extra
Cycle
7-12
External Interface Timing
H3
H1
IOSTRB
XR/W
XA
XD Write Data
XRDY
Extra
Cycle
Figure 7–11, Figure 7–12, Figure 7–13, Figure 7–14, Figure 7–15,
Figure 7–16, Figure 7–17, Figure 7–18, Figure 7–19, Figure 7–20, and
Figure 7–21 illustrate the various transitions between memory reads and
writes, and I/O writes over the expansion bus.
Figure 7–11. Memory Read and I/O Write for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XRDY
7-14
External Interface Timing
Figure 7–12. Memory Read and I/O Read for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XA Memory
I/O Address
Address
XD Read Read
XRDY
Figure 7–13. Memory Write and I/O Write for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XRDY
7-16
External Interface Timing
Figure 7–14. Memory Write and I/O Read for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XRDY
Figure 7–15. I/O Write and Memory Write for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XRDY
7-18
External Interface Timing
Figure 7–16. I/O Write and Memory Read for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XRDY
Figure 7–17. I/O Read and Memory Write for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XRDY
7-20
External Interface Timing
Figure 7–18. I/O Read and Memory Read for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XD Read Read
XRDY
Figure 7–19. I/O Write and I/O Read for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XA
XRDY
7-22
External Interface Timing
Figure 7–20. I/O Write and I/O Write for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XA
XRDY
Figure 7–21. I/O Read and I/O Read for Expansion Bus
H3
H1
MSTRB
IOSTRB
XR/W
XA
XD Read Read
XRDY
7-24
External Interface Timing
Figure 7–22 and Figure 7–23 illustrate the signal states when a bus is inactive
(after an IOSTRB or (M)STRB access, respectively). The strobes (STRB,
MSTRB and IOSTRB) and (X)R/W) go to 1. The address is undefined, and the
ready signal (XRDY or RDY) is ignored.
H3
H1
IOSTRB
XR/W
XA
XD Write Data
Bus Inactive
H3
H1
(M)STRB
(X)R/W
(X)A
Bus Inactive
7-26
External Interface Timing
Figure 7–24 illustrates the timing for HOLD and HOLDA. HOLD is an external
asynchronous input. There is a minimum of one cycle delay from the time when
the processor recognizes HOLD = 0 until HOLDA = 0. When HOLDA = 0, the
address, data buses, and associated strobes are placed in a high-impedance
state. All accesses occurring over an interface are complete before a hold is
acknowledged.
H3
H1
HOLD
HOLDA
STRB
R/W
D Write Data
Bus
Inactive
The four modes are used to generate the internal ready signal, RDYint, that
controls accesses. As long as RDYint = 1, the current external access is
delayed. When RDYint = 0, the current access completes. Since the use of
programmable wait states for both external interfaces is identical, only the pri-
mary bus interface is described in the following paragraphs.
7-28
Programmable Wait States
24-bit address
23 8 7 0
7-30
Programmable Bank Switching
The TMS320C3x has an internal register that contains the MSBs (as defined
by the BNKCMP field) of the last address used for a read or write over the pri-
mary interface. At reset, the register bits are set to 0. If the MSBs of the address
being used for the current primary interface read do not match those contained
in this internal register, a read cycle is not asserted for one H1/H3 clock cycle.
During this extra clock cycle, the address bus switches over to the new ad-
dress, but STRB is inactive (high). The contents of the internal register are re-
placed with the MSBs being used for the current read of the current address.
If the MSBs of the address being used for the current read match the bits in
the register, a normal read cycle takes place.
If repeated reads are performed from the same memory bank, no extra cycles
are inserted. When a read is performed from a different memory bank, memory
conflicts are avoided by the insertion of an extra cycle. This feature can be dis-
abled by setting BNKCMP to 0. The insertion of the extra cycle occurs only
when a read is performed. The changing of the MSBs in the internal register
occurs for all reads and writes over the primary interface.
Figure 7–26 illustrates the addition of an inactive cycle when switches be-
tween banks of memory occur.
H3
H1
STRB
R/W
RDY
Extra
Cycle
Peripherals
The TMS320C3x features two timers, two serial ports (one on the
TMS320C31), and an on-chip direct memory access (DMA) controller. These
peripheral modules are controlled through memory-mapped registers located
on the dedicated peripheral bus.
Topic Page
8-1
Timers
8.1 Timers
The TMS320C3x timer modules are general-purpose, 32-bit, timer/event
counters, with two signaling modes and internal or external clocking (see
Figure 8–1). You can use the timer modules to signal to the TMS320C3x or the
external world at specified intervals or to count external events. With an inter-
nal clock, you can use the timer to signal an external A/D converter to start a
conversion, or it can interrupt the TMS320C3x DMA controller to begin a data
transfer. The timer interrupt is one of the internal interrupts. With an external
clock, the timer can count external events and interrupt the CPU after a speci-
fied number of events. Each timer has an I/O pin that you can use as an input
clock to the timer, an output clock signal, or a general-purpose I/O pin.
32
32
Comparator
?
Period = Counter
Pulse Generator
INV
TSTAT
Timer Out
- Global-Control Register
The global-control register determines the operating mode of the timer,
monitors the timer status, and controls the function of the I/O pin of the timer.
- Period Register
The period register specifies the timer’s signaling frequency.
8-2
Timers
- Counter Register
The counter register contains the current value of the incrementing count-
er. You can increment the timer on the rising edge or the falling edge of the
input clock. The counter is zeroed and can cause an internal interrupt
whenever its value equals that in the period register. The pulse generator
generates two types of external clock signals: pulse or clock. The memory
map for the timer modules is shown in Figure 8–2.
Timer 0 Timer 1
Peripherals 8-3
Timers
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx TSTAT INV CLKSRC C/P HLD GO xx xx DATIN DATOUT I/O FUNC
R R/W R/W R/W R/W R/W R R/W R/W R/W
2 DATOUT 0 DATOUT drives TCLK when the TMS320C3x is in I/O port mode.
You can use DATOUT as an input to the timer.
6 GO 0 The GO bit resets and starts the timer counter. When GO = 1 and
the timer is not held, the counter is zeroed and begins increment-
ing on the next rising edge of the timer input clock. The GO bit is
cleared on the same rising edge. GO = 0 has no effect on the
timer.
7 HLD 0 Counter hold signal. When this bit is 0, the counter is disabled and
held in its current state. If the timer is driving TCLK, the state of
TCLK is also held. The internal divide-by-two counter is also held
so that the counter can continue where it left off when HLD is set to
1. You can read and modify the timer registers while the timer is
being held. RESET has priority over HLD. Table 8–2 shows the
effect of writing to GO and HLD.
† x = 0 or 1
8-4
Timers
10 INV 0 Inverter control bit. If an external clock source is used and INV = 1, the
external clock is inverted as it goes into the counter. If the output of the
pulse generator is routed to TCLK and INV = 1, the output is inverted
before it goes to TCLK (see Figure 8–1). If INV = 0, no inversion is
performed on the input or output of the timer. The INV bit has no effect,
regardless of its value, when TCLK is used in I/O port mode.
11 TSTAT 0 This bit indicates the status of the timer. It tracks the output of the
uninverted TCLK pin. This flag sets a CPU interrupt on a transition from
0 to 1. A write has no effect.
Peripherals 8-5
Timers
8-6
Timers
1/f(CLKSRC)
period register/f(CLKSRC)
1/f(CLKSRC)
2/f(H1)
period register/f(CLKSRC)
2 x period register/f(CLKSRC)
TINT TINT
(b) TSTAT and timer output (INV = 0) when C/P = 1 (clock mode)
The rate of timer signaling is determined by the frequency of the timer input
clock and the period register. The following equations are valid with either an
internal or an external timer clock:
Table 8–2 shows the result of a write using specified values of the GO and HLD
bits in the global control register.
Peripherals 8-7
Timers
1 0 All timer operations are held, including zeroing of the counter. The
GO bit is not cleared until the timer is taken out of hold.
Certain boundary conditions affect timer operation. These conditions are listed
below:
- When the period and counter registers are 0, the operation of the timer is
dependent upon the C/P mode selected. In pulse mode (C/P = 0), TSTAT
is set and remains set. In clock mode (C/P = 1), the width of the cycle is
2/f(H1), and the external clocks are ignored.
- When the counter register is not 0 and the period register = 0, the counter
will count, roll over to 0, and then behave as described above.
- When the counter register is set to a value greater than the period register,
the counter may overflow when being incremented. Once the counter
reaches its maximum 32-bit value (0FFFFFFFFh), it simply clocks over to
0 and continues.
Writes from the peripheral bus override register updates from the counter and
new status updates to the control register.
8-8
Timers
Figure 8–6 provides some examples of the TCLKx output when the period reg-
ister is set to various values and clock or pulse mode is selected.
4H1
H1
6H1
H1
4H1
2H1
8H1
4H1
12H1
6H1
Peripherals 8-9
Timers
Internal External
DATIN
I/O = 0
(a)
Internal External
DATOUT TCLK
DATIN
I/O = 1
(b)
- If CLKSRC = 1 and FUNC = 1, the timer input comes from the internal
clock, and the timer output goes to TCLK. This value can be inverted using
INV, and you can read in DATIN the value output on TCLK.
- If CLKSRC = 0 and FUNC = 0, the timer is driven according to the status
of the I/O bit. If I/O = 0, the timer input comes from TCLK. This value can
be inverted using INV, and you can read in DATIN the value of TCLK. If I/O
= 1, TCLK is an output pin. Then, TCLK and the timer are both driven by
DATOUT. All 0-to-1 transitions of DATOUT increment the counter. INV has
no effect on DATOUT. You can read in DATIN the value of DATOUT.
- If CLKSRC = 0 and FUNC = 1, TCLK drives the timer. If INV = 0, all 0-to-1
transitions of TCLK increment the counter. If INV = 1, all 1-to-0 transitions
of TCLK increment the counter. You can read in DATIN the value of TCLK.
8-10
Timers
Figure 8–4 on page 8-6 shows the four timer modes of operation.
f(timer clock)
f(interrupt) = , where
period register
f(interrupt) = timer frequency
f(timer clock) = interrupt frequency
f(timer clock)
f(interrupt) = , where
2 x period register
f(interrupt) = timer frequency
f(timer clock) = interrupt frequency
Peripherals 8-11
Timers
1) Halt the timer by clearing the GO/HLD bits of the timer global-control regis-
ter. To do this, write a 0 to the timer global-control register. Note that the
timers are halted on RESET.
2) Configure the timer via the timer global-control register (with GO = HLD
= 0 ), the timer counter register, and timer period register, if necessary.
3) Start the timer by setting the GO/HLD bits of the timer global-control
register.
8-12
Serial Ports
The global-control register controls the global functions of the serial port and
determines the serial-port operating mode. Two port control registers control
the functions of the six serial port pins. The transmit buffer contains the next
complete word to be transmitted. The receive buffer contains the last complete
word received. Three additional registers are associated with the transmit/re-
ceive sections of the serial-port timer. A serial-port block diagram is shown in
Figure 8–8 on page 8-14, and the memory map of the serial ports is shown in
Figure 8–9 on page 8-15.
Peripherals 8-13
Serial Ports
CLKR CLKX
Receive TSTAT TSTAT Transmit
CLKR CLKX
Timer (16) Timer (16)
RSR XSR
(32) (32)
DX
8-14
Serial Ports
1 XRDY 1 If XRDY = 1, the transmit buffer has written the last bit of data to the shifter
and is ready for a new word. A three H1/H3 cycle delay occurs from the
loading of the transmit shifter until XRDY is set to 1. The rising edge of this
signal sets XINT. If XRDY = 0, the transmit buffer has not written the last
bit of data to the transmit shifter and is not ready for a new word. XRDY =
1 at reset.
2 FSXOUT 0 This bit configures the FSX pin as an input (FSXOUT = 0) or an output
(FSXOUT = 1).
Peripherals 8-15
Serial Ports
8 XVAREN 0 This bit specifies fixed (XVAREN = 0) or variable (XVAREN = 1) data rate
signaling when transmitting. With a fixed data rate, FSX is active for at least
one XCLK cycle and then goes inactive before transmission begins. With
variable data rate, FSX is active while all bits are being transmitted. When
you use an external FSX and variable data rate signaling, the DX pin is driv-
en by the transmitter when FSX is held active or when a word is being
shifted out.
9 RVAREN 0 This bit specifies fixed (RVAREN = 0) or variable (RVAREN = 1) data rate
signaling when receiving. With a fixed data rate, FSR is active for at least
one RCLK cycle and then goes inactive before the reception begins. With
variable data rate, FSR is active while all bits are being received.
10 XFSM 0 Transmit frame sync mode. Configures the port for continuous mode oper-
ation(XFSM = 1) or standard mode (XFSM = 0). In continuous mode, only
the first word of a block generates a sync pulse, and the rest are simply
transmitted continuously to the end of the block. In standard mode, each
word has an associated sync pulse.
11 RFSM 0 Receive frame sync mode. Configures the port for continuous mode
(RFSM =1) or standard mode (RFSM = 0) operation. In continuous mode,
only the first word of a block generates a sync pulse, and the rest are simply
received continuously without expectation of another sync pulse. In stan-
dard mode, each word received has an associated sync pulse.
8-16
Serial Ports
19–18 XLEN 00 These two bits define the word length of serial data transmitted. All data
is assumed to be right-justified in the transmit buffer when fewer than 32
bits are specified.
0 0 --- 8 bits 1 0 --- 24 bits
0 1 --- 16 bits 1 1 --- 32 bits
21–20 RLEN 00 These two bits define the word length of serial data received. All data is
right-justified in the receive buffer.
0 0 --- 8 bits 1 0 --- 24 bits
0 1 --- 16 bits 1 1 --- 32 bits
22 XTINT 0 Transmit timer interrupt enable. If XTINT = 0, the transmit timer interrupt
is disabled. If XTINT = 1, the transmit timer interrupt is enabled.
24 RTINT 0 Receive timer interrupt enable. If RTINT = 0, the receive timer interrupt is
disabled. If RTINT = 1, the receive timer interrupt is enabled.
26 XRESET 0 Transmit reset. If XRESET = 0, the transmit side of the serial port is reset.
To take the transmit side of the serial port out of reset, set XRESET to 1.
However, do not set XRESET to 1 until at least three cycles after XRESET
goes inactive. This applies only to system reset. Setting XRESET to 0 does
not change the contents of any of the serial-port control registers. It places
the transmitter in a state corresponding to the beginning of a frame of data.
Resetting the transmitter generates a transmit interrupt. Reset this bit dur-
ing the time the mode of the transmitter is set. You can toggle XFSM with-
out resetting the global-control register.
Peripherals 8-17
Serial Ports
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
DRP DXP CLKRP CLKXP RFSM XFSM RVAREN XVAREN RCLK XCLK HS RSR XSR FSXOUT XRDY RRDY
SRCE SRCE FULL EMPTY
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R R R/W R R
8-18
Serial Ports
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
FSX FSX FSX FSX DX DX DX DX CLKX CLKX CLKX CLKX
xx xx xx xx
DATIN DATOUT I/O FUNC DATIN DATOUT I/O FUNC DATIN DATOUT I/O FUNC
R R/W R/W R/W R R/W R/W R/W R R/W R/W R/W
Peripherals 8-19
Serial Ports
This 32-bit port control register is controlled by the function of the serial port
FSR, DR, and CLKR pins. At reset, all bits are set to 0. Table 8–5 defines the
register bits, the bit names, and functions. Figure 8–12 illustrates this port con-
trol register.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
FSR FSR FSR FSR DR DR DR DR CLKR CLKR CLKR CLKR
xx xx xx xx
DATIN DATOUT I/O FUNC DATIN DATOUT I/O FUNC DATIN DATOUT I/O FUNC
R R/W R/W R/W R R/W R/W R/W R R/W R/W R/W
8-20
Serial Ports
Peripherals 8-21
Serial Ports
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx RTSTAT xx RCLKSRC RC/P RHLD RGO XTSTAT xx XCLKSRC XC/P XHLD XGO
R R/W R/W R R/W R/W R R/W R/W R/W
15 0
Transmit Counter
8-22
Serial Ports
15 0
Transmit Period
31 24 23 16 15 8 7 0
Peripherals 8-23
Serial Ports
Data is shifted to the left (LSB to MSB). Figure 8–17 illustrates what happens
when words less than 32 bits are shifted into the serial port. In this figure, it is
assumed that an 8-bit word is being received and that the upper three bytes
of the receive buffer are originally undefined. In the first portion of the figure,
byte a has been shifted in. When byte b is shifted in, byte a is shifted to the left.
When the data receive register is read, both bytes a and b are read.
31 24 23 16 15 8 7 0
After Byte a X X X a
After Byte b X X a b
8-24
Serial Ports
XSR XSR
DATOUT DATAOUT
DATIN DATIN
Internal External
Internal External TSTAT
TSTAT Timer in
Internal CLKX
Timer in Clock XSR
XSR CLKX DATOUT (NC)
DATOUT (NC) DATIN
DATIN
Peripherals 8-25
Serial Ports
CLKX CLKX
XSR XSR
DATOUT (NC) INV DATOUT (NC)
DATIN DATIN INV
Internal External
TSTAT
Timer
CLKX
XSR
DATOUT (NC) INV
DATIN
FUNC = 1 (Serial-Port Mode)
XCLKSRCE = 0 (Input Serial-Port CLK)
XCLKSRC = 0 (External CLK for Timer)
(c)
8-26
Serial Ports
Transmit data is clocked out on the rising edge of the selected serial-port clock.
Receive data is latched into the receive shift register on the falling edge of the
serial-port clock. All data is transmitted and loaded MSB first and right-justi-
fied. If fewer than 32 bits are transferred, the data are right-justified in the 32-bit
transmit and receive buffers. Therefore, the LSBs of the transmit buffer are
the bits that are transmitted.
The transmit ready (XRDY) signal specifies that the data-transmit register
(DXR) is available to be loaded with new data. XRDY goes active as soon as
the data is loaded into the transmit shift register (XSR). The last word may still
be shifting out when XRDY goes active. If DXR is loaded before the last word
has completed transmission, the data bits transmitted are consecutive; that is,
the LSB of the first word immediately precedes the MSB of the second, with
all signaling valid as in two separate transmits. XRDY goes inactive when DXR
is loaded and remains inactive until the data is loaded into the shifter.
The receive ready (RRDY) signal is active as long as a new word of data is
loaded into the data receive register and has not been read. As soon as the
data is read, the RRDY bit is turned off.
An input FSX in the fixed data rate mode should go active for at least one serial
clock cycle and then inactive to initiate the data transfer. The transmitter then
sends the number of bits specified by the LEN bits. In the variable data-rate
mode, the transmitter begins sending from the time FSX goes active until the
number of specified bits has been shifted out. In the variable data-rate mode,
when the FSX status changes prior to all the data bits being shifted out, the
transmission completes, and the DX pin is placed in a high-impedance state.
An FSR input is exactly complementary to the FSX.
When using an external FSX, if DXR and XSR are empty, a write to DXR results
in a DXR-to-XSR transfer. This data is held in the XSR until an FSX occurs.
When the external FSX is received, the XSR begins shifting the data. If XSR
is waiting for the external FSX, a write to DXR will change DXR, but a DXR-to-
XSR transfer will not occur. XSR begins shifting when the external FSX is re-
ceived, or when it is reset using XRESET.
Peripherals 8-27
Serial Ports
Similarly with FSR, the receiver continues shifting in new data and loading
DRR. If the data-receive buffer is not read before the next word is shifted in,
you will lose subsequent incoming data. You can use the RFSM bit to terminate
the receive-continuous mode.
Handshake Mode
The handshake mode (HS = 1) allows for direct connection between proces-
sors. In this mode, all data words are transmitted with a leading 1 (see
Figure 8–20). For example, if an eight-bit word is to be transmitted, the first bit
sent is a 1, followed by the eight-bit data word.
In this mode, once the serial port transmits a word, it will not transmit another
word until it receives a separately transmitted zero bit. Therefore, the 1 bit that
precedes every data word is, in effect, a request bit.
DX 1
leading 1
After a serial port receives a word (with the leading 1) and that word has been
read from the DRR, the receiving serial port sends a single 0 to the transmitting
serial port. Thus, the single 0 bit acts as an acknowledge bit (see Figure 8–21).
This single acknowledge bit is sent every time the DRR is read, even if the DRR
does not contain new data.
DX 0
single 0
8-28
Serial Ports
When the serial port is placed in the handshake mode, the insertion and dele-
tion of a leading 1 for transmitted data, the sending of a 0 for acknowledgement
of received data, and the waiting for this acknowledge bit are all performed au-
tomatically. Using this scheme, it is simple to connect processors with no exter-
nal hardware and to guarantee secure communication. Figure 8–22 is a typi-
cal configuration.
CLKR CLKX
FSR FSX
DR DX
- The transmit timer interrupt: The rising edge of XTSTAT causes a sing-
le-cycle interrupt pulse to occur. When XTINT is 0, this interrupt pulse is
disabled.
- The receive timer interrupt: The rising edge of RTSTAT causes a single-
cycle interrupt pulse to occur. When RTINT is 0, this interrupt pulse is dis-
abled.
Peripherals 8-29
Serial Ports
The following paragraphs and figures illustrate the functional timing of the vari-
ous serial-port modes of operation. The timing descriptions are presented with
the assumption that all signal polarities are configured to be positive, that is,
CLKXP = CLKRP = DXP = DRP = FSXP = FSRP = 0. Logical timing, in situa-
tions where one or more of these polarities are inverted, is the same except
with respect to the opposite polarity reference points, that is, rising vs. falling
edges, etc.
Fixed data-rate serial-port transfers can occur in two varieties: burst mode and
continuous mode. In burst mode, transfers of single words are separated by
periods of inactivity on the serial port. In continuous mode, there are no gaps
between successive word transfers; the first bit of a new word is transferred
on the next CLKX/R pulse following the last bit of the previous word. This oc-
curs continuously until the process is terminated.
In burst mode with fixed data-rate timing, FSX/FSR pulses initiate transfers,
and each transfer involves a single word. With an internally generated FSX
(see Figure 8–23), transmission is initiated by loading DXR. In this mode,
there is a delay of approximately 2.5 CLKX cycles (depending on CLKX and
H1 frequencies) from the time DXR is loaded until FSX occurs. With an exter-
nal FSX, the FSX pulse initiates the transfer, and the 2.5-cycle delay effectively
becomes a setup requirement for loading DXR with respect to FSX. Therefore,
in this case, you must load DXR no later than three CLKX cycles before FSX
occurs. Once the XSR is loaded from the DXR, an XINT is generated.
8-30
Serial Ports
FSR/FSX (External)
FSX (Internal)
DX/DR A1 AN
In receive operations, once a transfer is initiated, FSR is ignored until the last
bit. For burst-mode transfers, FSR must be low during the last bit, or another
transfer will be initiated. After a full word has been received and transferred to
the DRR, an RINT is generated.
In fixed data-rate mode, you can perform continuous transfers even if R/XFSM
= 0, as long as properly timed frame synchronization is provided, or as long
as DXR is reloaded each cycle with an internally generated FSX (see
Figure 8–24).
CLKX/R
FSX (Internal)
FSR/FSX (External)
DR/DX A1 AN B1 BN C1
XINT XINT
RINT RINT
DXR Loaded XINT
Peripherals 8-31
Serial Ports
For receive operations and with externally generated FSX, once transfers
have begun, frame sync pulses are required only during the last bit transferred
to initiate another contiguous transfer. Otherwise, frame sync inputs are ig-
nored. Therefore, continuous transfers will occur if frame sync is held high.
With an internally generated FSX, there is a delay of approximately 2.5 CLKX
cycles from the time DXR is loaded until FSX occurs. This delay occurs each
time DXR is loaded; therefore, during continuous transmission, the instruction
that loads DXR must be executed by the N–3 bit for an N-bit transmission.
Since delays due to pipelining may vary, you should incorporate a conserva-
tive margin of safety in allowing for this delay.
Once the process begins, an XINT and an RINT are generated at the begin-
ning of each transfer. The XINT indicates that the XSR has been loaded from
DXR and can be used to cause DXR to be reloaded. To maintain continuous
transmission in fixed rate mode with frame sync, especially with an internally
generated FSX, DXR must be reloaded early in the ongoing transfer.
The RINT indicates that a full word has been received and transferred into the
DRR. RINT is therefore commonly used to indicate an appropriate time to read
DRR.
You can accomplish continuous serial-port transfers without the use of frame
sync pulses if R/XFSM are set to 1. In this mode, operation of the serial port
is similar to continuous operation with frame sync, except that a frame sync
pulse is involved only in the first word transferred, and no further frame sync
pulses are used. Following the first word transferred (see Figure 8–25), no in-
ternal frame sync pulses are generated, and frame sync inputs are ignored.
Additionally, you should set R/XFSM prior to or during the first word trans-
ferred; you must set R/XFSM no later than the transfer of the N–1 bit of the first
word, except for transmit operations. For transmit operations in the fixed data-
rate mode, XFSM must be set no later than the N–2 bit. You must clear
R/XFSM no later than the N–1 bit to be recognized in the current cycle.
8-32
Serial Ports
CLKX/R
FSR/FSX (External)
FSX (Internal)
DR/DX A1 AN B1 BN C1
DXR Loaded
DXR Loaded Load DXR Load DXR
Read DRR Read DRR
Timing of RINT and XINT and data transfers to and from DXR and DRR, re-
spectively, are the same as in fixed data-rate continuous mode with frame
sync. This mode of operation also exhibits the same delay of 2.5 CLKX cycles
after DXR is loaded before an internal FSX is generated. As in the case of con-
tinuous operation in fixed data-rate mode with frame sync, you must reload
DXR no later than transmission of the N–3 bit.
When you use continuous operation in fixed data-rate mode, R/XFSM can be
set and cleared as desired, even during active transfers, to enable or disable
the use of frame sync pulses as dictated by system requirements. Under most
conditions, the effect of changing the state of R/XFSM occurs during the trans-
fer in which the R/XFSM change was made, provided the change was made
early enough in the transfer. For transmit operations with internal FSX in fixed
data-rate mode, however, a one-word delay occurs before frame sync pulse
generation resumes when clearing XFSM to 0 (see Figure 8–26). Therefore,
in this case, one additional word is transferred before the next FSX pulse is
generated. Also note that, as discussed previously, the clearing of XFSM is
recognized during the transmission of the word currently being transmitted as
long as XFSM is cleared no later than the N–1 bit. The setting of XFSM is rec-
ognized as long as XFSM is set no later than the N–2 bit.
Peripherals 8-33
Serial Ports
Figure 8–26. Exiting Fixed Continuous Mode Without Frame Sync, FSX Internal
1st Word 2nd Word 3rd Word 4th Word 5th Word
CLKX
FSX
(Internal)
DX A1 AN B1 BN C1 CN D1 DN E1 EN F1 FN
- FSX/R pulses typically last for the entire transfer interval, although FSR
and external FSX are ignored after the first bit transferred. FSX/R pulses
in fixed data-rate mode typically last only one CLKX/R cycle but can last
longer.
- Data transfer begins during the CLKX/R cycle in which FSX/R occurs,
rather than the CLKX/R cycle following FSX/R, as is the case with fixed
data-rate timing.
- With variable data-rate timing, frame sync inputs are ignored until the end
of the last bit transferred, rather than the beginning of the last bit trans-
ferred, as is the case with fixed data-rate timing.
8-34
Serial Ports
CLKX/R
FSR/FSX (External)
FSX (Internal)
DX/DR A1 AN
When you transmit continuously in variable data-rate mode with frame sync,
timing is the same as for fixed data-rate mode, except for the differences be-
tween these two modes as described under Variable Data-Rate Timing Opera-
tion. The only other exception is that you must reload DXR no later than the
N–4 bit to maintain continuous operation of the variable data-rate mode (see
Figure 8–28); you must reload DXR no later than the N–3 bit to maintain con-
tinuous operation of the fixed data-rate mode.
CLKX/R
FSR/FSX (External)
FSX (Internal)
DX/DR A1 AN B1 BN C1 C2
Peripherals 8-35
Serial Ports
CLKX/R
FSR/FSX (External)
FSX (Internal)
DX/DR A1 AN B1 BN C1 C2
XINT
Set XINT XINT
DXR Loaded R/XFSM RINT RINT
The serial ports are controlled through memory-mapped registers on the dedi-
cated peripheral bus. Following is a general procedure for initializing and/or
reconfiguring the serial ports.
1) Halt the serial port by clearing the XRESET and/or RRESET bits of the ser-
ial-port global-control register. To do this, write a 0 to the serial-port global-
control register. Note that the serial ports are halted on RESET.
2) Configure the serial port via the serial-port global-control register (with
XRESET = RRESET = 0) and the FSX/DX/CLKX and FSR/DR/CLKR port-
control registers. If necessary, configure the receive/transmit registers:
timer control (with XHLD = RHLD = 0), timer counter, and timer period. Re-
fer to subsection 8.2.14 for more information.
3) Start the serial port operation by setting the XRESET and RRESET bits
of the serial-port global-control register and the XHLD and RHLD bits of
the serial-port receive/transmit timer-control register, if necessary.
8-36
Serial Ports
Since the FSX is set as an output and continuous mode is disabled when hand-
shake mode is selected, you should set the XFSM and RFSM bits to 0 and the
FSXOUT bit to 1 in the global control register. You should set the XRESET,
RRESET, and HS bits to 1 in order to start the handshake communication. You
should set the polarity of the serial port pins active (high) for simplification. Al-
though the CLKX/CLKR can be set as either input or output, you should set
the CLKX as output and the CLKR as input. The rest of the bits are user-confi-
gurable as long as both serial ports have consistent setup.
You need the serial port timer only if the CLKX or CLKR is configured as an
output. Since only the CLKX is configured as an output, you should set the tim-
er control register to 0Fh. When the serial port timer is used, you should also
set the serial timer register to the proper value for the clock speed. The serial
port timer clock speed setup is similar to the TMS320C3x timer. Refer to Sec-
tion 8.1 on page 8-2 for detailed information on timer clock generation.
The maximum clock frequency for serial transfers is F(CLKIN)/4 if the internal
clock is used and F(CLKIN)/5.2 if an external clock is used. Therefore, if two
TMS320C3xs have the same system clock, the timer period register should
be set equal to or greater than 1, which makes the clock frequency equal to
F(CLKIN)/8.
Example 8–1 and Example 8–2 are serial port register setups for the above
case. (Assume two TMS320C3xs have the same system clock.)
Peripherals 8-37
Serial Ports
Since the data has a leading 1 and the acknowledge signal is a 0 in the hand-
shake mode, the TMS320C3x serial port can distinguish between the data and
the acknowledge signal. Therefore, even if the TMS320C3x serial port re-
ceives the data before the acknowledge signal, the data will not be misinter-
preted as the acknowledge signal and be lost. In addition, the acknowledge
signal is not generated until the data is read from the data receive register
(DRR). Therefore, the TMS320C3x will not transmit the data and the acknowl-
edge signal simultaneously.
Example 8–3 sets up the CPU to transfer data (128 words) from an array buffer
to the serial port 0 output register when the previous value stored in the serial
port output register has been sent. Serial port 0 is initialized to transmit 32-bit
data words with an internally generated frame sync and a bit-transfer rate of
8H1 cycles/bit.
8-38
Serial Ports
LDI @SOURCE,AR0
LDI *AR0++,R1
STI R1,*+AR1(8)
LDI 8,IR0
LDI 2,R0
LDI 126,RC
RPTB LOOP
WAIT AND *AR1,R0,R2 ; WAIT UNTIL XRDY BIT = 1
BZ WAIT
LOOP STI R1,*+AR1(IR0)
|| LDI *++AR0(1),R1
BU $
.END
Peripherals 8-39
Serial Ports
The TLC320C4x analog interface chips (AIC) from Texas Instruments offer a
zero-glue-logic interface to the TMS320C3x family of DSPs. The interface is
shown in Figure 8–30 as an example of the TMS320C3x serial-port configura-
tion and operation.
TMS320C3x TMS320C4x
XF0 RESET WORD VCC
CLKR0 SCLK
CLKX0 OUT+ Analog
FSR0 FSR OUT– Out
DR0 DR
FSX0 FSX IN+ Analog
DX0 DX IN– In
TCLK0 MCLK
GND
The TMS320C3x resets the AIC through the external pin XF0. It also gener-
ates the master clock for the AIC through the timer 0 output pin, TCLK0. (Pre-
cise selection of a sample rate may require the use of an external oscillator
rather than the TCLK0 output to drive the AIC MCLK input.) In turn, the AIC
generates the CLKR0 and CLKX0 shift clocks as well as the FSR0 and FSX0
frame synchronization signals.
A typical use of the AIC requires an 8-kHz sample rate of the analog signal.
If the clock input frequency to the TMS320C3x device is 30 MHz, you should
load the following values into the serial port and timer registers.
Serial Port:
Port global control register: 0E970300h
FSX/DX/CLKX port control register 00000111h
FSR/DR/CLKR port control register 00000111h
Timer:
Timer global control register 000002C1h
Timer period register 00000001h
The DSP201/2 and DSP101/2 family of D/As and A/Ds from Burr Brown also
offer a zero-glue-logic interface to the TMS320C3x family of DSPs. The inter-
face is shown in Example 8–4. This interface is used as an example of the
TMS320C3x serial-port configuration and operation.
8-40
Serial Ports
CASC +5 V +5 V CASC
TMS320C3x
12.29 MHz
22 pF 22 pF
The DSP102 A/D is interfaced to the TMS320C3x serial port receive side; the
DSP202 D/A is interfaced to the transmit side. The A/Ds and D/As are hard-
wired to run in cascade mode. In this mode, when the TMS320C3x initiates a
convert command to the A/D via the TCLK0 pin, both analog inputs are con-
verted into two 16-bit words, which are concatenated to form one 32-bit word.
The A/D signals the TMS320C3x via the A/D’s SYNC signal (connected to the
TMS320C3x FSR0 pin) that serial data is to be transmitted. The 32-bit word
is then serially transmitted, MSB first, out the SOUTA serial pin of the DSP102
to the DR0 pin of the TMS320C3x serial port. The TMS320C3x is programmed
to drive the analog interface bit clock from the CLKX0 pin of the TMS320C3x.
The bit clock drives both the A/D’s and D/A’s XCLK input. The TMS320C3x
transmit clock also acts as the input clock on the receive side of the
TMS320C3x serial port. Since the receive clock is synchronous to the internal
clock of the TMS320C3x, the receive clock can run at full speed (that is,
f(H1)/2).
Peripherals 8-41
Serial Ports
Similarly, on receiving a convert command, the pipelined D/A converts the last
word received from the TMS320C3x and signals the TMS320C3x via the
SYNC signal (connected to the TMS320C3x FSX0 pin) to begin transmitting
a 32-bit word representing the two channels of data to be converted. The data
transmitted from the TMS320C3x DX0 pin is input to both the SINA and SINB
inputs of the D/A as shown in the figure.
The TMS320C3x is set up to transfer bits at the maximum rate of about eight
Mbps, with a dual-channel sample rate of about 44.1 kHz. Assuming a 32-MHz
CLKIN, you can configure this standard-mode fixed-data-rate signaling inter-
face by setting the registers as described below:
Serial Port:
Port global-control register: 0EBC0040h
FSX/DX/CLKX port-control register 00000111h
FSR/DR/CLKR port-control register 00000111h
Receive/transmit timer-control register 0000000Fh
Timer:
Timer global-control register 000002C1h
Timer period register 000000B5h
8-42
DMA Controller
A DMA transfer consists of two operations: a read from a memory location and
a write to a memory location. The DMA controller can read from and write to
any location in the TMS320C3x memory map. This includes all
memory-mapped peripherals. The operation of the DMA is controlled with the
following set of memory-mapped registers:
- DMA global-control register
- DMA source-address register
- DMA destination-address register
- DMA transfer-counter register
Table 8–7 shows these registers, their memory-mapped addresses, and their
functions. Each of these DMA registers is discussed in the succeeding subsec-
tions.
Peripherals 8-43
DMA Controller
Reserved 808001h
Reserved 808002h
Reserved 808003h
Reserved 808005h
Reserved 808007h
Reserved 808009h
Reserved 80800Ah
Reserved 80800Bh
Reserved 80800Ch
Reserved 80800Dh
Reserved 80800Eh
Reserved 80800Fh
8-44
DMA Controller
3–2 STAT 0–0 These bits indicate the status of the DMA and change every cycle
(see Table 8–10).
9–8 SYNC 0–0 The SYNC bits determine the timing synchronization between the
events initiating the source and the destination transfers. The inter-
pretation of the SYNC bits is shown in Table 8–11.
11 TCINT 0 If TCINT = 1, the DMA interrupt is set when the transfer counter
makes a transition to 0. If TCINT = 0, the DMA interrupt is not set
when the transfer counter makes a transition to 0.
Note: When the DMA completes a transfer, the START bits remain in 11 (base 2). The DMA starts when the START bits are set
to 11 and one of the following conditions applies:
Peripherals 8-45
DMA Controller
Table 8–9. START Bits and Operation of the DMA (Bits 0–1)
START Function
00 DMA read or write cycles in progress will be completed; any data read will
be ignored. Any pending read or write will be cancelled. The DMA is reset
so that when it starts a new transaction begins; that is, a read is per-
formed. (Reset value)
01 DMA is being held in the middle of a DMA transfer, that is, between a read
and a write.
10 Reserved.
11 DMA busy; that is, DMA is performing a read or write or waiting for a
source or destination synchronization interrupt.
Table 8–11. SYNC Bits and Synchronization of the DMA (Bits 8–9)
SYNC Function
00 No synchronization. Enabled interrupts are ignored. (Reset value)
8-46
DMA Controller
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
xx xx xx xx TCINT TC SYNC DECDST INCDST DECSRC INCSRC STAT START
R/W R/W R/W R/W R/W R/W R/W R/W R R R/W R/W
Peripherals 8-47
DMA Controller
Table 8–12 lists the bits, names, and functions of the CPU/DMA interrupt en-
able register. Figure 8–32 shows the IE register. The priority and decoding
schemes of CPU and DMA interrupts are identical. Note that when the DMA
receives an interrupt, this interrupt is acted upon according to the SYNC field
of the DMA control register. Also note that an interrupt can affect the DMA but
not the CPU and can affect the CPU but not the DMA. Refer to subsection 3.1.8
on page 3-7 and to Chapter 6.
8-48
DMA Controller
EDINT ETINT1 ETINT0 ERINT1 EXINT1 ERINT0 EXINT0 EINT3 EINT2 EINT1 EINT0
xx xx xx xx xx
(DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA) (DMA)
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
EDINT ETINT1 ETINT0 ERINT1 EXINT1 ERINT0 EXINT0 EINT3 EINT2 EINT1 EINT0
xx xx xx xx xx
(CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU) (CPU)
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
- Read data from the address specified by the DMA source register
- Write data that has been read to the address specified by the DMA desti-
nation register
A transfer is complete only when the read and write are complete. You can stop
a transfer by setting the START bits to the desired value. When the DMA is re-
started (START = 1 1), it completes any pending transfer.
At the end of a DMA read, the source address is modified as specified by the
SRCINC and SRCDEC bits of the DMA global-control register. At the end of
a DMA write, the destination address is modified as specified by the DSTINC
and DSTDEC bits of the DMA global control register. At the end of every DMA
write, the DMA transfer counter is decremented.
DMA on-chip reads and writes (reads and writes from on-chip memory and pe-
ripherals) are single-cycle. DMA off-chip reads are two cycles. The first cycle
is the external read, and the second cycle loads the DMA register. The external
read cycle is identical to a CPU read cycle. DMA off-chip writes are identical
to CPU off-chip writes. If the DMA has been started and is transferring data
over either external bus, you should not modify the bus-control register asso-
ciated with that bus. If you must modify the bus-control register (see Chapter
7), stop the DMA, make the modification, and then restart the DMA. Failure to
do this may produce an unexpected zero-wait-state bus access.
Peripherals 8-49
DMA Controller
Through the 24-bit source and destination registers, the DMA is capable of ac-
cessing any memory-mapped location in the TMS320C3x memory map.
Table 8–13, Table 8–14, and Table 8–15 show the number of cycles a DMA
transfer requires, depending on whether the source and destination are on-
chip memory and peripherals, the external port, or the I/O port. T represents
the number of transfers to be performed, Cr represents the number of wait-
states for the source read, and Cw represents the number of wait-states for the
destination write. Each entry in the table represents the total cycles required
to do the T transfers, assuming that there are no pipeline conflicts.
Accompanying each table is a figure illustrating the timing of the DMA transfer.
|R| and |W| represent single-cycle reads and writes, respectively. |R.R| and
|W.W| represent multicycle reads and writes. |Cr| and |Cw| show the number
of wait cycles for a read and write.
Cycles (H1) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Source On-Chip R R R : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : :
Destination On-Chip W W W : : : : : : : : : : : :
Legend:
T = Number of transfers
Cr = Source-read wait states
Cw = Destination-write wait states
|R| = Single-cycle reads
|W| = Single-cycle writes
|R.R| = Multicycle reads
|W.W| = Multicycle writes
| I| = Internal register cycle
8-50
DMA Controller
Cycles (H1) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Source On-Chip R R : : : : R
: : : : : : : : :
: : : : : :
: : : : : : : : : : : :
W . W . W .W W . W .W . W W . W . W . W : : : : :
Destination Primary Bus : : Cw : Cw : Cw : : : : :
Source Primary Bus R .R . R: I : : : .R .R . R : I : : : : : :
Cr : : : : : Cr : : : : : :
: : : : : : : : : : : : : : : : : :
: : : W . W . W .W : : : W . W . W .W : :
Destination Primary Bus : : : : : Cw : : : : : Cw : :
Source Expansion Bus R . R .R : I R .R.R: I R . R .R : I : : : :
Cr : :Cr : : Cr : : : : :
: : : : :
: : : : : : : : : : : : :
: : : W . W .W . W W .W .W . W W . W. W . W
Destination Primary Bus : : : : : Cw : : Cw : : Cw
Legend:
T = Number of transfers
Cr = Source-read wait states
Cw = Destination-write wait states
|R| = Single-cycle reads
|W| = Single-cycle writes
|R.R| = Multicycle reads
|W.W| = Multicycle writes
|I| = Internal register cycle
Peripherals 8-51
DMA Controller
Cycles (H1) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Source On-Chip R R : : R : : : : : : : : : : :
: : : : : : : : : : : : : : : : : :
W . W . W .W W . W .W . W W . W . W . W : : : : :
Destination Expansion Bus : : Cw : Cw : Cw : : : : :
Source Primary Bus R .R .R I R .R .R : I R.R . R : I : : : :
Cr : : Cr : : Cr : : : : :
: : : : : : : : : : : : : : : : : :
: : : W . W . W .W W . W . W .W W . W . W .W
Destination Expansion Bus : : : : : Cw : : Cw : : Cw
Source Expansion Bus R .R .R :I : : : R . R .R : I : : : : : :
Cr : :
: : : Cr : : : : : : :
: : : : :
: : : : : : : : : : : : :
: : : W . W .W . W : : : W .W .W . W : :
Destination Expansion Bus : : : : : Cw : : : : : Cw : :
Legend:
T = Number of transfers
Cr = Source-read wait states
Cw = Destination-write wait states
|R| = Single-cycle reads
|W| = Single-cycle writes
|R.R| = Multicycle reads
|W.W| = Multicycle writes
|I| = Internal register cycle
8-52
DMA Controller
Table 8–16 shows the maximum DMA transfer rates, assuming that there are
no wait states (Cr = Cw = 0). Table 8–17 shows the maximum DMA transfer
rates, assuming there is one wait state for the read (Cr = 1) and no wait states
for the write (Cw = 0). Table 8–18 shows the maximum DMA transfer rates,
assuming there is one wait state for the read (Cr = 1) and one wait state for the
write (Cw = 1).
In each table, the time for the complete transfer (the read and the write) is con-
sidered. Since one bus access is required for the read and another for the
write, internal bus transfer rates will be twice the DMA transfer rate. It is also
assumed that no conflicts with the CPU exist. Rates are listed in Mwords/sec.
A word is 32 bits (4 bytes).
Peripherals 8-53
DMA Controller
No Synchronization
Go to Start
Source Synchronization
When SYNC = 0 1, the DMA is synchronized to the source (see Figure 8–34).
A read will not be performed until an interrupt is received by the DMA. Then
all DMA interrupts are disabled globally. However, no bits in the DMA interrupt
enable register are changed.
8-54
DMA Controller
Go to Start
Destination Synchronization
When SYNC= 1 0, the DMA is synchronized to the destination. First, all inter-
rupts are ignored until the read is complete. Though the DMA interrupts are
considered globally disabled, no bits in the DMA interrupt-enable register are
changed. A write will not be performed until an interrupt is received by the
DMA. Figure 8–35 shows the synchronization mechanism when SYNC = 1 0.
Go to Start
When SYNC = 1 1, the DMA is synchronized to both the source and destina-
tion. A read is performed when an interrupt is received. A write is performed
on the following interrupt. Source and destination synchronization when
SYNC = 1 1 is shown in Figure 8–36.
Peripherals 8-55
DMA Controller
Start
Go to Start
You can generate a DMA interrupt to the CPU whenever the transfer count
reaches 0, indicating that the last transfer has taken place. The TCINT bit in
the DMA global control register determines whether the interrupt will be gener-
ated. If TCINT = 1, the DMA interrupt is generated. If TCINT = 0, the DMA inter-
rupt is not generated. If the DMA interrupt is generated, the EDINT bit, bit 10
in the interrupt enable register, must also be set to enable the CPU to be inter-
rupted by the DMA.
A second bit in the DMA global control register, the TC bit, is also generally
associated with the state of the TCINT bit and the interrupt operation. The TC
bit determines whether transfers are terminated when the transfer counter be-
comes 0 or whether they are allowed to continue. If TC = 1, transfers are termi-
nated when the transfer count becomes 0. If TC = 0, transfers are not termi-
nated when the transfer count becomes 0.
8-56
DMA Controller
- Ensure that each interrupt is received when you use interrupt synchroniza-
tion; otherwise, the DMA will never complete the block transfer.
- Use read/write synchronization when reading from or writing to serial ports
to guarantee data validity.
The following are indications that the DMA has finished a set of transfers:
- The DINT bit in the IIF register is set to 1 (interrupt polling). This requires
that the TCINT bit in the DMA control register be set first. This interrupt-
polling method does not cause any additional CPU-DMA access conflict.
Peripherals 8-57
DMA Controller
- The transfer counter has a zero value. However, notice that the transfer
counter is decremented after the DMA read operation finishes (not after
the write operation). Nevertheless, a transfer counter with a 0 value can
be used as an indication of a transfer completion.
- The STAT bits in the DMA channel control register are set to 002. You can
poll the DMA channel control register for this value. However, because the
DMA registers are memory-mapped into the peripheral bus address
space, this option can cause further CPU-DMA access conflicts.
Example 8–5, Example 8–6, and Example 8–7 illustrate initialization proce-
dures for the DMA.
When linking the examples, you should allocate section memory addresses
carefully to avoid CPU-DMA conflict. In the ’C3x, the CPU always prevails in
cases of conflict. In the event of a CPU program–DMA data conflict, the enab-
ling of the cache helps if the .text section is in external memory. For example,
when linking the code in Example 8–5, Example 8–6, and Example 8–7, the
.text section can be allocated into RAM0, .data into RAM1, and .bss into
RAM1, where RAM0 and RAM1 correspond to on-chip RAM block 0 and block
1, respectively.
In Example 8–5, the DMA initializes a 128-element array to 0. The DMA sends
an interrupt to the CPU after the transfer is completed. This program assumes
previous initialization of the CPU interrupt vector table (specifically the DMA-
to-CPU interrupt). The program initializes the ST and IE registers for interrupt
processing.
8-58
DMA Controller
Peripherals 8-59
DMA Controller
* DMA INITIALIZATION
LDI @DMA,AR0 ; POINT TO DMA GLOBAL CONTROL REGISTER
LDI @SPORT,AR1
LDI @RESET,R0
STI R0,*+AR1(4) ; RESET SPORT TIMER
LDI @RESET1,R0
STI R0,*AR0 ; RESET DMA
LDI @SPRESET,R0
STI R0,*AR1 ; RESET SPORT
LDI @SOURCE,R0 ; INITIALIZE DMA SOURCE ADDRESS REGISTER
STI R0,*+AR0(4)
LDI @DESTIN,R0 ; INITIALIZE DMA DESTINATION ADDRESS REGISTER
STI R0,*+AR0(6)
LDI @COUNT,R0 ; INITIALIZE DMA TRANSFER COUNTER REGISTER
STI R0,*+AR0(8)
OR @IEVAL,IE ; ENABLE INTERRUPTS
OR 2000H,ST ; ENABLE CPU INTERRUPTS GLOBALLY
LDI @CONTROL,R0 ; INITIALIZE DMA GLOBAL CONTROL REGISTER
STI R0,*AR0 ; START DMA TRANSFER
* SERIAL PORT INITIALIZATION
LDI @SRCTRL,R0 ; SERIAL-PORT RECEIVE CONTROL REG INITIALIZATION
STI R0,*+AR1(3)
LDI @STPERIOD,R0 ; SERIAL-PORT TIMER PERIOD INITIALIZATION
STI R0,*+AR1(6)
LDI @STCTRL,R0 ; SERIAL-PORT TIMER CONTROL REG INITIALIZATION
STI R0,*+AR1(4)
LDI @SGCCTRL,R0 ; SERIAL-PORT GLOBAL CONTROL REG INITIALIZATION
STI R0,*AR1
BU $
.END
Example 8–7 sets up the DMA to transfer data (128 words) from an array buff-
er to the serial port 0 output register with serial port transmit interrupt XINT0.
The DMA sends an interrupt to the CPU when the data transfer completes.
Serial port 0 is initialized to transmit 32-bit data words with an internally gener-
ated frame sync and a bit-transfer rate of 8H1 cycles/bit. The receive-bit clock
is internally generated and equal in frequency to one-half of the ’C3x H1 fre-
quency.
This program assumes previous initialization of the CPU interrupt vector table
(specifically the DMA-to-CPU interrupt). The serial port interrupt directly af-
fects only the DMA; therefore, no CPU serial port interrupt vector setting is re-
quired.
8-60
DMA Controller
Peripherals 8-61
DMA Controller
* CPU WRITES THE FIRST WORD (TRIGGERING EVENT –––> XINT IS GENERATED)
LDI @SOURCE,AR0
LDI *–AR0(1),R0
STI R0,*+AR1(8)
BU $
.END
8-62
DMA Controller
Peripherals 8-63
8-64
Chapter 9
Pipeline Operation
Topic Page
9-1
Pipeline Structure
Figure 9–1 illustrates these four levels of the pipeline structure. The levels are
indexed according to instruction and execution cycle. The perfect overlap in
the pipeline, where all four units operate in parallel, occurs at cycle (m). Those
levels about to be executed are at m + 1, and those just executed are at m – 1.
The TMS320C3x pipeline control allows a high-speed execution rate of one
execution per cycle. It also manages pipeline conflicts so that they are trans-
parent to the user. You do not need to take any special precautions to guaran-
tee correct operation.
9-2
Pipeline Structure
CYCLE F D R E
m–3 W – – –
m–2 X W – –
m–1 Y X W –
m Z Y X W Perfect overlap
m+1 – Z Y X
m+2 – – Z Y
m+3 – – – Z
Priorities from highest to lowest have been assigned to each of the functional
units as follows:
1) Execute (highest)
2) Read
3) Decode
4) Fetch
5) DMA (lowest)
When the processing of an instruction is ready to pass to the next higher pipe-
line level, but that level is not ready to accept a new input, a pipeline conflict
occurs. In this case, the lower-priority unit waits until the higher-priority unit
completes its currently executing function.
Despite the DMA controller’s low priority, you can minimize or even eliminate
conflicts with the CPU through suitable data structuring because the DMA con-
troller has its own data and address buses.
- Branch Conflicts
Branch conflicts involve most of those instructions or operations that read
and/or modify the PC.
- Register Conflicts
Register conflicts involve delays that can occur when reading from or writ-
ing to registers that are used for address generation.
- Memory Conflicts
Memory conflicts occur when the internal units of the TMS320C3x com-
pete for memory resources.
Example 9–1 shows the code and pipeline operation for a standard branch.
9-4
Pipeline Conflicts
PIPELINE OPERATION
PC F D R E
n BR – – –
n+1 MPYF BR – –
n+1 (nop) (nop) BR –
n+1 (nop) (nop) (nop) BR
THREE OR (nop) (nop) (nop)
STI OR (nop) (nop)
RPTS and RPTB both flush the pipeline, allowing the RS, RE, and RC registers
to be loaded at the proper time relative to the flow of the pipeline. If these regis-
ters are loaded without the use of RPTS or RPTB, no flushing of the pipeline
occurs. If you are not using any of the repeat modes, then you can use RS, RE,
and RC as general-purpose 32-bit registers and not cause any pipeline con-
flicts. In cases such as the nesting of RPTB due to nested interrupts, it might
be necessary to load and store these registers directly while using the repeat
modes. Since up to four instructions can be fetched before entering the repeat
mode, you should follow loads by a branch to flush the pipeline. If the RC is
changing when an instruction is loading it, the direct load takes priority over
the modification made by the repeat mode logic.
Delayed branches are implemented to guarantee the fetching of the next three
instructions. The delayed branches include BRD, BcondD, and DBcondD.
Example 9–2 shows the code and pipeline operation for a delayed branch.
PIPELINE OPERATION
PC F D R E
n BRD — — —
THREE → PC
9-6
Pipeline Conflicts
- Group 1
This group includes auxiliary registers (AR0–AR7), index registers (IR0,
IR1), and block size register (BK).
- Group 2
This group includes the data page pointer (DP).
- Group 3
This group includes the system stack pointer (SP).
If an instruction writes to one of these three groups, the decode unit cannot use
any register within that particular group until the write is complete, that is, in-
struction execution is completed. In Example 9–3, an auxiliary register is
loaded, and a different auxiliary register is used on the next instruction. Since
the decode stage needs the result of the write to the auxiliary register, the de-
code of this second instruction is delayed two cycles. Every time the decode
is delayed, a refetch of the program word is performed; that is, the ADDF is
fetched three times. Since these are actual refetches, they can cause not only
conflicts with the DMA controller but also cache hits and misses.
PIPELINE OPERATION
PC F D R E
n LDI — — —
The case for reads of these groups is similar to the case for writes. If an
instruction must read a member of one of these groups, the use of that particu-
lar group by the decode for the following instruction is delayed until the read
is complete. The registers are read at the start of the execute cycle and there-
fore require only a one-cycle delay of the following decode. For four registers
(IR0, IR1, BK, or DP), there is no delay. For all other registers, including the
SP, the delay occurs.
In Example 9–4, two auxiliary registers are added together, with the result go-
ing to an extended-precision register. The next instruction uses a different aux-
iliary register as an address register.
9-8
Pipeline Conflicts
PIPELINE OPERATION
PC F D R E
n ADDI — — —
Loop counter auxiliary registers for the decrement and branch (DBR)) instruc-
tion are regarded in the same way as they are for addressing. Therefore, the
operation shown in Example 9–3 and Example 9–4 can also occur for this in-
struction.
- Program wait
A program fetch is prevented from beginning.
- Execute only
An instruction sequence requires three CPU data accesses in a single
cycle.
- Hold everything
A primary or expansion bus operation must complete before another one
can proceed.
These four types of memory conflicts are illustrated in examples and dis-
cussed in the paragraphs that follow.
Program Wait
- A multicycle CPU data access or DMA data access over the external bus
is needed.
9-10
Pipeline Conflicts
Example 9–5 illustrates a program wait until a CPU data access completes.
In this case, *AR0 and *AR1 are both pointing to data in RAM block 0, and the
MPYF instruction will be fetched from RAM block 0. This results in the conflict
shown in Example 9–5. Since no more than two accesses can be made to
RAM block 0 in a single cycle, the program fetch cannot begin and must wait
until the CPU data accesses are complete.
PIPELINE OPERATION
PC F D R E
n ADDF3 — — —
- Code that has been cached is executed, and the instruction prior to the
ADDF is one of the following (conditional or unconditional):
J a delayed branch instruction, or
J a delayed decrement and branch instruction.
Even though the DMA has the lowest priority, multicycle access cannot be
aborted. The program fetch must therefore wait until the DMA access com-
pletes.
PIPELINE OPERATION
PC F D R E
n ADDF — — —
A program fetch incomplete occurs when a program fetch requires more than
one cycle to complete due to wait states. In Example 9–7, the MPYF and
ADDF are fetched from memory that supports single-cycle accesses. The
SUBF is fetched from memory, which requires one wait state. One example
that demonstrates this conflict is a fetch across a bank boundary on the
primary port. See Section 7.4 on page 7-30.
PIPELINE OPERATION
PC F D R E
n MPYF — — —
9-12
Pipeline Conflicts
Execute Only
The execute only type of memory pipeline conflict occurs when performing an
interlocked load or when a sequence of instructions requires three CPU data
accesses in a single cycle. There are three cases in which this occurs:
The first case is shown in Example 9–8. Since this sequence requires three
data memory accesses and only two are available, only the execute phase of
the pipeline is allowed to proceed. The dual reads required by the LDF || LDF
are delayed one cycle. Note that a refetch of the next instruction can occur.
PIPELINE OPERATION
PC F D R E
n STF — — —
Example 9–9 shows a parallel store followed by a single load or read. Since
the two parallel stores are required, the next CPU data memory read must wait
a cycle before beginning. One program memory refetch can occur.
PIPELINE OPERATION
PC F D R E
n STF STF — — —
9-14
Pipeline Conflicts
The final case involves an interlocked load (LDII or LDFI) instruction and XF1
= 1. Since the interlocked loads use the XF1 pin as an acknowledge that the
read can complete, the loads might need to extend the read cycle, as shown
in Example 9–10. Note that a program refetch can occur.
PIPELINE OPERATION
PC F D R E
n NOT — — —
Hold Everything
The first type of hold everything conflict occurs when one of the external ports
is busy due to an access that has started but is not complete. In Example 9–11,
the first store is a two-cycle store. The CPU writes the data to an external port.
The port control then takes two cycles to complete the data-data write. The
LDF is a read over the same external port. Since the store is not complete, the
CPU continues to attempt LDF until the port is available.
PIPELINE OPERATION
PC F D R E
n STF — — —
n+4 Y X W LDF
9-16
Pipeline Conflicts
The second type of hold everything conflict involves multicycle data reads. The
read has begun and continues until completed. In Example 9–12, the LDF is
performed from an external memory that requires several cycles to complete.
PIPELINE OPERATION
PC F D R E
n LDF — — —
n+1 I LDF — —
n+2 J I LDF —
n+3 K2 J I LDF
D = Decode, E = Execute, F = Fetch, R = Read, PC = Program Counter, I, J, K = Instruction Representations
The final type of hold everything conflict involves conditional calls and traps,
which are different from the other branch instructions. Whereas the other
branch instructions are conditional loads, the conditional calls and traps are
conditional stores, which require one cycle more than a conditional branch
(see Example 9–13). The added cycle is used to push the return address after
the call condition is evaluated.
PIPELINE OPERATION
PC F D R E
n9 CALLcond — — —
n+1 I CALLcond — —
Example 9–14, Example 9–15, and Example 9–16 demonstrate either some
common uses of these registers that do not produce a conflict or ways that you
can avoid the conflict.
PIPELINE OPERATION
PC F D R E
n LDF — — —
9-18
Resolving Register Conflicts
PIPELINE OPERATION
PC F D R E
n LDI — — —
Example 9–16. Write to DP Followed by a Direct Memory Read Without a Pipeline Conflict
LDP TABLE_ADDR
POP R0
LDF *–AR3(2),R1
LDI @TABLE_ADDR,AR0
PUSHF R6
PUSH R4
PIPELINE OPERATION
PC F D R E
n LDP — — —
9-20
Resolving Memory Conflicts
Table 9–1 shows how many accesses can be performed from the different
memory spaces when it is necessary to do a program fetch and a single data
access and still achieve maximum performance (one cycle). As shown in
Table 9–1, four cases achieve one-cycle maximization.
Table 9–1. One Program Fetch and One Data Access for Maximum Performance
Accesses From Expansion Bus†
Primary Bus Dual-Access Or Peripheral
Case # Accesses Internal Memory Accesses
1 1 1 –
2 1 – 1
2 from any
3 – combination –
of internal memory
4 – 1 1
† The expansion bus is available only on the TMS320C30.
Table 9–2 shows how many accesses can be performed from the different
memory spaces when it is necessary to do a program fetch and two data ac-
cesses and still achieve maximum performance (one cycle). Six conditions
achieve this maximization.
Table 9–2. One Program Fetch and Two Data Accesses for Maximum Performance
Accesses From Expansion† Or
Primary Bus Dual-Access Peripheral Bus
Case # Accesses Internal Memory Accesses
1 1 2 from any –
combination
of internal memory
2† 1 Program 1 Data 1 Data
3† 1 Data 1 Data 1 Program
4 – 2 from same internal –
memory block and
1 from a different
internal memory
block
5 – 3 from different –
internal memory
blocks
6 – 2 from any 1
combination
of internal memory
† The expansion bus is available only on the TMS320C30.
9-22
Clocking of Memory Accesses
H1
H3
The precise operation of memory reads and writes can be defined according
to these minor clock periods. The types of memory operations that can occur
are program fetches, data loads and stores, and DMA accesses.
External program fetches always start at the beginning of H3, with the address
being presented on the external bus. At the end of H1, they are completed with
the latching of the instruction word.
Two-operand instructions include all instructions whose bits 31–29 are 000 or
010 (see Figure 9–2). In the case of a data read, bits 15–0 represent the src
operand. Internal data reads are always performed during H1. External data
reads always start at the beginning of H3, with the address being presented
on the external bus; they complete with the latching of the data word at the end
of H1.
31 24 23 16 15 87 0
In the case of a data store, bits 15–0 represent the dst operand. Internal data
stores are performed during H3. External data stores always start at the begin-
ning of H3, with the address and data being presented on the external bus.
Three-operand instructions include all instructions whose bits 31–29 are 001
(see Figure 9–3). The source operands, src1 and src2, come from either regis-
ters or memory. When one or more of the source operands are from memory,
these instructions are always memory reads.
9-24
Clocking of Memory Accesses
31 24 23 16 15 87 0
If only one of the source operands is from memory (either src1 or src2) and is
located in internal memory, the data is read during H1. If the single memory
source operand is in external memory, the read starts at the beginning of H3,
with the address being presented on the external bus, and completes with the
latching of the data word at the end of H1.
If both source operands are to be fetched from memory, several cases occur.
If both operands are located in internal memory, the src1 read is performed
during H3 and the src2 read during H1, thus completing two memory reads in
a single cycle.
If src1 is in internal memory and src2 is in external memory, the src2 access
begins at the start of H3 and latches at the end of H1. At the same time, the
src1 access to internal memory is performed during H3. Again, two memory
reads are completed in a single cycle.
If src1 is in external memory and src2 is in internal memory, two cycles are nec-
essary to complete the two reads. In the first cycle, both operands are ad-
dressed. Since src1 takes an entire cycle to be read and latched from external
memory, the internal operation on src2 cannot be completed until the second
cycle. Ordering the operands so that src1 is located internally is necessary to
achieve single-cycle execution.
If src1 and src2 are both from external memory, two cycles are required to com-
plete the two reads. In the first cycle, the src1 access is performed and loaded
on the next H3; in the second cycle, the src2 access is performed and loaded
on that cycle’s H1.
H1
H3
PIPELINE OPERATION
PC F D R E
n STI
Two cycles are required for the MSTRB store. Two other cycles are required for the
dummy MSTRB read of *AR3 (because the read follows a write). One cycle is required
for an actual MSTRB read of *AR3.
9-26
Clocking of Memory Accesses
H1
H3
PIPELINE OPERATION
PC F D R E
n STI
The next class of instructions includes every instruction that has a store in par-
allel with another instruction. Bits 31 and 30 for these instructions are equal
to 1 1.
The instruction word format for those operations that perform a multiply or ALU
operation in parallel with a store is shown in Figure 9–4. If the store operation
to dst2 is external or internal, it is performed during H3. Two bus cycles are
required for external stores, but only one CPU cycle is necessary to complete
the write.
formed during H1. Note that memory reads are performed by the CPU during
the read (R) phase of the pipeline, and stores are performed during the ex-
ecute (E) phase.
31 24 23 16 15 87 0
The instruction word format for those instructions that have parallel stores to
memory is shown in Figure 9–5. If both destination operands, dst1 and dst2,
are located in internal memory, dst1 is stored during H3 and dst2 during H1,
thus completing two memory stores in a single cycle.
If dst1 is in external memory and dst2 is in internal memory, the dst1 store be-
gins at the start of H3. The dst2 store to internal memory is performed during
H1. Two bus cycles are required for the external store, but only one CPU cycle
is necessary to complete the write. Again, two memory stores are completed
in a single cycle.
If dst1 and dst2 are both written to external memory, a single CPU cycle is still
all that is necessary to complete the stores. In this case, four bus cycles are
required.
1) In the first cycle, both dst1 and dst2 are written to the port, and the external
bus access for dst1 begins.
2) The store for dst1 is completed on the second cycle, and the store for dst2
begins on the third external bus cycle.
3) Finally, the store for dst2 is completed on the fourth external bus cycle.
9-28
Clocking of Memory Accesses
31 24 23 16 15 87 0
Memory addressing for parallel multiplies and adds is similar to that for three-
operand instructions. The parallel multiplies and adds include all instructions
whose bits 31–30 = 10 (see Figure 9–6).
For these operations, src3 and src4 are both located in memory. If both oper-
ands are located in internal memory, src3 is performed during H3, and src4 is
performed during H1, thus completing two memory reads in a single cycle.
If src3 is in internal memory and src4 is in external memory, the src4 access
begins at the start of H3 and latches at the end of H1. At the same time, the
src3 access to internal memory is performed during H3. Again, two memory
reads are completed in one cycle.
If src3 is in external memory and src4 is in internal memory, two cycles are nec-
essary to complete the two reads. In the first cycle, the internal src4 access
is performed. During the H3 of the next cycle, the src3 access is performed.
If src3 and src4 are both from external memory, two cycles are necessary to
complete the two reads. In the first cycle, the src3 access is performed; in the
second cycle, the src4 access is performed.
31 24 23 16 15 87 0
The TMS320C3x instruction set can also use one of 20 condition codes with
any of the 10 conditional instructions, such as LDFcond. This chapter defines
the condition codes and flags.
The assembler allows optional syntax forms to simplify the assembly language
for special-case instructions. These optional forms are listed and explained.
Topic Page
10-1
Instruction Set
The instruction set contains 113 instructions organized into the following func-
tional groups:
- Load-and-store
- Two-operand arithmetic/logical
- Three-operand arithmetic/logical
- Program control
- Interlocked operations
- Parallel operations
Two of these instructions can load data conditionally. This is useful for locating
the maximum or minimum value in a data set. See Section 10.2 on page 10-10
for detailed information on condition codes.
LDF Load floating-point value POPF Pop floating-point value from stack
10-2
Instruction Set
- Three-operand instructions can have two source operands (or one source
operand and a count operand) and a destination operand. A source oper-
and can be a memory word or a register. The destination of a three-oper-
and instruction is always a register.
Table 10–3 lists the instructions that have three-operand versions. Note that
you can omit the 3 in the mnemonic from three-operand instructions (see sub-
section 10.3.2 on page 10-16).
10-4
Instruction Set
10-6
Instruction Set
Mnemonic Description
Parallel Arithmetic with Store Instructions
ABSF Absolute value of a floating-point number and store floating-point value
|| STF
ABSI Absolute value of an integer and store integer
|| STI
ADDF3 Add floating-point values and store floating-point value
|| STF
ADDI3 Add integers and store integer
|| STI
AND3 Bitwise logical-AND and store integer
|| STI
ASH3 Arithmetic shift and store integer
|| STI
FIX Convert floating-point to integer and store integer
|| STI
FLOAT Convert integer to floating-point value and store floating-point value
|| STF
LDF Load floating-point value and store floating-point value
|| STF
LDI Load integer and store integer
|| STI
LSH3 Logical shift and store integer
|| STI
MPYF3 Multiply floating-point values and store floating-point value
|| STF
MPYI3 Multiply integer and store integer
|| STI
Mnemonic Description
Parallel Arithmetic with Store Instructions (Concluded)
NEGF Negate floating-point value and store floating-point value
|| STF
NEGI Negate integer and store integer
|| STI
NOT Complement value and store integer
|| STI
OR3 Bitwise logical-OR value and store integer
|| STI
STF Store floating-point values
|| STF
STI Store integers
|| STI
SUBF3 Subtract floating-point value and store floating-point value
|| STF
SUBI3 Subtract integer and store integer
|| STI
XOR3 Bitwise exclusive-OR values and store integer
|| STI
Parallel Load Instructions
LDF Load floating-point
|| LDF
LDI Load integer
|| LDI
Parallel Multiply and Add/Subtract Instructions
MPYF3 Multiply and add floating-point
|| ADDF3
MPYF3 Multiply and subtract floating-point
|| SUBF3
MPYI3 Multiply and add integer
|| ADDI3
MPYI3 Multiply and subtract integer
|| SUBI3
10-8
Instruction Set
Figure 10–1 on page 10-11 shows the condition flags in the low-order bits of
the status register. Following the figure is a list of status register condition flags
and descriptions of how the flags are set by most instructions. For specific de-
tails of the effect of a particular instruction on the condition flags, see the de-
scription of that instruction in subsection 10.3.3 on page 10-18.
10-10
Condition Codes and Flags
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Logical operations assign N the state of the MSB of the output value. For inte-
ger and floating-point operations, N is set if the result is negative, and cleared
otherwise. Zero is positive.
For logical, integer, and floating-point operations, Z is set if the output is 0 and
cleared otherwise.
For integer operations, V is set if the result does not fit into the format specified
for the destination (that is, –2 32 ≤ result ≤ 2 32 – 1). Otherwise, V is cleared.
For floating-point operations, V is set if the exponent of the result is greater
than 127; otherwise,V is cleared. Logical operations always clear V.
C Carry Flag
When an integer addition is performed, C is set if a carry occurs out of the bit
corresponding to the MSB of the output. When an integer subtraction is per-
formed, C is set if a borrow occurs into the bit corresponding to the MSB of the
output. Otherwise, for integer operations, C is cleared. The carry flag is unaf-
fected by floating-point and logical operations. For shift instructions, this flag
is set to the final value shifted out; for a 0 shift count, this is set to 0.
Table 10–9 lists the condition mnemonic, code, description, and flag for each
of the 20 condition codes.
10-12
Condition Codes and Flags
10-14
Individual Instructions
Symbol Meaning
src Source operand
src1 Source operand 1
src2 Source operand 2
src3 Source operand 3
src4 Source operand 4
op1
|| op2 Operation 1 performed in parallel with operation 2
C Carry bit
GIE Global interrupt enable bit
N Trap vector
PC Program counter
RM Repeat mode flag
SP System stack pointer
10-16
Individual Instructions
- Empty expressions are not allowed for the displacement in indirect mode:
LDI *+AR0(),R0 is not legal.
- You can use the LDP pseudo-op to load a register (usually DP) with the
eight MSBs of a relocatable address:
LDP addr,REG or LDP @addr,REG
The @ sign is optional.
If the destination REG is the DP, you can omit the DP in the operand. LDP
generates an LDI instruction with an immediate operand and a special re-
location type.
- You can write the parallel bars indicating part 2 of a parallel instruction any-
where on the line from column 0 to the mnemonic. For example:
ADDI can be written as ADDI
|| STI || STI.
- If the second operand of a parallel instruction is the same as the third (des-
tination register) operand, you can omit the third operand. This allows you
to write three-operand parallel instructions that look like normal two-oper-
and instructions. For example,
ADDI *AR0,R2,R2 can be written as ADD *AR0,R2
|| MPYI *AR1,R0,R0 || MPYI *AR1,R0.
Instructions (applies to all parallel instructions that have a register second
operand) affected: ADDI, ADDF, AND, MPYI, MPYF, OR, SUBI, SUBF,
and XOR.
10-18
Example Instruction EXAMPLE
or
Each instruction begins with an assembler syntax expression. You can place
labels either before the command (instruction mnemonic) on the same line or
on the preceding line in the first column. The optional comment field that con-
cludes the syntax is not included in the syntax expression. Space(s) are
required between each field (label, command, operand, and comment fields).
The syntax examples illustrate the common one-line syntax and the two-line
syntax used in parallel addressing. Note that the two vertical bars || that indi-
cate a parallel addressing pair can be placed anywhere before the mnemonic
on the second line. The first instruction in the pair can have a label, but the sec-
ond instruction cannot have a label.
or
|src2 | → dst1
|| src3 → dst2
Operands are defined according to the addressing mode and/or the type of ad-
dressing used. Note that indirect addressing uses displacements and the in-
dex registers. Refer to Chapter 5 for detailed information on addressing.
Encoding
31 24 23 16 15 87 0
or
31 24 23 16 15 87 0
Encoding examples are shown using general addressing and parallel addres-
sing. The instruction pair for the parallel addressing example consists of
INST1 and INST2.
Description Instruction execution and its effect on the rest of the processor or memory con-
tents is described. Any constraints on the operands imposed by the processor
or the assembler are discussed. The description parallels and supplements
the information given by the operation block.
Cycles 1
The digit specifies the number of cycles required to execute the instruction.
The seven condition flags stored in the status register (ST) are modified by the
majority of instructions only if the destination register is R7–R0. The flags pro-
vide information about the properties of the result or the output of arithmetic
or logical operations.
10-20
Example Instruction EXAMPLE
Mode Bit OVM Overflow Mode Flag. In general, integer operations are affected by the
OVM bit value (described in Table 3–2 on page 3-6).
Before Instruction:
DP = 80h
R5 = 0766900000h = 2.30562500e+02
Memory at 8098AEh = 5CDFh = 1.00001107e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
R5 = 0066900000h = 1.80126953e + 00
Memory at 8098AEh = 5CDFh = 1.00001107e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0
The sample code presented in the above format shows the effect of the code
on system pointers (for example, DP or SP), registers (for example, R1 or R5),
memory at specific locations, and the seven status bits. The values given for
the registers include the leading 0s to show the exponent in floating-point oper-
ations. Decimal conversions are provided for all register and memory loca-
tions. The seven status bits are listed in the order in which they appear in the
assembler and simulator (see Section 10.2 on page 10-10 and Table 10–9 on
page 10-13 for further information on these seven status bits).
0 0 0 0 0 0 0 0 0 G dst src
Description The absolute value of the src operand is loaded into the dst register. The src
and dst operands are assumed to be floating-point numbers.
An overflow occurs if src (man) = 80000000h and src (exp) = 7Fh. The result
is dst (man) = 7FFFFFFFh and dst (exp) = 7Fh.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 0
N 0
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example ABSF R4,R7
Before Instruction:
R4 = 05C8000F971h = –9.90337307e + 27
R7 = 07D251100AEh = 5.48527255e + 37
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R4 = 05C8000F971h = –9.90337307e + 27
R7 = 05C7FFF068Fh = 9.90337307e + 27
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-22
Parallel ABSF and STF ABSF||STF
Encoding
31 24 23 16 15 87 0
Description A floating-point absolute value and a floating-point store are performed in par-
allel. All registers are read at the beginning and loaded at the end of the ex-
ecute cycle. This means that if one of the parallel operations (STF) reads from
a register and the operation being performed in parallel (ABSF) writes to the
same register, STF accepts as input the contents of the register before it is mo-
dified by the ABSF.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
If src3 and dst1 point to the same register, src3 is read before the write to dst1.
An overflow occurs if src (man) = 80000000h and src (exp) = 7Fh. The result
is dst (man) = 7FFFFFFFh and dst (exp) = 7Fh.
Cycles 1
Before Instruction:
AR3 = 809800h
IR1 = 0AFh
R4 = 733C00000h = 1.79750e + 02
AR7 = 8098C5h
Data at 8098AFh = 58B4000h = – 6.118750e + 01
Data at 8098C4h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR3 = 8098AFh
IR1 = 0AFh
R4 = 574C00000h = 6.118750e + 01
AR7 = 8098C5h
Data at 8098AFh = 58B4000h = –6.118750e + 01
Data at 8098C4h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-24
Absolute Value of Integer ABSI
Encoding
31 24 23 16 15 87 0
0 0 0 0 0 0 0 0 1 G dst src
Description The absolute value of the src operand is loaded into the dst register. The src
and dst operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 0
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R0 = 0FFFFFFCBh = – 53
After Instruction:
R0 = 035h = 53
Before Instruction:
AR1 = 20h
R3 = 0h
Data at 20h = 0FFFFFFCBh = – 53
After Instruction:
AR1 = 20h
R3 = 35h = 53
Data at 20h = 0FFFFFFCBh = – 53
10-26
Parallel ABSI and STI ABSI||STI
Encoding
31 24 23 16 15 87 0
Description An integer absolute value and an integer store are performed in parallel. All
registers are read at the beginning and loaded at the end of the execute cycle.
This means that, if one of the parallel operations (STI) reads from a register
and the operation being performed in parallel (ABSI) writes to the same regis-
ter, STI accepts as input the contents of the register before it is modified by the
ABSI.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 0
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
AR5 = 8099E2h
R5 = 0h
R1 = 42h = 66
AR2 = 8098FFh
IR1 = 0Fh
Data at 8099E1h = 0FFFFFFCBh = – 53
Data at 8098FFh = 2h = 2
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR5 = 8099E2h
R5 = 35h = 53
R1 = 42h = 66
AR2 = 8098F0h
IR1 = 0Fh
Data at 8099E1h = 0FFFFFFCBh = – 53
Data at 8098FFh = 42h = 66
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-28
Add Integer With Carry ADDC
Encoding
31 24 23 16 15 87 0
0 0 0 0 0 0 1 0 G dst src
Description The sum of the dst and src operands and the carry (C) flag is loaded into the
dst register. The dst and src operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R1 = 00FFFF5C25h = – 41,947
R5 = 00FFFF019Eh = – 65,122
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 00FFFF5C25h = – 41,947
R5 = 00FFFE5DC4h = – 107,068
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
Description The sum of the src1 and src2 operands and the carry (C) flag is loaded into
the dst register. The src1, src2, and dst operands are assumed to be signed
integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
U 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
10-30
Add Integer With Carry, 3-Operand ADDC3
Before Instruction:
AR5 = 809908h
IR0 = 10h
R5 = 066h = 102
R2 = 0h
Data at 809908h = 0FFFFFFCBh = – 53
LUF LV UF N Z V C = 0 0 0 0 0 0 1
After Instruction:
AR5 = 809918h
IR0 = 10h
R5 = 066h = 102
R2 = 032h = 50
Data at 809908h = 0FFFFFFCBh = – 53
LUF LV UF N Z V C = 0 0 0 0 0 0 1
Before Instruction:
R2 = 02BCh = 700
R7 = 0F82h = 3970
R0 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 1
After Instruction:
R2 = 02BCh = 700
R7 = 0F82h = 3970
R0 = 0123Fh = 4671
LUF LV UF N Z V C = 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 G dst src
Description The sum of the dst and src operands is loaded into the dst register. The dst and
src operands are assumed to be floating-point numbers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example ADDF *AR4++(IR1),R5
Before Instruction:
AR4 = 809800h
IR1 = 12Bh
R5 = 0579800000h = 6.23750e+01
Data at 809800h = 86B2800h = 4.7031250e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR4 = 80992Bh
IR1 = 12Bh
R5 = 09052C0000h = 5.3268750e+02
Data at 809800h = 86B2800h = 4.7031250e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-32
Add Floating-Point, 3-Operand ADDF3
Description The sum of the src1 and src2 operands is loaded into the dst register. The src1,
src2, and dst operands are assumed to be floating-point numbers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example 1 ADDF3 R6,R5,R1
or
ADDF3 R5,R6,R1
Before Instruction:
R6 = 086B280000h = 4.7031250e + 02
R5 = 0579800000h = 6.23750e+01
R1 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R6 = 086B280000h = 4.7031250e + 02
R5 = 0579800000h = 6.23750e + 01
R1 = 09052C0000h = 5.3268750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Before Instruction:
AR1 = 809820h
AR7 = 8099F0h
IR0 = 8h
R4 = 0h
Data at 809821h = 700F000h = 1.28940e + 02
Data at 8099F0h = 34C2000h = 1.27590e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 809820h
AR7 = 8099F8h
IR0 = 8h
R4 = 070DB20000h = 1.41695313e + 02
Data at 809821h = 700F000h = 1.28940e + 02
Data at 8099F0h = 34C2000h = 1.27590e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-34
Parallel ADDF3 and STF ADDF3||STF
Encoding
31 24 23 16 15 87 0
Description A floating-point addition and a floating-point store are performed in parallel. All
registers are read at the beginning and loaded at the end of the execute cycle.
This means that if one of the parallel operations (STF) reads from a register
and the operation being performed in parallel (ADDF3) writes to the same reg-
ister, STF accepts as input the contents of the register before it is modified by
the ADDF3.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Before Instruction:
AR3 = 809800h
IR1 = 0A5h
R2 = 070C800000h = 1.4050e + 02
R5 = 0h
R4 = 057B400000h = 6.281250e + 01
AR2 = 8098F3h
Data at 8098A5h = 733C000h = 1.79750e + 02
Data at 8098F3h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR3 = 809800h
IR1 = 0A5h
R2 = 070C800000h = 1.4050e+02
R5 = 0820200000h = 3.20250e + 02
R4 = 057B400000h = 6.281250e + 01
AR2 = 8098F3h
Data at 8098A5h = 733C000h = 1.79750e + 02
Data at 8098F3h = 57B4000h = 6.28125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-36
Add Integer ADDI
Encoding
31 24 23 16 15 87 0
0 0 0 0 0 0 1 0 0 G dst src
Description The sum of the dst and src operands is loaded into the the dst register. The
dst and src operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R3 = 0FFFFFFCBh = – 53
R7 = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R3 = 0FFFFFFCBh = – 53
R7 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
Description The sum of the src1 and src2 operands is loaded into the dst register. The src1,
src2, and dst operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R4 = 0DCh = 220
R7 = 0A0h = 160
R5 = 10h = 16
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-38
Add Integer, 3-Operand ADDI3
After Instruction:
R4 = 0DCh = 220
R7 = 0A0h = 160
R5 = 017Ch = 380
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Before Instruction:
AR3 = 809802h
AR6 = 809930h
IR0 = 18h
R2 = 10h = 16
Data at 809801h = 2AF8h = 11,000
Data at 809930h = 3A98h = 15,000
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR3 = 809802h
AR6 = 809918h
IR0 = 18h
R2 = 06598h = 26,000
Data at 809801h = 2AF8h = 11,000
Data at 809930h = 3A98h = 15,000
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
Description An integer addition and an integer store are performed in parallel. All registers
are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (ADDI3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the ADDI3.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a carry occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
10-40
Parallel ADDI3 and STI ADDI3||STI
Before Instruction:
AR0 = 80992Ch
IR0 = 0Ch
R5 = 0DCh = 220
R0 = 0h
R3 = 35h = 53
AR7 = 80983Bh
Data at 80992Ch = 12Ch = 300
Data at 80983Bh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR0 = 809920h
IR0 = 0Ch
R5 = 0DCh = 220
R0 = 208h = 520
R3 = 35h = 53
AR7 = 80983Bh
Data at 80992Ch = 12Ch = 300
Data at 80983Bh = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 0 0 0 1 0 1 G dst src
Description The bitwise logical-AND between the dst and src operands is loaded into the
dst register. The dst and src operands are assumed to be unsigned integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
R1 = 80h
R2 = 0AFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 1
After Instruction:
R1 = 80h
R2 = 80h
LUF LV UF N Z V C = 0 0 0 0 0 0 1
10-42
Bitwise Logical-AND, 3-Operand AND3
Encoding
31 24 23 16 15 87 0
Description The bitwise logical-AND between the src1 and src2 operands is loaded into
the destination register. The src1, src2, and dst operands are assumed to be
unsigned integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
AR0 = 8098F4h
IR0 = 50h
AR1 = 809951h
R4 = 0h
Data at 8098F4h = 30h
Data at 809952h = 123h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR0 = 8098A4h
IR0 = 50h
AR1 = 809951h
R4 = 020h
Data at 8098F4h = 30h
Data at 809952h = 123h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Before Instruction:
AR5 = 80985Ch
R7 = 2h
R4 = 0h
Data at 80985Bh = 0AFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR5 = 80985Ch
R7 = 2h
R4 = 2h
Data at 80985Bh = 0AFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-44
Parallel AND3 and STI AND3||STI
Encoding
31 24 23 16 15 87 0
Description A bitwise logical-AND and an integer store are performed in parallel. All regis-
ters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (AND3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the AND3.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
AR1 = 8099F1h
IR0 = 8h
R4 = 0A323h
R7 = 0h
R3 = 35h = 53
AR2 = 80983Fh
Data at 8099F9h = 5C53h
Data at 80983Fh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 8099F1h
R0 = 8h
R4 = 0A323h
R7 = 03h
R3 = 35h = 53
AR2 = 80983Fh
Data at 8099F9h = 5C53h
Data at 80983Fh = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-46
Bitwise Logical-AND With Complement ANDN
0 0 0 0 0 0 1 1 0 G dst src
Description The bitwise logical-AND between the dst operand and the bitwise logical com-
plement (∼) of the src operand is loaded into the dst register. The dst and src
operands are assumed to be unsigned integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example ANDN @980Ch,R2
Before Instruction:
DP = 80h
R2 = 0C2Fh
Data at 80980Ch = 0A02h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
R2 = 042Dh
Data at 80980Ch = 0A02h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
Description The bitwise logical-AND between the src1 operand and the bitwise logical
complement (∼) of the src2 operand is loaded into the dst register. The src1,
src2, and dst operands are assumed to be unsigned integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example 1 ANDN3 R5,R3,R7
Before Instruction:
R5 = 0A02h
R3 = 0C2Fh
R7 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-48
Bitwise Logical-ANDN, 3-Operand ANDN3
After Instruction:
R5 = 0A02h
R3 = 0C2Fh
R7 = 042Dh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Before Instruction:
R1 = 0CFh
AR5 = 809825h
IR0 = 5h
R0 = 0h
Data at 809825h = 0FFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 0CFh
AR5 = 80982Ah
IR0 = 5h
R0 = 0F30h
Data at 809825h = 0FFFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Else:
dst >> |count | → dst
Encoding
31 24 23 16 15 87 0
0 0 0 0 0 0 1 1 1 G dst count
Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count of up to 32 bits.
If the count operand is greater than 0, the dst operand is left-shifted by the
value of the count operand. Low-order bits shifted in are 0-filled, and high-ord-
er bits are shifted out through the carry (C) bit.
Arithmetic left-shift:
C ← dst ← 0
If the count operand is less than 0, the dst operand is right-shifted by the abso-
lute value of the count operand. The high-order bits of the dst operand are sign-
extended as it is right-shifted. Low-order bits are shifted out through the C bit.
Arithmetic right-shift:
sign of dst → dst → C
If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count and dst operands are assumed to be signed integers.
Cycles 1
10-50
Arithmetic Shift ASH
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
R1 = 10h = 16
R3 = 0AE000h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 10h
R3 = 0E0000000h
LUF LV UF N Z V C = 0 1 0 1 0 1 0
Before Instruction:
DP = 80h
R5 = 0AEC00001h
Data at 8098C3h = 0FFE8 = – 24
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
R5 = 0FFFFFFAEh
Data at 8098C3h = 0FFE8 = – 24
LUF LV UF N Z V C = 0 0 0 1 0 0 1
Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count of up to 32 bits.
If the count operand is greater than 0, the src operand is left-shifted by the
value of the count operand. Low-order bits shifted in are 0-filled, and high-ord-
er bits are shifted out through the status register’s C bit.
Arithmetic left-shift:
C ← src ← 0
If the count operand is less than 0, the src operand is right-shifted by the abso-
lute value of the count operand. The high-order bits of the src operand are sign-
extended as they are right-shifted. Low-order bits are shifted out through the
C (carry) bit.
Arithmetic right-shift:
sign of src → src → C
If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count, src, and dst operands are assumed to be signed integers.
10-52
Arithmetic Shift, 3-Operand ASH3
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N MSB of the output.
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Mode Bit OVM Operation is not affected by OVM bit value.
Example ASH3 *AR3– –(1),R5,R0
Before Instruction:
AR3 = 809921h
R5 = 02B0h
R0 = 0h
Data at 809921h = 10h = 16
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR3 = 809920h
R5 = 000002B0h
R0 = 02B00000h
Data at 809921h = 10h = 16
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Example ASH3 R1,R3,R5
Before Instruction:
R1 = 0FFFFFFF8h = – 8
R3 = 0FFFFCB00h
R5 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 0FFFFFFF8h = – 8
R3 = 0FFFFCB00h
R5 = 0FFFFFFCBh
LUF LV UF N Z V C = 0 0 0 1 0 0 0
Description The seven least significant bits of the count operand register are used to gen-
erate the two’s complement shift count of up to 32 bits.
If the count operand is greater than 0, the src2 operand is left-shifted by the
value of the count operand. Low-order bits shifted in are 0-filled, and high-ord-
er bits are shifted out through the C bit.
Arithmetic left-shift:
C ← src2 ← 0
If the count operand is less than 0, the src2 operand is right-shifted by the ab-
solute value of the count operand. The high-order bits of the src2 operand are
sign-extended as it is right-shifted. Low-order bits are shifted out through the
C bit.
Arithmetic right-shift:
sign of src2 → src2 → C
If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count and dst operands are assumed to be signed integers.
All registers are read at the beginning and loaded at the end of the execute
cycle. This means that, if one of the parallel operations (STI) reads from a reg-
ister and the operation being performed in parallel (ASH3) writes to the same
register, STI accepts as input the contents of the register before it is modified
by the ASH3.
10-54
Parallel ASH3 and STI ASH3||STI
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
AR6 = 809900h
IR1 = 8Ch
R1 = 0FFE8h = – 24
R0 = 0h
R5 = 35h = 53
AR2 = 8098A2h
Data at 809900h = 0AE000000h
Data at 8098A2h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR6 = 80998Ch
IR1 = 8Ch
R1 = 0FFE8h = – 24
R0 = 0FFFFFFAEh
R5 = 35h = 53
AR2 = 8098A2h
Data at 809900h = 0AE000000h
Data at 8098A2h = 35h = 53
LUF LV UF N Z V C = 0 0 0 1 0 0 0
Encoding
31 24 23 16 15 87 0
Description Bcond signifies a standard branch that executes in four cycles. A branch is per-
formed if the condition is true (since a pipeline flush also occurs on a true condi-
tion; see Section 9.2 on page 9-4). If the src operand is expressed in register
addressing mode, the contents of the specified register are loaded into the PC.
If the src operand is expressed in PC-relative mode, the assembler generates
a displacement: displacement = label – (PC of branch instruction + 1). This dis-
placement is stored as a 16-bit signed integer in the 16 least significant bits
of the branch instruction word. This displacement is added to the PC of the
branch instruction plus 1 to generate the new PC.
The TMS320C3x provides 20 condition codes that you can use with this in-
struction (see Table 10–9 on page -13 for a list of condition mnemonics, condi-
tion codes and flags). Condition flags are set on a previous instruction only
when the destination register is one of the extended-precision registers (R7–
R0) or when one of the compare instructions (CMPF, CMPF3, CMPI, CMPI3,
TSTB, or TSTB3) is executed.
Cycles 4
10-56
Branch Conditionally (Standard) Bcond
Example BZ R0
Before Instruction:
PC = 2B00h
R0 = 0003FF00h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 3FF00h
R0 = 0003FF00h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Note:
If a BZ instruction is executed immediately following a RND instruction with
a 0 operand, the branch is not performed, because the 0 flag is not set. To
circumvent this problem, execute a BZUF instead of a BZ instruction.
Description Bcond D signifies a delayed branch that allows the three instructions after the
delayed branch to be fetched before the PC is modified. The effect is a single-
cycle branch, and the three instructions following Bcond D will not affect the
cond.
A branch is performed if the condition is true. If the src operand is expressed
in register-addressing mode, the contents of the specified register are loaded
into the PC. If the src operand is expressed in PC-relative mode, the assembler
generates a displacement: displacement = label – (PC of branch instruction
+ 3). This displacement is stored as a 16-bit signed integer in the 16 least sig-
nificant bits of the branch instruction. This displacement is added to the PC of
the branch instruction plus 3 to generate the new PC. The TMS320C3x pro-
vides 20 condition codes that you can use with this instruction (see Table 10–9
on page -13 for a list of condition mnemonics, condition codes, and flags). Con-
dition flags are set on a previous instruction only when the destination register
is one of the extended-precision registers (R7–R0) or when one of the com-
pare instructions (CMPF, CMPF3, CMPI, CMPI3, TSTB, or TSTB3) is ex-
ecuted.
Cycles 1
Status Bits LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-58
Branch Conditionally (Delayed) BcondD
Before Instruction:
PC = 50h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 77h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Syntax BR src
Encoding
31 24 23 16 15 87 0
0 1 1 0 0 0 0 0 disp
Description BR performs a PC-relative branch that executes in four cycles, since a pipeline
flush also occurs upon execution of the branch; see Section 9.2 on page 9-4.
An unconditional branch is performed. The src operand is assumed to be a
24-bit unsigned integer. Note that bit 24 = 0 for a standard branch.
Cycles 4
Example BR 805Ch
Before Instruction:
PC = 80h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 805Ch
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-60
Branch Unconditionally (Delayed) BRD
Operation src → PC
Encoding
31 24 23 16 15 87 0
0 1 1 0 0 0 0 1 src
Description BRD signifies a delayed branch that allows the three instructions after the
delayed branch to be fetched before the PC is modified. The effect is a
single-cycle branch.
Cycles 1
Before Instruction:
PC = 1Bh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 2Ch
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 1 1 0 0 0 1 0 src
Description A call is performed. The next PC value is pushed onto the system stack. The
src operand is loaded into the PC. The src operand is assumed to be a 24-bit
unsigned immediate operand.
Cycles 4
Before Instruction:
PC = 5h
SP = 809801h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 123456h
SP = 809802h
Data at 809802h = 6h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-62
Call Subroutine Conditionally CALLcond
Description A call is performed if the condition is true. If the condition is true, the next PC
value is pushed onto the system stack. If the src operand is expressed in regis-
ter addressing mode, the contents of the specified register are loaded into the
PC. If the src operand is expressed in PC-relative mode, the assembler gener-
ates a displacement: displacement = label – (PC of call instruction + 1). This
displacement is stored as a 16-bit signed integer in the 16 least significant bits
of the call instruction word. This displacement is added to the PC of the call
instruction plus 1 to generate the new PC.
The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page -13 for a list of condition mnemonics, condi-
tion codes, and flags). Condition flags are set on a previous instruction only
when the destination register is one of the extended-precision registers (R7–
R0) or when one of the compare instructions (CMPF, CMPF3, CMPI, CMPI3,
TSTB, or TSTB3) is executed.
Cycles 5
Example CALLNZ R5
Before Instruction:
PC = 123h
SP = 809835h
R5 = 789h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 789h
SP = 809836h
R5 = 789h
Data at 809836h = 124h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-64
Compare Floating-Point CMPF
0 0 0 0 0 1 0 0 0 G dst src
Description The src operand is subtracted from the dst operand. The result is not loaded
into any register, thus allowing for nondestructive compares. The dst and src
operands are assumed to be floating-point numbers.
Cycles 1
Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example CMPF *+AR4,R6
Before Instruction:
AR4 = 8098F2h
R6 = 070C800000h = 1.4050e+02
Data at 8098F3h = 070C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR4 = 8098F2h
R6 = 070C800000h = 1.4050e + 02
Data at 8098F3h = 070C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 1 0 0
Encoding
31 24 23 16 15 87 0
0 0 1 0 0 0 1 1 0 T 0 0 0 0 0 src1 src2
Description The src2 operand is subtracted from the src1 operand. The result is not loaded
into any register, thus allowing for nondestructive compares. The src1 and
src2 operands are assumed to be floating-point numbers. Although this in-
struction has only two operands, it is designated as a three-operand instruc-
tion because operands are specified in the three-operand format.
Cycles 1
Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-66
Compare Floating-Point, 3-Operand CMPF3
Before Instruction:
AR2 = 809831h
AR3 = 809852h
Data at 809831h = 77A7000h = 2.5044e + 02
Data at 809852h = 57A2000h = 6.253125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 809831h
AR3 = 809851h
Data at 809831h = 77A7000h = 2.5044e + 02
Data at 809852h = 57A2000h = 6.253125e + 01
LUF LV UF N Z V C = 0 0 0 1 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 0 0 1 0 0 1 G dst src
Description The src operand is subtracted from the dst operand. The result is not loaded
into any register, thus allowing for nondestructive compares. The dst and src
operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
R3 = 898h = 2200
R7 = 3E8h = 1000
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R3 = 898h = 2200
R7 = 3E8h = 1000
LUF LV UF N Z V C = 0 0 0 1 0 0 1
10-68
Compare Integer, 3-Operand CMPI3
Encoding
31 24 23 16 15 87 0
0 0 1 0 0 0 1 1 1 T 0 0 0 0 0 src1 src2
Description The src2 operand is subtracted from the src1 operand. The result is not loaded
into any register, thus allowing for nondestructive compares. The src1 and
src2 operands are assumed to be signed integers. Although this instruction
has only two operands, it is designated as a three-operand instruction be-
cause operands are specified in the three-operand format.
Cycles 1
Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Before Instruction:
R7 = 03E8h = 1000
R4 = 0898h = 2200
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R7 = 03E8h = 1000
R4 = 0898h = 2200
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-70
Decrement and Branch Conditionally (Standard) DBcond
Description DBcond signifies a standard branch that executes in four cycles because the
pipeline must be flushed if cond is true. The specified auxiliary register is de-
cremented and a branch is performed if the condition is true and the specified
auxiliary register is greater than or equal to 0. The condition flags are those set
by the last previous instruction that affects the status bits.
The auxiliary register is treated as a 24-bit signed integer. The most significant
eight bits are unmodified by the decrement operation. The comparison of the
auxiliary register uses only the 24 least significant bits of the auxiliary register.
Note that the branch condition does not depend on the auxiliary register decre-
ment.
If the src operand is expressed in register addressing mode, the contents of
the specified register are loaded into the PC. If the src operand is expressed
in PC-relative addressing mode, the assembler generates a displacement:
displacement = label – (PC of branch instruction + 1). This integer is stored as
a 16-bit signed integer in the 16 least significant bits of the branch instruction
word. This displacement is added to the PC of the branch instruction plus 1 to
generate the new PC.
The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page -13 for a list of condition mnemonics, condi-
tion codes, and flags). Condition flags are set on a previous instruction only
when the destination register is one of the extended-precision registers
(R0–R7) or when one of the compare instructions (CMPF, CMPF3, CMPI,
CMPI3, TSTB, or TSTB3) is executed.
Cycles 4
Before Instruction:
PC = 5Fh
AR3 = 12h
R2 = 9Fh
R3 = 80h
LUF LV UF N Z V C = 0 0 0 1 0 0 0
After Instruction:
PC = 9Fh
AR3 = 11h
R2 = 9Fh
R3 = 80h
LUF LV UF N Z V C = 0 0 0 1 0 0 0
10-72
Decrement and Branch Conditionally (Delayed) DBcondD
Else, continue.
ARn register (0 ≤ n ≤ 7)
Encoding
31 24 23 16 15 87 0
Description DBcond D signifies a delayed branch that allows the three instructions after
the delayed branch to be fetched before the PC is modified. The effect is a
single-cycle branch. The specified auxiliary register is decremented, and a
branch is performed if the condition is true and the specified auxiliary register
is greater than or equal to 0. The condition flags are those set by the last pre-
vious instruction that affects the status bits. The three instructions following the
DBcond D do not affect the cond.
The auxiliary register is treated as a 24-bit signed integer. The most significant
eight bits are unmodified by the decrement operation. The comparison of the
auxiliary register uses only the 24 least significant bits of the auxiliary register.
Note that the branch condition does not depend on the auxiliary register decre-
ment.
The TMS320C3x provides 20 condition codes that you can use with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Condition flags are set on a previous instruction
only when the destination register is one of the extended-precision registers
(R7–R0) or when one of the compare instructions (CMPF, CMPF3, CMPI,
CMPI3, TSTB, or TSTB3) is executed.
Cycles 1
Before Instruction:
PC = 100h
R2 = 26h
AR5 = 67h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 210h
R2 = 26h
AR5 = 66h
LUF LV UF N Z V C = 0 0 0 0 1 0 0
10-74
Floating-Point-to-Integer Conversion FIX
Encoding
31 24 23 16 15 87 0
0 0 0 0 0 1 0 1 0 G dst src
Description The floating-point operand src is converted to the nearest integer less than or
equal to it in value, and the result is loaded into the dst register. The src oper-
and is assumed to be a floating-point number and the dst operand a signed
integer.
The exponent field of the result register (if it has one) is not modified.
Integer overflow occurs when the floating-point number is too large to be rep-
resented as a 32-bit two’s complement integer. In the case of integer overflow,
the result will be saturated in the direction of overflow.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
R1 = 0A28200000h = 1.3454e + 3
R2 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 0A28200000h = 13454e + 3
R2 = 541h = 1345
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-76
Parallel FIX and STI FIX||STI
Encoding
31 24 23 16 15 87 0
Description A floating-point to integer conversion is performed. All registers are read at the
beginning and loaded at the end of the execute cycle. This means that, if one
of the parallel operations (STI) reads from a register, and the operation being
performed in parallel (FIX) writes to the same register, STI accepts as input the
contents of the register before it is modified by FIX.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Integer overflow occurs when the floating-point number is too large to be rep-
resented as a 32-bit two’s complement integer. In the case of integer overflow,
the result will be saturated in the direction of overflow.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
AR4 = 8098A2h
R1 = 0h
R0 = 0DCh = 220
AR2 = 80983Ch
Data at 8098A3h = 733C000h = 1.7950e + 02
Data at 80983Ch = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR4 = 8098A3h
R1 = 0B3h = 179
R0 = 0DCh = 220
AR2 = 80983Ch
Data at 8098A3h = 733C000h = 1.79750e + 02
Data at 80983Ch = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-78
Integer-to-Floating-Point Conversion FLOAT
0 0 0 0 0 1 0 1 1 G dst src
Description The integer operand src is converted to the floating-point value equal to it, and
the result loaded into the dst register. The src operand is assumed to be a
signed integer, and the dst operand a floating-point number.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example FLOAT *++AR2(2),R5
Before Instruction:
AR2 = 809800h
R5 = 034C2000h = 1.27578125e + 01
Data at 809802h = 0AEh = 174
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 809802h
R5 = 072E00000h = 1.74e + 02
Data at 809802h = 0AEh = 174
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
10-80
Parallel FLOAT and STF FLOAT||STF
Before Instruction:
AR2 = 8098C5h
IR0 = 8h
R6 = 0h
R7 = 034C200000h = 1.27578125e + 01
AR1 = 809933h
Data at 8098CDh = 0AEh = 174
Data at 809933h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 8098C5h
IR0 = 8h
R6 = 072E000000h = 1.740e + 02
R7 = 034C200000h = 1.27578125e + 01
AR1 = 809933h
Data at 8098CDh = 0AEh = 174
Data at 809933h = 034C2000h = 1.27578125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
0 0 0 1 1 0 1 1 0 G 0 0 0 0 0 src
Cycles 1
Before Instruction:
IACK = 1
PC = 300h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
IACK = 1
PC = 301h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-82
Idle Until Interrupt IDLE
Syntax IDLE
Operation 1 → ST(GIE)
Next PC → PC
Idle until interrupt.
Operands None
Encoding
31 24 23 16 15 87 0
0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Description The global interrupt enable bit is set, the next PC value is loaded into the PC,
and the CPU idles until an interrupt is received. When the interrupt is received,
the contents of the PC are pushed onto the active system stack.
Cycles 1
0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Description The IDLE2 instruction serves the same function as IDLE, except that it re-
moves the functional clock input from the internal device. This allows for ex-
tremely low power mode. The PC is incremented once, and the device remains
in an idle state until one of the external interrupts (INT0–3) is asserted.
In IDLE2 mode, the ’C31 will behave as follows:
- The CPU, peripherals, and memory will retain their previous states.
- When the device is in the functional (nonemulation) mode, the clocks will
stop with H1 high and H3 low.
- The ’LC31 will remain in IDLE2 until one of the four external interrupts
(INT3 – INT0) is asserted for at least two H1 cycles. When one of the four
interrupts is asserted, the clocks start after a delay of one H1 cycle. The
clocks can start up in the phase opposite that in which they were stopped
(that is, H1 might start high when H3 was high before stopping, and H3
might start high when H1 was high before stopping.) However, the H1 and
H3 clocks remain 180° out of phase with each other.
- During IDLE2 operation, for one of the four external interrupts to be recog-
nized by the CPU and serviced, it must be asserted for at least two H1
cycles. For the processor to recognize only one interrupt when it restarts
operation, the interrupt must be asserted for less than three cycles.
- When the ’LC31 is in emulation mode, the H1 and H3 clocks will continue
to run normally, and the CPU will operate as if an IDLE instruction had been
executed. The clocks continue to run for correct operation of the emulator.
Delayed Branch
For correct device operation, the three instructions after a delayed
branch should not be IDLE or IDLE2 instructions.
10-84
Low-Power Idle IDLE2
Cycles 1
0 0 0 0 0 1 1 0 1 G dst src
Description The exponent field of the src operand is loaded into the exponent field of the
dst register. No modification of the dst register mantissa field is made unless
the value of the exponent loaded is the reserved value of the exponent for 0
as determined by the precision of the src operand. Then the mantissa field of
the dst register is set to 0. The src and dst operands are assumed to be float-
ing-point numbers. Immediate values are evaluated in the short floating-point
format.
Cycles 1
Status Bits LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example LDE R0,R5
Before Instruction:
R0 = 0200056F30h = 4.00066337e + 00
R5 = 0A056FE332h = 1.06749648e + 03
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R0 = 0200056F30h = 4.00066337e + 00
R5 = 02056FE332h = 4.16990814e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-86
Load Floating-Point LDF
Encoding
31 24 23 16 15 87 0
0 0 0 0 0 1 1 1 0 G dst src
Description The src operand is loaded into the dst register. The dst and src operands are
assumed to be floating-point numbers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
DP = 80h
R2 = 0h
Data at 809800h = 10C52A00h = 2.19254303e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
R2 = 010C52A00h = 2.19254303e + 00
Data at 809800h = 10C52A00h = 2.19254303e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Else:
dst is unchanged.
Encoding
31 24 23 16 15 87 0
Description If the condition is true, the src operand is loaded into the dst register. otherwise,
the dst register is unchanged. The dst and src operands are assumed to be
floating-point numbers.
The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Note that an LDFU (load floating-point uncondi-
tionally) instruction is useful for loading R7–R0 without affecting condition
flags. Condition flags are set on a previous instruction only when the destina-
tion register is one of the extended-precision registers (R7–R0) or when one
of the compare instructions (CMPF, CMPF3, CMPI, CMPI3, TSTB, or TSTB3)
is executed.
Cycles 1
10-88
Load Floating-Point Conditionally LDFcond
Before Instruction:
After Instruction:
0 0 0 0 0 1 1 1 1 G dst src
Description The src operand is loaded into the dst register. An interlocked operation is sig-
naled over XF0 and XF1. The src and dst operands are assumed to be floating-
point numbers. Note that only direct and indirect modes are allowed. Refer to
Section 6.4 on page 6-12 for detailed description.
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example LDFI *+AR2,R7
Before Instruction:
AR2 = 8098F1h
R7 = 0h
Data at 8098F2h = 584C000h = – 6.28125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 8098F1h
R7 = 0584C00000h = – 6.28125e + 01
Data at 8098F2h = 584C000h = – 6.28125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 1
10-90
Parallel LDF and LDF LDF||LDF
Encoding
31 24 23 16 15 87 0
Description Two floating-point loads are performed in parallel. If the LDFs load the same
register, the assembler issues a warning. The result is that of LDF src2, dst2.
Cycles 1
Before Instruction:
AR1 = 80985Fh
IR0 = 8h
R7 = 0h
AR7 = 80988Ah
R3 = 0h
Data at 809857h = 70C8000h = 1.4050e + 02
Data at 80988Ah = 57B4000h = 6.281250e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 809857h
R0 = 8h
R7 = 070C800000h = 1.4050e + 02
AR7 = 80988Bh
R3 = 057B400000h = 6.281250e + 01
Data at 809857h = 70C8000h = 1.4050e + 02
Data at 80988Ah = 57B4000h = 6.281250e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-92
Parallel LDF and STF LDF||STF
Encoding
31 24 23 16 15 87 0
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Before Instruction:
AR2 = 8098E7h
R1 = 0h
R3 = 057B400000h = 6.28125e + 01
AR4 = 809900h
IR1 = 10h
Data at 8098E7h = 70C8000h = 1.4050e + 02
Data at 809900h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 8098E6h
R1 = 070C800000h = 1.4050e + 02
R3 = 057B400000h = 6.28125e + 01
AR4 = 809910h
IR1 = 10h
Data at 8098E7h = 70C8000h = 1.4050e + 02
Data at 809900h = 57B4000h = 6.28125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-94
Load Integer LDI
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 0 0 0 0 G dst src
Description The src operand is loaded into the dst register. The dst and src operands are
assumed to be signed integers. An alternate form of LDI, LDP, is used to load
the data page pointer register (DP). See the LDP instruction and subsec-
tion 10.3.2 on page 10-16.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
AR1 = 2Ch
IR0 = 5h
R5 = 3C5h = 965
Data at 27h = 26h = 38
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 2Ch
IR0 = 5h
R5 = 26h = 38
Data at 27h = 26h = 38
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-96
Load Integer Conditionally LDIcond
Else:
dst is unchanged.
Encoding
31 24 23 16 15 87 0
Description If the condition is true, the src operand is loaded into the dst register. otherwise,
the dst register is unchanged. Regardless of the condition, the read of the src
takes place. The dst and src operands are assumed to be signed integers.
The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Note that an LDIU (load integer unconditionally)
instruction is useful for loading R7–R0 without affecting the condition flags.
Condition flags are set on a previous instruction only when the destination reg-
ister is one of the extended-precision registers (R7–R0) or when one of the
compare instructions (CMPF, CMPF3, CMPI, CMPI3, TSTB, or TSTB3) is ex-
ecuted.
Cycles 1
Before Instruction:
ARO = 8098FO
Data at 8098FOh = 027Ch = 636
R6 = 0FE2h = 4,066
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
ARO = 8098F1h
Data at 8098FOh = 027Ch = 636
R6 = 0FE2h = 4,066
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-98
Load Integer, Interlocked LDII
0 0 0 0 1 0 0 0 1 G dst src
Description The src operand is loaded into the dst register. An interlocked operation is sig-
naled over XF0 and XF1. The src and dst operands are assumed to be signed
integers. Note that only the direct and indirect modes are allowed. Refer to
Section 6.4 on page 6-12 for detailed description.
Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example LDII @985Fh,R3
Before Instruction:
DP = 80
R3 = 0h
Data at 80985Fh = 0DCh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80
R3 = 0DCH
Data at 80985Fh = 0DCh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
Description Two integer loads are performed in parallel. A warning is issued by the assem-
bler if the LDIs load the same register. The result is that of LDI src2, dst2.
Cycles 1
10-100
Parallel LDI and LDI LDI||LDI
Before Instruction:
AR1 = 809826h
R7 = 0h
AR7 = 8098C8h
IR0 = 10h
R1 = 0h
Data at 809825h = 0FAh = 250
Data at 8098C8h = 2EEh = 750
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 809826h
R7 = 0FAh = 250
AR7 = 8098D8h
IR0 = 10h
R1 = 02EEh = 750
Data at 809825h = 0FAh = 250
Data at 8098C8h = 2EEh = 750
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
Description An integer load and an integer store are performed in parallel. If src2 and dst2
point to the same location, src2 is read before the write to dst2.
Cycles 1
10-102
Parallel LDI and STI LDI||STI
Before Instruction:
AR1 = 8098E7h
R2 = 0h
R7 = 35h = 53
AR5 = 80982Ch
IR0 = 8h
Data at 8098E6h = 0DCh = 220
Data at 80982Ch = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 8098E7h
R2 = 0DCh = 220
R7 = 35h = 53
AR5 = 809834h
IR0 = 8h
Data at 8098E6h = 0DCh = 220
Data at 80982Ch = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 0 0 1 0 G dst src
Description The mantissa field of the src operand is loaded into the mantissa field of the
dst register. The dst exponent field is not modified. The src and dst operands
are assumed to be floating-point numbers. If the src operand is from memory,
the entire memory contents are loaded as the mantissa. If immediate address-
ing mode is used, bits 15–12 of the instruction word are forced to 0 by the as-
sembler.
Cycles 1
Before Instruction:
R2 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R2 = 001CC00000h = 1.22460938e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-104
Load Data Page Pointer LDP
Operands src is the 8 MSBs of the absolute 24-bit source address (src).
The “, DP” in the operand is optional.
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 src
Description This pseudo-op is an alternate form of the LDUI instruction, except that LDP
is always in the immediate addressing mode. The src operand field contains
the eight MSBs of the absolute 24-bit src address (essentially, only
bits 23 –16 of src are used). These eight bits are loaded into the eight LSBs
of the data page pointer.
The eight LSBs of the pointer are used in direct addressing as a pointer to the
page of data being addressed. There is a total of 256 pages, each page 64K
words long. Bits 31 – 8 of the pointer are reserved and should be kept set to 0.
Cycles 1
Before Instruction:
DP = 65h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Operation H1/16 → H1
Operands None
Encoding
31 23 0
0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Description Device continues to execute instructions, but at the reduced rate of the CLKIN
frequency divided by 16 (that is, in LOPOWER mode, an ’LC31 with a CLKIN
frequency of 32 MHz will perform in the same way as a 2-MHz ’LC31, which
has an instruction cycle time of 1000 ns). This allows for low-power operation.
The ’LC31 CPU slows down during the read phase of the LOPOWER instruc-
tion. To exit the LOPOWER power-down mode, invoke the MAXSPEED
instruction (opcode = 1080 0000 h). The ’LC31 resumes full-speed operation
during the read phase of the MAXSPEED instruction.
Delayed Branch
Do not run the IDLE2 instruction in the LOPOWER mode.
Cycles 1
10-106
Logical Shift LSH
Operation If count ≥ 0:
dst << count → dst
Else:
dst >> |count | → dst
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 0 0 1 1 G dst count
Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count. If the count operand is greater than 0, the dst
operand is left-shifted by the value of the count operand. Low-order bits shifted
in are 0-filled, and high-order bits are shifted out through the carry (C) bit.
Logical left-shift:
C ← dst ← 0
If the count operand is less than 0, the dst is right-shifted by the absolute value
of the count operand. The high-order bits of the dst operand are 0-filled as they
are shifted to the right. Low-order bits are shifted out through the C bit.
Logical right-shift:
0 → dst → C
If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count operand is assumed to be a signed integer, and the dst operand is as-
sumed to be an unsigned integer.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Before Instruction:
R4 = 018h = 24
R7 = 02ACh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R4 = 018h = 24
R7 = 0AC000000h
LUF LV UF N Z V C = 0 0 0 1 0 1 0
Before Instruction:
AR5 = 809908h
IR0 = 4h
R5 = 0012C00000h
Data at 809904h = 0FFFFFFF4h = –12
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR5 = 809908h
IR0 = 4h
R5 = 0000012C00h
Data at 809904h = 0FFFFFFF4h = –12
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-108
Logical Shift, 3-Operand LSH3
Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count.
If the count operand is greater than 0, a copy of the src operand is left-shifted
by the value of the count operand, and the result is written to the dst. (The src
is not changed.) Low-order bits shifted in are 0-filled, and high-order bits are
shifted out through the C (carry) bit.
Logical left-shift:
C ← src ← 0
If the count operand is less than 0, the src operand is right-shifted by the abso-
lute value of the count operand. The high-order bits of the dst operand are 0-
filled as they are shifted to the right. Low-order bits are shifted out through the
C bit.
Logical right-shift:
0 → src → C
If the count operand is 0, no shift is performed, and the C bit is set to 0. The
count operand is assumed to be a signed integer. The src and dst operands
are assumed to be unsigned integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Unaffected if dst is not R7–R0.
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
R4 = 018h = 24
R7 = 02ACh
R2 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R4 = 018h = 24
R7 = 02ACh
R2 = 0AC000000h
LUF LV UF N Z V C = 0 0 0 1 0 1 0
Before Instruction:
AR4 = 809908h
IR1 = 4h
R5 = 012C00000h
R3 = 0h
Data at 809904h = 0FFFFFFF4h = –12
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-110
Logical Shift, 3-Operand LSH3
After Instruction:
AR4 = 809908h
IR1 = 4h
R5 = 012C00000h
R3 = 0000012C00h
Data at 809904h = 0FFFFFFF4h = –12
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Description The seven least significant bits of the count operand are used to generate the
two’s complement shift count.
If the count operand is greater than 0, a copy of the src2 operand is left-shifted
by the value of the count operand, and the result is written to the dst1. (The
src2 is not changed.) Low-order bits shifted in are 0-filled, and high-order bits
are shifted out through the C (carry) bit.
Logical left-shift:
C ← src2 ← 0
If the count operand is less than 0, the src2 operand is right-shifted by the ab-
solute value of the count operand. The high-order bits of the dst operand are
0-filled as they are shifted to the right. Low-order bits are shifted out through
the C (carry bit).
Logical right-shift:
0 → src2 → C
If the count operand is 0, no shift is performed, and the carry bit is set to 0.
The count operand is assumed to be a seven-bit signed integer, and the src2
and dst1 operands are assumed to be unsigned integers. All registers are read
at the beginning and loaded at the end of the execute cycle. This means that
if one of the parallel operations (STI) reads from a register and the operation
being performed in parallel (LSH3) writes to the same register, STI accepts as
input the contents of the register before it is modified by the LSH3.
10-112
Parallel LSH3 and STI LSH3||STI
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output.
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the last bit shifted out. 0 for a shift count of 0.
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R2 = 18h = 24
AR3 = 8098C2h
R0 = 0h
R4 = 0DCh = 220
AR5 = 8098A3h
Data at 8098C3h = 0ACh
Data at 8098A2h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R2 = 18h = 24
AR3 = 8098C3h
R0 = 0AC000000h
R4 = 0DCh = 220
AR5 = 8098A3h
Data at 8098C3h = 0ACh
Data at 8098A2h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 1 0 1 0
Before Instruction:
R7 = 0FFFFFFF4h = –12
AR2 = 809863h
R2 = 0h
R0 = 12Ch = 300
AR0 = 8098B7h
Data at 809863h = 2C000000h
Data at 8098B8h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R7 = 0FFFFFFF4h = –12
AR2 = 809862h
R2 = 2C000h
R0 = 12Ch = 300
AR0 = 8098B7h
Data at 809863h = 2C000000h
Data at 8098B8h = 12Ch = 300
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-114
Restore Clock to Regular Speed MAXSPEED
Syntax MAXSPEED
Operation H1/16 → H1
Operands None
Encoding
31 23 16 15 87 0
0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Cycles 1
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 0 1 0 0 G dst src
Description The product of the dst and src operands is loaded into the dst register. The src
operand is assumed to be a single-precision floating-point number, and the dst
operand is an extended-precision floating-point number.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
R0 = 070C800000h = 1.4050e + 02
R2 = 034C200000h = 1.27578125e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R0 = 070C800000h = 1.4050e + 02
R2 = 0A600F2000h = 1.79247266e + 03
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-116
Multiply Floating Point, 3-Operand MPYF3
Encoding
31 24 23 16 15 87 0
Description The product of the src1 and src2 operands is loaded into the dst register. The
src1 and src2 operands are assumed to be single-precision floating-point
numbers, and the dst operand is an extended-precision floating-point number.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
R0 = 057B400000h = 6.281250e + 01
R7 = 0733C00000h = 1.79750e + 02
R1 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R0 = 057B400000h = 6.281250e + 01
R7 = 0733C00000h = 1.79750e + 02
R1 = 0D306A3000h = 1.12905469e + 04
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Before Instruction:
AR2 = 809800h
IR0 = 12Ah
R7 = 057B400000h = 6.281250e + 01
R2 = 0h
Data at 80992Ah = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 809800h
IR0 = 12Ah
R7 = 057B400000h = 6.281250e + 01
R2 = 0D09E4A000h = 8.82515625e + 03
Data at 80992Ah = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-118
Parallel MPYF3 and ADDF3 MPYF3||ADDF3
Operands srcA
srcB Any two indirect (disp = 0,1,IR0,IR1)
srcC Any two register (0 ≤ Rn ≤ 7)
srcD
Operation (P Field)
Encoding
31 24 23 16 15 87 0
Any combination of addressing modes can be coded for the four possible
source operands as long as two are coded as indirect and two are register. The
assignment of the source operands srcA – srcD to the src1 – src4 fields
varies, depending on the combination of addressing modes used, and the P
field is encoded accordingly.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 0
Z 0
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
AR5 = 8098C5h
AR1 = 8098A8h
IR0 = 4h
R0 = 0h
R5 = 0733C00000h = 1.79750e + 02
R7 = 070C800000h = 1.4050e + 02
R3 = 0h
Data at 8098C5h = 34C0000h = 1.2750e + 01
Data at 8098A4h = 1110000h = 2.265625e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-120
Parallel MPYF3 and ADDF3 MPYF3||ADDF3
After Instruction:
AR5 = 8098C6h
AR1 = 8098A4h
IR0 = 4h
R0 = 0467180000h = 2.88867188e + 01
R5 = 0733C00000h = 1.79750e + 02
R7 = 070C800000h = 1.4050e + 02
R3 = 0820200000h = 3.20250e + 02
Data at 8098C5h = 34C0000h = 1.2750e + 01
Data at 8098A4h = 1110000h = 2.265625e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; 0 unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-122
Parallel MPYF3 and STF MPYF3||STF
Before Instruction:
AR2 = 80982Bh
R7 = 057B400000h = 6.281250e + 01
R0 = 0h
R3 = 086B280000h = 4.7031250e + 02
AR0 = 809860h
IR0 = 8h
Data at 80982Ah = 70C8000h = 1.4050e + 02
Data at 809860h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 80982Bh
R7 = 057B400000h = 6.281250e + 01
R0 = 0D09E4A000h = 8.82515625e + 03
R3 = 086B280000h = 4.7031250e + 02
AR0 = 809858h
IR0 = 8h
Data at 80982Ah = 70C8000h = 1.4050e + 02
Data at 809860h = 86B280000h = 4.7031250e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Operands srcA
srcB Any two indirect (disp = 0,1,IR0,IR1)
srcC Any two register (0 ≤ Rn ≤ 7)
srcD
Operation (P Field)
Encoding
31 24 23 16 15 87 0
10-124
Parallel MPYF3 and SUBF3 MPYF3||SUBF3
Any combination of addressing modes can be coded for the four possible
source operands as long as two are coded as indirect and two are coded regis-
ter. The assignment of the source operands srcA – srcD to the src1 – src4
fields varies, depending on the combination of addressing modes used, and
the P field is encoded accordingly.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 0
Z 0
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
R5 = 034C000000h = 1.2750e + 01
AR7 = 809904h
IR1 = 8h
R0 = 0h
R7 = 0733C00000h = 1.79750e + 02
AR3 = 8098B2h
R2 = 0h
Data at 80990Ch = 1110000h = 2.250e + 00
Data at 8098B2h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R5 = 034C000000h = 1.2750e + 01
AR7 = 80990Ch
IR1 = 8h
R0 = 0467180000h = 2.88867188e + 01
R7 = 0733C00000h = 1.79750e + 02
AR3 = 8098B1h
R2 = 05E3000000h = – 3.9250e + 01
Data at 80990Ch = 1110000h = 2.250e + 00
Data at 8098B2h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-126
Multiply Integer MPYI
0 0 0 0 1 0 1 0 1 G dst src
Description The product of the dst and src operands is loaded into the dst register. The src
and dst operands, when read, are assumed to be 24-bit signed integers. The
result is assumed to be a 48-bit signed integer. The output to the dst register
is the 32 least significant bits of the result.
Integer overflow occurs when any of the most significant 16 bits of the 48-bit
result differs from the most significant bit of the 32-bit output value.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
Example MPYI R1,R5
Before Instruction:
R1 = 000033C251h = 3,392,081
R5 = 000078B600h = 7,910,912
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 000033C251h = 3,392,081
R5 = 00E21D9600h = – 501,377,536
LUF LV UF N Z V C = 0 1 0 1 0 1 0
Encoding
31 24 23 16 15 87 0
Description The product of the src1 and src2 operands is loaded into the dst register. The
src1 and src2 operands are assumed to be 24-bit signed integers. The result
is assumed to be a signed 48-bit integer. The output to the dst register is the
32 least significant bits of the result.
Integer overflow occurs when any of the most significant 16 bits of the 48-bit
result differs from the most significant bit of the 32-bit output value.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
10-128
Multiply Integer, 3-Operand MPYI3
Before Instruction:
AR4 = 809850h
AR1 = 8098F3h
R2 = 0h
Data at 809850h = 0ADh = 173
Data at 8098F2h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR4 = 809850h
AR1 = 8098F3h
R2 = 094ACh = 38,060
Data at 809850h = 0ADh = 173
Data at 8098F2h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Before Instruction:
AR4 = 8099F8h
IR0 = 8h
R2 = 0C8h = 200
R7 = 0h
Data at 8099F0h = 32h = 50
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR4 = 8099F0h
IR0 = 8h
R2 = 0C8h = 200
R7 = 02710h = 10,000
Data at 8099F0h = 32h = 50
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Operands srcA
srcB Any two indirect (disp = 0,1,IR0,IR1)
srcC Any two register (0 ≤ Rn ≤ 7)
srcD
Operation (P Field)
Encoding
31 24 23 16 15 87 0
Description An integer multiplication and an integer addition are performed in parallel. All
registers are read at the beginning and loaded at the end of the execute cycle.
This means that if one of the parallel operations (MPYI3) reads from a register
and the operation being performed in parallel (ADDI3) writes to the same reg-
ister, then MPYI3 accepts as input the contents of the register before it is modi-
fied by the ADDI3.
10-130
Parallel MPYI3 and ADDI3 MPYI3||ADDI3
Any combination of addressing modes can be coded for the four possible
source operands as long as two are coded as indirect and two are coded as
register. The assignment of the source operands srcA – srcD to the
src1 – src4 fields varies, depending on the combination of addressing modes
used, and the P field is encoded accordingly. To simplify processing when the
order is not significant, the assembler may change the order of operands in
commutative operations.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 0
Z 0
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R7 = 14h = 20
R4 = 64h = 100
R0 = 0h
AR3 = 80981Fh
AR5 = 80996Eh
R3 = 0h
Data at 80981Eh = 0FFFFFFCBh = – 53
Data at 80996Eh = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R7 = 14h = 20
R4 = 64h = 100
R0 = 07D0h = 2000
AR3 = 80981Fh
AR5 = 80996Dh
R3 = 0h
Data at 80981Eh = 0FFFFFFCBh = – 53
Data at 80996Eh = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-132
Parallel MPYI3 and STI MPYI3||STI
Encoding
31 24 23 16 15 87 0
Description An integer multiplication and an integer store are performed in parallel. All reg-
isters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (MPYI3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the MPYI3.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Integer overflow occurs when any of the most significant 16 bits of the 48-bit
result differ from the most significant bit of the 32-bit output value.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
AR0 = 80995Ah
R5 = 32h = 50
R7 = 0h
R2 = 0DCh = 220
AR3 = 80982Fh
Data at 80995Bh = 0C8h = 200
Data at 80982Eh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR0 = 80995Bh
R5 = 32h = 50
R7 = 2710h = 10000
R2 = 0DCh = 220
AR3 = 80982Fh
Data at 80995Bh = 0C8h = 200
Data at 80982Eh = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-134
Parallel MPYI3 and SUBI3 MPYI3||SUBI3
Operands srcA
srcB Any two indirect (disp = 0,1,IR0,IR1)
srcC Any two register (0 ≤ Rn ≤ 7)
srcD
Operation (P Field)
Encoding
31 24 23 16 15 87 0
Any combination of addressing modes can be coded for the four possible
source operands as long as two are coded as indirect and two are coded as reg-
ister. The assignment of the source operands srcA – srcD to the src1 – src4
fields varies, depending on the combination of addressing modes used, and the
P field is encoded accordingly. To simplify processing when the order is not sig-
nificant, the assembler may change the order of operands in commutative op-
erations.
Integer overflow occurs when any of the most significant 16 bits of the 48-bit
result differs from the most significant bit of the 32-bit output value.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 1 if an integer underflow occurs; 0 otherwise
N 0
Z 0
V 1 if an integer overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R2 = 32h = 50
AR0 = 8098E3h
R0 = 0h
AR5 = 8099FCh
IR1 = 0Ch
R4 = 07D0h = 2000
Data at 8098E4h = 62h = 98
Data at 8099FCh = 4B0h = 1200
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-136
Parallel MPYI3 and SUBI3 MPYI3||SUBI3
After Instruction:
R2 = 320h = 800
AR0 = 8098E4h
R0 = 01324h = 4900
AR5 = 8099F0h
IR1 = 0Ch
R4 = 07D0h = 2000
Data at 8098E4h = 62h = 98
Data at 8099FCh = 4B0h = 1200
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 0 1 1 0 G dst src
Description The difference of the 0, src, and C operands is loaded into the dst register. The
dst and src are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R5 = 0FFFFFFCBh = – 53
R7 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 1
After Instruction:
R5 = 0FFFFFFCBh = – 53
R7 = 34h = 52
LUF LV UF N Z V C = 0 0 0 0 0 0 1
10-138
Negate Floating Point NEGF
0 0 0 0 1 0 1 1 1 G dst src
Description The difference of the 0 and src operands is loaded into the dst register. The
dst and src operands are assumed to be floating-point numbers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Before Instruction:
AR3 = 809800h
R1 = 057B400025h = 6.28125006e + 01
Data at 809802h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR3 = 809802h
R1 = 07F3800000h = –1.4050e + 02
Data at 809802h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; 0 unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-140
Parallel NEFG and STF NEGF||STF
Before Instruction:
AR4 = 8098E1h
R7 = 0h
R2 = 0733C00000h = 1.79750e + 02
AR5 = 809803h
Data at 8098E1h = 57B400000h = 6.281250e + 01
Data at 809804h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR4 = 8098E0h
R7 = 0584C00000h = – 6.281250e + 01
R2 = 0733C00000h = 1.79750e + 02
AR5 = 809804h
Data at 8098E1h = 57B4000h = 6.281250e + 01
Data at 809804h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 1 0 0 0 G dst src
Description The difference of the 0 and src operands is loaded into the dst register. The
dst and src operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R5 = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R5 = 0FFFFFF52 = –174
LUF LV UF N Z V C = 0 0 0 1 0 0 1
10-142
Parallel NEGI and STI NEGI||STI
Encoding
31 24 23 16 15 87 0
Description An integer negation and an integer store are performed in parallel. All registers
are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (NEGI) writes to the same register, then
STI accepts as input the contents of the register before it is modified by the
NEGI.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
AR3 = 80982Fh
R2 = 19h = 25
AR1 = 8098A5h
Data at 80982Eh = 0DCh = 220
Data at 8098A5h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR3 = 80982Fh
R2 = 0FFFFFF24h = – 220
AR1 = 8098A6h
Data at 80982Eh = 0DCh = 220
Data at 8098A5h = 19h = 25
LUF LV UF N Z V C = 0 0 0 1 0 0 1
10-144
No Operation NOP
0 0 0 0 1 1 0 0 1 G 0 0 0 0 0 src
Description If the src operand is specified in the indirect mode, the specified addressing
operation is performed, and a dummy memory read occurs. If the src operand
is omitted, no operation is performed.
Cycles 1
Status Bits LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example 1 NOP
Before Instruction:
PC = 3Ah
After Instruction:
PC = 3Bh
Example 2 NOP *AR3– – (1)
Before Instruction:
PC = 5h
AR3 = 809900h
After Instruction:
PC = 6h
AR3 = 8098FFh
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 1 0 1 0 G dst src
If src (exp) = –128 and src (man) = 0, then dst = 0, Z = 1, and UF = 0. If src (exp)
= –128 and src (man) ≠ 0, then dst = 0, Z = 0, and UF = 1. For all other cases
of the src, if a floating-point underflow occurs, then dst (man) is forced to 0 and
dst (exp) = –128. If src (man) = 0, then dst (man) = 0 and dst (exp) = –128. Re-
fer to Section 4.6 on page 4-18 for more information.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV Unaffected
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-146
Normalize NORM
Before Instruction:
R1 = 0400003AF5h
R2 = 070C800000h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 0400003AF5h
R2 = F26BD40000h = 1.12451613e – 04
LUF LV UF N Z V C = 0 0 0 0 0 0 0
0 0 0 0 1 1 0 1 1 G dst src
Description The bitwise logical-complement of the src operand is loaded into the dst regis-
ter. The complement is formed by a logical-NOT of each bit of the src operand.
The dst and src operands are assumed to be unsigned integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is affected by OVM bit value.
Example NOT @982Ch,R4
Before Instruction:
DP = 80h
R4 = 0h
Data at 80982Ch = 5E2Fh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
R4 = 0FFFFA1D0h
Data at 80982Ch = 5E2Fh
LUF LV UF N Z V C = 0 0 0 1 0 0 0
10-148
Parallel NOT and STI NOT||STI
Encoding
31 24 23 16 15 87 0
Description A bitwise logical-NOT and an integer store are performed in parallel. All regis-
ters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (NOT) writes to the same register, STI
accepts as input the contents of the register before it is modified by the NOT.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
AR2 = 8099CBh
R3 = 0h
R7 = 0DCh = 220
AR4 = 809850h
IR1 = 10h
Data at 8099CCh = 0C2Fh
Data at 809840h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 8099CBh
R3 = 0FFFFF3D0h
R7 = 0DCh = 220
AR4 = 809840h
IR1 = 10h
Data at 8099CCh = 0C2Fh
Data at 809840h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 1 0 0 0
10-150
Bitwise Logical-OR OR
0 0 0 1 0 0 0 0 0 G dst src
Description The bitwise logical OR between the src and dst operands is loaded into the dst
register. The dst and src operands are assumed to be unsigned integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example OR *++AR1(IR1),R2
Before Instruction:
AR1 = 809800h
IR1 = 4h
R2 = 012560000h
Data at 809804h = 2BCDh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 809804h
IR1 = 4h
R2 = 012562BCDh
Data at 809804h = 2BCDh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
Description The bitwise logical-OR between the src1 and src2 operands is loaded into the
dst register. The src1, src2, and dst operands are assumed to be unsigned in-
tegers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-152
Bitwise Logical-OR, 3-Operand OR3
Before Instruction:
AR1 = 809800h
IR1 = 4h
R2 = 012560000h
R7 = 0h
Data at 809804h = 2BCDh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 809804h
IR1 = 4h
R2 = 012560000h
R7 = 012562BCDh
Data at 809804h = 2BCDh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
A bitwise logical-OR and an integer store are performed in parallel. All regis-
ters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (OR3) writes to the same register, then
STI accepts as input the contents of the register before it is modified by the
OR3.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-154
Parallel OR3 and STI OR3||STI
Before Instruction:
AR2 = 809830h
R5 = 800000h
R2 = 0h
R6 = 0DCh = 220
AR1 = 809883h
Data at 809831h = 9800h
Data at 809883h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 809831h
R5 = 800000h
R2 = 809800h
R6 = 0DCh = 220
AR1 = 809882h
Data at 809831h = 9800h
Data at 809883h = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 1 1 0 0 0 1 dst 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Description The top of the current system stack is popped and loaded into the dst register
(32 LSBs). The top of the stack is assumed to be a signed integer. The POP
is performed with a postdecrement of the stack pointer. The exponent bits of
an extended precision register (R7–R0) are left unmodified.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example POP R3
Before Instruction:
SP = 809856h
R3 = 012DAh = 4,826
Data at 809856h = FFFF0DA4h = – 62,044
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
SP = 809855h
R3 = 0FFFF0DA4h = –62,044
Data at 809856h = FFFF0DA4h = – 62,044
LUF LV UF N Z V C = 0 0 0 1 0 0 0
10-156
Pop Floating Point POPF
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 1 1 0 1 0 1 dst 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Description The top of the current system stack is popped and loaded into the dst register
(32 MSBs). The top of the stack is assumed to be a floating-point number. The
POP is performed with a postdecrement of the stack pointer. The eight LSBs
of an extended precision register (R7–R0) are 0 filled.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
UF 0
LV Unaffected
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example POPF R4
Before Instruction:
SP = 80984Ah
R4 = 025D2E0123h = 6.91186578e + 00
Data at 80984Ah = 5F2C1302h = 5.32544007e + 28
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
SP = 809849h
R4 = 5F2C130200h = 5.32544007e + 28
Data at 80984Ah = 5F2C1302h = 5.32544007e + 28
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 1 1 1 0 0 1 src 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Description The contents of the src register (32 LSBs) are pushed on the current system
stack. The src is assumed to be a signed integer. The PUSH is performed with
a preincrement of the stack pointer. The integer or mantissa portion of an ex-
tended precision register (R7–R0) is saved with this instruction.
Cycles 1
Example PUSH R6
Before Instruction:
SP = 8098AEh
R6 = 025C128081h = 633,415,688
Data at 8098AFh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
SP = 8098AFh
R6 = 025C128081h = 633,415,688
Data at 8098AFh = 5C128081h = 1,544,716,417
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-158
PUSH Floating Point PUSHF
Encoding
31 24 23 16 15 87 0
0 0 0 0 1 1 1 1 1 0 1 src 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Description The contents of the src register (32 MSBs) are pushed on the current system
stack. The src is assumed to be a floating-point number. The PUSH is per-
formed with a preincrement of the stack pointer. The eight LSBs of the mantis-
sa are not saved. (Note the difference in R2 and the value on the stack in the
example below.)
Cycles 1
Example PUSHF R2
Before Instruction:
SP = 809801h
R2 = 025C128081h = 6.87725854e + 00
Data at 809802h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
SP = 809802h
R2 = 025C128081h = 6.87725854e + 00
Data at 809802h = 025C1280h = 6.87725830e + 00
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Syntax RETIcond
Else, continue.
Operands None
Encoding
31 24 23 16 15 87 0
0 1 1 1 1 0 0 0 0 0 0 cond 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Description A conditional return is performed. If the condition is true, the top of the stack
is popped to the PC, and a 1 is written to the global interrupt enable (GIE) bit
of the status register. This has the effect of enabling all interrupts for which the
corresponding interrupt enable bit is a 1.
The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Condition flags are set on a previous instruction
only when the destination register is one of the extended-precision registers
(R7–R0) or when one of the compare instructions (CMPF, CMPF3, CMPI,
CMPI3, TSTB, or TSTB3) is executed.
Cycles 4
10-160
Return From Interrupt Conditionally RETIcond
Example RETINZ
Before Instruction:
PC = 456h
SP = 809830h
ST = 0h
Data at 809830h = 123h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 123h
SP = 80982Fh
ST = 2000h
Data at 809830h = 123h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Syntax RETScond
Operation If cond is true:
*SP– – → PC.
Else, continue.
Operands None
Encoding
31 24 23 16 15 87 0
0 1 1 1 1 0 0 0 1 0 0 cond 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Description A conditional return is performed. If the condition is true, the top of the stack
is popped to the PC.
The TMS320C3x provides 20 condition codes that you can use with this in-
struction (see Table 10–9 on page -13 for a list of condition mnemonics, condi-
tion codes, and flags). Condition flags are set on a previous instruction only
when the destination register is one of the extended-precision registers (R7–
R0) or when one of the compare instructions (CMPF, CMPF3, CMPI, CMPI3,
TSTB, or TSTB3) is executed.
Cycles 4
Status Bits LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example RETSGE
Before Instruction:
PC = 123h
SP = 80983Ch
Data at 80983Ch = 456h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 456h
SP = 80983Bh
Data at 80983Ch = 456h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-162
Round Floating Point RND
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 0 0 1 0 G dst src
Description The result of rounding the src operand is loaded into the dst register.The src
operand is rounded to the nearest single-precision floating-point value. If the
src operand is exactly half-way between two single-precision values, it is
rounded to the most positive value.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs or the src operand is 0;
0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z Unaffected
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Before Instruction:
R5 = 0733C16EEFh = 1.79755599e + 02
R2 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R5 = 0733C16EEFh = 1.79755599e + 02
R2 = 0733C16F00h = 1.79755600e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-164
Rotate Left ROL
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 0 0 1 1 1 1 dst 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Description The contents of the dst operand are left-rotated one bit and loaded into the dst
register. This is a circular rotation, with the MSB transferred into the LSB.
Rotate left:
C dst
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the bit rotated out of the high-order bit. Unaffected
if dst is not R7 – R0.
Mode Bit OVM Operation is not affected by OVM bit value.
Example ROL R3
Before Instruction:
R3 = 80025CD4h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R3 = 0004B9A9h
LUF LV UF N Z V C = 0 0 0 0 0 0 1
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 0 1 0 0 1 1 dst 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Description The contents of the dst operand are left-rotated one bit through the carry bit
and loaded into the dst register. The MSB is rotated to the carry bit at the same
time the carry bit is transferred to the LSB.
C dst
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the bit rotated out of the high-order bit. If dst is not
R7–R0, then C is shifted into the dst but not changed.
Mode Bit OVM Operation is not affected by OVM bit value.
Example 1 ROLC R3
Before Instruction:
R3 = 00000420h
LUF LV UF N Z V C = 0 0 0 0 0 0 1
After Instruction:
R3 = 000000841h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-166
Rotate Left Through Carry ROLC
Example 2 ROLC R3
Before Instruction:
R3 = 80004281h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R3 = 00008502h
LUF LV UF N Z V C = 0 0 0 0 0 0 1
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 0 1 0 1 1 1 dst 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Description The contents of the dst operand are right-rotated one bit and loaded into the
dst register. The LSB is rotated into the carry bit and also transferred into the
MSB.
Rotate right:
dst C
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7– R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the bit rotated out of the high-order bit. Unaffected
if dst is not R7–R0.
Mode Bit OVM Operation is not affected by OVM bit value.
Example ROR R7
Before Instruction:
R7 = 00000421h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R7 = 80000210h
LUF LV UF N Z V C = 0 0 0 1 0 0 1
10-168
Rotate Right Through Carry RORC
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 0 1 1 0 1 1 dst 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Description The contents of the dst operand are right-rotated one bit through the status
register’s carry bit. This could be viewed as a 33-bit shift. The carry bit value
is rotated into the MSB of the dst, while at the same time the dst LSB is rotated
into the carry bit.
C dst
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Set to the value of the bit rotated out of the high-order bit. If dst is not
R7 – R0, then C is shifted in but not changed.
Mode Bit OVM Operation is not affected by OVM bit value.
Example RORC R4
Before Instruction:
R4 = 80000081h
LUF LV UF N Z V C = 0 0 0 1 0 0 0
After Instruction:
R4 = 40000040h
LUF LV UF N Z V C = 0 0 0 0 0 0 1
0 1 1 0 0 1 0 0 src
10-170
Repeat Single RPTS
Operation src → RC
1 → ST (RM)
1→S
Next PC → RS
Next PC → RE
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 0 1 1 1 G 1 1 0 1 1 src
Description The RPTS instruction allows you to repeat a single instruction a number of
times without any penalty for looping. Fetches can also be made from the in-
struction register (IR), thus avoiding repeated memory access.
The src operand is loaded into the repeat counter (RC). A 1 is written into the
repeat mode bit of the status register ST (RM). A 1 is also written into the re-
peat single bit (S). This indicates that the program fetches are to be performed
only from the instruction register. The next PC is loaded into the repeat end
address (RE) register and the repeat start address (RS) register.
For the immediate mode, the src operand is assumed to be an unsigned inte-
ger and is not sign-extended.
Cycles 4
Before Instruction:
PC = 123h
ST = 0h
RS = 0h
RE = 0h
RC = 0h
AR5 = 0FFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 124h
ST = 100h
RS = 124h
RE = 124h
RC = 0FFh
AR5 = 0FFh
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-172
Signal, Interlocked SIGI
Syntax SIGI
Operands None
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Description An interlocked operation is signaled over XF0 and XF1. After the interlocked
operation is acknowledged, the interlocked operation ends. SIGI ignores the
external ready signals. Refer to Section 6.4 on page 6-12 for detailed informa-
tion.
Cycles 1
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 1 0 0 0 G src dst
Description The src register is loaded into the dst memory location. The src and dst oper-
ands are assumed to be floating-point numbers.
Cycles 1
Before Instruction:
DP = 80h
R2 = 052C501900h = 4.30782204e + 01
Data at 8098A1h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
R2 = 052C501900h = 4.30782204e + 01
Data at 8098A1h = 52C5019h = 4.30782204e + 01
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-174
Store Floating Point, Interlocked STFI
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 1 0 0 1 G src dst
Description The src register is loaded into the dst memory location. An interlocked opera-
tion is signaled over pins XF0 and XF1. The src and dst operands are assumed
to be floating-point numbers. Refer to Section 6.4 on page 6-12 for detailed
information.
Cycles 1
Before Instruction:
R3 = 0733C00000h = 1.79750e + 02
AR4 = 80993Ch
Data at 80993Bh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R3 = 0733C00000h = 1.79750e + 02
AR4 = 80993Ch
Data at 80993Bh = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
Description Two STF instructions are executed in parallel. Both src1 and src2 are assumed
to be floating-point numbers.
Cycles 1
Before Instruction:
R4 = 070C800000h = 1.4050e + 02
AR3 = 809835h
R3 = 0733C00000h = 1.79750e + 02
AR5 = 8099D2h
Data at 809835h = 0h
Data at 8099D3h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-176
Parallel Store Floating Point STF||STF
After Instruction:
R4 = 070C800000h = 1.4050e + 02
AR3 = 809834h
R3 = 0733C00000h = 1.79750e + 02
AR5 = 8099D3h
Data at 809835h = 070C8000h = 1.4050e + 02
Data at 8099D3h = 0733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 1 0 1 0 G src dst
Description The src register is loaded into the dst memory location. The src and dst oper-
ands are assumed to be signed integers.
Cycles 1
Before Instruction:
DP = 80h
R4 = 42BD7h = 273,367
Data at 80982Bh = 0E5FCh = 58,876
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
R4 = 42BD7h = 273,367
Data at 80982Bh = 42BD7h = 273,367
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-178
Store Integer, Interlocked STII
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 1 0 1 1 G src dst
Description The src register is loaded into the dst memory location. An interlocked opera-
tion is signaled over pins XF0 and XF1. The src and dst operands are assumed
to be signed integers. Refer to Section 6.4 on page 6-12 for detailed informa-
tion.
Cycles 1
Before Instruction:
DP = 80h
R1 = 78Dh
Data at 8098AEh = 25Ch
After Instruction:
DP = 80h
R1 = 78Dh
Data at 8098AEh = 78Dh
Encoding
31 24 23 16 15 87 0
Description Two integer stores are performed in parallel. If both stores are executed to the
same address, the value written is that of STI src2, dst2.
Cycles 1
Before Instruction:
R0 = 0DCh = 220
AR2 = 809830h
IR0 = 8h
R5 = 35h = 53
AR0 = 8098D3h
Data at 809838h = 0h
Data at 8098D3h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-180
Parallel STI and STI STI||STI
After Instruction:
R0 = 0DCh = 220
AR2 = 809838h
IR0 = 8h
R5 = 35h = 53
AR0 = 8098D3h
Data at 809838h = 0DCh = 220
Data at 8098D3h = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 1 1 0 1 G dst src
Description The difference of the dst, src, and C operands is loaded into the dst register.
The dst and src operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
AR5 = 809800h
R5 = 0FAh = 250
Data at 809800h = 0C7h = 199
LUF LV UF N Z V C = 0 0 0 0 0 0 1
After Instruction:
AR5 = 809804h
R5 = 032h = 50
Data at 809800h = 0C7h = 199
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-182
Subtract Integer With Borrow, 3-Operand SUBB3
Encoding
31 24 23 16 15 87 0
Description The difference of the src1 and src2 operands and the C flag is loaded into the
dst register. The src1, src2, and dst operands are assumed to be signed inte-
gers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
AR5 = 809800h
IR0 = 4h
R5 = 0C7h = 199
R0 = 0h
Data at 809800h = 0FAh = 250
LUF LV UF N Z V C = 0 0 0 0 0 0 1
After Instruction:
AR5 = 809804h
IR0 = 4h
R5 = 0C7h = 199
R0 = 32h = 50
Data at 809800h = 0FAh = 250
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-184
Subtract Integer Conditionally SUBC
Encoding
31 24 23 16 15 87 0
0 0 0 1 0 1 1 1 0 G dst src
Description The src operand is subtracted from the dst operand. The dst operand is loaded
with a value dependent on the result of the subtraction. If (dst – src) is greater
than or equal to 0, then (dst – src) is left-shifted one bit, the least significant
bit is set to 1, and the result is loaded into the dst register. If (dst – src) is less
than 0, dst is left-shifted one bit and loaded into the dst register. The dst and
src operands are assumed to be unsigned integers.
You can use SUBC to perform a single step of a multibit integer division. See
subsection 11.3.4 on page 11-26 for a detailed description.
Cycles 1
Before Instruction:
DP = 80h
R1 = 04F6h = 1270
Data at 8098C5h = 492h = 1170
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
R1 = 0C9h = 201
Data at 8098C5h = 492h = 1170
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Before Instruction:
R0 = 07D0h = 2000
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R0 = 0FA0h = 4000
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-186
Subtract Floating Point SUBF
0 0 0 1 0 1 1 1 1 G dst src
Description The difference of the dst operand minus the src operand is loaded into the
dst register. The dst and src operands are assumed to be floating-point num-
bers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
After Instruction:
AR0 = 809808h
IR0 = 80h
R5 = 051D000000h = 3.9250e + 01
Data at 809888h = 70C8000h = 1.4050e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
Description The difference of the src1 and src2 operands is loaded into the dst register.
The src1, src2, and dst operands are assumed to be floating-point numbers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-188
Subtract Floating Point, 3-Operand SUBF3
Before Instruction:
AR0 = 809888h
IR0 = 80h
AR1 = 809851h
R4 = 0h
Data at 809888h = 70C8000h = 1.4050e + 02
Data at 809851h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR0 = 809808h
IR0 = 80h
AR1 = 809851h
R4 = 51D000000h = 3.9250e + 01
Data at 809888h = 70C8000h = 1.4050e + 02
Data at 809851h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Before Instruction:
R7 = 57B400000h = 6.281250e + 01
R0 = 34C200000h = 1.27578125e + 01
R6 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R7 = 57B400000h = 6.281250e + 01
R0 = 34C200000h = 1.27578125e + 01
R6 = 5B7C80000h = – 5.00546875e + 01
LUF LV UF N Z V C = 0 0 0 0 1 0 0
Encoding
31 24 23 16 15 87 0
If src3 and dst1 point to the same location, src3 is read before the write to dst1.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-190
Parallel SUBF3 and STF SUBF3||STF
Before Instruction:
R1 = 057B400000h = 6.28125e + 01
AR4 = 8098B8h
IR1 = 8h
R0 = 0h
R7 = 0733C00000h = 1.79750e + 02
AR5 = 809850h
IR0 = 10h
Data at 8098B0h = 70C8000h = 1.4050e + 02
Data at 809860h = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 057B400000h = 6.28125e + 01
AR4 = 8098B8h
IR1 = 8h
R0 = 061B600000h = 7.768750e + 01
R7 = 0733C00000h = 1.79750e + 02
AR5 = 809850h
IR0 = 10h
Data at 8098B0h = 70C8000h = 1.4050e + 02
Data at 809860h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 1 1 0 0 0 0 G dst src
Description The difference of the dst operand minus the src operand is loaded into the dst
register. The dst and src operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R7 = 226h = 550
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R7 = 14Ah = 330
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-192
Subtract Integer, 3-Operand SUBI3
Encoding
31 24 23 16 15 87 0
Description The difference of the src1 operand minus the src2 operand is loaded into the
dst register. The src1, src2, and dst operands are assumed to be signed inte-
gers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R2 = 0866h = 2150
R7 = 0834h = 2100
R0 = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R2 = 0866h = 2150
R7 = 0834h = 2100
R0 = 032h = 50
LUF LV UF N Z V C = 0 0 0 1 0 0 0
Before Instruction:
AR2 = 80985Eh
R4 = 0226h = 550
R3 = 0h
Data at 80985Dh = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR2 = 80985Eh
R4 = 0226h = 550
R3 = 014Ah = 330
Data at 80985Dh = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-194
Parallel SUBI3 and STI SUBI3||STI
Encoding
31 24 23 16 15 87 0
Description An integer subtraction and an integer store are performed in parallel. All regis-
ters are read at the beginning and loaded at the end of the execute cycle. This
means that if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (SUBI3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the SUBI3.
If src3 and dst1 point to the same location, src3 is read before the write to dst1.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R7 = 14h = 20
AR2 = 80982Fh
IR0 = 10h
R1 = 0h
R3 = 35h = 53
AR7 = 80983Bh
Data at 80983Fh = 0DCh = 220
Data at 80983Ch = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R7 = 14h = 20
AR2 = 80982Fh
IR0 = 10h
R1 = 0C8h = 200
R3 = 35h = 53
AR7 = 80983Ch
Data at 80983Fh = 0DCh = 220
Data at 80983Ch = 35h = 53
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-196
Subtract Reverse Integer With Borrow SUBRB
Encoding
31 24 23 16 15 87 0
0 0 0 1 1 0 0 0 1 G dst src
Description The difference of the src, dst, and C operands is loaded into the dst register.
The dst and src operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
R4 = 03CBh = 971
R6 = 0258h = 600
LUF LV UF N Z V C = 0 0 0 0 0 0 1
After Instruction:
R4 = 03CBh = 971
R6 = 0172h = 370
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 1 1 0 0 1 0 G dst src
Description The difference of the src operand minus the dst operand is loaded into the dst
register. The dst and src operands are assumed to be floating-point numbers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF 1 if a floating-point underflow occurs; unchanged otherwise
LV 1 if a floating-point overflow occurs; unchanged otherwise
UF 1 if a floating-point underflow occurs; 0 otherwise
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if a floating-point overflow occurs; 0 otherwise
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
DP = 80h
R5 = 057B400000h = 6.281250e + 01
Data at 809905h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
DP = 80h
R5 = 0669E00000h = 1.16937500e + 02
Data at 809905h = 733C000h = 1.79750e + 02
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-198
Subtract Reverse Integer SUBRI
Encoding
31 24 23 16 15 87 0
0 0 0 1 1 0 0 1 1 G dst src
Description The difference of the src operand minus the dst operand is loaded into the dst
register. The dst and src operands are assumed to be signed integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV 1 if an integer overflow occurs; unchanged otherwise
UF 0
N 1 if a negative result is generated; 0 otherwise
Z 1 if a 0 result is generated; 0 otherwise
V 1 if an integer overflow occurs; 0 otherwise
C 1 if a borrow occurs; 0 otherwise
Mode Bit OVM Operation is affected by OVM bit value.
Before Instruction:
AR5 = 809900h
IR0 = 8h
R3 = 0DCh = 220
Data at 809900h = 226h = 550
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR5 = 809908h
IR0 = 8h
R3 = 014Ah = 330
Data at 809900h = 226h = 550
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Syntax SWI
Operands None
Encoding
31 24 23 16 15 87 0
0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Description The SWI instruction performs an emulator interrupt. This is a reserved instruc-
tion and should not be used in normal programming.
Cycles 4
10-200
Trap Conditionally TRAPcond
Syntax TRAPcond N
Operation 0 → ST(GIE)
If cond is true:
Next PC → *++SP,
Trap vector N → PC.
Else:
Operands N (0 ≤ N ≤ 31)
Encoding
31 24 23 16 15 87 0
0 1 1 1 0 1 0 0 0 0 0 cond 0 0 0 0 0 0 0 0 0 0 1 N
Description Interrupts are disabled globally when 0 is written to ST(GIE). If the condition
is true, the contents of the PC are pushed onto the system stack, and the PC
is loaded with the contents of the specified trap vector (N). If the condition is
not true, ST(GIE) is set to its value before the TRAPcond instruction changes
it.
The TMS320C3x provides 20 condition codes that can be used with this in-
struction (see Table 10–9 on page 10-13 for a list of condition mnemonics,
condition codes, and flags). Condition flags are set on a previous instruction
only when the destination register is one of the extended-precision registers
(R7–R0) or when one of the compare instructions (CMPF, CMPF3, CMPI,
CMPI3, TSTB, or TSTB3) is executed.
Cycles 5
Example TRAPZ 16
Before Instruction:
PC = 123h
SP = 809870h
ST = 0h
Trap Vector 16 = 10h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
PC = 10h
SP = 809871h
Data at 809871h = 124h
ST = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-202
Test Bit Fields TSTB
0 0 0 1 1 0 1 0 0 G dst src
Description The bitwise logical-AND of the dst and src operands is formed, but the result
is not loaded in any register. This allows for nondestructive compares. The dst
and src operands are assumed to be unsigned integers.
Cycles 1
Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Example TSTB *–AR4(1),R5
Before Instruction:
AR4 = 8099C5h
R5 = 898h = 2200
Data at 8099C4h = 767h = 1895
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR4 = 8099C5h
R5 = 898h = 2200
Data at 8099C4h = 767h = 1895
LUF LV UF N Z V C = 0 0 0 0 1 0 0
Encoding
31 24 23 16 15 87 0
0 0 1 0 0 1 1 1 1 T 0 0 0 0 0 src1 src2
Description The bitwise logical-AND between the src1 and src2 operands is formed but is
not loaded into any register. This allows for nondestructive compares. The
src1 and src2 operands are assumed to be unsigned integers. Although this
instruction has only two operands, it is designated as a three-operand instruc-
tion because operands are specified in the three-operand format.
Cycles 1
Status Bits These condition flags are modified for all destination registers (R27 – R0).
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
10-204
Test Bit Fields, 3-Operands TSTB3
Before Instruction:
AR5 = 809885h
IR0 = 80h
AR0 = 80992Ch
Data at 809885h = 898h = 2200
Data at 80992Dh = 767h = 1895
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR5 = 809805h
IR0 = 80h
AR0 = 80992Ch
Data at 809885h = 898h = 2200
Data at 80992Dh = 767h = 1895
LUF LV UF N Z V C = 0 0 0 0 1 0 0
Before Instruction:
R4 = 0FBC4h
AR6 = 8099F8h
IR0 = 8h
Data at 8099F8h = 1568h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R4 = 0FBC4h
AR6 = 8099F0h
IR0 = 8h
Data at 8099F8h = 1568h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Encoding
31 24 23 16 15 87 0
0 0 0 1 1 0 1 0 1 G dst src
Description The bitwise exclusive-OR of the src and dst operands is loaded into the dst
register. The dst and src operands are assumed to be unsigned integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
R1 = 0FFA32h
R2 = 0FF5C1h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R1 = 0FF412h
R2 = 000FF3h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-206
Bitwise Exclusive-OR, 3-Operand XOR3
Encoding
31 24 23 16 15 87 0
Description The bitwise exclusive-OR between the src1 and src2 operands is loaded into
the dst register. The src1, src2, and dst operands are assumed to be unsigned
integers.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
AR3 = 809800h
IR0 = 10h
R7 = 0FFFFh
R4 = 0h
Data at 809800h = 5AC3h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR3 = 809810h
IR0 = 10h
R7 = 0FFFFh
R4 = 0A53Ch
Data at 809800h = 5AC3h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
Before Instruction:
R5 = 0FFA32h
AR1 = 809826h
R1 = 0h
Data at 809825h = 0FF5C1h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
R5 = 0FFA32h
AR1 = 809826h
R1 = 000F33h
Data at 809825h = 0FF5C1h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-208
Parallel XOR3 and STI XOR3||STI
Encoding
31 24 23 16 15 87 0
Description A bitwise exclusive-XOR and an integer store are performed in parallel. All reg-
isters are read at the beginning and loaded at the end of the execute cycle. This
means that, if one of the parallel operations (STI) reads from a register and the
operation being performed in parallel (XOR3) writes to the same register, STI
accepts as input the contents of the register before it is modified by the XOR3.
If src2 and dst2 point to the same location, src2 is read before the write to dst2.
Cycles 1
Status Bits These condition flags are modified only if the destination register is R7 – R0.
LUF Unaffected
LV Unaffected
UF 0
N MSB of the output
Z 1 if a 0 output is generated; 0 otherwise
V 0
C Unaffected
Mode Bit OVM Operation is not affected by OVM bit value.
Before Instruction:
AR1 = 80987Eh
R3 = 85h
R6 = 0DCh = 220
AR2 = 8098B4h
IR0 = 8h
Data at 80987Eh = 85h
Data at 8098ACh = 0h
LUF LV UF N Z V C = 0 0 0 0 0 0 0
After Instruction:
AR1 = 80987Fh
R3 = 0h
R6 = 0DCh = 220
AR2 = 8098B4h
IR0 = 8h
Data at 80987Eh = 85h
Data at 8098ACh = 0DCh = 220
LUF LV UF N Z V C = 0 0 0 0 0 0 0
10-210
Chapter 11
Software Applications
The purpose of this chapter is to explain how to use the instruction set, the ar-
chitecture, and the interface of the TMS320C3x processor. It presents coding
examples for frequently used applications and discusses more involved exam-
ples and applications. This chapter defines the principles involved in the ap-
plications and provides the corresponding assembly-language code for in-
structional purposes and for immediate use. Whenever the detailed explana-
tion of the underlying theory is too extensive to be included in this manual, ap-
propriate references are given for further information.
Topic Page
11-1
Processor Initialization
You can reset the processor by applying a low level to the RESET input for sev-
eral cycles. At this time, the TMS320C3x terminates execution and puts the
reset vector (that is, the contents of memory location 0) in the program counter.
The reset vector normally contains the address of the system-initialization rou-
tine. The hardware reset also initializes various registers and status bits.
After reset, you can further initialize the processor by executing instructions
that set up operational modes, memory pointers, interrupts, and the remaining
functions needed to meet system requirements.
To configure the processor at reset, you should initialize the following internal
functions:
- Memory-mapped registers
- Interrupt structure
In addition to the initialization performed during the hardware reset (for condi-
tions after hardware reset, see Chapter 12), Example 11–1 shows coding for
initializing the TMS320C3x to the following machine state:
- All interrupts are enabled.
- The overflow mode is disabled.
- The data memory page pointer is set to 0.
- The internal memory is filled with 0s.
Note that all constants larger than 16 bits should be placed in memory and ac-
cessed through direct or indirect addressing.
11-2
Processor Initialization
.data
MASK .word 0FFFFFFFFH
BLK0 .word 0809800H ; Beginning address of RAM block 0
BLK1 .word 0809C00H ; Beginning address of RAM block 1
STCK .word 0809F00H ; Beginning of stack
CTRL .word 0808000H ; Pointer for peripheral±bus memory map
DMACTL .word 0000000H ; Init for DMA control (0)
TIM0CTL .word 0000000H ; Init of timer 0 control (32)
TIM1CTL .word 0000000H ; Init of timer 1 control (48)
SERGLOB0 .word 0000000H ; Init of serial 0 glbl control (64)
SERPRTX0 .word 0000000H ; Init of serial 0 xmt port control (66)
SERPRTR0 .word 0000000H ; Init of serial 0 rcv port control (67)
SERTIM0 .word 0000000H ; Init of serial 0 timer control (68)
SERGLOB1 .word 0000000H ; Init of serial 1 glbl control (80)
SERPRTX1 .word 0000000H ; Init of serial 1 xmt port control (82)
SERPRTR1 .word 0000000H ; Init of serial 1 rcv port control (83)
SERTIM1 .word 0000000H ; Init of serial 1 timer control (84)
PARINT .word 0000000H ; Init of parallel interface control (100)
IOINT .word 0000000H ; Init of I/O interface control (96)
*
.text
*
* THE ADDRESS AT MEMORY LOCATION 0 DIRECTS EXECUTION TO BEGIN HERE
* FOR RESET PROCESSING THAT INITIALIZES THE PROCESSOR. WHEN RESET
* IS APPLIED, THE FOLLOWING REGISTERS ARE INITIALIZED TO 0:
*
* BITS: 31–14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
*
INTERNAL DATA MEMORY INITIALIZATION TO FLOATING POINT 0
*
11-4
Processor Initialization
*
* THE PROCESSOR IS INITIALIZED. THE REMAINING APPLICATION–
* DEPENDENT PART OF THE SYSTEM (BOTH ON– AND OFF–CHIP) SHOULD
* NOW BE INITIALIZED.
*
* FIRST, INITIALIZE THE CONTROL REGISTERS. IN THIS EXAMPLE,
* EVERYTHING IS INITIALIZED TO 0, SINCE THE ACTUAL INITIALIZATION IS
* APPLICATION-DEPENDENT.
*
LDI @CTRL,AR0 ; Load in AR0 the pointer to control
* ; registers
LDI @DMACTL,R0
STI R0,*+AR0(0) ; Init DMA control
LDI @TIM0CTL,R0
STI R0,*+AR0(32) ; Init timer 0 control
LDI @TIM1CTL,R0
STI R0,*+AR0(48) ; Init timer 1 control
LDI @SERGLOB0,R0
STI R0,*+AR0(64) ; Init serial 0 global control
LDI @SERPRTX0,R0
STI R0,*+AR0(66) ; Init serial 0 xmt control
LDI @SERPRTR0,R0
STI R0,*+AR0(67) ; Init serial 0 rcv control
LDI @SERTIM0,R0
STI R0,*+AR0(68) ; Init serial 0 timer control
LDI @SERGLOB1,R0
STI R0,*+AR0(80) ; Init serial 1 global control
LDI @SERPRTX1,R0
STI R0,*+AR0(82) ; Init serial 1 xmt control
LDI @SERPRTR1,R0
STI R0,*+AR0(83) ; Init serial 1 rcv control
LDI @SERTIM1,R0
STI R0,*+AR0(84) ; Init serial 1 timer control
LDI @PARINT,R0
STI R0,*+AR0(100) ; Init parallel interface control (C30 only)
LDI @IOINT,R0
STI R0,*+AR0(96) ; Init I/O interface control
*
LDI @STCK,SP ; Init the stack pointer
OR 2000H,ST ; Global interrupt enable
*
BR BEGIN ; Branch to the beginning of application
.end
11.2.1 Subroutines
The TMS320C3x has a 24-bit program counter (PC) and a practically unlimited
software stack. The CALL and CALLcond subroutine calls cause the stack
pointer to increment and store the contents of the next value of the PC counter
on the stack. At the end of the subroutine, RETScond performs a conditional
return.
Example 11–2 illustrates the use of a subroutine to determine the dot product
between two vectors. Given two vectors of length N, represented by the arrays
a [0], a [1],..., a [N –1] and b [0], b [1],..., b [N –1], the dot product is computed
from the expression
Processing proceeds in the main routine to the point where the dot product is
to be computed. It is assumed that the arguments of the subroutine have been
appropriately initialized. At this point, a CALL is made to the subroutine,
transferring control to that section of the program memory for execution, then
returning to the calling routine via the RETS instruction when execution has
completed. Note that for this particular example, it would suffice to save the
register R2. However, a larger number of registers are saved for demonstra-
tion purposes. The saved registers are stored on the system stack. This stack
should be large enough to accommodate the maximum anticipated storage re-
quirements. You could use other methods of saving registers equally well.
11-6
Program Control
* CALL DOT
* .
* .
* .
*
* SUBROUTINE DOT
*
*
* EQUATION: d = a(0) * b(0) + a(1) * b(1) + ... + a(N±1) * b(N±1)
*
* THE DOT PRODUCT OF a AND b IS PLACED IN REGISTER R0. N MUST
* BE GREATER THAN OR EQUAL TO 2.
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* AR0 | ADDRESS OF a(0)
* AR1 | ADDRESS OF b(0)
* RC | LENGTH OF VECTORS (N)
*
* REGISTERS USED AS INPUT: AR0, AR1, RC
* REGISTER MODIFIED: R0
* REGISTER CONTAINING RESULT: R0
*
*
*
.global DOT
*
DOT PUSH ST ; Save status register
PUSH R2 ; Use the stack to save R2’s
PUSHF R2 ; Lower 32 and upper 32 bits
PUSH AR0 ; Save AR0
PUSH AR1 ; Save AR1
PUSH RC ; Save RC
* ; Initialize R0:
MPYF3 *AR0,*AR1,R0 ; a(0) * b(0) ±> R0
LDF 0.0,R2 ; Initialize R2
SUBI 2,RC ; Set RC = N±2
*
* DOT PRODUCT (1 <= i < N)
*
RPTS RC ; Setup the repeat single
MPYF3 *++AR0(1),*++AR1(1),R0 ; a(i) * b(i) ±> R0
|| ADDF3 R0,R2,R2 ; a(i±1)*b(i±1) + R2 ±> R2
*
ADDF3 R0,R2,R0 ; a(N±1)*b(N±1) + R2 ±> R0
*
* RETURN SEQUENCE
*
POP RC ; Restore RC
POP AR1 ; Restore AR1
POP AR0 ; Restore AR0
POPF R2 ; Restore top 32 bits of R2
POP R2 ; Restore bottom 32 bits of R2
POP ST ; Restore ST
RETS ; Return
*
* end
*
.end
11-8
Program Control
The CALL and CALLcond instructions and the interrupt routines push the
value of the PC onto the stack. RETScond and RETIcond then pop the stack
and place the value in the program counter. You can also use the PUSH and
POP instructions to maneuver the integer value of any register onto and off the
stack, respectively. There are two additional instructions, PUSHF and POPF,
for floating point numbers. You can push and pop floating point numbers to reg-
isters R7–R0. This feature makes it easy to save all 40 bits of the extended
precision registers (see Example 11–2). Using PUSH and PUSHF on the
same register saves the lower 32 and upper 32 bits. PUSH saves the lower
32; PUSHF, the upper 32. POPF, followed by POP, will recover this extended
precision number. It is important to perform the integer and floating-point
PUSH and POP in the order given above. POPF forces the least significant
eight bits of the extended-precision registers to 0 and therefore must be per-
formed first.
You can easily read and write to the SP to create multiple stacks for different
program segments. SP is not initialized by the hardware during reset. It is
therefore important to remember to initialize its value so that SP points to a pre-
determined memory location. This avoids the problem of SP attempting to
write into ROM or otherwise write over useful data.
Even when the interrupt is disabled, you can read the interrupt flag register (IF)
and take appropriate action, depending on whether the interrupt has occurred.
This is true even when the interrupt is disabled. This can be useful when an
interrupt-driven interface is not implemented. Example 11–3 shows the case
in which a subroutine is called when interrupt 1 has not occurred.
When interrupt processing begins, the PC is pushed onto the stack, and the
interrupt vector is loaded in the PC. Interrupts are then disabled by setting the
GIE = 0, and the program continues from the address loaded in the PC. Since
all interrupts are disabled, interrupt processing can proceed without further in-
terruption, unless the interrupt service routine re-enables interrupts.
Except for very simple interrupt service routines, it is important to ensure that
the processor context is saved during execution of this routine. You must save
the context before you execute the routine itself and restore it after the routine
is finished. The procedure is called context switching. Context switching is also
useful for subroutine calls, especially during extensive use of the auxiliary and
the extended precision registers. This section contains code examples of con-
text switching and an interrupt service routine.
11-10
Program Control
Example 11–4 and Example 11–5 show saving and restoring of the
TMS320C3x state. In both examples, the stack is used for saving the registers,
and it expands towards higher addresses. If you don’t want to use the stack
pointed at by SP, you can create a separate stack by using an auxiliary register
as the stack pointer. Registers saved in these examples are:
- Extended-precision registers R7 through R0
- Auxiliary registers AR7 through AR0
- Data-page pointer DP
- Index registers IR0 and IR1
- Block-size register BK
- Status register ST
- Interrupt-related registers IE and IF
- I/O flag IOF
- Repeat-related registers RS, RE, and RC
11-12
Program Control
11-14
Program Control
11-16
Program Control
Conditional delayed branches use the conditions that exist at the end of the
instruction immediately preceding the delayed branch. Sometimes a branch
is necessary in the flow of a program, but fewer than three instructions can be
placed after a delayed branch. For faster execution, it is still advantageous to
use a delayed branch. This is shown in Example 11–7, with NOPs taking the
place of the unused instructions. The trade-off is more instruction words for
less execution time.
Example 11–8 shows an application of the block repeat construct. In this ex-
ample, an array of 64 elements is flipped over by exchanging the elements that
are equidistant from the end of the array. In other words, if the original array is
Because the exchange operation is done on two elements at the same time,
it requires 32 operations. The repeat counter RC is initialized to 31. In general,
if RC contains the number N, the loop will be executed N + 1 times. The loop
is defined by the RPTB instruction and the EXCH label.
11-18
Program Control
In principle, it is possible to nest repeat blocks. However, there is only one set
of control registers: RS, RE, and RC. It is therefore necessary to save these
registers before entering an inside loop. It might be more practical to imple-
ment a nested loop by the more traditional method of using a register as a
counter and then using a delayed branch rather than using the nested repeat
block approach.
Example 11–9 shows another example of using the block repeat to find a maxi-
mum of 147 numbers.
The single-instruction repeat uses the control registers RS, RE, and RC in the
same way as the block repeat. The advantage over the block repeat is that the
instruction is fetched only once, and then the buses are available for moving
operands. Note that the single-instruction repeat construct is not interruptible,
while block repeat is interruptible.
11-20
Program Control
11-22
Logical and Arithmetic Operations
11-24
Logical and Arithmetic Operations
You can use direct memory access (DMA) in parallel with CPU operations to
accomplish such data transfers. The DMA operation is explained in detail in
subsection 8.3 on page 8-43. An alternative to DMA is to perform data trans-
fers under program control using load and store instructions in a repeat mode.
Example 11–14 shows the transfer of a block of 512 floating-point numbers
from external memory to block 1 of the on-chip RAM.
The TMS320C3x can implement fast Fourier transforms (FFT) with bit-rev-
ersed addressing. If the data to be transformed is in the correct order, the final
result of the FFT is scrambled in bit-reversed order. To recover the frequency-
domain data in the correct order, you must swap certain memory locations.
The bit-reversed addressing mode makes swapping unnecessary. The next
time data needs to be accessed, the access is performed in a bit-reversed
manner rather than sequentially. The base address of bit-reversed addressing
must be located on a boundary of the size of the table. For example, if IR0 =
2n–1, the n LSBs of the base address must be 0.
In bit-reversed addressing, IR0 holds a value equal to one-half the size of the
FFT, if real and imaginary data are stored in separate arrays. During access-
ing, the auxiliary register is indexed by IR0, but with reverse carry propagation.
Example 11–15 illustrates a 512-point complex FFT being moved from the
place of computation (pointed at by AR0) to a location pointed at by AR1. In
this example, real and imaginary parts XR(i) and XI(i) of the data are not stored
in separate arrays, but they are interleaved XR(0), XI(0), XR(1), XI(1), ...,
XR(N-1), XI(N-1). Because of this arrangement, the length of the array is 2N
instead of N, and IR0 is set to 512 instead of 256.
11-26
Logical and Arithmetic Operations
SUBC implements binary division in the same manner that long division imple-
ments it. The divisor which is assumed to be smaller than the dividend) is
shifted left i – j times to be aligned with the dividend. Then, using SUBC, the
shifted divisor is subtracted from the dividend. For each subtraction that does
not produce a negative answer, the dividend is replaced by the difference. It
is then shifted to the left, and a 1 is put in the LSB. If the difference is negative,
the dividend is simply shifted left by 1. This operation is repeated
i – j + 1 times.
Long Division:
00000000000000000000000000000110
Quotient
00000000000000000000000000000101 00000000000000000000000000100001
–101
1101
–101
11 Remainder
SUBC Method:
00000000000000000000000000100001 Dividend
00000000000000000000000000101000 Divisor (Aligned)
(First SUBC Command)
Negative Difference
↓
00000000000000000000000000100010 New Dividend + Quotient
00000000000000000000000000101000 Divisor
Difference (> 0) (Second SUBC Command)
00000000000000000000000000011010
↓
00000000000000000000000000110101 New Dividend + Quotient
00000000000000000000000000101000 Divisor
Difference (> 0) (Third SUBC Command)
00000000000000000000000000001101
↓
00000000000000000000000000011011 New Dividend + Quotient
00000000000000000000000000101000 Divisor
(Fourth SUBC Command)
Negative Difference
↓
00000000000000000000000000110110
Final Result
↓ ↓
Remainder Quot.
When the SUBC command is used, both the dividend and the divisor must be
positive. Example 11–16 shows an example of a realization of the integer divi-
sion in which the sign of the quotient is properly handled. The last instruction
before returning modifies the condition flag in case subsequent operations de-
pend on the sign of the result.
11-28
Logical and Arithmetic Operations
.globl DIVI
SIGN .set R2
TEMPF .set R3
TEMP .set IR0
COUNT .set IR1
DIVI:
*
* DETERMINE SIGN OF RESULT. GET ABSOLUTE VALUE OF OPERANDS.
*
If the dividend is less than the divisor and you want fractional division, you can
perform a division after you determine the desired accuracy of the quotient in
bits. If the desired accuracy is k bits, start by shifting the dividend left by k posi-
tions. Then apply the algorithm described above, with i replaced by i + k. It is
assumed that i + k is less than 32.
11-30
Logical and Arithmetic Operations
x [0] = 1.0 * 2 – e – 1
This algorithm properly treats the boundary conditions when the input number
either is 0 or has a very large value. When the input is 0, the exponent
e = – 128. Then the calculation of x [0] yields an exponent equal to
– (– 128) –1 = 127, and the algorithm will overflow and saturate. On the other
hand, in the case of a very large number, e = 127, the exponent of x [0] will be
– 127 – 1 = – 128. This will cause the algorithm to yield 0, which is a reasonable
handling of that boundary condition.
11-32
Logical and Arithmetic Operations
NEGI R1
SUBI 1,R1 ; Now we have ±e±1, the exponent of x[0]
ASH 24,R1
PUSH R1
POPF R1 ; Now R1 = x[0] = 1.0 * 2**(±e±1)
*
* NOW THE ITERATIONS BEGIN.
*
MPYF R1,R0,R2 ; R2 = v * x[0]
SUBRF 2.0,R2 ; R2 = 2.0 ± v * x[0]
MPYF R2,R1 ; R1 = x[1] = x[0] * (2.0 ± v * x[0])
*
MPYF R1,R0,R2 ; R2 = v * x[1]
SUBRF 2.0,R2 ; R2 = 2.0 – v * x[1]
MPYF R2,R1 ; R1 = x[2] = x[1] * (2.0 ± v * x[1])
*
MPYF R1,R0,R2 ; R2 = v * x[2]
SUBRF 2.0,R2 ; R2 = 2.0 ± v * x[2]
MPYF R2,R1 ; R1 = x[3] = x[2] * (2.0 ± v * x[2])
*
MPYF R1,R0,R2 ; R2 = v * x[3]
SUBRF 2.0,R2 ; R2 = 2.0 ± v * x[3]
MPYF R2,R1 ; R1 = x[4] = x[3] * (2.0 ± v * x[3])
*
RND R1 ; This minimizes error in the LSBs
*
* FOR THE LAST ITERATION WE USE THE FORMULATION:
* x[5] = (x[4] * (1.0 ± (v * x[4]))) + x[4]
*
MPYF R1,R0,R2 ; R2 = v * x[4] = 1.0..01.. => 1
SUBRF 1.0,R2 ; R2 = 1.0 ± v * x[4] = 0.0..01... => 0
MPYF R1,R2 ; R2 = x[4] * (1.0 ± v * x[4])
ADDF R2,R1 ; R2 = x[5] = (x[4]*(1.0±(v*x[4])))+x[4]
*
RND R1,R0 ; Round since this is followed by a MPYF
*
* NOW THE CASE OF v < 0 IS HANDLED.
*
NEGF R0,R2
LDF R3,R3 ; This sets condition flags
LDFN R2,R0 ; If v < 0, then R0 = ±R0
*
RETS
*
* END
*
.end
At the ith iteration, the estimate x[i] of 1 / SQRT(v) is computed from v and the
previous estimate x[i-1] according to this formula:
11-34
Logical and Arithmetic Operations
11-36
Logical and Arithmetic Operations
In the instruction set, operations ADDC (add with carry) and SUBB (subtract
with borrow) use the status carry bit for extended-precision arithmetic. The
carry bit is affected by the arithmetic operations of the ALU and by the rotate
and shift instructions. It can also be manipulated directly by setting the status
register to certain values. For proper operation, the overflow mode bit should
be reset (OVM = 0) so that the accumulator results are not loaded with the sat-
uration values. Example 11–19 and Example 11–20 show 64-bit addition and
64-bit subtraction. The first operand is stored in the registers R0 (low word) and
R1 (high word). The second operand is stored in R2 and R3. The result is
stored in R0 and R1.
11-38
Logical and Arithmetic Operations
When two 32-bit numbers are multiplied, a 64-bit product results. The proce-
dure for multiplication is to split the 32-bit magnitude values of the multiplicand
X and the multiplier Y into two parts (X1,X0) and (X3,X2), respectively, with 16
bits each. The operation is done on unsigned numbers, and the product is ad-
justed for the sign bit. Example 11–21 shows the implementation of a 32-bit by
32-bit multiplication.
* X0*Y0 16+16 P1
* X0*Y1 16+16 P2
* X1*Y0 16+16 P3
* X1*Y1 16+16 P4
* ––––––––––––––
* W1 W0
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | MULTIPLIER AND LOW WORD OF THE PRODUCT
* R1 | MULTIPLICAND AND UPPER WORD OF THE PRODUCT
*
*
* REGISTERS USED AS INPUT: R0, R1
* REGISTERS MODIFIED: R0, R1, R2, R3, R4, AR0, AR1
* REGISTER CONTAINING RESULT: R0,R1
*
*
11-40
Logical and Arithmetic Operations
In fixed-point arithmetic, the binary point that separates the integer from the
fractional part of the number is fixed at a certain location. For example, if a
32-bit number has the binary point after the most significant bit (which is also
the sign bit), only fractional numbers (numbers with absolute values less than
1), can be represented. In other words, there is a number called a Q31 number,
which is a number with 31 fractional bits. All operations assume that the binary
point is fixed at this location. The fixed-point system, although simple to imple-
ment in hardware, imposes limitations in the dynamic range of the represented
number, which causes scaling problems in many applications. You can avoid
this difficulty by using floating-point numbers.
m * be
8 1 23
e s f
11-42
Logical and Arithmetic Operations
In a 32-bit word representing a floating-point number, the first eight bits corre-
spond to the exponent expressed in two’s-complement format. There is one
bit for sign and 23 bits for the mantissa. The mantissa is expressed in two’s-
complement form, with the binary point after the most significant nonsign bit.
Since this bit is the complement of the sign bit s, it is suppressed. In other
words, the mantissa actually has 24 bits. A special case occurs when
e = –128. In this case, the number is interpreted as 0, independently of the
values of s and f (which are set to 0 by default). To summarize, the values of
the represented numbers in the TMS320C3x floating-point format are as fol-
lows:
2e * (01.f) if s = 0
2e * (10.f) if s = 1
0 if e = –128
IEEE floating-point format:
1 8 23
s e f
The IEEE floating-point format uses sign-magnitude notation for the mantissa,
and the exponent is biased by 127. In a 32-bit word representing a
floating-point number, the first bit is the sign bit. The next eight bits correspond
to the exponent, which is expressed in an offset-by-127 format (the actual ex-
ponent is e –127). The following 23 bits represent the absolute value of the
mantissa with the most significant 1 implied. The binary point is after this most
significant 1. In other words, the mantissa actually has 24 bits. There are sev-
eral special cases, summarized below.
These are the values of the represented numbers in the IEEE floating-point
format:
(–1) s * 2 e –127 * (01.f) if 0 < e < 255
Special cases:
(–1) s * 0.0 if e = 0 and f = 0 (zero)
(–1) s * 2 –126 * (0.f) if e = 0 and f < > 0 (denormalized)
(–1) s * infinity if e = 255 and f = 0 (infinity)
NaN (not a number) if e = 255 and f < > 0
Based on these definitions of the formats, two versions of the conversion rou-
tines were developed. One version handles the complete definition of the for-
mats. The other ignores some of the special cases (typically the ones that are
rarely used), but it has the benefit of executing faster than the complete con-
version. For this discussion, the two versions are referred to as the complete
version and the fast version, respectively.
Example 11–22 shows the fast conversion from IEEE to TMS320C3x floating-
point format. It properly handles the general case when 0 < e < 255, and also
handles 0s (that is, e = 0 and f = 0). The other special cases (denormalized,
infinity, and NaN) are not treated and, if present, will give erroneous results.
11-44
Logical and Arithmetic Operations
Example 11–23 shows the complete conversion between the IEEE and
TMS320C3x formats. In addition to the general case and the 0s, it handles the
special cases as follows:
11-46
Logical and Arithmetic Operations
TSTB *+AR1(7),R0
RETSNZ ; Return if NaN
LDI R0,R0
LDFGT *+AR1(8),R0 ; If positive, infinity =
; most positive number
LDFN *+AR1(5),R0 ; If negative, infinity =
RETS ; most negative number RETS
11-48
Logical and Arithmetic Operations
*
* TITLE TMS320C3x TO IEEE CONVERSION (FAST VERSION)
*
*
* SUBROUTINE TOIEEE
*
* FUNCTION: CONVERSION BETWEEN THE TMS320C3x FORMAT AND THE IEEE
* FLOATING-POINT FORMAT. THE NUMBER TO BE CONVERTED
* IS IN THE UPPER 32 BITS OF R0. THE RESULT WILL BE IN
* THE LOWER 32 BITS OF R0.
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R0 | NUMBER TO BE CONVERTED
* AR1 | POINTER TO TABLE WITH CONSTANTS
*
11-50
Logical and Arithmetic Operations
*
TOIEEE1 LDF R0,R0 ; Determine the sign of the number
LDFZ *+AR1(4),R0 ; If 0, load appropriate number
BND NEG ; Branch to NEG if negative (delayed)
ABSF R0 ; Take the absolute value
; of the number
LSH 1,R0 ; Eliminate the sign bit in R0
PUSHF R0
POP R0 ; Place number in lower 32 bits of R0
ADDI *+AR1(2),R0 ; Add exponent bias (127)
LSH ±1,R0 ; Add the positive sign
CONT TSTB *+AR1(5),R0
RETSNZ ; If e > 0, return
TSTB *+AR1(7),R0
RETSZ ; If e = 0 & f = 0, return
PUSH R0
POPF R0
LSH ±1,R0 ; Shift f right by one bit
PUSHF R0
POP R0
ADDI *+AR1(6),R0 ; Add 1 to the MSB of f
RETS
NEG POP R0 ; Place number in lower 32 bits of R0
BRD CONT
ADDI *+ARI(2),R0 ; Add exponent bias (127)
LSH ±1,R0 ; Make space for the sign
ADDI *+AR1(3),R0 ; Add the negative sign
RETS
11-52
Application-Oriented Operations
11.4.1 Companding
In telecommunications, conserving channel bandwidth while preserving
speech quality is a primary concern. This is achieved this by quantizing the
speech samples logarithmically. An 8-bit logarithmic quantizer produces
speech quality equivalent to a 13-bit uniform quantizer. The logarithmic quanti-
zation is achieved by companding (COMpress/exPANDing). Two international
standards have been established for companding: the µ-law standard (used
in the United States and Japan), and the A-law standard (used in Europe). De-
tailed descriptions of µ law and A law companding are presented in an applica-
tion report on companding routines included in the book Digital Signal Pro-
cessing Applications with the TMS320 Family (literature number SPRA012A).
Example 11–26 and Example 11–27 show µ-law compression and expansion
(that is, linear to µ-law and µ-law to linear conversion), while Example 11–28
and Example 11–29 show A-law compression and expansion. For expansion,
using a look-up table is an alternative approach. A look-up table trades
memory space for speed of execution. Since the compressed data is eight bits
long, you can construct a table with 256 entries containing the expanded data.
If the compressed data is stored in the register AR0, the following two instruc-
tions will put the expanded data in register R0:
ADDI @TABL,AR0 ; @TABL = BASE ADDRESS OF TABLE
LDI *AR0,R0 ; PUT EXPANDED NUMBER IN R0
You could use the same look-up table approach for compression, but the re-
quired table length would then be 16,384 words for µ-law or 8,192 words for
A-law. If this memory size is not acceptable, use the subroutines presented in
Example 11–26 or Example 11–28.
LDI 0,R2
LDI R1,R1 ; If number is negative,
LDILT 80H,R2 ; set sign bit
ADDI R2,R0 ; R0 = compressed number
NOT R0 ; Reverse all bits for transmission
RETS
11-54
Application-Oriented Operations
11-56
Application-Oriented Operations
Digital filters are a common requirement for digital signal processing systems.
There are two types of digital filters: finite impulse response (FIR) and infinite
impulse response (IIR). Each of these types can have either fixed or adaptable
coefficients. This section presents the fixed-coefficient filters first, followed by
the adaptive filters.
If the FIR filter has an impulse response h [0], h [1],..., h [N –1], and x[n] repre-
sents the input of the filter at time n, the output y [n] at time n is given by this
equation:
Two features of the TMS320C3x that facilitate the implementation of the FIR
filters are parallel multiply/add operations and circular addressing. The former
permits the performance of a multiplication and an addition in a single machine
cycle, while the latter makes a finite buffer of length N sufficient for the data x.
Figure 11–1 shows the arrangement of the memory locations necessary to im-
plement circular addressing, while Example 11–30 presents the TMS320C3x
assembly code for an FIR filter.
11-58
Application-Oriented Operations
*
ADDF R0,R2,R0 ; Add last product
*
* RETURN SEQUENCE
*
RETS ; Return
*
* end
*
.end
As in the case of FIR filters, the address for the start of the values d must be
a multiple of 4; that is, the last two bits of the beginning address must be 0. The
block-size register BK must be initialized to 3.
11-60
Application-Oriented Operations
*
.global IIR1
*
IIR1 MPYF3 *AR0,*AR1,R0
* ; a2 * d(n±2) ±> R0
MPYF3 *++AR0(1),*AR1– –(1) % ,R1
* ; b2 * d(n±2) ±> R1
*
MPYF3 *++AR0(1),*AR1,R0 ; a1 * d(n±1) ±> R0
|| ADDF3 R0,R2,R2 ; a2*d(n±2)+x(n) ±> R2
*
MPYF3 *++AR0(1),*AR1– –(1)%,R0 ; b1 * d(n±1) ±> R0
|| ADDF3 R0,R2,R2 ; a1*d(n±1)+a2*d(n±2)+x(n) ±> R2
*
MPYF3 *++AR0(1),R2,R2 ; b0 * d(n) ±> R2
|| STF R2,*AR1++(1)%
*
* ; Store d(n)and point to d(n±1)
*
ADDF R0,R2 ; b1*d(n±1)+b0*d(n) ±> R2
ADDF R1,R2,R0 ; b2*d(n±2)+b1*d(n±1)
; +b0*d(n) ±> R0
*
* RETURN SEQUENCE
*
RETS ; Return
*
* end
*
.end
In the more general case, the IIR filter contains N >1 biquads. The equations
for its implementation are given by the following pseudo-C language code:
y [0,n] = x [n]
for (i = 0; i < N; i ++){
d [i,n] = a2 [i] d [i, n – 2] + a1 [i] d [i,n –1] + y [i – 1,n]
y [i,n] = b2 [i] d [i – 2] + b1 [i] d [i,n – 1] + b0 [i] d [i,n]
}
y [n] = y [N – 1,n]
11-62
Application-Oriented Operations
You should initialize the block register BK to 3; the beginning of each set of d
values (that is, d [i,n ], i = 0...N – 1) should be at an address that is a multiple
of 4 (where the last two bits are 0).
*
*
* SUBROUTINE IIR2
*
*
*
* EQUATIONS: y(0,n) = x(n)
*
* FOR (i = 0; i < N; i++)
* {
* y(n) = y(N±1,n)
*
* TYPICAL CALLING SEQUENCE:
*
* load R2
* load AR0
* load AR1
* load IR0
* load IR1
* load BK
* load RC
* CALL IIR2
*
*
* ARGUMENT ASSIGNMENT:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R2 | INPUT SAMPLE x(n)
* ARO | ADDRESS OF FILTER COEFFICIENTS (a2(0))
* AR1 | ADDRESS OF DELAY NODE VALUES (d(0,n±2))
* BK | BK = 3
* IR0 | IR0 = 4
* IR1 | IR1 = 4*N±4
* RC | NUMBER OF BIQUADS (N) ±2
*
11-64
Application-Oriented Operations
* CYCLES: 17 + 6N WORDS: 17
*
*
*
*
.global IIR2
*
IIR2 MPYF3 *AR0, *AR1, R0
* ; a2(0) * d(0,n±2) ±> R0
MPYF3 *AR0++(1), *AR1– –(1)%, R1
* ; b2(0) * d(0,n±2) ±> R1
*
MPYF3 *++AR0(1),*AR1,R0 ; a1(0) * D(0,n±1) ±> R0
|| ADDF R0, R2, R2 ; First sum term of d(0,n)
*
MPYF3 *++AR0(1),*AR1– –(1)%,R0 ; b1(0) * d(0,n±1) ±> R0
|| ADDF3 R0, R2, R2 ; Second sum term of d(0,n)
MPYF3 *++AR0(1),R2,R2 ; b0(0) * d(0,n) ±> R2
|| STF R2, *AR1– –(1)%
*
* ; Store d(0,n);
; point to
; d(0,n±2)
RPTB LOOP ; Loop for 1 <= i < n
*
MPYF3 *++AR0(1),*++AR1(IR0),R0 ; a2(i) * d(i,n±2) ±> R0
|| ADDF3 R0,R2,R2 ; First sum term of y(i±1,n)
*
MPYF3 *++AR0(1),*AR1– – (1)%R1 ; b2(i) * D(i,n±2) ±> R1
|| ADDF3 R1,R2,R2 ; Second sum term
; of y(i±1,n)
*
MPYF3 *++AR0(1),*AR1,R0 ; a1(i) * d(i,n±1) ±> R0
|| ADDF3 R0,R2,R2 ; First sum of d(i,n)
*
MPYF3 *++AR0(1),*AR1– –(1)%,R0 ; b1(i) * d(i,n±1) ±> R0
|| ADDF3 R0,R2,R2 ; Second sum term of d(i,n)
*
STF R2, *AR1– –(1)%
* ; Store d(i,n);
; point to d(i,n±2)
LOOP MPYF3 *++AR0(1), R2,R2
* ; b0(i) * d(i,n) ±> R2
*
*
* FINAL SUMMATION
*
ADDF R0,R2 ; First sum term of y(n±1,n)
ADDF3 R1,R2,R0 ; Second sum term
; of y(n±1,n)
*
NOP *AR1– –(IR1) ; Return to first biquad
NOP *AR1– –(1)% ; Point to d(0,n±1)
*
* RETURN SEQUENCE
*
RETS ; Return
* end
*
.end
11-66
Application-Oriented Operations
In some applications in digital signal processing, you must adapt a filter over
time to keep track of changing conditions. The book Theory and Design of
Adaptive Filters by Treichler, Johnson, and Larimore (Wiley-Interscience,
1987) presents the theory of adaptive filters. Although in theory, both FIR and
IIR structures can be used as adaptive filters, the stability problems and the
local optimum points that the IIR filters exhibit make them less attractive for
such an application. Hence, until further research makes IIR filters a better
choice, only the FIR filters are used in adaptive algorithms of practical applica-
tions.
β is a constant for the computation. You can interleave the updating of the filter
coefficients with the computation of the filter output so that it takes three cycles
per filter tap to do both. The updated coefficients are written over the old filter
coefficients. Example 11–33 shows the implementation of an adaptive FIR fil-
ter on the TMS320C3x. The memory organization and the positioning of the
data in memory should follow the same rules that apply to the FIR filter de-
scribed in subsection 11.4.2.1 on page 11-58.
*
* SUBROUTINE LMS
* + h(n,N±1)*x(n±(N±1))
*
* TYPICAL CALLING SEQUENCE:
*
* load R4
* load AR0
* load AR1
* load RC
* load BK
* CALL LMS
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R4 | SCALE FACTOR (2 * mu * err)
* AR0 | ADDRESS OF h(n,N±1)
* AR1 | ADDRESS OF x(n±(N±1))
* RC | LENGTH OF FILTER ± 2 (N±2)
* BK | LENGTH OF FILTER (N)
*
11-68
Application-Oriented Operations
.global LMS
* ; Initialize R0:
LMS MPYF3 *AR0, *AR1, R0
* ; h(n,N±1) * x(n±(N±1)) ±> R0
LDF 0.0,R2 ; Initialize R2
*
* ; Initialize R1:
MPYF3 *AR1++(1)%, R4, R1
* ; x(n±(N±1)) * tmuerr ±> R1
ADDF3 *AR0++(1), R1, R1
* ; h(n,N±1) + x(n±(N±1)) *
* ; tmuerr ±> R1
*
* RETURN SEQUENCE
*
RETS ; Return
*
* end
*
.end
for (i = 0; i < K; i + +) {
p (i) = 0
for (j = 0; j < N; j + +)
p (i) = p (i) + m (i,j) * v (j)
}
Figure 11–4 shows the data memory organization for matrix-vector multiplica-
tion, and Example 11–34 shows the TMS320C3x assembly code that imple-
ments it. Note that in Example 11–34, K (number of rows) should be greater
than 0, and N (number of columns) should be greater than 1.
11-70
Application-Oriented Operations
Input Result
Matrix Storage Vector Storage Vector Storage
Low
Address m(0, 0) v(0) p(0)
m(0, 1) v(1) p(1)
• • •
• • •
• • •
m(0, N – 1) v(N – 1) p(K – 1)
m(1, 0)
High m(1, 1)
Address
•
•
•
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* AR0 | ADDRESS OF M(0,0)
* AR1 | ADDRESS OF V(0)
* AR2 | ADDRESS OF P(0)
* AR3 | NUMBER OF ROWS ± 1 (K±1)
* R1 | NUMBER OF COLUMNS ± 2 (N±2)
*
.global MAT
*
* SETUP
*
MAT LDI R1,IR0 ; Number of columns±2 ±> IR0
ADDI 2,IR0 ; IR0 = N
11-72
Application-Oriented Operations
*
* FOR (i = 0; i < K; i++) LOOP OVER THE ROWS
*
RETS ; Return
* end
*
.end
Fourier transforms are an important tool often used in digital signal processing
systems. The purpose of the transform is to convert information from the time
domain to the frequency domain. The inverse Fourier transform converts infor-
mation back to the time domain from the frequency domain. Implementation
of Fourier transforms that are computationally efficient are known as fast Four-
ier transforms (FFTs). The theory of FFTs can be found in books such as DFT/
FFT and Convolution Algorithms by C.S. Burrus and T.W. Parks (John Wiley,
1985) and Digital Signal Processing Applications with the TMS320 Family by
Texas Instruments (literature number SPRA012A).
11-74
Application-Oriented Operations
.text
* INITIALIZE
FFTSIZ .word N
LOGFFT .word M
SINTAB .word SINE
INPUT .word INP
OUTPUT .word OUTP
LDI @FFTSIZ,IR1
LSH ±2,IR1 ; IR1 = N/4, pointer for SIN/COS table
LDI 0,AR6 ; AR6 holds the current stage number
LDI @FFTSIZ,IR0
LSH 1,IR0 ; IR0 = 2*N1 (because of real/imag)
LDI @FFTSIZ,R7 ; R7 = N2
LDI 1,AR7 ; Initialize repeat counter
; of first loop
LDI 1,AR5 ; Initialize IE index (AR5 = IE)
* OUTER LOOP
* FIRST LOOP
RPTB BLK1
ADDF *AR0,*AR2,R0 ; R0 = X(I)+X(L)
SUBF *AR2++,*AR0++,R1 ; R1 = X(I)±X(L)
ADDF *AR2,*AR0,R2 ; R2 = Y(I)+Y(L)
SUBF *AR2,*AR0,R3 ; R3 = Y(I)±Y(L)
STF R2,*AR0– – ; Y(I) = R2 and...
|| STF R3,*AR2– – ; Y(L) = R3
BLK1 STF R0,*AR0++(IR0) ; X(I) = R0 and...
|| STF R1,*AR2++(IR0) ; X(L) = R1 and AR0,2 = AR0,2 + 2*n
CMPI @LOGFFT,AR6
BZD END
* SECOND LOOP
RPTB BLK2
SUBF *AR2,*AR0,R2 ; R2 = X(I)±X(L)
SUBF *+AR2,*+AR0,R1
* ; R1 = Y(I)±Y(L)
MPYF R2,R6,R0 ; R0 = R2*SIN and...
|| ADDF *+AR2,*+AR0,R3
* ; R3 = Y(I)+Y(L)
MPYF R1,*+AR4(IR1),R3 ; R3 = R1 * COS and ...
|| STF R3,*+AR0 ; Y(I) = Y(I)+Y(L)
11-76
Application-Oriented Operations
CMPI R7,AR1
BNE INLOP ; Loop back to the inner loop
RPTB BITRV
LDF *+AR0(1),R0
|| LDF *AR0++(IR0)B,R1
BITRV STF R0,*+AR1(1)
|| STF R1,*AR1++(IR1)
.globl SINE
.globl N
.globl M
N .set 64
M .set 6
.data
SINE
.float 0.000000
.float 0.098017
.float 0.195090
.float 0.290285
.float 0.382683
.float 0.471397
.float 0.555570
.float 0.634393
.float 0.707107
.float 0.773010
.float 0.831470
.float 0.881921
.float 0.923880
.float 0.956940
.float 0.980785
.float 0.995185
COSINE
.float 1.000000
.float 0.995185
.float 0.980785
.float 0.956940
.float 0.923880
.float 0.881921
.float 0.831470
.float 0.773010
.float 0.707107
.float 0.634393
.float 0.555570
.float 0.471397
.float 0.382683
.float 0.290285
.float 0.195090
11-78
Application-Oriented Operations
.float 0.098017
.float 0.000000
.float ± 0.098017
.float ± 0.195090
.float ± 0.290285
.float ± 0.382683
.float – 0.471397
.float –0.555570
.float – 0.634393
.float – 0.707107
.float – 0.773010
.float – 0.831470
.float – 0.881921
.float – 0.923880
.float – 0.956940
.float – 0.980785
.float – 0.995185
.float –1.000000
.float – 0.995185
.float – 0.980785
.float – 0.956940
.float – 0.923880
.float – 0.881921
.float – 0.831470
.float – 0.773010
.float – 0.707107
.float – 0.634393
.float – 0.555570
.float – 0.471397
.float – 0.382683
.float – 0.290285
.float – 0.195090
.float – 0.098017
.float 0.000000
.float 0.098017
.float 0.195090
.float 0.290285
.float 0.382683
.float 0.471397
.float 0.555570
.float 0.634393
.float 0.707107
.float 0.773010
.float 0.831470
.float 0.881921
.float 0.923880
.float 0.956940
.float 0.980785
.float 0.995185
The radix-2 algorithm has tutorial value, because the functioning of the FFT
algorithm is relatively easy to understand. However, radix-4 implementation
can increase execution speed by reducing the amount of arithmetic required.
Example 11–37 shows the generic implementation of a complex, DIF FFT in
radix-4. A companion table, such as the one in Example 11–36, should have
a value of M equal to the logN, where the base of the logarithm is 4.
11-80
Application-Oriented Operations
*
.globl FFT ; Entry point for execution
.globl N ; FFT size
.globl M ; LOG4(N)
.globl SINE ; Address of sine table
.text
* INITIALIZE
FFT:
* OUTER LOOP
LOOP:
LDI @INPUT,AR0 ; AR0 points to X(I)
ADDI R0,AR0,AR1 ; AR1 points to X(I1)
ADDI R0,AR1,AR2 ; AR2 points to X(I2)
ADDI R0,AR2,AR3 ; AR3 points to X(I3)
LDI @RPTCNT,RC
SUBI 1,RC ; RC should be one less than desired #
* FIRST LOOP
RPTB BLK1
ADDF *+AR0,*+AR2,R1
11-82
Application-Oriented Operations
* ; R1 = Y(I)+Y(I2)
ADDF *+AR3,*+AR1,R3
* ; R3 = Y(I1)+Y(I3)
ADDF R3,R1,R6 ; R6 = R1+R3
SUBF *+AR2,*+AR0,R4
* ; R4 = Y(I)±Y(I2)
STF R6,*+AR0 ; Y(I) = R1+R3
SUBF R3,R1 ; R1 = R1±R3
LDF *AR2,R5 ; R5 = X(I2)
|| LDF *+AR1,R7 ; R7 = Y(I1)
ADDF *AR3,*AR1,R3 ; R3 = X(I1)+X(I3)
ADDF R5,*AR0,R1 ; R1 = X(I)+X(I2)
|| STF R1,*+AR1 ; Y(I1) = R1±R3
ADDF R3,R1,R6 ; R6 = R1+R3
SUBF R5,*AR0,R2 ; R2 = X(I)±X(I2)
|| STF R6,*AR0++(IR0) ; X(I) = R1+R3
SUBF R3,R1 ; R1 = R1±R3
SUBF *AR3,*AR1,R6 ; R6 = X(I1)±X(I3)
SUBF R7,*+AR3,R3 ; ±R3 = Y(I1)±Y(I3)
|| STF R1,*AR1++(IR0) ; X(I1) = R1±R3
SUBF R6,R4,R5 ; R5 = R4±R6
ADDF R6,R4 ; R4 = R4+R6
STF R5,*+AR2 ; Y(I2) = R4±R6
|| STF R4,*+AR3 ; Y(I3) = R4+R6
SUBF R3,R2,R5 ; R5 = R2±R3
ADDF R3,R2 ; R2 = R2+R3
BLK1 STF R5,*AR2++(IR0) ; X(I2) = R2±R3
|| STF R2,*AR3++(IR0) ; X(I3) = R2+R3
LDI @STAGE,AR7
ADDI 1,AR7
CMPI @LOGFFT,AR7
BZD END
STI AR7,@STAGE ; Current FFT stage
LDI 1,AR7
STI AR7,@IA1 ; Init IA1 index
LDI 2,AR7
STI AR7,@LPCNT ; Init loop counter for inner loop
; INLOP:
LDI 2,AR6 ; Increment inner loop counter
ADDI @LPCNT,AR6
LDI @LPCNT,AR0
LDI @IA1,AR7
ADDI @IEINDX,AR7 ; IA1 = IA1+IE
ADDI @INPUT,AR0 ; (X(I),Y(I)) pointer
STI AR7,@IA1
* SECOND LOOP
RPTB BLK2
ADDF *+AR2,*+AR0,R3
* ; R3 = Y(I)+Y(I2)
ADDF *+AR3,*+AR1,R5
* ; R5 = Y(I1)+Y(I3)
ADDF R5,R3,R6 ; R6 = R3+R5
SUBF *+AR2,*+AR0,R4
* ; R4 = Y(I)±Y(I2)
SUBF R5,R3 ; R3 = R3±R5
ADDF *AR2,*AR0,R1 ; R1 = X(I)+X(I2)
ADDF *AR3,*AR1,R5 ; R5 = X(I1)+X(I3)
MPYF R3,*+AR5(IR1),R6 R6 = R3*CO2
|| STF R6,*+AR0 ; Y(I) = R3+R5
ADDF R5,R1,R7 ; R7 = R1+R5
SUBF *AR2,*AR0,R2 ; R2 = X(I)±X(I2)
SUBF R5,R1 ; R1 = R1±R5
MPYF R1,*AR5,R7 ; R7 = R1*SI2
|| STF R7,*AR0++(IR0) ; X(I) = R1+R5
SUBF R7,R6 ; R6 = R3*CO2±R1*SI2
SUBF *+AR3,*+AR1,R5
* ; R5 = Y(I1)±Y(I3)
MPYF R1,*+AR5(IR1),R7 ; R7 = R1*C02
|| STF R6,*+AR1 ; Y(I1) = R3*CO2±R1*SI2
MPYF R3,*AR5,R6 ; R6 = R3*SI2
ADDF R7,R6 ; R6 = R1*CO2+R3*SI2
ADDF R5,R2,R1 ; R1 = R2+R5
SUBF R5,R2 ; R2 = R2±R5
SUBF *AR3,*AR1,R5 ; R5 = X(I1)±X(I3)
SUBF R5,R4,R3 ; R3 = R4±R5
ADDF R5,R4 ; R4 = R4+R5
MPYF R3,*+AR4(IR1),R6 ; R6 = R3*CO1
|| STF R6,*AR1++(IR0) ; X(I1) = R1*CO2+R3*SI2
11-84
Application-Oriented Operations
CMPI @LPCNT,R0
BP INLOP ; Loop back to the inner loop
BR CONT
RPTB BLK3
ADDF *AR2,*AR0,R1 ; R1 = X(I)+X(I2)
SUBF *AR2,*AR0,R2 ; R2 = X(I)±X(I2)
ADDF *+AR2,*+AR0,R3
* ; R3 = Y(I)+Y(I2)
SUBF *+AR2,*+AR0,R4
* ; R4 = Y(I)±Y(I2)
ADDF *AR3,*AR1,R5 ; R5 = X(I1)+X(I3)
SUBF R1,R5,R6 ; R6 = R5±R1
ADDF R5,R1 ; R1 = R1+R5
ADDF *+AR3,*+AR1,R5
* ; R5 = Y(I1)+Y(I3)
SUBF R5,R3,R7 ; R7 = R3±R5
ADDF R5,R3 ; R3 = R3+R5
STF R3,*+AR0 ; Y(I) = R3+R5
|| STF R1,*AR0++(IR0) ; X(I) = R1+R5
SUBF *AR3,*AR1,R1 ; R1 = X(I1)±X(I3)
SUBF *+AR3,*+AR1,R3
* ; R3 = Y(I1)±Y(I3)
STF R6,*+AR1 ; Y(I1) = R5±R1
CMPI @LPCNT,R0
BPD INLOP ; Loop back to the inner loop
11-86
Application-Oriented Operations
RPTB BITRV
LDF *+AR0(1),R0
|| LDF *AR0++(IR0)B,R1
BITRV STF R0,*+AR1(1)
|| STF R1,*AR1++(IR1)
Example 11–39 shows the implementation of a radix-2 real inverse FFT. The
inverse transformation assumes that the input data is given in the order pres-
ented at the output of the forward transformation and produces a time signal
in the proper order (that is, bit reversing takes place at the end of the program).
*
* VER DATE COMMENTS
* ––– –––––––––––– –––––––––––––––––––––––––––––––––––––––––––––––––––
* 1.0 18th July 91 Original release.
* 2.0 23rd July 91 Most stages modified.
* Minimum FFT size increased from 32 to 64.
* Faster in place bit reversing algorithm.
* Program size increased by about 100 words.
* One extra data word required.
*****************************************************************************
11-88
Application-Oriented Operations
FP .set AR3
11-90
Application-Oriented Operations
;
; Initialize C function.
;
.sect ”.ffttext”
_ffft_rl: PUSH FP ; Preserve C environment.
LDI SP,FP
PUSH R4
PUSH R5
PUSH R6
PUSHF R6
PUSH R7
PUSHF R7
PUSH AR4
PUSH AR5
PUSH AR6
PUSH AR7
PUSH DP
LDP FFT_SIZE ; Init. DP pointer.
LDI *–FP(2),R0 ; Move arguments from stack.
STI R0,@FFT_SIZE
LDI *–FP(3),R0
STI R0,@LOG_SIZE
LDI *–FP(4),R0
STI R0,@SOURCE_ADDR
LDI *–FP(5),R0
STI R0,@DEST_ADDR
LDI *–FP(6),R0
STI R0,@SINE_TABLE
LDI *–FP(7),R0
STI R0,@BIT_REVERSE
;
; Check bit reversing mode (on or off).
;
; BIT_REVERSING = 0, then OFF
; (no bit reversing).
; BIT_REVERSING <> 0, Then ON.
;
LDI @BIT_REVERSE,R0
CMPI 0,R0
BZ MOVE_DATA
;
; Check bit reversing type.
;
; If SourceAddr = DestAddr, then in place
; bit reversing.
; If SourceAddr <> DestAddr, then
; standard bit reversing.
;
LDI @SOURCE_ADDR,R0
CMPI @DEST_ADDR,R0
BEQ IN_PLACE
;
; Bit reversing Type 1 (from source to
; destination).
;
; NOTE: abs(SOURCE_ADDR – DEST_ADDR)
; must be > FFT_SIZE, this is not
; checked.
;
LDI @FFT_SIZE,R0
SUBI 2,R0
LDI @FFT_SIZE,IR0
LSH –1,IR0 ; IRO = half FFT size.
LDI @SOURCE_ADDR,AR0
LDI @DEST_ADDR,AR1
LDF *AR0++,R1
RPTS R0
LDF *AR0++,R1
|| STF R1,*AR1++(IR0)B
STF R1,*AR1++(IR0)B
BR START
;
; In-place bit reversing.
;
LDI @FFT_SIZE,RC
LSH –2,RC
SUBI 3,RC
LDI @DEST_ADDR,AR0
LDI AR0,AR1
LDI AR0,AR2
NOP *AR1++(IR0)B
NOP *AR2++(IR0)B
LDF *++AR0(IR1),R0
LDF *AR1,R1
CMPI AR1,AR0 ; Xchange locs only if AR0<AR1.
LDFGT R0,R1
LDFGT *AR1++(IR0)B,R1
11-92
Application-Oriented Operations
RPTB BITRV1
LDF *++AR0(IR1),R0
|| STF R0,*AR0
LDF *AR1,R1
|| STF R1,*AR2++(IR0)B
CMPI AR1,AR0
LDFGT R0,R1
BITRV1: LDFGT *AR1++(IR0)B,R0
STF R0,*AR0
STF R1,*AR2
LDI @FFT_SIZE,RC
LSH –1,RC
LDI @DEST_ADDR,AR0
ADDI RC,AR0
ADDI 1,AR0
LDI AR0,AR1
LDI AR0,AR2
LSH –1,RC
SUBI 3,RC
NOP *AR1++(IR0)B
NOP *AR2++(IR0)B
LDF *++AR0(IR1),R0
LDF *AR1,R1
CMPI AR1,AR0 ; Xchange locs only if AR0<AR1.
LDFGT R0,R1
LDFGT *AR1++(IR0)B,R1
RPTB BITRV2
LDF *++AR0(IR1),R0
|| STF R0,*AR0
LDF *AR1,R1
|| STF R1,*AR2++(IR0)B
CMPI AR1,AR0
LDFGT R0,R1
BITRV2: LDFGT *AR1++(IR0)B,R0
STF R0,*AR0
STF R1,*AR2
LDI @FFT_SIZE,RC
LSH –1,RC
LDI RC,IR0
LDI @DEST_ADDR,AR0
LDI AR0,AR1
ADDI 1,AR0
ADDI IR0,AR1
LSH –1,RC
LDI RC,IR0
SUBI 2,RC
LDF *AR0,R0
LDF *AR1,R1
RPTB BITRV3
LDF *++AR0(IR1),R0
|| STF R0,*AR1++(IR0)B
BITRV3: LDF *AR1,R1
|| STF R1,*–AR0(IR1)
STF R0,*AR1
STF R1,*AR0
BR START
;
; Check data source locations.
;
; If SourceAddr = DestAddr, then
; do nothing.
; If SourceAddr <> DestAddr, then move
data.
;
LDI @FFT_SIZE,R0
SUBI 2,R0
LDI @SOURCE_ADDR,AR0
LDI @DEST_ADDR,AR1
LDF *AR0++,R1
RPTS R0
LDF *AR0++,R1
|| STF R1,*AR1++
STF R1,*AR1
11-94
Application-Oriented Operations
;
; Perform first and second FFT loops.
;
; AR1 I1 0 [X(I1) + X(I2)] + [X(I3) + X(I4)]
;
AR2 I2 1 [X(I1) – X(I2)]
;
; AR3 I3 2 [X(I1) + X(I2)] – [X(I3) + X(I4)]
;
AR4 I4 3 –[X(I3) – X(I4)]
;
; AR1 4
;
;
;
; Perform third FFT loop.
;
; Part A:
;
; AR1 I1 0 X(I1) + X(I3)
; 1
;
; I2 2
; 3
;
; AR2 I3 4 X(I1) – X(I3)
; 5
;
; AR3 I4 6 –X(I4)
;
7
;
; AR1 8
;
9
;
;
;
LDI @DEST_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
ADDI 4,AR2
ADDI 6,AR3
LDI 8,IR0
LDI @FFT_SIZE,RC
LSH –3,RC
SUBI 2,RC
SUBF3 *AR2,*AR1,R1
ADDF3 *AR2,*AR1,R2
NEGF *AR3,R3
RPTB LOOP3_A
LDF *+AR2(IR0),R0 ; R0 = X(I3)
|| STF R2,*AR1++(IR0)
SUBF3 R0,*AR1,R1 ; R1 = X(I1) – X(I3)
|| STF R1,*AR2++(IR0) ;
ADDF3 R0,*AR1,R2 ; R2 = X(I1) + X(I3)
|| STF R3,*AR3++(IR0) ;
LOOP3_A: NEGF *AR3,R3 ; R3 = –X(I4)
;
STF R2,*AR1 ; X(I1)
STF R1,*AR2 ; X(I3)
STF R3,*AR3 ; X(I4)
11-96
Application-Oriented Operations
;
; Part B:
;
;
;
; 0
; AR0 I1 1 X[I1] + [X(I3)*COS+ X(I4)*COS]
; 2
; AR1 I2 3 X[I1] – [X(I3)*COS+ X(I4)*COS]
; 4
; AR2 I3 5 –X[I2] – [X(I3)*COS– X(I4)*COS]
; 6
; AR3 I4 7 X[I2] – [X(I3)*COS– X(I4)*COS]
; 8
; AR0 9 NOTE: COS(2*pi/8) = SIN(2*pi/8)
LDI @FFT_SIZE,RC
LSH –3,RC
LDI RC,IR1
SUBI 3,RC
LDI 8,IR0
LDI @DEST_ADDR,AR0
LDI AR0,AR1
LDI AR0,AR2
LDI AR0,AR3
ADDI 1,AR0
ADDI 3,AR1
ADDI 5,AR2
ADDI 7,AR3
LDI @SINE_TABLE,AR7 ; Initialize table pointers.
LDF *++AR7(IR1),R7 ; R7 = COS(2*pi/8)
; *AR7 = COS(2*pi/8)
MPYF3 *AR7,*AR2,R0 ; R0 = X(I3)*COS
MPYF3 *AR3,R7,R1 ; R5 = X(I4)*COS
ADDF3 R0,R1,R2 ; R2 = [X(I3)*COS + X(I4)*COS]
MPYF3 *AR7,*+AR2(IR0),R0
|| SUBF3 R0,R1,R3 ; R3 = –[X(I3)*COS – X(I4)*COS]
SUBF3 *AR1,R3,R4 ; R4 = –X(I2) + R3
ADDF3 *AR1,R3,R4 ; R4 = X(I2) + R3
|| STF R4,*AR2++(IR0) ; X(I3)
SUBF3 R2,*AR0,R4 ; R4 = X(I1) – R2
|| STF R4,*AR3++(IR0) ; X(I4)
ADDF3 *AR0,R2,R4 ; R4 = X(I1) + R2
|| STF R4,*AR1++(IR0) ; X(I2)
;
RPTB LOOP3_B ;
MPYF3 *AR3,R7,R1 ;
|| STF R4,*AR0++(IR0) ; X(I1)
ADDF3 R0,R1,R2
MPYF3 *AR7,*+AR2(IR0),R0
|| SUBF3 R0,R1,R3
SUBF3 *AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2++(IR0)
SUBF3 R2,*AR0,R4
|| STF R4,*AR3++(IR0)
LOOP3_B: ADDF3 *AR0,R2,R4
|| STF R4,*AR1++(IR0)
MPYF3 *AR3,R7,R1
|| STF R4,*AR0++(IR0)
ADDF3 R0,R1,R2
SUBF3 R0,R1,R3
SUBF3 *AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2
SUBF3 R2,*AR0,R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
|| STF R4,*AR1
STF R4,*AR0
11-98
Application-Oriented Operations
;
; Perform fourth FFT loop.
;
; Part A:
;
; AR1 I1 0 X(I1) + X(I3)
; 1
; 2
; 3
; I2 4
; 5
; 6
; 7
; AR2 I3 8 X(I1) – X(I3)
; 9
; 10
; 11
; AR3 I4 12 –X(I4)
; 13
; 14
; 15
; AR1 I5 16
; 17
;
;
;
;
LDI @DEST_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
ADDI 8,AR2
ADDI 12,AR3
LDI 16,IR0
LDI @FFT_SIZE,RC
LSH –4,RC
SUBI 2,RC
SUBF3 *AR2,*AR1,R1
ADDF3 *AR2,*AR1,R2
NEGF *AR3,R3
RPTB LOOP4_A
LDF *+AR2(IR0),R0 ; R0 = X(I3)
|| STF R2,*AR1++(IR0)
SUBF3 R0,*AR1,R1 ; R1 = X(I1) – X(I3)
|| STF R1,*AR2++(IR0) ;
ADDF3 R0,*AR1,R2 ; R2 = X(I1) + X(I3)
|| STF R3,*AR3++(IR0) ;
LOOP4_A: NEGF *AR3,R3 ; R3 = –X(I4)
;
STF R2,*AR1 ; X(I1)
|| STF R1,*AR2 ; X(I3)
STF R3,*AR3 ; X(I4)
;
; Part B:
;
; 0
; AR0 I1 (3rd) 1 X[I1] + [X(I3)*COS+ X(I4)*SIN]
; I1 (2nd) 2 .
; I1 (1st) 3 .
; 4
; I2 (1st) 5 .
; I2 (2nd) 6 .
; AR1 I2 (3rd) 7 X[I1] – [X(I3)*COS+ X(I4)*SIN]
; 8
; AR2 I3 (3rd) 9 –X[I2] – [X(I3)*COS– X(I4)*COS]
; I3 (2nd) 10 .
; AR4 I3 (1st) 11 .
; 12
; I4 (1st) 13 .
; I4 (2nd) 14 .
; AR3 I4 (3rd) 15 X[I2] – [X(I3)*SIN– X(I4)*COS]
; 16
; AR0 17
;
;
LDI @FFT_SIZE,RC
LSH –4,RC
LDI RC,IR1
LDI 2,IR0
SUBI 3,RC
LDI @DEST_ADDR,AR0
LDI AR0,AR1
LDI AR0,AR2
LDI AR0,AR3
LDI AR0,AR4
ADDI 1,AR0
ADDI 7,AR1
ADDI 9,AR2
ADDI 15,AR3
ADDI 11,AR4
LDI @SINE_TABLE,AR7
LDF *++AR7(IR1),R7 ; R7 = SIN(1*[2*pi/16])
; *AR7 = COS(3*[2*pi/16])
LDI AR7,AR6
LDF *++AR6(IR1),R6 ; R6 = SIN(2*[2*pi/16])
; *AR6 = COS(2*[2*pi/16])
LDI AR6,AR5
LDF *++AR5(IR1),R5 ; R5 = SIN(3*[2*pi/16])
; *AR5 = COS(1*[2*pi/16])
LDI 16,IR1
11-100
Application-Oriented Operations
|| STF R4,*AR2––
SUBF3 R2,*++AR0(IR0),R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
|| STF R4,*AR1
MPYF3 *++AR3,R6,R1
|| STF R4,*AR0
ADDF3 R0,R1,R2
MPYF3 *AR5,*–AR4(IR0),R0
|| SUBF3 R0,R1,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2
SUBF3 R2,*– –AR0,R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
|| STF R4,*AR1
MPYF3 *– –AR2,R7,R4
|| STF R4,*AR0
MPYF3 *++AR3,R7,R1
MPYF3 *AR5,*AR3,R0
|| ADDF3 R0,R1,R2
MPYF3 *AR7,*++AR4(IR1),R0
|| SUBF3 R4,R0,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2++(IR1)
SUBF3 R2,*– –AR0,R4
|| STF R4,*AR3++(IR1)
LOOP4_B: ADDF3 *AR0,R2,R4
|| STF R4,*AR1++(IR1)
MPYF3 *++AR2(IR0),R5,R4
|| STF R4,*AR0++(IR1)
MPYF3 *– –AR3(IR0),R5,R1
MPYF3 *AR7,*AR3,R0
|| ADDF3 R0,R1,R2
MPYF3 *AR6,*–AR4,R0
|| SUBF3 R4,R0,R3
SUBF3 *– –AR1(IR0),R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2– –
SUBF3 R2,*++AR0(IR0),R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
|| STF R4,*AR1
11-102
Application-Oriented Operations
MPYF3 *++AR3,R6,R1
|| STF R4,*AR0
ADDF3 R0,R1,R2
MPYF3 *AR5,*–AR4(IR0),R0
|| SUBF3 R0,R1,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2
SUBF3 R2,*– –AR0,R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
STF R4,*AR1
MPYF3 *– –AR2,R7,R4
|| STF R4,*AR0
MPYF3 *++AR3,R7,R1
MPYF3 *AR5,*AR3,R0
|| ADDF3 R0,R1,R2
SUBF3 R4,R0,R3
SUBF3 *++AR1,R3,R4
ADDF3 *AR1,R3,R4
|| STF R4,*AR2
SUBF3 R2,*– –AR0,R4
|| STF R4,*AR3
ADDF3 *AR0,R2,R4
STF R4,*AR1
STF R4,*AR0
;
; Perform remaining FFT loops (loop 4 onwards).
;
; LOOP
; 1st 2nd
;
; X’(I1) 0 0 X’(I1)+ X’(I3)
; AR1 X(I1) (1st) 1 1 X(I1) + [X(I3)*COS + X(I4)*SIN]
; X(I1) (2nd) 2 2 .
; X(I1) (3rd) 3 3 .
; .
; .
; A
; X’(I2) 8 16
; B .
; .
;
; X(I2) (3rd) 13 29 .
; X(I2) (2nd) 14 30 .
; AR2 X(I2) (1st) 15 31 X[I1] – [X(I3)*COS + X(I4)*SIN]
; X’(I3) 16 32 X’(I1)– X’(I3)
; AR3 X(I3) (1st) 17 33 –X[I2]– [X(I3)*SIN – X(I4)*COS]
; X(I3) (2nd) 18 34 .
; X(I3) (3rd) 19 35 .
; .
; .
; C
; X’(I4) 24 48 –X’(I4)
; D .
; .
;
; X(I4) (3rd) 29 61 .
; X(I4) (2nd) 30 62 .
; AR4 X(I4) (1st) 31 63 X[I2] – [X(I3)*SIN – X(I4)*COS]
; 32 64
; AR1 33 65
;
LDI @FFT_SIZE,IR0
LSH –2,IR0
STI IR0,@SEPARATION
LSH –2,IR0
LDI 5,R5
LDI 3,R7
LDI 16,R6
LDI @DEST_ADDR,AR5
LDI @DEST_ADDR,AR1
LSH –1,IR0
LSH 1,R7
LOOP: ADDI 1,R7
LSH 1,R6
LDI AR1,AR4
11-104
Application-Oriented Operations
LDF *–AR0(IR1),R3
MPYF3 *AR4,R3,R4
|| STF R4,*AR1++
MPYF3 *AR3,R3,R1
MPYF3 *AR0,*AR3,R0
|| SUBF3 R1,R0,R3
LDI R6,IR1
ADDF3 R0,R4,R2
SUBF3 *AR2,R3,R4
ADDF3 *AR2,R3,R4
|| STF R4,*AR3++(IR1)
SUBF3 R2,*AR1,R4
|| STF R4,*AR4++(IR1)
ADDF3 *AR1,R2,R4
|| STF R4,*AR2++(IR1)
STF R4,*AR1++(IR1)
SUBI3 AR5,AR1,R0
CMPI @FFT_SIZE,R0
BLTD INLOP ; LOOP BACK TO THE
INNER LOOP
LDI @SINE_TABLE,AR0 ; AR0 POINTS TO
SIN/COS TABLE
LDI R7,IR1
LDI R7,RC
ADDI 1,R5
CMPI @LOG_SIZE,R5
BLED LOOP
LDI @DEST_ADDR,AR1
LSH –1,IR0
LSH 1,R7
11-106
Application-Oriented Operations
;
; Return to C environment.
;
.end
*
* No more.
*
*****************************************************************************
*****************************************************************************
* VER DATE COMMENTS
* ––– –––––––––––– –––––––––––––––––––––––––––––––––––––––––––––––––––
* 1.0 18th Feb 92 Original release. Started from forward real FFT
* routine written by Alex Tessarolo, rev 2.0 .
*
*****************************************************************************
*
* SYNOPSIS: int ifft_rl( FFT_SIZE, LOG_SIZE, SOURCE_ADDR,
DEST_ADDR, SINE_TABLE, BIT_REVERSE );
*
* int FFT_SIZE ; 64, 128, 256, 512, 1024, ...
* int LOG_SIZE ; 6, 7, 8, 9, 10, ...
* float *SOURCE_ADDR ; Points to where data is originated
* ; and operated on.
* float *DEST_ADDR ; Points to where data will be stored.
* float *SINE_TABLE ; Points to the SIN/COS table.
* int BIT_REVERSE ; = 0, bit reversing is disabled.
* ; <> 0, bit reversing is enabled.
*
* NOTE: 1) If SOURCE_ADDR = DEST_ADDR, then in place bit
* reversing is performed, if enabled (more
* processor intensive).
* 2) FFT_SIZE must be >= 64 (this is not checked).
*
11-108
Application-Oriented Operations
*****************************************************************************
*
* REGISTERS USED: R0, R1, R2, R3, R4, R5, R6, R7
* AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7
* IR0, IR1
* RC, RS, RE
* DP
*
* MEMORY REQUIREMENTS: Program = 322 words (approximately)
* Data = 7 words
* Stack = 12 words
*
*****************************************************************************
*
* BENCHMARKS: Assumptions – Program in RAM0
* – Reserved data in RAM0
* – Stack on primary/expansion bus RAM
* – Sine/cosine tables in RAM0
* – Processing and data destination in RAM1
* – Primary/expansion bus RAM, 0 wait state
*
* FFT Size Bit Reversing Data Source Cycles(C30)
* –––––––– ––––––––––––– ––––––––––– –––––––––––
* 1024 OFF RAM1 25892 approx.
* Note: This number does not include the C callable overheads.
* Add 57 cycles for these overheads.
*****************************************************************************
FP .set AR3
11-110
Application-Oriented Operations
;
; Initialize C Function.
;
.sect ”.iffttext”
;
; Perform last FFT loops first (loop 2 onwards).
;
; LOOP
; 1st 2nd
;
; X’(I1) 0 0 X’(I1)+ X’(I3)
; AR1 X(I1) (1st) 1 1 X(I1) + [X(I2)
; X(I1) (2nd) 2 2 .
; X(I1) (3rd) 3 3 .
; .
; .
; A
; X’(I2) 8 16 X’(12)* 2
; B .
; .
;
; X(I2) (3rd) 13 29 .
; X(I2) (2nd) 14 30 .
; AR2 X(I2) (1st) 15 31 X[I4] – [X(I3)
; X’(I3) 16 32 X’(I1)– X’(I3)
; AR3 X(I3) (1st) 17 33 [X(I1)–X(I2)]*COS–[X(I3)+X(I4)]*SIN
; X(I3) (2nd) 18 34 .
; X(I3) (3rd) 19 35 .
; .
; .
; C
; X’(I4) 24 48 –X’(I4)* 2
; D .
; .
;
; X(I4) (3rd) 29 61 .
; X(I4) (2nd) 30 62 .
; AR4 X(I4) (1st) 31 63 [X(I2)–X(I2)]*SIN+[X(I3)+X(I4)]*COS
; 32 64
; AR1 33 65
;
;
11-112
Application-Oriented Operations
LDI AR1,AR2
ADDI 2,AR2 ; AR2 points at B.
ADDI R6,AR4
SUBI R7,AR4 ; AR4 points at D.
LDI AR4,AR3
SUBI 2,AR3 ; AR3 points at C.
LDI R7,IR1
LDI R7,RC
RPTB IN_BLK
SUBI3 AR5,AR1,R0
CMPI @FFT_SIZE,R0
BLTD INLOP ; Loop back to the inner loop
NOP *AR2++(IR1) ; Dummy
LDI R7,IR1
LDI R7,RC
ADDI 1,R5
CMPI @LOG_SIZE,R5 ; Next stage if any left
BLED LOOP
LDI @SOURCE_ADDR,AR1
LSH 1,IR0 ; Double step in sinus table
LSH –1,R7
11-114
Application-Oriented Operations
;
; Perform third FFT loop.
;Part A:
; AR1 I1 0 X (I1) + X(I3)
;
; 1
; AR2 I2 2 2 * X(I2)
;
; 3
; AR3 I3 4 X (I1) – X(I3)
;
; 5
; AR3 I4 6 –2 * X(I4)
;
; 7
; AR1 8
;
; 9
;
;
;
;
;
LDI @SOURCE_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
LDI AR1,AR4
ADDI 2,AR2
ADDI 4,AR3
ADDI 6,AR4
LDI 8,IR0
LDI @FFT_SIZE,RC
LSH –3,RC
SUBI 1,RC
LDI @SINE_TABLE,AR0 ; AR0 points at SIN/COS table.
RPTB LOOP3_A
LDF *AR3,R3
ADDF3 R3,*AR1,R0 ; R0 = X’(I1) + X’(I3)
SUBF3 R3,*AR1,R1 ; R1 = X’(I1) – X’(I3)
LDF *AR4,R2 ;
|| STF R0,*AR1++(IR0) ; X’(I1)
MPYF –2.0,R2 ; R2 = –2*X’(I4)
LDF *AR2,R3 ;
|| STF R1,*AR3++(IR0) ; X’(I3)
MPYF 2.0,R3 ; R3 = 2*X’(I2)
LOOP3_A: STF R3,*AR2++(IR0) ; X’(I2)
|| STF R2,*AR4++(IR0) ; X’(I4)
;
; Part B:
;
; 0
; AR1 I1 1 X(I1) + X(I2)
; 2
; AR2 I2 3 X(I1) – X(I3)
; 4
; AR3 I3 5 [X(I1)– X(I2)]*COS– [X(I3)+ X(I4)]*SIN
; 6
; AR4 I4 7 [X(I1)– X(I2)]*SIN+ [X(I3)+ X(I4)]*COS]
; 8
; AR1 9 NOTE: COS(2*pi/8) = SIN(2*pi/8)
;
;
;
LDI @SOURCE_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
LDI AR1,AR4
ADDI 1,AR1
ADDI 3,AR2
ADDI 5,AR3
ADDI 7,AR4
LDI @SINE_TABLE,AR7 ; AR7 points at SIN/COS table.
LDI @FFT_SIZE,RC
LSH –3,RC
LDI RC,IR1
SUBI 2,RC
11-116
Application-Oriented Operations
;
; Perform first and second FFT loops.
;
; AR1 I1 0 X(I1) + X(I3) + 2*X(I2)
; AR2 I2 1 X(I1) + X(I3) – 2*X(I2)
; AR3 I3 2 X(I1) – X(I3) – 2*X(I4)
; AR4 I4 3 X(I1) – X(I3) + 2*X(I4)
; AR1 4
;
;
;
LDI @SOURCE_ADDR,AR1
LDI AR1,AR2
LDI AR1,AR3
LDI AR1,AR4
ADDI 1,AR2
ADDI 2,AR3
ADDI 3,AR4
LDI 4,IR0
LDI @FFT_SIZE,RC
LSH –2,RC
SUBI 2,RC
11-118
Application-Oriented Operations
;
; Check bit reversing mode (on or off).
;
; BIT_REVERSING = 0, then OFF (no bit reversing).
; BIT_REVERSING <> 0, then ON.
;
LDI @BIT_REVERSE,R0
CMPI 0,R0
BZ MOVE_DATA
;
; Check bit reversing type.
;
; If SourceAddr = DestAddr, then in place bit reversing.
; If SourceAddr <> DestAddr, then standard bit reversing.
;
LDI @SOURCE_ADDR,R0
CMPI @DEST_ADDR,R0
BEQ IN_PLACE
;
; Bit reversing type 1 (from source to destination).
;
; NOTE: abs(SOURCE_ADDR – DEST_ADDR) must be > FFT_SIZE, this is not checked.
;
LDI @FFT_SIZE,R0
SUBI 2,R0
LDI @FFT_SIZE,IR0
LSH –1,IR0 ; IRO = half FFT size.
LDI @SOURCE_ADDR,AR0
LDI @DEST_ADDR,AR1
LDF *AR0++,R1
RPTS R0
LDF *AR0++,R1
|| STF R1,*AR1++(IR0)B
STF R1,*AR1++(IR0)B
BR DIVISION
11-120
Application-Oriented Operations
;
; In-place bit reversing.
;
LDI @FFT_SIZE,RC
LSH –2,RC
SUBI 3,RC
LDI @DEST_ADDR,AR0
LDI AR0,AR1
LDI AR0,AR2
NOP *AR1++(IR0)B
NOP *AR2++(IR0)B
LDF *++AR0(IR1),R0
LDF *AR1,R1
CMPI AR1,AR0 ; Xchange locations only if AR0<AR1.
LDFGT R0,R1
LDFGT *AR1++(IR0)B,R1
RPTB BITRV1
LDF *++AR0(IR1),R0
|| STF R0,*AR0
LDF *AR1,R1
|| STF R1,*AR2++(IR0)B
CMPI AR1,AR0
LDFGT R0,R1
BITRV1: LDFGT *AR1++(IR0)B,R0
STF R0,*AR0
STF R1,*AR2
LDI @FFT_SIZE,RC
LSH –1,RC
LDI @DEST_ADDR,AR0
ADDI RC,AR0
ADDI 1,AR0
LDI AR0,AR1
LDI AR0,AR2
LSH –1,RC
SUBI 3,RC
NOP *AR1++(IR0)B
NOP *AR2++(IR0)B
LDF *++AR0(IR1),R0
LDF *AR1,R1
CMPI AR1,AR0 ; Xchange locations only if AR0<AR1.
LDFGT R0,R1
LDFGT *AR1++(IR0)B,R1
RPTB BITRV2
LDF *++AR0(IR1),R0
|| STF R0,*AR0
LDF *AR1,R1
|| STF R1,*AR2++(IR0)B
CMPI AR1,AR0
LDFGT R0,R1
BITRV2: LDFGT *AR1++(IR0)B,R0
STF R0,*AR0
STF R1,*AR2
LDI @FFT_SIZE,RC
LSH –1,RC
LDI RC,IR0
LDI @DEST_ADDR,AR0
LDI AR0,AR1
ADDI 1,AR0
ADDI IR0,AR1
LSH –1,RC
LDI RC,IR0
SUBI 2,RC
LDF *AR0,R0
LDF *AR1,R1
RPTB BITRV3
LDF *++AR0(IR1),R0
|| STF R0,*AR1++(IR0)B
BITRV3: LDF *AR1,R1
|| STF R1,*–AR0(IR1)
STF R0,*AR1
STF R1,*AR0
BR DIVISION
11-122
Application-Oriented Operations
;
; Check data source locations.
;
; If SourceAddr =
; DestAddr, then do nothing.
; If SourceAddr <>
; DestAddr, then move data.
;
LDI @FFT_SIZE,R0
SUBI 2,R0
LDI @SOURCE_ADDR,AR0
LDI @DEST_ADDR,AR1
LDF *AR0++,R1
RPTS R0
LDF *AR0++,R1
|| STF R1,*AR1++
STF R1,*AR1
; Return to C environment.
;
.end
*
* No more.
*
*****************************************************************************
*
11-124
Application-Oriented Operations
1 024† 39 500
† This benchmark is based on the Meyer and Schwarz program found in Digital Signal Processing Applications With the TMS320
Family, Volume 3.
1 024† 1.975
† This benchmark is based on the Meyer and Schwarz program found in Digital Signal Processing Applications With the TMS320
Family, Volume 3.
If H(z) is the transfer function of a digital filter that has only poles, A(z) = 1/H(z)
will be a filter having only 0s, and it will be called the inverse filter. The inverse
lattice filter is shown in Figure 11–5. These equations describe the filter in
mathematical terms:
Initial conditions:
f (0,n) = b (0,n) = x (n)
Final conditions:
y (n) = f ( p,n)
In the above equation, f (i,n) is the forward error, b (i,n) is the backward error,
k (i ) is the i-th reflection coefficient, x (n) is the input, and y (n) is the output
signal. The order of the filter (that is, the number of stages) is p. In the linear
predictive coding (LPC) method of speech processing, the inverse lattice filter
is used during analysis, and the (forward) lattice filter during speech synthesis.
K1 K2 Kp
K1 K2 Kp
z –1 z –1 z –1
b(0, n) b(1, n) b(p–1, n)
Figure 11–6 shows the data memory organization of the inverse lattice-filter
on the TMS320C3x.
• •
• •
• •
High k(p) b(p –1, n –1)
Address
11-126
Application-Oriented Operations
* load R2
* load AR0
* load AR1
* load RC
* CALL LATINV
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R2 | f(0,n) = x(n)
* AR0 | ADDRESS OF FILTER COEFFICIENTS (k(1))
* AR1 | ADDRESS OF BACKWARD PROPAGATION
* | VALUES (b(0,n±1))
* RC | RC = p ± 2
*
.global LATINV
*
* i = 1
*
LATINV MPYF3 *AR0, *AR1, R0
– Kp – K2 – K1
Kp K2 K1
z –1 z –1 z –1
b(p–1, n) b(2, n) b(1, n)
11-128
Application-Oriented Operations
Initial conditions:
f (p,n) = x (n), b (i,n – 1) = 0 for i = 1,...,p
Final conditions:
y (n) = f (0,n)
The data memory organization is identical to that of the inverse filter, as shown
in Figure 11–6 on page 11-126. Example 11–41 shows the implementation of
the lattice filter on the TMS320C3x.
* LOAD AR0
* LOAD AR1
* LOAD RC
* CALL LATICE
*
*
* ARGUMENT ASSIGNMENTS:
* ARGUMENT | FUNCTION
* ––––––––––+–––––––––––––––––––––––––––––––––––––
* R2 | F(P,N) = E(N) = EXCITATION
* AR0 | ADDRESS OF FILTER COEFFICIENTS (K(P))
* AR1 | ADDRESS OF BACKWARD PROPAGATION VALUES (B(P±1,N±1))
IR0 | 3
* RC | RC = P ± 3
*
* REGISTERS USED AS INPUT: R2, AR0, AR1, RC
* REGISTERS MODIFIED: R0, R1, R2, R3, RS, RE, RC, AR0, AR1
* REGISTER CONTAINING RESULT: R2 (f(0,n))
*
* STACK USAGE: NONE
*
* PROGRAM SIZE: 12 WORDS
*
* EXECUTION CYCLES: 15 + 3 * (P±2)
*
.global LATICE
*
*
LATICE MPYF3 *AR0,*AR1,R0
* ; K(P) * B(P±1,N±1) ±> R0
; Assume F(P,N) ±> R2
SUBF3 R0,R2,R2 ; F(P,N)±K(P)*B(P±1,N±1)
; = F(P±1,N) ±> R2
|| MPYF3 *– –AR0(1),*– –AR1(1),R0
; K(P–1) * B(P±2,N±1) ±> R0
SUBF3 R0,R2,R2 ; F(P–1,N)±K(P–1)*B(P±2,N±1)
; = F(P±2,N) ±> R2
|| MPYF3 *– –AR0(1),*– –AR1(1),R0
; K(P–2) * B(P–3,N–1) ±> R0
MPYF3 R2,*+AR0(1),R1 ; F(P–2,N) * K(P–1) ±> R1
ADDF3 R1,*+AR1(1),R3 ; F(P±2,N) * K(P–1) + B(P±2,N–1)
; = B(P–1,N) ±> R3
11-130
Programming Tips
4) If it doesn’t, identify the places where most of the execution time is spent.
When writing a C program, you can increase the execution speed by maximiz-
ing the use of register variables. For more information, refer to the
TMS320C3x C Compiler Reference Guide.
You must observe certain conventions when writing a C-callable routine.
These conventions are outlined in the Runtime Environment chapter of the
TMS320C3x C Compiler Reference Guide. Certain registers are saved by the
calling function, and others need to be saved by the called function. The C
compiler manual helps achieve a clean interface. The end result is the read-
ability and natural flow of a high-level language combined with the efficiency
and special-feature use of assembly language.
11-132
Programming Tips
The LOPOWER instruction will slow down the H1/H3 clock by a factor of 16
during the read phase of the instruction. The MAXSPEED instruction will wake
the device from the low-power mode and return it to full frequency during
MAXSPEED’s read cycle. However, the H1/H3 clock may resume with the
phase opposite from before the clocks were shut down.
The IDLE2 instruction has the same functions that the IDLE instruction has,
except that the clock is stopped during the execute phase of the IDLE2 instruc-
tion. The clock pin will stop with H1 high and H3 low. The status of all of the
signals will remain the same as in the execute phase of the IDLE2 instruction.
In emulation mode, however, the clocks will continue to run, and IDLE2 will op-
erate identically to IDLE. The external interrupts INT(0–3) are the only signals
that start the processor up from the mode the device was in. Therefore, you
must enable the external interrupt before going to IDLE2 power-down mode.
(See Example 11–42.) If the proper external interrupt is not set up before
executing IDLE2 to power down, the only way to wake up the processor is with
a device RESET.
. sect “INTRPT”
RESET .word START ; Reset vector
INT0 .word INT0_ISR ; INT0 interrupt vector
INT1 .word INT1_ISR ; INT1 interrupt vector
INT2 .word INT2_ISR ; INT2 interrupt vector
INT3 .word INT3_ISR ; INT3 interrupt vector
: :
: :
.text
: :
: :
LDP @SP_ADR
LDI @SP_ADR,SP ; Set up stack pointer
OR 01h, IE ; Enable INT0
IDLE2 ; Set GIE = 1 and stop clock
: :
: :
: :
: :
INT0_ISR RETI ; Return to instruction after IDLE2
There will be one cycle of delay while waking up the processor from the IDLE2
power-down mode before the clocks start up. This adds one extra cycle from
the time the interrupt pad goes low until the interrupt is taken. The interrupt pad
needs to be low for at least two cycles. The clocks may start up in the phase
opposite from before the clocks were stopped.
11-134
Chapter 12
Hardware Applications
Topic Page
12-1
System Configuration Options Overview
32
Data D31–D0 HOLD
24 External DMA Interface
Address A23–A0 HOLDA
Primary 4
Bus R/W INT3–0
Control STRB IACK External Interrupt Interface
RDY External Flags
XF1–0
System Reset RESET TCLK0 Timer Interface
TCLK1
X1
Master Clock
X2/CLKIN CLKX0
H1 DX0
System Clock Outputs FSX0
Control H3 Serial Port 0
CLKR0
ROM Enable MC/MP DR0
(TMS320C30 only)
FSR0
Boot Load Enable MCBL/MP
(TMS320C31 only) CLKX1
32 DX1
Data XD31–XD0 FSX1 Serial Port 1
13 (TMS320C30 only)
Address XA12–XA0 CLKR1
Expansion Bus XR/W DR1
(TMS320C30 only) XRDY
Control FSR1
IOSTRB
MSTRB
TMS320C3x
All of the interfaces are independent of one another, and you can perform dif-
ferent operations simultaneously on each interface.
The primary and expansion buses implement the memory-mapped interface
to the device. The external direct memory access (DMA) interface allows ex-
ternal devices to cause the processor to relinquish the primary bus and allow
direct memory access.
12-2
System Configuration Options Overview
TMS320C3x
Peripherals Peripherals
External DMA Interface
Interrupt
Peripherals Interface Timer Interface I/O Devices
External Flags
TCM29C13
Bit I/O
CODEC
Clock and
TLC3204x
Reset
AIC
Generators,
Analog I/O
etc.
ƪ ƫ
tion 15.1 in Table 13–13 on page 13-33):
t
c(H)
– t
d(H1L – A)
) tsu(D)R
For example, for full-speed, zero-wait-state interface to any device, the 60-ns
TMS320C3x requires a read access time of 30 ns from address stable to data
valid. Because for most memories access time from chip select is the same
as access time from address, it is theoretically possible to use 30-ns memories
at full speed with the TMS320C3x-33. This requires that there be no delays
between the processor and the memories. However, because of
interconnection delays and because some gating is normally required for chip-
select generation, this is usually not the case. Therefore, slightly faster memo-
ries are required in most systems.
Among currently available RAMs, there are two distinct categories of devices
with different interface characteristics:
- RAMs without output enable control lines (OE), which include the one-bit-
wide organized RAMs and most of the four-bit wide RAMs
- RAMs with OE controls, which include the byte-wide RAMs and a few of
the four-bit wide RAMs
12-4
Primary Bus Interface
Many of the fastest RAMs do not provide OE control; they use chip-select (CS)
controlled write cycles to ensure that data outputs do not turn on for write oper-
ations. In CS-controlled write cycles, the write control line (WE) goes low be-
fore CS goes low, and internal logic holds the outputs disabled until the cycle
is completed. Using CS-controlled write cycles is an efficient way to interface
fast RAMs without OE controls to the TMS320C30 at full speed.
In the case of RAMs with OE controls, using this signal can add flexibility to
many systems. Additionally, many of these devices can be interfaced by using
CS-controlled write cycles with OE tied low in the same manner as with RAMs
without OE controls. There are, however, two requirements for interfacing to
OE RAMs in this manner. First, the RAM’s OE input must be gated with chip
select and WE internally so that the device’s outputs do not turn on unless a
read is being performed. Second, the RAM must allow its address inputs to
change while WE is low; some RAMs specifically prohibit this.
4 × CY7C186-25
Primary
Address
Bus
A23–A0
D31
A12 I/O7
A12 D30
A11 I/O6
A11 D29
A10 I/O5
A10 D28
A9 I/O4
A9 D27
A8 I/O3
A8 D26
A7 I/O2
A7 D25
A6 I/O1
A6 D24
A5 I/O0
A5
A4
A4
A3
A3
A2
A2
A1
A1
A23 A0 8 D23–D16
A0 I/O
(7–0)
In this circuit, the two chip selects on the RAM are driven by STRB and A23,
which are ANDed together internally. A23 locates the RAM at addresses
00000h through 03FFFh in external memory, and STRB establishes the CS-
controlled write cycle. The WE control input is then driven by the TMS320C3x
R/W signal, and the OE input is not used and is therefore connected to ground.
12-6
Primary Bus Interface
H1
A23–A0 Valid
CS1 = STRB
CS2
D31–D0 Valid
t1
t2
During write operations, as shown in Figure 12–5, the RAM’s outputs do not
turn on at all, because of the use of the chip-select controlled write cycles. The
chip-select controlled write cycles are generated because R/W goes active
(low) before the STRB term of the chip-select input. Because the RAM’s output
drivers are disabled whenever the WE input is low (regardless of the state of
the OE input), bus conflicts with the TMS320C3x are automatically avoided
with this interface. The circuit’s data setup and hold times (t1 and t2 in the timing
diagram) of approximately 50 and 20 ns, respectively, also easily meet the
RAM’s timing requirements of 10 and 0 ns.
H1
A23–A0
CS1 = STRB
WE = R/W
D31–D0
t1
t2
Note that the CY7C186’s OE control is gated internally with CS; therefore, the
RAM’s outputs are not enabled unless the device is selected. This is critical
if there are any other devices connected to the same bus; if there are no other
devices connected to the bus, OE need not be gated internally with chip select.
You can easily interface RAMs without OE controls to the TMS320C3x by us-
ing an approach similar to that used with RAMs with OE controls. If only one
bank of memory is implemented and no other devices are present on the bus,
the memories’ CS input can usually be connected to STRB directly. If several
devices must be selected, however, a gate is generally required to AND the
device select and STRB to drive the CS input to generate the chip-select con-
trolled write cycles. In either case, the WE input is driven by the TMS320C3x
R/W signal. Provided sufficiently fast gating is used, 25-ns RAMs can still be
used.
As with the case of RAMs with OE control lines, this approach works well if only
a few banks of memory are implemented where the chip-select decode can
be accomplished with only one level of gating. If many banks are required to
implement very large memory spaces, bank switching can be used to provide
for multiple bank select generation while still maintaining full-speed accesses
within each bank. Bank switching is discussed in detail in subsection 12.2.3.
12-8
Primary Bus Interface
When enabled, internally generated wait states affect all external cycles, re-
gardless of the address accessed. If different numbers of wait states are re-
quired for various external devices, the external RDY input may be used to tai-
lor wait-state generation to specific system requirements.
If the logical AND (electrical OR) of the wait count and external ready signals
is selected, the later of the two signals will control the internal ready signal, and
both signals must occur. Accordingly, external ready control must be imple-
mented for each wait-state device, and the wait count ready signal must be en-
abled.
If the logical OR (or electrical AND, since the signals are low true) of the exter-
nal and internal wait-count ready signals is selected, the earlier of the two sig-
nals will generate a ready condition and allow the cycle to be completed. Both
signals need not be present.
You can use the OR of the two ready signals if conditions occur that require
termination of bus cycles prior to the number of wait states implemented with
external logic. In this case, a shorter wait count is specified internally than the
number of wait states implemented with the external ready logic, and the bus
cycle is terminated after the wait count. This feature can also be a safeguard
against inadvertent accesses to nonexistent memory that would never re-
spond with ready and would therefore lock up the TMS320C3x.
If the OR of the two ready signals is used, however, and the internal wait-state
count is less than the number of wait states implemented externally, the exter-
nal ready generation logic must have the ability to reset its sequencing to allow
a new cycle to begin immediately following the end of the internal wait count.
This requires that, under these conditions, consecutive cycles be from inde-
pendently decoded areas of memory and that the external ready generation
logic be capable of restarting its sequence as soon as a new cycle begins.
Otherwise, the external ready generation logic might lose synchronization with
bus cycles and therefore generate improperly timed wait states.
Additionally, the AND of the two ready signals can extend the number of wait
states for devices that already have external ready logic implemented but re-
quire additional wait states under certain unique circumstances.
12-10
Primary Bus Interface
- Logically ORing all of the separate ready timing signals together to con-
nect to the physical ready input
Once the region of address space being accessed has been established, a
timing circuit of some sort is normally used to provide a ready indication to the
processor at the appropriate point in the cycle to satisfy each device’s unique
requirements.
Finally, since indications of ready status from multiple devices are typically
present, the signals are logically ORed by using a single gate to drive the RDY
input.
Figure 12–6. Circuit for Generation of Zero, One, or Two Wait States for Multiple Devices
74ALS138
TMS320C30 A Y0
Address B Y1
Bus
C Y2
STRB G2A Y3 Device
Y4 Selects
G1
Other 2- Y5
G2B
Wait-State Y6
Devices Y7
74AS32
Other 1-
Wait-State STRB
Devices
A23
Other 0-
Wait-State
74AS20 Devices
+5 V
PRE A23
J
Q 74AS21
74AS20 4.7 kΩ
74ACT112
K RDY
CLR
PRE
J Q
H1
74ACT112
K Q
CLR
RESET
12-12
Primary Bus Interface
Example Circuit
In this circuit, full-speed devices drive ready directly through the ’74AS21, and
the two flip-flops delay wait-state devices’ select signals one or two H1 cycles
to provide one or two wait states.
With this circuit, devices requiring wait states might take up to 36 ns from a val-
id address on the TMS320C3x to provide inputs to the ’74AS20’s inputs. This
usually allows sufficient time for any decoding required in generating select
signals for slower devices in the system. For example, the 74ALS138, driven
by address and STRB, can generate select decodes in 22 ns, which easily
meets the TMS320C3x-33’s timing requirements.
With this circuit, unused inputs to either the 74AS20s or the 74AS21 should
be tied to a logic high level to prevent noise from generating spurious wait
states.
If more than two wait states are required by devices within a system, other ap-
proaches can be employed for ready generation. If between three and seven
wait states are required, additional flip-flops can be included in the same man-
ner shown in Figure 12–6, or internally generated wait states can be used in
conjunction with external hardware. If more than seven wait states are re-
quired, an external circuit using a counter may be used to supplement the ca-
pabilities of the internal wait-state generators.
When bank switching is enabled, any time a portion of the high order address
lines changes, as defined by the contents of the BNKCMPR register, STRB
goes high for one full H1 cycle. Provided STRB is included in chip-select de-
codes, this causes all devices to be disabled during this period. The next bank
of devices is not enabled until STRB goes low again.
In general, bank switching is not required during writes, because these cycles
always exhibit an inherent one-half H1 cycle setup of address information be-
fore STRB goes low. Thus, when you use bank switching for read/write de-
vices, a minimum of half of one H1 cycle of address setup is provided for all
accesses. Therefore, large amounts of memory can be implemented without
wait states or extra hardware required for isolation between banks. Also, note
that access time for cycles during bank switching is the same as that for cycles
without bank switching, and, accordingly, full-speed accesses can still be ac-
complished within each bank.
When you use bank switching to implement large multiple-bank memory sys-
tems, an important consideration is address line fanout. Besides parametric
specifications for which account must be made, AC characteristics are also
crucial in memory system design. With large memory arrays, which commonly
require large numbers of address line inputs to be driven in parallel, capacitive
loading of address outputs is often quite large. Because all TMS320C3x timing
specifications are guaranteed up to a capacitive load of 80 pF, driving greater
loads will invalidate guaranteed AC characteristics. Therefore, it is often nec-
essary to provide buffering for address lines when driving large memory ar-
rays. AC timings for buffer performance can then be derated according to man-
ufacturer specifications to accommodate a wide variety of memory array sizes.
The circuit shown in Figure 12–7 illustrates the use of bank switching with Cy-
press Semiconductor’s CY7C185 25-ns 8K × 8 CMOS static RAM. This circuit
implements 32K 32-bit words of memory with one-wait-state accesses within
each bank.
12-14
Primary Bus Interface
The wait state for this bank memory is generated by using the wait-state gener-
ator circuit presented in the previous section. Because A23 is the signal that
enables the entire bank memory system, the inverted version of this signal is
ANDed with STRB to derive a one-wait-state device select. This signal is then
connected in the circuit along with the other one-wait-state device selects.
Thus, any time a bank memory access is made, one wait state is generated.
Each of the four banks in this circuit is selected by using a decode of A15–A13
generated by the 74AS138 (see Figure 12–8). With the BNKCMPR register
set to 0Bh, the banks will be selected on even 8K-word boundaries starting at
location 080A000h in external memory space.
BA0–12
+ 15 V + 15 V + 15 V + 15 V
BANKSEL0
BSTRB
BR/W
Bank 0
32
BANKSEL2 Bank 2
32
BANKSEL3 Bank 3
D31–D0
A0 A1 Y1 BA0
A1 A2 Y2 BA1
A2 A3 Y3 BA2
A3 A4 Y4 BA3
A4 A5 Y5 BA4
A5 A6 Y6 BA5
A6 A7 Y7 BA6
A7 A8 Y8 BA7
G1 G2
74ALS2541
A8 A1 Y1 BA8
A9 A2 Y2 BA9
A10 A3 Y3 BA10
A11 A4 Y4 BA11
A12 A5 Y5 BA12
R/W A6 Y6 BR/W
A7 Y7
A8 Y8
G1 G2
74AS138
A15 C Y1 BANKSEL0
A14 B Y2 BANKSEL1
A13 A Y3 BANKSEL2 74AS04
Y4 BANKSEL3
STRB BSTRB
Y5
Y6
A23 G1 Y7
G2A Y8
G2B G2
12-16
Primary Bus Interface
The 74ALS2541 buffers used on the address lines are necessary in this design
because the total capacitive load presented to each address line is a maximum
of 16 × 10 pF or 160 pF (bank memory plus zero-wait-state static RAM), which
exceeds the TMS320C3x rated capacitive loading of 80 pF. Using the
manufacturer’s derating curves for these devices at a load of 80 pF (the load
presented by the bank memory) predicts propagation delays at the output of
the buffers of a maximum of 16 ns. The access time of a read cycle within a
bank of the memory is therefore the sum of the memory access time and the
maximum buffer propagation delay, or 25 + 16 = 41 ns, which, since it falls be-
tween 30 and 90 ns, requires one wait state on the TMS320C3x-33.
This circuit cannot be implemented without bank switching because data out-
put’s turn-on and turn-off delays cause bus conflicts. Here, the propagation
delay of the 74AS138 is involved only during bank switches, when there is suf-
ficient time between cycles to allow new chip selects to be decoded.
The timing of this circuit for read operations using bank switching is shown in
Figure 12–9. With the BNKCMPR register set to 0Bh, when a bank switch oc-
curs, the bank address on address lines A23–A13 is updated during the extra
H1 cycle while STRB is high. Then, after chip-select decodes have stabilized
and the previously selected bank has disabled its outputs, STRB goes low for
the next read cycle. Further accesses occur at normal bus timings with one
wait state, as long as another bank switch is not necessary. Write cycles do
not require bank switching due to the inherent address setup provided in their
timings.
t1 t4
H1
A23–A13 Valid
A12–A0 Valid
STRB
t2
BANKSEL0 t5
BANKSEL1
t3
t6
D31–D0 Bank 0 on Bus Bank 1 on Bus
12-18
Expansion Bus Interface
Unlike the primary bus, both read and write cycles on the I/O portion of the ex-
pansion bus are two H1 cycles in duration and exhibit the same timing. The
XR/W signal is high for reads and low for writes. Since I/O accesses take two
cycles, many peripherals that require wait states if interfaced either to the pri-
mary bus or by using MSTRB can be used in a system without the need for wait
states. Specifically, in cases where there is only one device on the expansion
bus, devices with address access times greater than the 30 ns required by the
primary bus, but less than 59 ns, can be interfaced to the I/O bus of the
TMS320C30-33 without wait states.
XA12 +12 V +5 V
IOSTRB IOW
XR/W
74AS32
74AS04 VCC VDD
IOR OE REFOUT
XA12 SC
50 Ω
CS
12/8 REFIN
ONE SYNC
74AS32 74LS244 EOCEN
XD Bus
12-20
Expansion Bus Interface
In this application, the converter’s chip select is driven by XA12, which maps
this device at 804000h in I/O address space. Conversions are initiated by writ-
ing any data value to the device, and the conversion results are obtained by
reading from the device after the conversion is completed. To generate the de-
vice’s start conversion (SC) and output enable (OE) inputs, IOSTRB is ANDed
with XR/W. Therefore, the converter is selected whenever XA12 is low; OE is
driven when reads are performed, while SC is driven when writes are per-
formed.
As with many A/D converters, at the end of a read cycle the AD1678 data out-
put lines enter a high-impedance state. This occurs after the output enable
(OE) or read control line goes inactive. Also common with these types of de-
vices is that the data output buffers often require a substantial amount of time
to actually attain a full high-impedance state. When used with the
TMS320C30-33, devices must have their outputs fully disabled no later than
65 ns following the rising edge of IOSTRB because the TMS320C30 will begin
driving the data bus at this point if the next cycle is a write. If this timing is not
met, bus conflicts between the TMS320C30 and the AD1678 might occur, po-
tentially causing degraded system performance and even failure due to dam-
aged data bus drivers. The actual disable time for the AD1678 can be as long
as 80 ns; therefore, buffers are required to isolate the converter outputs from
the TMS320C30. The buffers used here are 74LS244s that are enabled when
the AD1678 is read and turned off 30.8 ns following IOSTRB going high.
Therefore, the TMS320C30-33 requirement of 65 ns is met.
When data is read following a conversion, the AD1678 takes 100 ns after its
OE control line is asserted to provide valid data at its outputs. Thus, including
the propagation delay of the 74LS244 buffers, the total access time for reading
the converter is 118 ns. This requires two wait states on the TMS320C30-33
expansion I/O bus.
The two wait states required in this case are implemented using software wait
states; however, depending on the overall system configuration, it might be
necessary to implement a separate wait-state generator for the expansion bus
(refer to subsection 12.2.2 on page 12-9). This would be the case if multiple
devices that required different numbers of wait states were connected to the
expansion bus.
Figure 12–11 shows the timing for read operations between the
TMS320C30-33 and the AD1678. At the beginning of the cycle, the address
and XR/W lines become valid t1 = 10 ns following the falling edge of H1. Then,
after t2 = 10 ns from the next rising edge of H1, IOSTRB goes low, beginning
the active portion of the read cycle. After t3 = 5.8 ns (the control logic propaga-
tion delay), the IOR signal goes low, asserting the OE input to the AD1678. The
’74LS244 buffers take t4 = 30 ns to enable their outputs, and then, following
the converters access delay and the buffer propagation delay (t5 = 100 + 18
= 118 ns), data is provided to the TMS320C30. This provides approximately
46 ns of data setup before the rising edge of IOSTRB. Therefore, this design
easily satisfies the TMS320C30-33’s requirement of 15 ns of data setup time
for reads.
H1
XA12–XA0
t2
t1
IOSTRB
t3
IOR
READO
DATA
t4
t5
Unlike the primary bus, read and write cycles on the I/O expansion bus are
timed the same with the exception that XR/W is high for reads and low for
writes and that the data bus is driven by the TMS320C30 during writes. When
writing to the AD1678, the ’74LS244 buffers do not turn on and no data is trans-
ferred. The purpose of writing to the converter is only to generate a pulse on
the converter’s SC input, which initiates a conversion cycle. When a conver-
sion cycle is completed, the AD1678’s EOC output is used to generate an inter-
rupt on the TMS320C30 to indicate that the converted data can be read.
12-22
Expansion Bus Interface
As with A/D converters, D/A converters are also available in a number of vari-
eties. One of the major distinctions between various types of D/A converters
is whether or not the converter includes both latches to store the digital value
to be converted to an analog quantity, and the interface to control those
latches. With latches and control logic included with the converter, interface
design is often simplified; however, internal latches are often included only in
slower D/A converters.
Because slower converters limit signal bandwidths, the converter used in this
design was selected to allow a reasonably wide range of signal frequencies
to be processed, and to illustrate the technique of interfacing to a converter that
uses external data latches.
VCC
REF. OUT
VEE -12 V
50 Ω
20 V SPAN
REF. IN
REF. GND
74LS377 10 V
AGND SPAN
XD0 3 2 10 pF
1D 1Q Bit 12 (LSB)
XD1 4 5 11 +12 V
XD2 7 6 10
XD3 8 9 DACOUT
9 LM318 Analog
XD4 13 U25 12 Out
8
XD5 14 15 AD565A
7
XD6 17 16 -12 V
6
XD7 18 19 5 2.4 K
CLK EN 4
3
2
Bit 1 (MSB)
74LS377
Power
XD8 3 2 GND
XD9 4 5
XD10 7 6
U26 AGND
XD11 8 9
CLK EN
XA12
XD Bus IOW
12-24
Expansion Bus Interface
The external data latches used in this interface are ’74LS377 devices that have
both clock and enable inputs. These latches serve as a convenient interface
with the TMS320C30; the enable inputs provide a device select function, and
the clock inputs latch the data. Therefore, with the enable input driven by in-
verted XA12 and the clock input by IOW, which is the AND of IOSTRB and
XR/W, data will be stored in the latches when a write is performed to I/O ad-
dress 805000h. Reading this address has no effect on the circuit.
Figure 12–13 shows a timing diagram of a write operation to the D/A converter
latches.
H1
XA12–XA0
t1
t3
XA12
t2
t4
IOSTRB
IOW
XD32–XD0
t5
t6
Because the write is actually being performed to the latches, the key timings
for this operation are the timing requirements for these devices. For proper op-
eration, these latches require simply a minimal setup and hold time of data and
control signals with respect to the rising edge of the clock input. Specifically,
the latches require a data setup time of 20 ns, enable setup of 25 ns, disable
setup of 10 ns, and data and enable hold times of 5 ns. This design provides
approximately 60 ns of enable setup, 30 ns of data setup, and 7.2 ns of data
hold time. Therefore, the setup and hold times provided by this design are well
in excess of those required by the latches. The key timing parameters for this
interface are summarized in Table 12–2.
Table 12–2. Key Timing Parameter for D/A Converter Write Operation
Time Time
Interval Event Period†
t1 H1 falling to address valid 10 ns
12-26
System Control Functions
TMS320C3x
X2/CLKIN X1
13 MHz
15 pF 15 pF
10 µH
ωP + ǸLC
1 (4)
At frequencies significantly lower than ωP, the 1/(ωC) term in (3) becomes the
dominating term, while ωL can be neglected. This is expressed as
In (5), the LC circuit appears conductive at frequencies lower than ωP. On the
other hand, at frequencies much higher than ωP, the ωL term is the dominant
term in (3), and 1/(ωC) can be neglected. This is expressed as
| z (ω) |
ωP + ǸLC
1 ω
(rad/s)
12-28
System Control Functions
Based on the discussion above, the design of the LC circuit proceeds as fol-
lows:
The reset input controls initialization of internal TMS320C3x logic and also
causes execution of the system initialization software. For proper system ini-
tialization, the reset signal must be applied for at least ten H1 cycles, i.e., 600
ns for a TMS320C3x operating at 33.33 MHz. Upon power-up, however, it can
take 20 ms or more before the system oscillator reaches a stable operating
state. Therefore, the power-up reset circuit should generate a low pulse on the
reset line for 100 to 200 ms. Once a proper reset pulse has been applied, the
processor fetches the reset vector from location 0, which contains the address
of the system initialization routine. Figure 12–16 shows a circuit that will gener-
ate an appropriate power-up reset circuit.
TMS320C3x
RS
+5 V
C1 = 4.7 µF
DGND
The voltage on the reset pin (RESET) is controlled by the R1C1 network. After
a reset, this voltage rises exponentially according to the time constant R1C1,
as shown in Figure 12–17.
Voltage
V = VCC (1 – e – t / τ )
VCC
V1
t0 = 0 t1 Time
The duration of the low pulse on the reset pin is approximately t1, which is the
time it takes for the capacitor C1 to be charged to 1.5 V. This is approximately
the voltage at which the reset input switches from a logic 0 to a logic 1. The
ƪ ƫ
capacitor voltage is expressed as
V + VCC t
1–e –t
(7)
ƪ ƫ
where τ = R1C1 is the reset circuit time constant. Solving equation (7) for t re-
sults in
t + –R1C1ln 1 –
V
V
(8)
CC
R1 = 100 KΩ
C1 = 4.7 µF
VCC = 5 V
V = V1 = 1.5 V
results in t = 167 ms. Therefore, the reset circuit of Figure 12–16 provides a
low pulse of long enough duration to ensure the stabilization of the system os-
cillator.
12-30
System Control Functions
Four serial port modes on the TLC32044 allow direct interface to TMS320C3x
processors. When the transmit and receive sections of the AIC are operating
synchronously, it can interface to two SN54299 or SN74299 serial-to-parallel
shift registers. These shift registers can then interface in parallel to the
TMS320C30, to other TMS320 digital processors, or to external FIFO circuitry.
Output data pulses inform the processor that data transmission is complete or
allow the DSP to differentiate between two transmitted bytes. A flexible control
scheme is provided so that the functions of the AIC can be selected and ad-
justed coincidentally with signal processing via software control. Refer to the
TLC32044 data sheet for detailed information.
When you interface the AIC to the TMS320C3x via one of the serial ports, no
additional logic is required. This interface is shown in Figure 12–18. The serial
data, control, and clock signals connect directly between the two devices, and
the AIC’s master clock input is driven from TCLK0, one of the TMS320C3x’s
internal timer outputs. The AIC’s WORD/BYTE input is pulled high, selecting
16-bit serial port transfers to optimize serial port data transfer rate. The
TMS320C3x’s XF0 pin, configured as an output, is connected to the AIC’s re-
set (RST) input to allow the AIC to be reset by the TMS320C3x under program
control. This allows the TMS320C3x timer and serial port to be initialized be-
fore beginning conversions on the AIC.
12-32
Serial-Port Interface
TMS320C30 TLC32044
IN+ ADV
FSX0 FSX
IN– AGND
DX0 DX
FSR0 FSR OUT+ AOUT
DR0 DR OUT–
CLKX0 SHIFT CLK
CLKR0 VDD +5 V
TCLK0 MSTR CLK VCC+ +5 V
VCC– +5V
XF0
AGND
G2
AGND AGND
WORO1 BYTE +5 V
RST
DGND
DGND
To provide the master clock input for the AIC, the TCLK0 timer is configured
to generate a clock signal with a 50% duty cycle at a frequency of f(H1)/4 or
4.167 MHz. To accomplish this, the global control register for timer 0 is set to
the value 3C1h, which establishes the desired operating modes. The period
register for timer 0 is set to 1, which sets the required division ratio for the H1
clock.
To properly communicate with the AIC, the TMS320C30 serial port must be
configured appropriately by initializing several TMS320C30 registers and
memory locations. First, reset the serial port by setting the serial port global
control register to 2170300h. (The AIC should also be reset at this time. See
description below of resetting the AIC via XF0.) This resets the serial port logic,
configures the serial port operating modes, including data transfer lengths,
and enables the serial port interrupts. This also configures another important
aspect of serial port operation: polarity of serial port signals. Because active
polarity of all serial port signals is programmable, it is critical to set appropriate-
ly the bits in the serial port global control register that control the polarity. In this
application, all polarities are set to positive except FSX and FSR, which are
driven by the AIC and are true low.
The serial port transmit and receive control registers must also be initialized
for proper serial port operation. In this application, both of these registers are
set to 111h, which configures all of the serial port pins in the serial port mode,
rather than the general-purpose digital I/O mode.
When the operations described above are completed, interrupts are enabled,
and, provided that the serial port interrupt vector(s) are properly loaded, serial
port transfers can begin after the serial port is taken out of reset. You can do
this by loading E170300h into the serial port global control register.
To begin conversion operations on the AIC and subsequent transfers of data
on the serial port, first reset the AIC by setting XF0 to 0 at the beginning of the
TMS320C3x initialization routine. Set XF0 to 0 by setting the TMS320C3x IOF
register to 2. This sets the AIC to a default configuration and halts serial port
transfers and conversion operations until reset is set high. Once the
TMS320C3x serial port and timer have been initialized as described above,
set XF0 high by setting the IOF register to 6. This allows the AIC to begin oper-
ating in its default configuration, which in this application is the desired mode.
In this mode, all internal filtering is enabled, sample rate is set at approximately
6.4 kHz, and the transmit and receive sections of the device are configured to
operate synchronously. This mode of operation is appropriate for a variety of
applications; if a 5.184-MHz master clock input is used, the default configura-
tion results in an 8-kHz sample rate, which makes this device ideal for speech
and telecommunications applications.
In addition to the benefit of a convenient default operating configuration, the
AIC can also be programmed for a wide variety of other operating configura-
tions. Sample rates and filter characteristics can be varied, and numerous con-
nections in the device can be configured to establish different internal architec-
tures by enabling or disabling various functional blocks.
To configure the AIC in a fashion different from the default state, you must first
send the device a serial data word with the two LSBs set to 1. The two LSBs
of a transmitted data word are not part of the transferred data information and
are not set to 1 during normal operation. This condition indicates that the next
serial transmission will contain secondary control information, not data. This
information is then used to load various internal registers and specify internal
configuration options. Four different types of secondary control words are dis-
tinguished by the state of the two LSBs of the transferred control information.
Note that each transferred secondary control word must be preceded by a data
word with the two LSBs set to 1.
The TMS320C3x can communicate with the AIC either synchronously or
asynchronously, depending on the information in the control register. The op-
erating sequence for synchronous communication with the TMS320C30
shown in Figure 12–19 is as follows:
1) The FSX or FSR pin is brought low.
2) One 16-bit word is transmitted, or one 16-bit word is received.
3) The FSX or FSR pin is brought high.
4) The E0DX or E0DR pin emits a low-going pulse.
12-34
Serial-Port Interface
SHIFT CLK
FSR, FSX
E0DR, E0DX
FSX
FSR
The execution of the IDLE2 instruction causes the H1 and H3 processor clocks
to be held at a constant level until the occurrence of an external interrupt. To
use the TMS320C31 IDEL2 power management feature effectively, interrupts
must be generated with or without the presence of the H1 clock. For normal
(non-IDLE2) operation, however, the interrupt inputs must be synchronized
with the falling edge of the H1 clock. An interrupt must satisfy the following
conditions:
- It must meet the setup time on the falling edge of H1, and
- It must be at least one cycle and less than two cycles in duration.
For an interrupt to be recognized during IDLE2 operation and turn the clocks
back on, it must first be held low for one H1 cycle. The logic in Figure 12–21
can be used to generate an interrupt signal to the TMS320C31 with the correct
timing during non-IDLE2 and IDLE2 operation. Figure 12–21 shows the inter-
rupt circuit, which uses a 16R4 PLD to generate the appropriate interrupt sig-
nal.
Figure 12–21. Interrupt Generation Circuit for Use With IDLE2 Operation
TMS320C31 TIBPAL16R4
Interrupt
INTx Source 2 12
H1 CLK
Example 12–1 shows the PLD equations for the 16R4 using the ABEL lan-
guage. This implementation makes the following assumptions regarding the
interrupt source:
12-36
Low-Power-Mode Interrupt Interface
Notice that the interrupt is driven active as soon as the interrupt source goes
active. It goes inactive again on detection of two H3 rising edges. These two
rising edges ensure that the interrupt is recognized during normal operation
and after the end of IDLE2 operation (when the clocks turn on again). The inter-
rupt goes inactive after the two H3 clocks are counted and does not go inactive
again until after the interrupt source again goes inactive and returns to active.
Example 12–1. State Machine and Equations for the Interrupt Generation 16R4 PLD
MODULE INTERRUPT_GENERATION
TITLE’ INTERRUPT_GENERATION FOR IDLE2 AND NON-IDLE2 TMS320C31A
TMS320C31’
c3xu5 device ’P16R4’;
”inputs
h3 Pin 1;
intsrc_ Pin 2; ”Interrupt source
”output
intx_ Pin 12; ”Interrupt input signal to the TMS320C31
sync_src_Pin 14; ”Internal signal used to synchronize the
”input to the H1 clock
same_ Pin 15; ”Keeps track if the new interrupt source
”has occurred. If active, no new interrupt
”has occurred.
”This logic makes the following assumptions:
”The duration of the interrupt source is at least one H1
”cycle in duration. It takes one H1 cycle to turn the H1
”clock on again.
”The interrupt source is pulse- or level-triggered. If the
”source stays active after being asserted, it is regarded
”as the same interrupt request and not a new one.
c,H,L,X = .C.,,1,0,.X.;
source = !intsrc_;
sync = !sync_src_;
samesrc = !same_;
c3xint = !intx_;
”state bits
outstate = [samesrc,sync];
idle = ^b00;
sync_st = ^b01; ”synchronize state
wait = ^b10; ”wait for interrupt source to go inactive
state_diagram outstate
state idle:
if (source) then sync_st
else idle;
state sync_st:
if (source) then wait
else idle;
state wait:
if (source) then wait
else idle;
equations
!intx_ = (source # sync) & !samesrc;
@page
”Test interrupt generation logic
test_vectors
([he, source] –> [outstate,c3xint])
[ c, L ] –> [idle, L ]; ”check start from idle
[ L, H ] –> [idle, H ]; ”test normal interrupt operation
[ c, H ] –> [sync_st, H ];
[ c, L ] –> [idle, L ];
[ c, L ] –> [idle, L ];
[ L, H ] –> [idle, H ]; ”test coming out of idle2 operation
[ L, H ] –> [idle, H ];
[ c, H ] –> [sync_st, H ];
[ c, L ] –> [idle, L ];
[ c, H ] –> [sync_st, H ]; ”test same source
[ c, H ] –> [wait, L ];
[ c, H ] –> [wait, L ];
[ c, L ] –> [idle, L ];
[ L, H ] –> [idle, H ]; ”test idle2 operation
[ L, H ] –> [idle, H ];
[ L, H ] –> [idle, H ];
end interrupt_generation
12-38
XDS Target Design Considerations
Figure 12–23 shows a portion of logic in the emulator pod. Note that 33-Ω re-
sistors have been added to the EMU0, EMU1, and EMU2 lines; this minimizes
cable reflections.
74LVT240
33 Ω
EMU1 (Pin 1)
33 Ω
EMU0 (Pin 2)
33 Ω
EMU2 (Pin 3)
+5 V
180 Ω 270 Ω 74F175
JP1
EMU3 (Pin 9) D
+5 V
180 Ω 270 Ω
JP2 74AS1004
H3 (Pin 11)
PD (VCC Pin 7)
100 Ω
RESIN
TL7705A
GND (Pins 2, 4, 6, 8, 10, 12)
12-40
XDS Target Design Considerations
H3
2
3
EMU0
EMU1
EMU2
4
5
6
EMU3
Figure 12–25. Signals Between the Emulator and the ’C3x With No Signals Buffered
2 inches or less
VCC
GND
Figure 12–26. Signals Between the Emulator and the ’C3x With Transmission Signals
Buffered
2 to 6 inches
VCC
GND
12-42
XDS Target Design Considerations
- All signals buffered. The distance between the emulation header and the
’C3x is greater than 6 inches but less than 12 inches. All ’C3x emulation
signals, EMU0, EMU1, EMU2, EMU3, and H3, are buffered through the
same package. (See Figure 12–27.)
6 to 12 inches
VCC
GND
H3 Buffer Restrictions
2.70
4.50
9.50
0.90
12-44
XDS Target Design Considerations
0.20
Cable
0.38
0.100
Key, Pin 8
0.70
Cable
0.100
Note: All dimensions are in inches and are nominal unless otherwise specified.
22 kΩ
TBC 22 kΩ 22 kΩ C3x
TMS0 EMU0
TMS1 EMU1
TD0 EMU2
TCKO EMU4
TCKI H1 (Clock)
TDI0 EMU3
TDI1 EMU5
TMS2/EVNT0 EMU6
TMS3/EVNT1
TMS4/EVNT2
TMS5/EVNT3
Notes: 1) In a ’C3x design, the TBC can connect to only one ’C3x device.
2) The ’C3x device’s H1 clock drives TCKI on the TBC. This is different from the
emulation header connections where H3 is used.
12-46
Chapter 13
Topic Page
13-1
Pinout and Pin Assignments
13-2
Pinout and Pin Assignments
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
H3 D2 D3 D7 D10 D13 D16 D17 D19 D22 D25 D28 XA0 XA1 XA5
A
X2/CLKIN CVSS H1 D4 D8 D11 D15 D18 D20 D24 D27 D31 XA4 IVSS XA6
B
EMU5 X1 DVSS D0 D5 D9 D14 VSS D21 D26 D30 XA3 DVSS XA7 XA10
C
XR/W XRDY VBBP DDVDD D1 D6 D12 VDD D23 D29 XA2 ADVDD XA9 XA11 MC/MP
D
RDY HOLDA MSTRB VSUBS LOCATOR DDVDD XA8 XA12 EMU3 EMU1
E
RESET STRB HOLD IOSTRB EMU4/SHZ EMU2 EMU0 A0
F
IACK XF0 XF1 R/W A1 A2 A3 A4
G
INT1 INT0 VSS VDD MDVDD TMS320C30 ADVDD VDD VSS A6 A5
Top View
H
INT2 INT3 RSV0 RSV1 A11 A9 A8 A7
J
RSV2 RSV3 RSV5 RSV7 A17 A14 A12 A10
K
RSV4 RSV6 RSV9 CLKR1 IODVDD A22 A18 A15 A13
L
RSV8 RSV10 FSR1 PDVDD CLKX0 EMU6 XD5 VDD XD16 XD22 XD27 IODVDD A21 A19 A16
M
DR1 CLKX1 DVSS CLKR0 TCLK1 XD2 XD7 VSS XD14 XD19 XD23 XD28 DVSS A23 A20
N
FSX1 DX1 FSR0 TCLK0 XD1 XD4 XD8 XD10 XD13 XD17 XD20 XD24 XD29 CVSS XD31
P
DR0 FSX0 DX0 XD0 XD3 XD6 XD9 XD11 XD12 XD15 XD18 XD21 XD25 XD26 XD30
XA5 XA1 XA0 D28 D25 D22 D19 D17 D16 D13 D10 D7 D3 D2 H3
A
XA6 IVSS XA4 D31 D27 D24 D20 D18 D15 D11 D8 D4 H1 CVSS X2/CLKIN
B
XA10 XA7 DVSS XA3 D30 D26 D21 VSS D14 D9 D5 D0 DVSS X1 EMU5
C
MC/MP XA11 XA9 ADVDD XA2 D29 D23 VDD D12 D6 D1 DDVDD VBBP XRDY XR/W
D
EMU1 EMU3 XA12 XA8 DDVDD LOCATOR VSUBS MSTRB HOLDA RDY
E
A0 EMU0 EMU2 EMU4/SHZ IOSTRB HOLD STRB RESET
F
A4 A3 A2 A1 R/W XF1 XF0 IACK
G
A5 A6 VSS VDD ADVDD TMS320C30 MDVDD VDD VSS INT0 INT1
Bottom View H
A7 A8 A9 A11 RSV1 RSV0 INT3 INT2
J
A10 A12 A14 A17 RSV7 RSV5 RSV3 RSV2
K
A13 A15 A18 A22 IODVDD CLKR1 RSV9 RSV6 RSV4
L
A16 A19 A21 IODVDD XD27 XD22 XD16 VDD XD5 EMU6 CLKX0 PDVDD FSR1 RSV10 RSV8
M
A20 A23 DVSS XD28 XD23 XD19 XD14 VSS XD7 XD2 TCLK1 CLKR0 DVSS CLKX1 DR1
N
XD31 CVSS XD29 XD24 XD20 XD17 XD13 XD10 XD8 XD4 XD1 TCLK0 FSR0 DX1 FSX1
P
XD30 XD26 XD25 XD21 XD18 XD15 XD12 XD11 XD9 XD6 XD3 XD0 DX0 FSX0 DR0
13-4
Pinout and Pin Assignments
40.38 (1.590)
39.62 (1.560)
5.02 (0.198)
3.88 (0.152) 1.52 (0.060)
1.02 (0.040)
13-6
Pinout and Pin Assignments
IODV DD
IODV DD
PDV DD
PDV DD
CLKR0
CLKX0
TCLK1
TCLK0
EMU6
FSR0
XD30
XD29
XD28
XD27
XD26
XD25
XD24
XD23
XD22
XD21
XD20
XD19
XD18
XD17
XD16
XD15
XD14
XD13
XD12
XD10
FSX0
XD11
V DD
V DD
V SS
V SS
DR0
XD9
XD8
XD7
XD6
XD5
XD4
XD3
XD2
XD1
XD0
DX0
NC
NC
104 53
105 52
DVSS DVSS
DVSS DVSS
CVSS DX1
CVSS FSX1
XD31 CLKX1
A23 CLKR1
A22 FSR1
A21 DR1
A20 RSV10
A19 RSV9
A18 RSV8
A17 RSV7
A16 RSV6
A15 RSV5
A14 RSV4
ADVDD RSV3
ADVDD RSV2
A13 RSV1
A12 RSV0
A11 INT3
A10 INT2
A9 INT1
A8 VSS
A7 VSS
A6 NC
VDD VDD
VDD VDD
VSS INT0
VSS IACK
A5 XF0
A4 XF1
A3 RESET
A2 R/W
A1 STRB
A0 RDY
EMU0 MDVDD
EMU1 MDVDD
EMU2 HOLD
EMU3 HOLDA
EMU4 XR/W
MC/MP XSTRB
XA12 MSTRB
XA11 XRDY
XA10 EMU5
XA9 VBBP
XA8 VSUBS
XA7 X1
XA6 X2
IVSS CVSS
IVSS CVSS
DVSS DVSS
DVSS DVSS
156 1
157 208
D11
D20
D19
D18
V SS
V SS
D17
D16
D15
D14
D13
D12
D10
D9
D8
D7
D6
D5
D4
D3
D2
D1
D0
H1
H3
ADV DD
ADV DD
XA5
XA4
XA3
XA2
XA1
XA0
D31
D30
D29
D28
D27
D26
D25
D24
D23
D22
D21
DDV DD
DDV DD
V DD
V DD
NC
DDV DD
DDV DD
NC
13-8
Pinout and Pin Assignments
Figure 13–5. TMS320C30 PPM 208-Pin Plastic Quad Flat Pack—PQL Package
30,7 (1.209)
30,5 (1.201) SQ
156 105
157 104
0,28 (0.01102)
0,18 (1.00709)
0,20 (0.008)
0,12 (0.005)
208 53
1 52 3,6 (0.142)
3,4 (0.134)
28,1 (1.106)
SQ
27,9 (1.098)
0,60 (0.024)
0,40 (0.016)
Signal Pin Signal Pin Signal Pin Signal Pin Signal Pin
A0 139 D6 197 EMU0 140 RSV3 37 XA10 148
A1 138 D7 196 EMU1 141 RSV4 38 XA11 147
A2 137 D8 195 EMU2 142 RSV5 39 XA12 146
A3 136 D9 194 EMU3 143 RSV6 40 XD0 64
A4 135 D10 193 EMU4/SHZ 144 RSV7 41 XD1 65
A5 134 D11 192 EMU5 9 RSV8 42 XD2 66
A6 129 D12 191 EMU6 63 RSV9 43 XD3 69
A7 128 D13 190 FSR0 56 RSV10 44 XD4 70
A8 127 D14 189 FSR1 46 R/W 20 XD5 71
A9 126 D15 188 FSX0 59 STRB 19 XD6 72
A10 125 D16 187 FSX1 49 TCLK0 61 XD7 73
A11 124 D17 186 H1 204 TCLK1 62 XD8 74
A12 123 D18 180 H3 205 VBBP 8 XD9 75
A13 122 D19 179 HOLD 15 VDD 26 XD10 76
A14 119 D20 178 HOLDA 14 VDD 27 XD11 82
A15 118 D21 177 IACK 24 VDD 77 XD12 83
A16 117 D22 176 INT0 25 VDD 78 XD13 84
A17 116 D23 175 INT1 31 VDD 130 XD14 85
A18 115 D24 174 INT2 32 VDD 131 XD15 86
A19 114 D25 173 INT3 33 VDD 181 XD16 87
A20 113 D26 170 IODVDD 67 VDD 182 XD17 88
A21 112 D27 169 IODVDD 68 VSS 29 XD18 89
A22 111 D28 168 IODVDD 102 VSS 30 XD19 90
A23 110 D29 167 IODVDD 103 VSS 80 XD20 91
ADVDD 120 D30 166 IVSS 153 VSS 81 XD21 92
ADVDD 121 D31 165 IVSS 154 VSS 132 XD22 93
ADVDD 157 DDVDD 171 MC/MP 145 VSS 133 XD23 94
ADVDD 158 DDVDD 172 MDVDD 16 VSS 184 XD24 95
CLKR0 57 DDVDD 206 MDVDD 17 VSS 185 XD25 96
CLKR1 47 DDVDD 207 MSTRB 11 VSUBS 7 XD26 97
CLKX0 58 DR0 55 NC 28 X1 6 XD27 98
CLKX1 48 DR1 45 NC 79 X2/CLKIN 5 XD28 99
CVSS 3 DVSS 1 NC 104 XA0 164 XD29 100
CVSS 4 DVSS 2 NC 183 XA1 163 XD30 101
CVSS 107 DVSS 51 NC 208 XA2 162 XD31 109
CVSS 108 DVSS 52 PDVDD 53 XA3 161 XF0 23
D0 203 DVSS 105 PDVDD 54 XA4 160 XF1 22
D1 202 DVSS 106 RDY 18 XA5 159 XRDY 10
D2 201 DVSS 155 RESET 21 XA6 152 XR/W 13
D3 200 DVSS 156 RSV0 34 XA7 151 XSTRB 12
D4 199 DX0 60 RSV1 35 XA8 150
D5 198 DX1 50 RSV2 36 XA9 149
† ADVDD, CVSS, DDVDD, DVSS, IODVDD, IVSS, MDVDD, PDVDD, VDD, and VSS pins are on a common plane internal to the
device.
13-10
Pinout and Pin Assignments
MCBL/MP
TCLK1
TCLK0
EMU2
EMU1
EMU0
EMU3
VDD
VDD
VDD
VDD
VDD
SHZ
VSS
VSS
VSS
VSS
VSS
VSS
A10
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22
A23
A11
17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117
A9 18 116 DX0
VSS 19 115 VDD
A8 20 114 FSX0
A7 21 113 VSS
A6 22 112 CLKX0
A5 23 111 CLKR0
VDD 24 110 FSR0
A4 25 109 VSS
A3 26 108 DR0
A2 27 107 INT3
A1 28 106 INT2
A0 29 105 VDD
VSS 30 104 VDD
D31 31 103 INT1
VDD 32 102 VSS
VDD 33 101 VSS
D30 34 100 INT0
VSS 35 99 IACK
VSS 36 98 XF1
VSS 37 97 VDD
D29 38 96 XF0
D28 39 95 RESET
VDD 40 94 R/W
D27 41 93 STRB
VSS 42 92 RDY
D26 43 91 VDD
D25 44 90 HOLD
D24 45 89 HOLDA
D23 46 88 X1
D22 47 87 X2/CLKIN
D21 48 86 VSS
VDD 49 85 VSS
D20 50 84 VSS
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
D19
D18
D17
D16
D15
D14
D13
D12
D10
V SS
V SS
V SS
D11
D9
D8
VSS
VSS
VSS
D7
D6
D5
D4
D3
D2
D1
D0
H1
H3
V DD
V DD
V DD
V DD
V DD
13-12
Pinout and Pin Assignments
4,45 (0.175)
0,254 (0.010) Nom 4,19 (0.165)
24,18 (0.952)
24,08 (0.948)
27,56 (1.085)
27,31 (1.075)
24,18 (0.952)
24,08 (0.948)
27,56 (1.085)
27,31 (1.075)
Signal Pin Signal Pin Signal Pin Signal Pin Signal Pin
A0 29 D4 76 EMU0 124 VDD 40 VSS 84
A1 28 D5 75 EMU1 125 VDD 49 VSS 85
A2 27 D6 74 EMU2 126 VDD 59 VSS 86
A3 26 D7 73 EMU3 123 VDD 65 VSS 101
A4 25 D8 68 FSR0 110 VDD 66 VSS 102
A5 24 D9 67 FSX0 114 VDD 74 VSS 109
A6 23 D10 64 H1 81 VDD 83 VSS 113
A7 22 D11 63 H3 82 VDD 91 VSS 117
A8 21 D12 62 HOLD 90 VDD 97 VSS 119
A9 20 D13 60 HOLDA 89 VDD 104 VSS 128
A10 19 D14 58 IACK 99 VDD 105 X1 88
A11 18 D15 56 INT0 100 VDD 115 X2/CLKIN 87
A12 17 D16 55 INT1 103 VDD 121 XF0 96
A13 16 D17 54 INT2 106 VDD 131 XF1 98
A14 15 D18 53 INT3 107 VDD 132
A15 14 D19 52 MCBL/MP 127 VSS 3
A16 13 D20 50 RDY 92 VSS 4
A17 12 D21 48 RESET 95 VSS 17
A18 11 D22 47 R/W 94 VSS 19
A19 10 D23 46 SHZ 118 VSS 30
A20 9 D24 45 STRB 93 VSS 35
A21 8 D25 44 TCLK0 120 VSS 36
A22 7 D26 43 TCLK1 122 VSS 37
A23 6 D27 41 VSS 42
CLKR0 5 D28 39 VSS 51
CLKX0 4 D29 38 VDD 6 VSS 57
D0 3 D30 34 VDD 15 VSS 61
D1 2 D31 31 VDD 24 VSS 69
D2 1 DR0 108 VDD 32 VSS 70
D3 130 DX0 116 VDD 33 VSS 71
† VDD and VSS pins are on a common plane internal to the device.
13-14
Pinout and Pin Assignments
Pin Signal Pin Signal Pin Signal Pin Signal Pin Signal
1 A21 31 D31 61 VSS 91 VDD 121 VDD
2 A20 32 VDD 62 D12 92 RDY 122 TCLK1
3 VSS 33 VDD 63 D11 93 STRB 123 EMU3
4 VSS 34 D30 64 D10 94 R/W 124 EMU0
5 A19 35 VSS 65 VDD 95 RESET 125 EMU1
6 VDD 36 VSS 66 VDD 96 XF0 126 EMU2
7 A18 37 VSS 67 D9 97 VDD 127 MCBL/MP
8 A17 38 D29 68 D8 98 XF1 128 VSS
9 A16 39 D28 69 VSS 99 IACK 129 A23
10 A15 40 VDD 70 VSS 100 INT0 130 A22
11 A14 41 D27 71 VSS 101 VSS 131 VDD
12 A13 42 VSS 72 D7 102 VSS 132 VDD
13 A12 43 D26 73 D6 103 INT1
14 A11 44 D25 74 VDD 104 VDD
15 VDD 45 D24 75 D5 105 VDD
16 A10 46 D23 76 D4 106 INT2
17 VSS 47 D22 77 D3 107 INT3
18 A9 48 D21 78 D2 108 DR0
19 VSS 49 VDD 79 D1 109 VSS
20 A8 50 D20 80 D0 110 FSR0
21 A7 51 VSS 81 H1 111 CLKR0
22 A6 52 D19 82 H3 112 CLKX0
23 A5 53 D18 83 VDD 113 VSS
24 VDD 54 D17 84 VSS 114 FSX0
25 A4 55 D16 85 VSS 115 VDD
26 A3 56 D15 86 VSS 116 DX0
27 A2 57 VSS 87 X2/CLKIN 117 VSS
28 A1 58 D14 88 X1 118 SHZ
29 A0 59 VDD 89 HOLDA 119 VSS
30 VSS 60 D13 90 HOLD 120 TCLK0
† VDD and VSS pins are on a common plane internal to the device.
13-16
Signal Descriptions
XF1, XF0 2 I/O/Z External flag pins. They are used as general- S R
purpose I/O pins or to support interlocked pro-
cessor instructions.
13-18
Signal Descriptions
13-20
Signal Descriptions
13-22
Signal Descriptions
XF1, XF0 2 I/O/Z External flag pins. These are used as general- S R
purpose I/O pins or to support interlocked pro-
cessor instructions.
Reserved (4 Pins) ¶
EMU3 1 O Reserved.
† Input (I), output (O), high-impedance state (Z)
‡ S = SHZ active, H = HOLD active, R = RESET active
§ The recommended decoupling capacitor value is 0.1 µF.
¶ Follow the connections specified for the reserved pins. 18- to 22-kΩ pull-up resistors are recommended. All +5-volt supply pins
must be connected to a common supply plane, and all ground pins must be connected to a common ground plane.
13-24
Electrical Specifications
13-26
Electrical Specifications
’C30/’C31 ’LC31-33
El
Electrical
i l Characteristic
Ch i i Min Nom‡ Max Min Nom‡ Max U i
Unit
VOH High-level output voltage ( VDD = Min, IOH = 2.4 3 2.0 V
Max)
CLKIN 25 25
IOL
Output
Tester Pin VLoad Under
Electronics Test
CT
IOH
13-28
Signal Transition Levels
1.0 V
0.6 V
10%
0.8 V
- For a high-to-low transition on an input signal, the level at which the input
is said to be no longer high is 2.0 volts, and the level at which the input is
said to be low is 0.8 volt.
- For a low-to-high transition on an input signal, the level at which the input
is said to be no longer low is 0.8 volt, and the level at which the input is said
to be high is 2.0 volts.
13.5 Timing
Table 13–12 defines the timing parameters for the X2/CLKIN, H1, and H3 in-
terface signals. The numbers shown in parentheses in Figure 13–11 and
Figure 13–12 correspond with those in the No. column of Table 13–12. Refer
to the RESET timing in Figure 13–23 on page 13-48 for CLKIN to H1/H3 delay
specification.
13-30
Timing
X2/CLKIN
(3)
(2)
(10)
(9) (6)
H1
(8)
(7)
(9.1) (9.1)
H3
(9) (6)
(7)
(8)
(10)
13-32
Timing
H3
H1
(11) (12)
(M)STRB
(X)R/W
(14.1/14.2) (13.1/13.2)
(X)A
(15.1/15.2)
(26) (16)
(X)D
(17.1/17.2)
(18)
(X)RDY
13-34
Timing
H3
H1
(12)
(11)
(M)STRB
(19)
(13.1/13.2)
(X)R/W
(14.1/14.2)
(22.1/22.2)
(X)A
(20) (21)
(X)D
(18)
(17.1/17.2)
(X)RDY
Table 13–14 defines memory read timing parameters for IOSTRB. The num-
bers shown in parentheses in Figure 13–15 and Figure 13–16 correspond
with those in the No. column of Table 13–14 and Table 13–15.
H1
(11.1) (12.1)
IOSTRB
(13.1) (23)
XR/W
(14.3)
XA
(15.3)
(16.1)
XD
(17.3)
(18.1)
(X)RDY
13-36
Timing
H3
H1
(11.1) (12.1)
IOSTRB
(13.1)
(23)
(X)R/W
(14.3)
(X)A
(24) (25)
(X)D
(17.3)
(18.1)
(X)RDY
Table 13–15 defines memory write timing parameters for IOSTRB. The num-
bers shown in parentheses in Figure 13–15 and Figure 13–16 correspond
with those in the No. column of Table 13–14 and Table 13–15.
13-38
Timing
Table 13–16. Timing Parameters for XF0 and XF1 When Executing LDFI or LDII
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
N
No. Name
N D
Description
i i Min Max Min Max Min Max Min Max U i
Unit
(1) td(H3H–XF0L) H3 high to XF0 low delay 19 15 13 12 ns
Figure 13–17. Timing for XF0 and XF1 When Executing LDFI or LDII
Fetch
LDFI or LDII Decode Read Execute
H3
H1
(M)STRB
(X)R/W
(X)A
(X)D
(X)RDY
(1)
(3)
XF1 Pin
Table 13–17 defines the timing parameters for the XF0 and XF1 pins during
execution of STFI or STII. The number shown in parentheses in Figure 13–18
corresponds with the number in the No. column of Table 13–17.
Table 13–17. Timing Parameters for XF0 When Executing STFI or STII
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) td(H3H–XF0H) H3 high to XF0 high delay 19 15 13 12 ns
XF0 is always set high at the beginning of the execute phase of the interlock
store instruction. When no pipeline conflicts occur, the address of the store is
also driven at the beginning of the execute phase of the interlock store instruc-
tion. However, if a pipeline conflict prevents the store from executing, the ad-
dress of the store will not be driven until the store can execute.
Fetch
STFI or STII Decode Read Execute
H3
H1
(M)STRB
(X)R/W
(X)A
(X)D
(X)RDY (1)
XF0 Pin
13-40
Timing
Table 13–18. Timing Parameters for XF0 and XF1 When Executing SIGI
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) td(H3H–XF0L) H3 high to XF0 low delay 19 15 13 12 ns
Figure 13–19. Timing for XF0 and XF1 When Executing SIGI
Fetch
SIGI Decode Read Execute
H3
H1
(1)
(3) (2)
XF0
(4)
XF1
Table 13–19. Timing Parameters for Loading the XF Register When Configured as an Output
Pin
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) tv(H3H–XF) H3 high to XF valid 19 15 13 12 ns
Figure 13–20. Timing for Loading XF Register When Configured as an Output Pin
Fetch Load
Instruction Decode Read Execute
H3
H1
OUTXF
1 or 0
Bit
(1)
XF Pin
13-42
Timing
H1
(2)
IOXF
Bit (3)
(1)
XF Pin Output
H3
H1
IOXF
Bit
(1)
XF Pin
13-44
Timing
13-46
Timing
13-48
Timing
22
TMS320C30-33
20
4.75 V ≤ VDD ≤ 5.25 V
18
CLKIN to H1/H3 (ns)
16
14
12
10
8
6
4
2
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
Case Temperature (C°)
22
TMS320C31-27
20 TMS320C31-33 extended
18 TMS320C31-33 (extended temperature) temperature
CLKIN to H1/H3 (ns)
TMS320C30-40 range
16
14 4.75 V ≤ VDD ≤ 5.25 V
12
10
8
6
4
2
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105110 115120125
Case Temperature (C°)
20
18
CLKIN to H1/H3 (ns)
TMS320C31-50
16
4.75 V ≤ VDD ≤ 5.25 V
14
12
10
8
6
4
2
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
13-50
Timing
H3
H1
SHZ
(1) (2)
All I/O Pins
Note: Enabling SHZ destroys TMS320C3x register and memory contents. Assert SHZ = 1 and reset the TMS320C3x to restore
it to a known condition.
The interrupt (INT) pins are asynchronous inputs that can be asserted at any
time during a clock cycle. The TMS320C3x interrupts are level-sensitive, not
edge-sensitive. Interrupts are detected on the falling edge of H1. Therefore,
interrupts must be set up and held to the falling edge of H1 for proper detection.
The CPU and DMA respond to detected interrupts on instruction fetch bound-
aries only.
For the processor to recognize only one interrupt on a given input, an interrupt
pulse must be set up and held to:
The TMS320C3x can accept an interrupt from the same source every two H1
clock cycles.
If the specified timings are met, the exact sequence shown in Figure 13–28 will
occur; otherwise, an additional delay of one clock cycle is possible.
H3
H1
(1)
INT3 – INT0
Pin
(2)
INT3 – INT0
Flag
Vector First
ADDR Address Instruction
Address
Data
H3
H1
(1)
(2)
IACK
ADDR
Data
13-54
Timing
Table 13–27 defines the serial port timing parameters for eight ’C3x devices.
The numbers shown in parentheses in Figure 13–30 and Figure 13–31 corre-
spond with those in the No. column of Table 13–27.
(1) (2)
H1
(1)
(3)
(3)
CLKX/R
(5)
(4)
(6) (15)
(8)
FSX(EXT)
(11)
(12)
Notes: 1) Timing diagrams show operations with CLKXP = CLKRP = FSXP = FSRP = 0.
2) Timing diagrams depend on the length of the serial port word, where n = 8, 16, 24, or 32 bits, respectively.
(10)
DR Bit n-1 Bit n-2 Bit n-3
(7) (8)
Notes: 1) Timing diagrams show operation with CLKXP = CLKRP = FSXP = FSRP = 0.
2) Timing diagrams depend on the length of the serial port word, where n = 8, 16, 24, or 32 bits, respectively.
3) The timings that are not specified expressly for the variable data rate mode are the same as those that are specified
for the fixed data rate mode.
13-56
Timing
13-58
Timing
13-60
Timing
Table 13–28 defines the timing parameters for the HOLD and HOLDA signals.
The numbers shown in parentheses in Figure 13–32 correspond with those in
the No. column of Table 13–28.
The NOHOLD bit of the primary bus control register (see subsection 7.1.1 on
page 7-3) overrides the HOLD signal. When this bit is set, the device comes
out of hold and prevents future hold cycles.
Asserting HOLD prevents the processor from accessing the primary bus. Pro-
gram execution continues until a read from or a write to the primary bus is re-
quested. In certain circumstances, the first write will be pending, thus allowing
the processor to continue until a second write is encountered.
H3
H1
(1) (1)
(4)
HOLD
(3) (3)
(6)
HOLDA
(7) (8) (9)
STRB
(11)
(10)
R/W
(12) (13)
A
(16)
D Write Data
Note: HOLDA will go low in response to HOLD going low and will continue to remain low until one H1 cycle after HOLD goes
back high, as shown in Figure 13–32.
13-62
Timing
H3
H1
(2) (3)
(1) (3)
Peripheral
Pin
Table 13–30 and Table 13–31 show the timing parameters for changing the
peripheral pin from a general-purpose output pin to a general-purpose input
pin and vice versa. The numbers shown in parentheses in Figure 13–34 and
Figure 13–35 correspond to those shown in the No. column of Table 13–30
and Table 13–31, respectively.
Table 13–30. Timing Parameters for Peripheral Pin Changing From General-Purpose Output
to Input Mode
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) th(H3H) Hold after H1 high 19 15 13 10 ns
Table 13–31. Timing Parameters for Peripheral Pin Changing From General-Purpose Input to
Output Mode
’C30-33
’C30-27 ’C31-33 ’C30-40
’C31-27 ’LC31 ’C31-40 ’C31-50
No. Name Description Min Max Min Max Min Max Min Max Unit
(1) td(GPIOH1H) H1 high to peripheral pin 19 15 13 10 ns
switching from input to out-
put delay
Figure 13–34. Timing for Change of Peripheral Pin From General-Purpose Output to
Input Mode
H3
H1
IO (2)
Control Bit (3)
(1)
Peripheral
Output
Pin
Data Bit
Data
Sampled Data
Seen
13-64
Timing
Figure 13–35. Timing for Change of Peripheral Pin From General-Purpose Input to
Output Mode
Execution of Store
of Peripheral Control
Register
H3
H1
IO
Control
Bit
(1)
Peripheral
Pin
Table 13–32 and Table 13–33 define the timing parameters for the timer pin.
The numbers shown in parentheses in Figure 13–36 correspond with those in
the No. column of Table 13–32 and Table 13–33.
13-66
Timing
H3
H1
(2) (3)
(1) (3)
Peripheral
Pin
(5)
(4)
Instruction Opcodes
The opcode fields for all TMS320C3x instructions are shown in Table A–1. Bits
in the table marked with a hyphen are defined in the individual instruction de-
scriptions (see Chapter 10). Table A–1, along with the instruction descriptions,
fully defines the instruction words. The opcodes are listed in numerical order.
Note that an undefined operation may occur if an illegal opcode is executed.
A-1
Instruction Opcodes
A-2
Instruction Opcodes
INSTRUCTION 31 30 29 28 27 26 25 24 23
NOP 0 0 0 0 1 1 0 0 1
NORM 0 0 0 0 1 1 0 1 0
NOT 0 0 0 0 1 1 0 1 1
POP 0 0 0 0 1 1 1 0 0
POPF 0 0 0 0 1 1 1 0 1
PUSH 0 0 0 0 1 1 1 1 0
PUSHF 0 0 0 0 1 1 1 1 1
OR 0 0 0 1 0 0 0 0 0
RND 0 0 0 1 0 0 0 1 0
ROL 0 0 0 1 0 0 0 1 1
ROLC 0 0 0 1 0 0 1 0 0
ROR 0 0 0 1 0 0 1 0 1
RORC 0 0 0 1 0 0 1 1 0
RPTS 0 0 0 1 0 0 1 1 1
STF 0 0 0 1 0 1 0 0 0
STFI 0 0 0 1 0 1 0 0 1
STI 0 0 0 1 0 1 0 1 0
STII 0 0 0 1 0 1 0 1 1
SIGI 0 0 0 1 0 1 1 0 0
SUBB 0 0 0 1 0 1 1 0 1
SUBC 0 0 0 1 0 1 1 1 0
SUBF 0 0 0 1 0 1 1 1 1
SUBI 0 0 0 1 1 0 0 0 0
SUBRB 0 0 0 1 1 0 0 0 1
SUBRF 0 0 0 1 1 0 0 1 0
SUBRI 0 0 0 1 1 0 0 1 1
TSTB 0 0 0 1 1 0 1 0 0
XOR 0 0 0 1 1 0 1 0 1
IACK 0 0 0 1 1 0 1 1 0
ADDC3 0 0 1 0 0 0 0 0 0
ADDF3 0 0 1 0 0 0 0 0 1
ADDI3 0 0 1 0 0 0 0 1 0
AND3 0 0 1 0 0 0 0 1 1
ANDN3 0 0 1 0 0 0 1 0 0
ASH3 0 0 1 0 0 0 1 0 1
CMPF3 0 0 1 0 0 0 1 1 0
CMPI3 0 0 1 0 0 0 1 1 1
INSTRUCTION 31 30 29 28 27 26 25 24 23
LSH3 0 0 1 0 0 1 0 0 0
MPYF3 0 0 1 0 0 1 0 0 1
MPYI3 0 0 1 0 0 1 0 1 0
OR3 0 0 1 0 0 1 0 1 1
SUBB3 0 0 1 0 0 1 1 0 0
SUBF3 0 0 1 0 0 1 1 0 1
SUB13 0 0 1 0 0 1 1 1 0
TSTB3 0 0 1 0 0 1 1 1 1
XOR3 0 0 1 0 1 0 0 0 0
LDFcond 0 1 0 0 – – – – –
LDIcond 0 1 0 1 – – – – –
BR(D)† 0 1 1 0 0 0 0 – –
CALL 0 1 1 0 0 0 1 – –
RPTB 0 1 1 0 0 1 0 – –
SWI 0 1 1 0 0 1 1 – –
Bcond(D)† 0 1 1 0 1 0 – – –
DBcond(D)† 0 1 1 0 1 1 – – –
CALLcond 0 1 1 1 0 0 – – –
TRAPcond 0 1 1 1 0 1 0 – –
RETIcond 0 1 1 1 1 0 0 0 0
RETScond 0 1 1 1 1 0 0 0 1
MPYF3||ADDF3 1 0 0 0 0 0 0 0 –
1 0 0 0 0 0 0 1 –
1 0 0 0 0 0 1 0 –
1 0 0 0 0 0 1 1 –
MPYF3||SUBF3 1 0 0 0 0 1 0 0 –
1 0 0 0 0 1 0 1 –
1 0 0 0 0 1 1 0 –
1 0 0 0 0 1 1 1 –
MPYI3||ADDI3 1 0 0 0 1 0 0 0 –
1 0 0 0 1 0 0 1 –
1 0 0 0 1 0 1 0 –
1 0 0 0 1 0 1 1 –
A-4
Instruction Opcodes
For information on pricing and availability, contact the nearest TI field sales of-
fice or authorized distributor.
Topic Page
B-1
TMS320C3x Development Support Tools
- Simulator. Simulates via software the operation of the ’C3x and can be
used in C and assembly software development. This product is currently
available for PC (DOS and Windows) and SPARC workstations. Refer to
the TMS320C3x C Source Debugger User’s Guide (SPRU054) for de-
tailed information.
B-2
TMS320C3x Development Support Tools
Because ’C3x and ’C5x XDS510 emulators also come with the same emu-
lator board (or box), you can buy the ’C3x C source debugger software as a
separate product called ’C3x C Source Debugger Conversion Software.
This enables you to debug ’C3x/’C4x/’C5x applications with the same
emulator board. The emulator cable that comes with the ’C5x XDS510
emulator is not compatible with the ’C3x. You need a JTAG emulation con-
version cable. Refer to the TMS320C3x C Source Debugger User’s Guide
(SPRU053) for detailed information on the ’C3x emulator.
For a more detailed description of services and products offered by third par-
ties, refer to the TMS320 Third Party Support Reference Guide (literature
number SPRU052) and the TMS320 Software Cooperative Data Sheet Pack-
et (literature number SPRT111). Call the Literature Response Center at (800)
477–8924 to request a copy.
B-4
TMS320C3x Development Support Tools
- Fax at (713)274–2324
To find out more about the BBS, refer to the TMS320 Family Development
Support Reference Guide (literature number SPRU011).
B-6
TMS320C3x Part Ordering Information
Operating
p g Typical
yp Power
D i
Device Technology
T h l Frequency Package
P k Type
T Dissipation
TMS320C30GEL 0.8-µm CMOS 33 MHz Ceramic 181-pin PGA 1.00 W
B-8
TMS320C3x Part Ordering Information
TMX and TMP devices and TMDX development support tools are shipped with
the following disclaimer:
TMS devices and TMDS development support tools have been fully character-
ized, and their quality and reliability have been fully demonstrated. TI’s stan-
dard warranty applies to TMS devices and TMDS development support tools.
TMDX development support products are intended for internal evaluation pur-
poses only. They are covered by TI’s Warranty and Update Policy for Micropro-
cessor Development Systems products; however, they should be used by cus-
tomers only with the understanding that they are developmental in nature.
Figure B–1 presents a legend for reading the complete device name for any
TMS320 family member.
B-10
Appendix
AppendixCA
- Our customers,
Our customer’s perception of quality is the governing criterion for judging per-
formance. This concept is the basis for TI Corporate Quality Policy, which is
as follows:
“For every product or service we offer, we shall define the requirements that
solve the customer’s problems, and we shall conform to those requirements
without exception.”
Topic Page
C-1
Reliability Stress Tests
- Piece parts (such as lead frame, mold compound, mount material, bond
wire, or lead finish)
- Manufacturing site
Table C–1 lists the microprocessor and microcontroller reliability tests, the du-
ration of the test, and sample size. Table C–2 contains definitions and descrip-
tions of terms used in those tests.
C-2
Reliability Stress Tests
Electrostatic discharge, ± 2 kV 15 15
Mechanical sequence – 22
Thermal sequence – 22
Thermal/mechanical sequence – 22
PIND – 45
Solderability 22 22
Solder heat 22 22
Resistance to solvents 15 15
Lead integrity 15 15
Lead pull 22 –
Salt atmosphere 15 15
Flammability (UL94-V0) 3 –
Thermal impedance 5 5
C-4
Reliability Stress Tests
Table C–3 lists the TMS320C3x devices, the approximate number of transis-
tors, and the equivalent gates. The numbers have been determined from de-
sign verification runs.
C-6
TMS320C31 PQFP Reflow Soldering Precautions
Once the bag seal is broken, the devices should, within two days of removal,
be reflow soldered and stored at less than 60% RH and less than 30° C. If these
conditions are not met, TI recommends baking the devices in a clean oven at
125° C and 10% maximum RH for 25 hours. This procedure restores the de-
vices to their dry-packed moisture level.
TI recommends that the reflow process not exceed two solder cycles and that
the temperature not exceed 220° C.
Topic Page
D.1 Fundamental Power Dissipation Characteristics . . . . . . . . . . . . . . . . . D-2
D.2 Current Requirement for Internal Circuitry . . . . . . . . . . . . . . . . . . . . . . D-5
D.3 Current Requirement for Output Driver Circuitry . . . . . . . . . . . . . . . . . D-9
D.4 Calculation of Total Supply Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-18
D.5 Example Supply Current Calculations . . . . . . . . . . . . . . . . . . . . . . . . . D-26
D.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-28
D.7 Photo of IDD for FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-29
D.8 FFT Assembly Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-30
D-1
Fundamental Power Dissipation Characteristics
D.1.2 Dependencies
The power supply current consumption depends on many factors. Four are
system-related:
- Operating frequency,
- Supply voltage,
- Operating temperature, and
- Output load
D-2
Fundamental Power Dissipation Characteristics
The total power supply current for the device is described in this equation,
which applies the four basic power supply current components and the depen-
ǒ Ǔ
dencies described above:
where
Iibus is the current component due to internal bus usage, including data value
and cycle time dependencies,
Ixbus is the current component due to external bus usage, including data
value, wait state, cycle time, and capacitive load dependencies,
This appendix explains, in detail, how to determine the power supply current
requirement for the TMS320C30. If a less detailed analysis is sufficient, the
minimum, typical, and maximum values can be used to determine a rough esti-
mate of the power supply current requirements. The minimum power supply
current requirement is 110 mA. The typical and average current consumption
is 200 mA, as described in the TMS320C30 data sheet, and will be associated
with most algorithms running on the device unless data output is excessive.
Each part of an algorithm behaves differently, depending on its internal and ex-
ternal bus usage. To analyze the power supply current requirement, you must
partition an algorithm into segments with distinct concentrations of internal or
external bus usage. The analysis that follows is applied to each distinct pro-
gram segment to determine the power supply current requirement for that sec-
tion. The average power supply current requirement can then be calculated
from the requirements of each segment of the algorithm.
All TMS320C30 supply current measurements were performed on the test set-
up shown in Figure D–1. The test setup consists of a TMS320C30, 8K words
of zero-wait-state Cypress Semiconductor SRAMs (CY7C186–25PC), and
RC loads on all data and address lines. A Tektronix Current Probe (P6042)
measures the power supply current in all VDD lines of the device. The supply
voltage on the output load is 2.15 V. Unless otherwise specified, all measure-
ments are made at a supply voltage of 5.0 V, an input clock frequency of 33
MHz, a capacitive load of 80 pF, and an operating temperature of 25°C.
CY7C186-25PC
Tektronix
Current Probe
(P6042)
SRAM
R = 825 Ω R = 825 Ω
TMS320C30
32 D 32 D
Primary Expansion
24 A 13 A
C C
VSS
D-4
Current Requirement for Internal Circuitry
D.2.1 Quiescent
Quiescent refers to the baseline supply current drawn by the TMS320C30 dur-
ing minimal internal activity, such as executing the IDLE instruction or branch-
ing to self. It includes the current required to fetch an instruction from on- or
off-chip memory. The quiescent requirement for the TMS320C30 is 110 mA.
Examples of quiescent current include:
- Maintaining timers and serial ports
- Executing the IDLE instruction
- TMS320C30 in HOLD mode pending external bus access
- TMS320C30 in reset
- Branching to self
The internal bus operations include all operations that utilize the internal buses
extensively, such as accessing internal RAM every cycle. No distinction is
made between internal reads (such as instruction or operand fetches from in-
ternal ROM or internal RAM banks) and internal writes (such as operand
stores to internal RAM banks), because internally they are equal. Significant
use of internal buses adds a term to the power supply current requirement that
is data-dependent. Since switching requires more current, moving changing
data at high rates requires higher power supply current.
Pipeline conflicts, use of cache, fetches from external wait-state memory, and
writes to external wait-state memory all affect the internal and external bus
cycles of an algorithm executing on the TMS320C30. Therefore, the internal
bus usage of the algorithm must be determined to accurately calculate power
supply current requirements. The TMS320C30 software simulator and XDS
emulator both provide benchmarking and timing capabilities that allow bus
usage to be determined.
The current resulting from internal bus usage varies roughly exponentially with
transfer rates. Figure D–2 shows internal bus current requirements for trans-
ferring alternating data (AAAAAAAAh to 55555555h) at several transfer rates
(expressed as the transfer cycle time). A transfer rate less than 1 implies multi-
ple accesses per single H1 cycle (that is, using direct memory access (DMA),
etc.). Transfer cycle times greater than 1 refer to single-cycle transfers with
one or more cycles between them. The minimum transfer cycle time is one-
third, which corresponds to three accesses in a single H1 cycle.
The data set AAAAAAAAh to 55555555h exhibits the maximum current for
these types of operations. Less current is required for transferring other data
patterns, and current values can be derated accordingly as described later in
this subsection.
As the transfer rate decreases (that is, transfer cycle time increases), the in-
cremental IDD approaches 0 mA. Transfer rates corresponding to more than
seven H1 cycles do not add any current and are considered insignificant. This
figure represents the incremental IDD due to internal bus operations and is
added to quiescent and internal operations current values.
For example, the maximum transfer rate corresponds to three accesses every
cycle or one-third H1 transfer cycle time. At this rate, 85 mA is added to the
quiescent (110 mA) and internal operation (55 mA) current values for a total
of 250 mA.
D-6
Current Requirement for Internal Circuitry
IncrementalFigure D–2 shows the internal bus current requirement when tran-
sferring As, followed by 5s, for various transfer rates. Figure D–3 shows the
data dependence of the internal bus current requirement when the data is oth-
er than As followed by 5s. The trapezoidal region bounds all possible data val-
ues transferred. The lower line represents the scale factor for transferring the
same data. The upper line represents the scale factor for transferring alternat-
ing data (all 0s to all Fs or all As to all 5s, etc.).
80
60
40
20
–20
0 2 4 6 8 10 12 14
1.2
0s–Fs As–5s
Alternating Data
1
Normalized I DD
0.8
0.6
Same Data Fs–Fs
0.4
0s–0s
Since the possible permutations of data values is quite large, the extent to
which data varies is referred to as relative data complexity. This term repre-
sents a relative measure of the extent to which data values are changing and
the extent to which the number of bits are changing state. Therefore, relative
data complexity ranges from 0, signifying minimal variation of data, to a nor-
malized value of 1, signifying greatest data variation.
If a statistical knowledge of the data exists, Figure D–3 can be used to deter-
mine the exact power supply requirement according to internal bus usage. For
example, Figure D–3 indicates a 63% scale factor when all Fs are moved inter-
nally every cycle with two accesses per cycle. This scale factor is multiplied
by 55 mA (from Figure D–2, at one-half H1 cycle transfer time), yielding 34.65
mA because of internal bus usage. Therefore, an algorithm running under
these conditions requires about 200 mA of power supply current (110 + 55 +
34.65).
Since a statistical knowledge of the data might not be readily available, a nomi-
nal scale factor will suffice. The median between the minimum and maximum
values at 50% relative data complexity yields a value of 0.80. This value will
serve as an estimate of a nominal scale factor. Therefore, you can use this
nominal data scale factor of 80% for internal bus data dependency, adding 44
mA to 110 mA (quiescent) and 55 mA (internal operations) to yield 210 mA. As
an upper bound, assume worst case conditions of three accesses of alternat-
ing data every cycle, adding 85 mA to 110 mA (quiescent) and 55 mA (internal
operations) to yield 250 mA.
D-8
Current Requirement for Output Driver Circuitry
Accordingly, the highest values of supply current are exhibited when external
writes are being performed at high speed. During reads, or when the external
buses are not being used, the TMS320C30 is not driving the data bus; this
eliminates the most significant component of output buffer current. Further-
more, in typical cases, only a few address lines are changing, or the whole ad-
dress bus is static. Under these conditions, an insignificant amount of supply
current is consumed. Therefore, when no external writes are being performed
or when writes are performed infrequently, current due to output buffer circuitry
can be ignored.
When external writes are being performed, the current required to supply the
output buffers depends on several considerations. As with internal bus opera-
tions, current required for output drivers depends on the data being transferred
and the rate at which transfers are being made. Additionally, output driver cur-
rent requirements depend on the number of wait states implemented, because
wait states affect rates at which bus signals switch. Finally, current values are
also dependent upon external bus DC and capacitive loading.
External operations involve writes external to the device and constitute the
major power supply current component. The power supply current for the ex-
ternal buses is made up of three components and is summarized in the follow-
ing equation:
I
base
) Iprim ) Iexp
where
The remainder of this section describes in detail the calculation of external bus
current components.
As previously mentioned, to obtain accurate current values, you must first es-
tablish timing of write cycles on the buses. To determine the rate and timings
at which write cycles to the external buses occur, you must analyze program
activity, including any pipeline conflicts that may exist. Information from this
manual and the TMS320C30 emulator or simulator is useful in making these
determinations. Note that effects from the use of cache must also be ac-
counted for in these analyses because use of cache can affect whether in-
structions are fetched from external memory.
When evaluating external write activity in a given program segment, you must
consider whether a particular level of external write activity constitutes signifi-
cant activity. If writes are being performed at a slow enough rate, they do not
significantly impact supply current requirements; therefore, current due to ex-
ternal writes can be ignored. This is the case, however, only if writes are being
performed at very slow rates on both the primary and the expansion buses. If
writes are being performed at high speed on only one of the two external
buses, you should still use the approach described in this section to calculate
current requirements.
Note that, although you obtain negative incremental current values under
some circumstances, the total contribution for external buses, including base-
line current, must always be positive. The reason is that, when external buses
are used minimally, total current requirements always approach the current
contribution due to internal components, which is solely a function of internal
activity. This places a lower limit on current contributions resulting from the pri-
mary and expansion buses, because the total current due to external buses
is the sum of the 60-mA baseline value and the primary and expansion bus
components. This effect is discussed in further detail in the rest of this subsec-
tion.
D-10
Current Requirement for Output Driver Circuitry
When you have established bus-write cycle timing, you can use Figure D–4
to determine the contribution to supply current due to this bus activity.
Figure D–4 shows values of current contribution from the primary bus for vari-
ous numbers of wait states and H1 cycles between writes. These characteris-
tics are exhibited when writes of alternating 55555555h and AAAAAAAAh are
being performed at a capacitive load of 80 pF per output signal line. The condi-
tions exhibit the highest current values on the device. The values presented
in the figure represent incremental or additional current contributed by the pri-
mary bus output driver circuitry under the given conditions. Current values ob-
tained from this graph are later scaled and added to several other current
terms to calculate the total current for the device. As indicated in the figure, the
lower curve represents the current contribution for 18 or more cycles between
writes.
Figure D–4.Primary Bus Current Versus Transfer Rate and Wait States
Primary Bus Analysis [80 pF, As/5s]
200
q = Number of cycles between writes
150
Incremental I DD (mA)
q=1
100
q=2
50
q=4
0
q ≥ 18
–50
0 1 2 3 4 5 6 7
Wait States
Note that number of cycles between writes refers to the number of H1 cycles
between the active portion of the write cycles as defined in Chapter 13—that
is, between H1 cycles when STRB, MSTRB, or IOSTRB and R/W (or XR/W,
as the case may be) are low. As shown in Figure D–4, the minimum number
of cycles between writes is 1 because with back-to-back writes there is one H1
cycle between active portions of the writes.
To further illustrate the relationship of current and write cycle time, Figure D–5
shows the characteristics of current for various numbers of cycles between
writes for zero wait states. The information on this curve can be used to obtain
more precise values of current if zero wait states are being used and the num-
ber of cycles between writes does not fall on one of the curves in Figure D–4.
Figure D–5.Primary Bus Current Versus Transfer Rate at Zero Wait States
Primary Bus Duty Cycle Analysis [80 pF, As/5s]
200
Incremental I DD (mA)
150
100
50
–50
0 2 4 6 8 10 12 14 16 18 20
Note that, although these graphs contain negative current values, negative
current has not necessarily actually occurred. The negative values exist be-
cause the graphs represent a current offset from a common baseline current
value, which is not necessarily the lowest current exhibited. Using this ap-
proach to depict current contributions due to different components simplifies
current calculations because it allows calculations to be made independently.
Independent calculations are possible because information about relation-
ships between different sections of the device are included implicitly in the in-
formation for each section.
Figure D–4 and Figure D–5 show that the contribution of writes for external
bus activities becomes insignificant if writes are being performed at intervals
of more than 18 cycles. Under these conditions, you should use the incremen-
tal value of –30-mA current contribution due to the primary bus. Note, however,
that you should use a value of –30 mA only if the expansion bus is being used
extensively. This is because the total contribution for external buses, including
baseline current, must always be positive. If the expansion bus is not being
used and the primary bus is being used minimally, the current contribution due
to the primary bus must always be greater than or equal to 20 mA. This ensures
that the correct total current value is obtained when summing external bus
components. Once a current value has been obtained from Figure D–4 or
Figure D–5, this value can, if necessary, be scaled by a data dependency fac-
tor, as described at the end of this section. This scaled value is then summed
along with several other current terms to determine the total supply current.
Calculation of total supply current is described in detail in Section D.4 on page
D-18.
D-12
Current Requirement for Output Driver Circuitry
Figure D–6.Expansion Bus Current Versus Transfer Rate and Wait States
Expansion Bus Analysis [80 pF, As/5s]
q = Number of cycles between writes
100
Incremental I DD (mA)
q=1
50
q=2
0
q=4
–50
q ≥ 18
–100
0 1 2 3 4 5 6 7
Wait States
Figure D–7.Expansion Bus Current Versus Transfer Rate at Zero Wait States
Expansion Bus Duty Cycle Analysis [80 pF, As/5s]
200
150
Incremental I DD (mA)
100
50
–50
–100
–150
0 2 4 6 8 10 12 14 16 18 20
D-14
Current Requirement for Output Driver Circuitry
Regardless of the approach you take for scaling, once you determine the scale
factors for primary and expansion buses, apply these factors to scale the cur-
rent values found by using the graphs in the previous two subsections. For ex-
ample, if a nominal scale factor of 0.85 is used and the system uses zero wait
states with two cycles between accesses on both the primary and expansion
buses, the current contribution from the two buses is as follows:
Primary: 0.85 x 80 mA = 68 mA
Expansion: 0.85 x 40 mA = 34 mA
0.95
0s–Fs
0.85
0.8
0.75
Same Data
0.7
0.65 0s–0s
0.6
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Data Complexity
0.9 0s–Fs
Normalized I DD
0.85
Fs–Fs
0.8
0.75
0.6 0s–0s
0.55
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Data Complexity
Once you account for cycle timing and data dependencies, you should include
capacitive loading effects in a manner similar to that of data dependency.
Figure D–10 shows the scale factor to be applied to the current values
obtained above as a function of actual load capacitance if the load capacitance
presented to the buses is less than 80 pF.
The slope of the load capacitance line in Figure D–10 is 0.26% normalized IDD
per pF. While this slope may be used to interpolate scale factors for loads
greater than 80 pF, the TMS320C30 is specified to drive output loads of less
than 80 pF, and interface timings cannot be guaranteed at higher loads. With
data dependency and capacitive load scale factors applied to the current val-
ues for primary and expansion buses, the total supply current required for the
device for a particular application can be calculated, as described in the next
section.
D-16
Current Requirement for Output Driver Circuitry
0.95
Normalized I DD
0.9
0.85
0.8
0.75
0 10 20 30 40 50 60 70 80
Note that numerous VDD and VSS pins on the device are routed to a variety of
internal connections, not all of which are common. Externally, however, all of
these pins should be connected in parallel to 5 V and ground planes, respec-
tively, with as low impedance as possible.
D-18
Calculation of Total Supply Current
4) If external writes are being performed at high speed (see section D.3 on
page D-9), add 60 mA and then add the values calculated for primary and
expansion bus current components. If only one external bus is being used,
the appropriate incremental current for the unused bus should still be in-
cluded because the current offsets include components required for oper-
ating both buses. Note, however, that, as discussed previously, the total
current contribution for external buses, including baseline, must always be
positive.
The current value resulting from summing these components is the total de-
vice current requirement for a given program activity.
Once the total current for a particular program segment has been determined,
the dependencies that affect total current requirements are applied as a scale
factor in the same manner as data dependencies discussed in other sections.
Figure D–11 shows the relative scale factors to be applied to the supply current
values as a function of both VDD and operating frequency.
Power supply current consumption does not vary significantly with operating
temperature. However, if desired, a scale factor of 2% normalized IDD per 50°C
change in operating temperature may be used to derate current within the spe-
cified range noted in the TMS320C30 data sheet. This temperature depen-
dence is shown graphically in Figure D–12. Note that a temperature scale fac-
tor of 1.0 corresponds to current values at 25°C, which is the temperature at
which all other references in the document are made.
VDD = 4.5 V
0.8
0.7
0.6
0.5
0.4
0.3 VDD Increments in 0.25 V
0.2
0 5 10 15 20 25 30
f(CLKIN) (MHz)
1.02
Normalized I DD
1.01
0.99
0.98
0.97
–80 –60 –40 –20 0 20 40 60 80
D-20
Calculation of Total Supply Current
ǒ Ǔ
summarized in the following equation:
where
lq + 110 mA
l ops + 55 mA
i
l
bus
i
+ D1 f1 (see Table D–1)
l
xbus
+ lprim ) lexp
with
l + 60 mA
base
l
prim
+ D2 C2 f 2 (see Table D–1)
Table D–1 describes the symbols used in the power supply current equation.
The table displays figure numbers from which the value can be obtained.
I +
0.8 250 mA )
0.2 300 mA 260 mA+
Using this approach, average current for any number of program segments
can be calculated.
D-22
Calculation of Total Supply Current
P +I V
IDD
IOUT
ISS
VDD
IDD
IOUT
ISS
Furthermore, external loads draw supply-only current when outputs are being
driven high, because, when outputs are in the logic 0 state, the device is sink-
ing current that is supplied from an external source. Therefore, the power dissi-
pation due to this current component will not have a contribution through IDD
but will contribute to power dissipation with a magnitude of:
P + VOL I
OL
where VOL is the low-level output voltage and IOL is the current being sunk by
the output as shown in Figure D–13. The power dissipation component due
to outputs being driven low should be calculated and added to the total power
dissipation.
When outputs with DC loads are being switched, the power dissipation compo-
nents from outputs being driven high and outputs being driven low are aver-
aged and added to the total device power dissipation. You should calculate
power components due to DC loading of the outputs separately for each pro-
gram segment before you calculate average power.
Note that any unused inputs that are left disconnected may float to a voltage
level that will cause input buffer circuits to remain in the linear region and there-
fore contribute a significant component to power supply current. Accordingly,
any unused inputs should be made inactive by being either grounded or pulled
high if absolute minimum power dissipation is desired. If several unused inputs
must be pulled high, they may be pulled high together through one resistor to
minimize component count and board space.
D-24
Calculation of Total Supply Current
Note that the average power should be determined by calculating the power
for each program segment (including considerations described above) and
performing a time average of these values, rather than simply multiplying the
average current as determined in the previous subsection by VDD.
D.5.1 Processing
The processing portion of the algorithm is 95% of the total algorithm. During
this portion, the power supply current is required only for the internal circuitry.
Data is processed in several loops that compose a majority of the algorithm.
During these loops, two operands are transferred on every cycle. The current
required for internal bus usage, then, is 55 mA, taken from Figure D–2 on page
D-7. The data is assumed to be random. A data value scale factor of 0.8 is
used from Figure D–3 on page D-7. This value scales 55 mA, yielding 44 mA
for internal bus operations. Adding 44 mA to the quiescent current requirement
and internal operations current requirement yields a current requirement of
209 mA for the major portion of the algorithm.
I + Iq ) Iiops ) Iibus
I + 110 mA ) 55 mA ) (55 mA)(0.8) + 209 mA
D-26
Example Supply Current Calculations
I + Iq ) Iibus ) Ixbus
or,
D.6 Summary
An accurate power supply current requirement for the TMS320C30 cannot be
expressed simply in terms of operating frequency, supply voltage, and output
load capacitance. The specification must be more complete and depends on
device functionality and system parameters. The current components related
to device functionality are due to quiescent current, internal operations, inter-
nal bus operations, and external bus operations. Those related to system pa-
rameters are due to operating frequency, supply voltage, output load capaci-
tance, and operating temperature. The typical power supply current require-
ment is 200 mA, and the minimum, or quiescent, is 110 mA.
D-28
Photo of IDD for FFT
400
300
200
100
mA
500 µs/Div
SINTAB: ; setup
.WORD SINE
RAM0:
.WORD 809800h
OUTBUF:
.WORD 800h
.TEXT
LDI N,IR0
LSH –1,IR0
; LENGTH–TWO BUTTERFLIES
LDI @RAM0,AR0
LDI IR0,RC
SUBI 1,RC
RPTB BLK1
ADDF *+AR0,*AR0++,R0
SUBF *AR0,*–AR0,R1
BLK1 STF R0,*–AR0
|| STF R1,*AR0++
LDI @RAM0,AR0
LDI 2,IR0
LDI N,RC
LSH –2,RC
SUBI 1,RC
RPTB BLK2
ADDF *+AR0(IR0),*AR0++(IR0),R0
SUBF *AR0,*–AR0(IR0),R1
NEGF *+AR0,R0
|| STF R0,*–AR0(IR0)
BLK2 STF R1,*AR0++(IR0)
|| STF R0,*+AR0
D-30
FFT Assembly Code
LDI N,IR0
LSH –2,IR0
LDI 3,R5
LDI 1,R4
LDI 2,R3
LOOP LSH –1,IR0
LSH 1,R4
LSH 1,R3
LDI @RAM0,AR5
INLOP:
LDI IR0,AR0
ADDI @SINTAB,AR0
LDI R4,IR1
LDI AR5,AR1
ADDI 1,AR1
LDI AR1,AR3
ADDI R3,AR3
LDI AR3,AR2
SUBI 2,AR2
ADDI R3,AR2,AR4
LDF *AR5++(IR1),R0
ADDF *+AR5(IR1),R0,R1
SUBF R0,*++AR5(IR1),R0
|| STF R1,*–AR5(IR1)
NEGF R0
NEGF *++AR5(IR1),R1
|| STF R0,*AR5
STF R1,*AR5
; INNERMOST LOOP
LDI N,IR1
LSH –2,IR1
LDI R4,RC
SUBI 2,RC
RPTB BLK3
MPYF *AR3,*+AR0(IR1),R0
MPYF *AR4,*AR0,R1
MPYF *AR4,*+AR0(IR1),R1
|| ADDF R0,R1,R2
MPYF *AR3,*AR0++(IR0),R0
SUBF R0,R1,R0
SUBF *AR2,R0,R1
ADDF *AR2,R0,R1
|| STF R1,*AR3++
ADDF *AR1,R2,R1
|| STF R1,*AR4– –
SUBF R2,*AR1,R1
|| STF R1,*AR1++
BLK3 STF R1,*AR2– –
SUBI @RAM0,AR5
ADDI R4,AR5
CMPI N,AR5
BLTD INLOP
ADDI @RAM0,AR5
NOP
NOP
ADDI 1,R5
CMPI M,R5
BLE LOOP
LDI RAM0,AR1
B FFT
.END
D-32
Appendix
AppendixEA
This appendix contains the standalone data sheet for the military version of the
’C3x digital signal processor, the SMJ320C3x Digital Signal Processor.
E-1
E-2
Appendix
AppendixFA
Texas Instruments (TI) offers many products for total system solutions, includ-
ing memory options, data acquisition, and analog input/output devices. This
appendix describes a variety of devices that interface directly to the TMS320
DSPs in rapidly expanding applications.
Topic Page
F-1
Multimedia Applications
Figure F–1 shows both the central role of the multimedia computer and the
multimedia system’s ability to integrate the various media to optimize informa-
tion flow and processing.
Video Input
Video Monitor
Image Sensor
Multimedia
Computer
Microphone Facsimile/Modem
F-2
Multimedia Applications
The TLC32047 wide-band analog interface circuit (AIC) is well suited for multi-
media applications because it features wide-band audio and up to 25-kHz
sampling rates. The TLC32047 is a complete analog-to-digital and digital-to-
analog interface system for the TMS320 DSPs. The nominal bandwidths of the
filters accommodate 11.4 kHz, and this bandwidth is programmable. The
application circuit shown in Figure F–2 handles both speech encoding and
modem communication functions, which are associated with multimedia appli-
cations.
DSP
AIC DSP Encrypt/ DSP AIC DAA HYB Phone
Decrypt Line
TMS320 DSP/
TLC32047
Interface
Controller Memory
Figure F–3 shows the interfacing of the TMS320C25 DSP to the TLC32047
AIC, which constitutes a building block of the 9600-bps V.32 bis modem shown
in Figure F–2.
DX DX ANLG GND
FSR FSR BAT 42 0.2 µF Cer.
DR DR VCC– –5 V
CLKR SHIFT CLK VDD +5 V
CLKX 0.1 µF
DGTL GND
D A
TLC32044 Analog interface (AIC) Serial 14 19.2 kHz Speech and modems
TLC32040 Analog interface (AIC) Serial 14 19.2 kHz Speech and modems
TLC32071 Analog interface (AIC) Parallel 8 1 MHz Servo ctrl / disk drive
TMS57013/4 Dual audio DAC + digital Serial 16/18 32, 37.8, Digital audio
filter 44.1, 48 kHz
F-4
Telecommunications Applications
TMS320C25 16
GND
TCM320AC36
DR J1 13 DOUT
20 kΩ
8 DIN 18
DX E11 ANLGIN Codec
19 IN
14 DCLKX MIC_GS
CLKX A9 20 kΩ
F10 2 Codec
FSX EAR_A
B9 9 3 OUT
CLKR FSR EAR_B
J2 12
FSR FSX 5
11 VDD 5V
CLK
1 kΩ PDN DCLKR
5V
1 7
7 10 7 10 1 kΩ
Reset
F-6
Telecommunications Applications
CLKR/CLKX
FSX/FSR
... Receive
DR/DOUT A8 Bit 1 Bit 2 Bit 3 Bit 8
Timing
MSB LSB
Transmit
DX/PCMIN A8 Bit 1 Bit 2 Bit 3 ... Bit 8
Timing
MSB LSB
F-8
Telecommunications Applications
Analog
Phones Neighborhood Cellular
Concentrator Phone
TCM1520 Detector
TSP50C1x Speech Synthesis TCM29C13 Combo
TP3054 Combo PBX
TCM1060/30 Transient Suppressors
Low- TCM9050/51 HVLI/HCombo TP305x
Speed DSP/Memory/Logic TCM29Cxx
Modem
TCM1520
TCM5089 DSP
TCM3105 Modem
Phones Phones
TMS320xx DSP
TCM291x Combo TCM29Cxx Combo
Fine Tune
Echo-Cancel
D
A Telephone
TMS320C25 Line
A
Echo Canceler
Transmitter
ADC
Serial and
RS-232 TMS320C25 DAC
I/F I/O
Control Receiver
TLC320AC01
F-10
Dedicated Speech Synthesis Applications
Dedicated speech synthesis chips are a good alternative for low-cost applica-
tions. The speech synthesis technology provided by the dedicated chips is ei-
ther linear-predictive coding (LPC) or continuously variable slope delta modu-
lation (CVSD). Table F–5 shows the characteristics of the TI voice synthesiz-
ers.
In addition to the speech synthesizers, TI has low-cost memories that are ideal
for use with these chips. TI can also be of assistance in developing and pro-
cessing the speech data that is used in these speech synthesis systems.
Table F–6 shows speech memory devices of different capabilities. Additional-
ly, audio filters are outlined in Table F–7.
CLK ÷ 50
TLC10/20 General-purpose dual filter 2 N/A No
CLK ÷ 100
CLK ÷ 50
TLC04/14 Low pass, Butterworth filter 4 N/A No
CLK ÷ 100
F-12
Dedicated Speech Synthesis Applications
(a) Software
(b) Speech
(c) System
y(n)
ADC Sensor
F-14
Servo Control/Disk Drive Applications
SCSI
Data
Bus
To SCSI RAM Buffer Control Data
Host and
Interface Buffer Data Sequencer Separator
Control
Address
Decode
Control Disk Head
Select
Control
TLC32071
To To/From Disk Heads
From Spindle
SN74LS393
Motor
Table F–9 lists analog/digital interface devices used for servo control.
TLC1543 10 21 µs 11 Serial
TLC1549 10 21 µs 1 Serial
1 µs 8
AIC TLC32071 8 (ADC) Parallel
9 MHz 1
Figure F–10 shows the interfacing of the TMS320C14 and the TLC32071.
D0–D7 CSCNTRL
A2 CSAN
Address Decode Logic
A1 WE
A0 DEN
RESET
WE
DEN
TMS320C14 TLC32071
For further information on these servo control products, please call TI Linear
Applications at (214) 997–3772.
F-16
Modem Applications
The AIC interfaces directly with serial-input TMS320 DSPs, which execute the
modem’s high-speed encoding and decoding algorithms. The TLC320C4x
family performs level-shifting, filtering, and A/D and D/A data conversion. The
DSP’s software-programmable features provide the flexibility required for mo-
dem operations and make it possible to modify and upgrade systems easily.
Under DSP control, the AIC’s sampling rates permit designers to include fall-
back modes without additional analog hardware in most cases. Phase adjust-
ments can be made in real time so that the A/D and D/A conversions can be
synchronized with the upcoming signal. In addition, the chip has a built-in loop-
back feature to support modem self-test requirements.
For further information or application assistance, please call TI Linear Applica-
tions at (214) 997–3772.
Figure F–11 shows a V.32 bis modem implementation using the TMS320C25
and a TLC320AC01. The upper TMS320C25 performs echo cancellation and
transmit data functions, while the lower TMS320C25 performs receive data
and timing recovery functions. The echo canceler simulates the telephone
channel and generates an estimated echo of the transmit data signal.
Figure F–11. High-Speed V.32 Bis and Multistandard Modem With the TLC320AC01 AIC
TLC320AC01 – +
+
ADC and DAC
Fine Tune
Echo-Cancel
D
A Telephone
TMS320C25/C5X Line
A
Echo Canceler
Transmitter
ADC
Serial and
RS–232 TMS320C25/C5X DAC
I/F I/O
Control Receiver
TLC320AC01
F-18
Modem Applications
With the extensive use of the TMS320 DSPs in consumer electronics, much
electromechanical control and signal processing can be done in the digital do-
main. Digital systems generally require some form of analog interface, usually
in the form of high-performance ADCs and DACs. Figure F–12 shows the gen-
eral performance requirements for a variety of applications.
MSPS
300
Instrumentation
100
HDTV
Sampling Frequency
30
Broadcasting
ADTV
DVTR
10
Fax/PC
Bits
4 5 6 7 8 9 10
Performance/Application
F-20
Advanced Digital Electronics Applications for Consumers
TV IF TMS320 CRT
ADC DAC Buffer
Amplifier DSP Video
Signal
Field System
Memory Controller
Clock
Generator
Video casette recorders (VCRs), compact disc (CD) and digital audio tape
(DAT) players, and personal computers (PCs) are a few of the products that
have taken a major position in the marketplace in recent years. The audio
channels for compact disc and DAT require 16-bit A/D resolution to meet the
distortion and noise standards. See Figure F–14 for a block diagram of a typi-
cal digital audio system.
1024fs
1fs L L
TMS57001
Analog
Digital Audio TMS57013/4 Dual 16/18 PWM Analog
Data Power
Sound Bit DAC+ Digital Filter Output
Amplifier
Processor R
R
The motion and motor control systems usually use 8- to 10-bit ADCs for the
lower frequency servo loop. Tape or disk systems use motor or motion control
for proper positioning of the record or playback heads. With the storage me-
dium compressing data into an increasingly smaller physical size, the position-
ing systems require more precision.
The converters have dual channels so that the right and left stereo signals can
be transformed into analog signals with only one chip. There are some func-
tions that allow the customers to select the conditions according to their appli-
cations, such as muting, attenuation, de-emphasis, and zero data detection.
These functions are controlled by external 16-bit serial data from a controller
like a microcomputer.
F-22
Advanced Digital Electronics Applications for Consumers
This appendix contains the source code for the TMS320C3x boot loader.
G-1
Boot Loader Source Code
************************************************************************
* C31BOOT – TMS320C31 BOOT LOADER PROGRAM
* (C) COPYRIGHT TEXAS INSTRUMENTS INC., 1990
*
* NOTE: 1. AFTER DEVICE RESET, THE PROGRAM IS SET TO WAIT FOR
* THE EXTERNAL INTERRUPTS. THE FUNCTION SELECTION OF
* THE EXTERNAL INTERRUPTS IS AS FOLLOWS:
* –––––––––––––––––––––––––––––––––––––––––––––––––––
* INTERRUPT PIN | FUNCTION
* –––––––––––––––|–––––––––––––––––––––––––––––––––––
* 0 | EPROM boot loader from 1000H
* –––––––––––––––|–––––––––––––––––––––––––––––––––––
* 1 | EPROM boot loader from 400000H
* –––––––––––––––|–––––––––––––––––––––––––––––––––––
* 2 | EPROM boot loader from FFF000H
* –––––––––––––––|–––––––––––––––––––––––––––––––––––
* 3 | Serial port 0 boot loader
* –––––––––––––––––––––––––––––––––––––––––––––––––––
*
* 2. THE EPROM BOOT LOADER LOADS WORD, HALFWORD, OR BYTE-
* WIDE PROGRAMS TO SPECIFIED LOCATIONS. THE
* 8 LSBs OF FIRST MEMORY SPECIFY THE MEMORY WIDTH OF
* THE EPROM. IF THE HALFWORD OR BYTE-WIDE PROGRAM IS
* SELECTED, THE LSBs ARE LOADED FIRST, FOLLOWED BY THE MSBs.
* THE FOLLOWING WORD CONTAINS THE CONTROL WORD FOR
* THE LOCAL MEMORY REGISTER. THE PROGRAM BLOCKS FOLLOW.
* THE FIRST TWO WORDS OF EACH PROGRAM BLOCK CONTAIN
* THE BLOCK SIZE AND MEMORY ADDRESS TO BE LOADED INTO.
* WHEN THE ZERO BLOCK SIZE IS READ, THE PROGRAM BLOCK
* LOADING IS TERMINATED. THE PC WILL BRANCH TO THE
* STARTING ADDRESS OF THE FIRST PROGRAM BLOCK.
*
* 3. IF SERIAL PORT 0 IS SELECTED FOR BOOT LOADING, THE
* PROCESSOR WILL WAIT FOR THE INTERRUPT FROM THE
* RECEIVE SERIAL PORT 0 AND PERFORM THE DOWNLOAD.
* AS WITH THE EPROM LOADER, PROGRAMS CAN BE LOADED
* INTO DIFFERENT MEMORY BLOCKS. THE FIRST TWO WORDS OF EACH
* PROGRAM BLOCK CONTAIN THE BLOCK SIZE AND MEMORY ADDRESS
* TO BE LOADED INTO. WHEN THE ZERO BLOCK SIZE IS READ,
* PROGRAM BLOCK LOADING IS TERMINATED. IN OTHER WORDS,
* IN ORDER TO TERMINATE THE PROGRAM BLOCK LOADING,
* A ZERO HAS TO BE ADDED AT THE END OF THE PROGRAM BLOCK.
* AFTER THE BOOT LOADING IS COMPLETED, THE PC WILL BRANCH
* TO THE STARTING ADDRESS OF THE FIRST PROGRAM BLOCK.
*
************************************************************************
G-2
Boot Loader Source Code
.global check
.sect ”vectors”
reset .word check
int0 .word 809FC1h
int1 .word 809FC2h
int2 .word 809FC3h
int3 .word 809FC4h
xint0 .word 809FC5h
rint0 .word 809FC6h
.word 809FC7h
.word 809FC8h
tint0 .word 809FC9h
tint1 .word 809FCAh
dint .word 809FCBh
.word 809FCCh
.word 809FCDh
.word 809FCEh
.word 809FCFh
.word 809FD0h
.word 809FD1h
.word 809FD2h
.word 809FD3h
.word 809FD4h
.word 809FD5h
.word 809FD6h
.word 809FD7h
.word 809FD8h
.word 809FD9h
.word 809FDAh
.word 809FDBh
.word 809FDCh
.word 809FDDh
.word 809FDEh
.word 809FDFh
***************************************************************************
***************************************************************************
.space 5
G-4
Boot Loader Source Code
.space 1
.space 29
LDI *+AR0(4Ch),R1
LDI R0,R0 ; test load address flag
BNN end_s
load_s STI R1,*AR4++(1) ; store new word to dest. address
end_s RETSU ; return from subroutine
.space 22
.space 26
.space 14
.space 1
.end
G-6
Index
Index
Index-1
Index
Index-2
Index
Index-3
Index
Index-4
Index
Index-5
Index
Index-6
Index
bitwise exclusive-OR instruction 10-206 bulletin board service (BBS) B-5 to B-6
3-operand instruction 10-207 bus operation 7-1 to 7-32
bitwise logical-complement instruction 10-148 external 2-26
bitwise logical-AND instruction 10-42 internal 2-22
3-operand instruction 10-43 buses
bitwise logical-ANDN instruction 10-47 DMA 2-22
3-operand instruction 10-48 program 2-22
bitwise logical-OR instruction 10-151 busy-waiting example 6-14
3-operand instruction 10-152 byte-wide configured memory 3-31
block
moves 11-25
repeat 11-18
C
repeat modes 6-2 to 6-7 C (HLL) routines 11-131 to 11-134
control bits 6-3 C compiler B-2
nested block repeats 6-7
’C30, memory maps 2-14
operation 6-3 to 6-4
RC register value 6-6 to 6-7 ’C30 power dissipation D-1 to D-32
restrictions 6-6 FFT assembly code D-30 to D-32
RPTB instruction 6-4 to 6-5 photo of IDD for FFT D-29
RPTS instruction 6-5 summary D-28
repeat registers (RC, RE, RS) 3-11, 6-2 ’C31
size (BK) register 3-4 memory maps 2-15
block diagram interrupt and trap memory maps 3-34
architectural 2-3 reserved memory locations 2-31
functional 1-5 ’C3x DSPs 1-2
boot loader 3-26 cache
external memory loading 3-30 architecture 3-21 to 3-23
interrupt and trap vector mapping 3-33 control bits 3-24
invoking 3-26 cache clear bit (CC) 3-24
mode selection 3-29 cache enable bit (CE) 3-24
operations 3-26 cache freeze bit (CF) 3-25
precautions 3-35 hit 3-23
serial-port loading 3-33 instruction 2-12
boot loader source code G-1 to G-6 memory 2-11, 3-21
algorithm 3-23 to 3-24
BR instruction 10-60 architecture 3-21
branch conflicts 9-4 to 9-6 instruction 3-21
branch unconditionally (delayed) instruction miss 3-23
10-58, 10-61 segment 3-24
branch unconditionally (standard) instruction word 3-23
10-56, 10-60 CALL instruction 6-10, 10-62
branches 6-8 call subroutine conditionally instruction 10-63
delayed 6-8 to 6-9, 11-17 call subroutine instruction 10-62
BRD instruction 10-61 CALLcond instruction 6-10, 10-63 to 10-64
breakdown of numbers B-9 to B-10 calls 6-10 to 6-11
buffered signals 12-43 carry flag 10-12
MPSD 12-42 cautions x
buffering 12-41 C-callable routines 11-131
Index-7
Index
Index-8
Index
Index-9
Index
Index-10
Index
Index-11
Index
Index-12
Index
Index-13
Index
Index-14
Index
Index-15
Index
options overview (system configuration) 12-2 parallel MPYF3 and ADDF3 instructions
OR instruction 10-151 10-119 to 10-121
OR3 and STI instructions (parallel) parallel MPYF3 and STF instructions
10-154 to 10-155 10-122 to 10-123
parallel MPYF3 and SUBF3 instructions
OR3 instruction 10-152 to 10-153
10-124 to 10-126
ordering information B-7 to B-10
parallel MPYI3 and ADDI3 instructions
ORing of the ready signals 12-9 to 12-10 10-130 to 10-132
output driver circuitry current parallel MPYI3 and STI instructions
requirement D-9 to D-17 10-133 to 10-134
capacitive load dependence D-16 to D-18 parallel MPYI3 and SUBI3 instructions
data dependency D-14 to D-16 10-135 to 10-137
expansion bus D-13 to D-14 parallel multiplies and adds 9-29
primary bus D-10 to D-12 parallel NEGF and STF instructions
output value formats 10-10 10-140 to 10-141
overflow 4-15, 4-22 parallel NEGI and STI instructions
overflow condition flag 10-12 10-143 to 10-144
parallel NOT and STI instructions
10-149 to 10-150
P parallel operations instructions 10-7 to 10-8
parallel OR3 and STI instructions
parallel ABSF and STF instructions 10-23 to 10-24 10-154 to 10-155
parallel ABSI and STI instructions 10-27 to 10-28 parallel STF and ABSF instructions 10-23 to 10-24
parallel ADDF3 and MPYF3 instructions parallel STF and ADDF3 instructions
10-119 to 10-121 10-35 to 10-36
parallel ADDF3 and STF instructions parallel STF and FLOAT instructions
10-35 to 10-36 10-80 to 10-81
parallel ADDI3 and MPYI3 instructions parallel STF and LDF instructions 10-93 to 10-94
10-130 to 10-132 parallel STF and MPYF3 instructions
parallel ADDI3 and STI instructions 10-40 to 10-41 10-122 to 10-123
parallel STF and NEGF instructions
parallel addressing modes 2-16, 5-21 to 5-22
10-140 to 10-141
parallel AND3 and STI instructions 10-45 to 10-46 parallel STF and STF instructions
parallel ASH3 and STI instructions 10-54 to 10-55 10-176 to 10-177
parallel bus 12-19 parallel STF and SUBF3 instructions
See also expansion bus interface 10-190 to 10-191
parallel FIX and STI instructions 10-77 to 10-78 parallel STI and ABSI instructions 10-27 to 10-28
parallel FLOAT and STF instructions parallel STI and ADDI3 instructions 10-40 to 10-41
10-80 to 10-81 parallel STI and AND3 instructions 10-45 to 10-46
parallel instruction set summary 2-23 to 2-24 parallel STI and ASH3 instructions 10-54 to 10-55
parallel instructions advantages 11-132 parallel STI and FIX instructions 10-77 to 10-78
parallel STI and LDI instructions 10-102 to 10-103
parallel LDF and LDF instructions 10-91 to 10-92
parallel STI and LSH3 instructions
parallel LDF and STF instructions 10-93 to 10-94
10-112 to 10-114
parallel LDI and LDI instructions 10-100 to 10-101 parallel STI and MPYI3 instructions
parallel LDI and STI instructions 10-102 to 10-103 10-133 to 10-134
parallel LSH3 and STI instructions parallel STI and NEGI instructions
10-112 to 10-114 10-143 to 10-144
Index-16
Index
Index-17
Index
Index-18
Index
Index-19
Index
Index-20
Index
Index-21
Index
Index-22
Index
Index-23
Index
Index-24
Index
Index-25
Index-26