0% found this document useful (0 votes)
18 views4 pages

Floating-Point Digital Signal Processor

The TMS320C6713B is a high-performance floating-point digital signal processor (DSP) from Texas Instruments, operating at frequencies up to 300 MHz and capable of delivering significant computational power. It features a two-level cache architecture, a variety of peripherals including multichannel audio ports, and supports advanced VLIW architecture for efficient instruction execution. The DSP is compatible with TI's eXpressDSP development tools, making it suitable for multichannel and multifunction applications.

Uploaded by

om saini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views4 pages

Floating-Point Digital Signal Processor

The TMS320C6713B is a high-performance floating-point digital signal processor (DSP) from Texas Instruments, operating at frequencies up to 300 MHz and capable of delivering significant computational power. It features a two-level cache architecture, a variety of peripherals including multichannel audio ports, and supports advanced VLIW architecture for efficient instruction execution. The DSP is compatible with TI's eXpressDSP development tools, making it suitable for multichannel and multifunction applications.

Uploaded by

om saini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

 

   


SPRS294B − OCTOBER 2005 − REVISED JUNE 2006

description
The TMS320C67xt DSPs (including the TMS320C6713B device†) compose the floating-point DSP generation
in the TMS320C6000t DSP platform. The C6713B device is based on the high-performance, advanced
very-long-instruction-word (VLIW) architecture developed by Texas Instruments (TI), making this DSP an
excellent choice for multichannel and multifunction applications.
Operating at 225 MHz, the C6713B delivers up to 1350 million floating-point operations per second (MFLOPS),
1800 million instructions per second (MIPS), and with dual fixed-/floating-point multipliers up to 450 million
multiply-accumulate operations per second (MMACS).
Operating at 300 MHz, the C6713B delivers up to 1800 million floating-point operations per second (MFLOPS),
2400 million instructions per second (MIPS), and with dual fixed-/floating-point multipliers up to 600 million
multiply-accumulate operations per second (MMACS).
The C6713B uses a two-level cache-based architecture and has a powerful and diverse set of peripherals. The
Level 1 program cache (L1P) is a 4K-byte direct-mapped cache and the Level 1 data cache (L1D) is a 4K-byte
2-way set-associative cache. The Level 2 memory/cache (L2) consists of a 256K-byte memory space that is
shared between program and data space. 64K bytes of the 256K bytes in L2 memory can be configured as
mapped memory, cache, or combinations of the two. The remaining 192K bytes in L2 serves as mapped SRAM.
The C6713B has a rich peripheral set that includes two Multichannel Audio Serial Ports (McASPs), two
Multichannel Buffered Serial Ports (McBSPs), two Inter-Integrated Circuit (I2C) buses, one dedicated
General-Purpose Input/Output (GPIO) module, two general-purpose timers, a host-port interface (HPI), and a
glueless external memory interface (EMIF) capable of interfacing to SDRAM, SBSRAM, and asynchronous
peripherals.
The two McASP interface modules each support one transmit and one receive clock zone. Each of the McASP
has eight serial data pins which can be individually allocated to any of the two zones. The serial port supports
time-division multiplexing on each pin from 2 to 32 time slots. The C6713B has sufficient bandwidth to support
all 16 serial data pins transmitting a 192 kHz stereo signal. Serial data in each zone may be transmitted and
received on multiple serial data pins simultaneously and formatted in a multitude of variations on the Philips
Inter-IC Sound (I2S) format.
In addition, the McASP transmitter may be programmed to output multiple S/PDIF, IEC60958, AES-3, CP-430
encoded data channels simultaneously, with a single RAM containing the full implementation of user data and
channel status fields.
The McASP also provides extensive error-checking and recovery features, such as the bad clock detection
circuit for each high-frequency master clock which verifies that the master clock is within a programmed
frequency range.
The two I2C ports on the TMS320C6713B allow the DSP to easily control peripheral devices and communicate
with a host processor. In addition, the standard multichannel buffered serial port (McBSP) may be used to
communicate with serial peripheral interface (SPI) mode peripheral devices.
The TMS320C6713B device has two bootmodes: from the HPI or from external asynchronous ROM. For more
detailed information, see the bootmode section of this data sheet.
The TMS320C67x DSP generation is supported by the TI eXpressDSPt set of industry benchmark
development tools, including a highly optimizing C/C++ Compiler, the Code Composer Studiot Integrated
Development Environment (IDE), JTAG-based emulation and real-time debugging, and the DSP/BIOSt
kernel.

TMS320C6000, eXpressDSP, Code Composer Studio, and DSP/BIOS are trademarks of Texas Instruments.
† Throughout the remainder of this document, TMS320C6713B shall be referred to as C6713B or 13B.

POST OFFICE BOX 1443 • HOUSTON, TEXAS 77251−1443 11


 
   
SPRS294B − OCTOBER 2005 − REVISED JUNE 2006

device characteristics
Table 2 provides an overview of the C6713B DSP. The table shows significant features of the device, including
the capacity of on-chip RAM, the peripherals, the execution time, and the package type with pin count. For more
details on the C67x DSP device part numbers and part numbering, see Figure 12.
Table 2. Characteristics of the C6713B Processor
C6713B
INTERNAL CLOCK (FLOATING-POINT DSP)
HARDWARE FEATURES
SOURCE
GDP/ZDP PYP
Peripherals EMIF SYSCLK3 or ECLKIN 1 (32 bit) 1 (16 bit)
EDMA
CPU clock frequency 1
(16 Channels)
Not all peripheral pins are
available at the same HPI (16 bit) SYSCLK2 1
time. (For more details, McASPs AUXCLK, SYSCLK2† 2
see the Device
Configurations section.) I2Cs SYSCLK2 2
McBSPs SYSCLK2 2
Peripheral performance is
32-Bit Timers 1/2 of SYSCLK2 2
dependent on chip-level
configuration. GPIO Module SYSCLK2 1
Size (Bytes) 264K
4K-Byte (4KB) L1 Program (L1P) Cache
On-Chip Memory 4KB L1 Data (L1D) Cache
Organization
64KB Unified L2 Cache/Mapped RAM
192KB L2 Mapped RAM
CPU ID+CPU Rev ID Control Status Register (CSR.[31:16]) 0x0203
BSDL File For the C6713B BSDL file, contact your Field Sales Representative.
Frequency MHz 300, 225, 200 225, 200, 167
3.3 ns (GDP-300, ZDP-300) 5 ns (PYP-200)
4.4 ns (GDP-225, ZDP-225) 4.4 ns (PYP-225)
Cycle Time ns
5 ns (GDPA-200, 6 ns (PYPA−167)
ZDPA-200) 5 ns (PYPA-200)
1.20‡ V
Core (V) 1.2 V
Voltage 1.4 V (−300)
I/O (V) 3.3 V
Prescaler /1, /2, /3, ..., /32
Clock Generator Options Multiplier x4, x5, x6, ..., x25
Postscaler /1, /2, /3, ..., /32
272-Ball BGA (GDP)
27 x 27 mm −
272-Ball BGA (ZDP)
Packages
208-Pin PowerPAD
28 x 28 mm −
PQFP (PYP)
Process Technology µm 0.13
Product Status
Product Preview (PP)
PD§
Advance Information (AI)
Production Data (PD)
† AUXCLK is the McASP internal high-frequency clock source for serial transfers. SYSCLK2 is the McASP system clock used for the clock
check (high-frequency) circuit.
‡ This value is compatible with existing 1.26-V designs.
§ PRODUCTION DATA information is current as of publication date. Products conform to specifications per the terms of Texas Instruments
standard warranty. Production processing does not necessarily include testing of all parameters.

C67x is a trademark of Texas Instruments.

12 POST OFFICE BOX 1443 • HOUSTON, TEXAS 77251−1443


 
   
SPRS294B − OCTOBER 2005 − REVISED JUNE 2006

functional block and CPU (DSP core) diagram

Digital Signal Processor

32
EMIF L1P Cache
L2 Cache/
Direct Mapped
Memory
4K Bytes Total
4 Banks
McASP1 64K Bytes
Total
C67x CPU
McASP0 (up to
4-Way) Instruction Fetch Control
Registers
Instruction Dispatch
McBSP1 Control
Instruction Decode
Logic
Data Path A Data Path B
Test
McBSP0 A Register File B Register File
Pin Multiplexing

In-Circuit
Emulation

Enhanced Interrupt
I2C1 .L1† .S1† .M1† .D1 .D2 .M2† .S2† .L2†
DMA Control
Controller
(16 channel)
I2C0 L2 L1D Cache
Memory 2-Way
192K Set Associative
Timer 1 Bytes 4K Bytes

Timer 0 Clock Generator and PLL


Power-Down
x4 through x25 Multiplier
Logic
/1 through /32 Dividers

GPIO

16
HPI

† In addition to fixed-point instructions, these functional units execute floating-point instructions.

EMIF interfaces to: McBSPs interface to: McASPs interface to:


−SDRAM −SPI Control Port −I2S Multichannel ADC, DAC, Codec, DIR
−SBSRAM −High-Speed TDM Codecs −DIT: Multiple Outputs
−SRAM, −AC97 Codecs
−ROM/Flash, and −Serial EEPROM
−I/O devices

POST OFFICE BOX 1443 • HOUSTON, TEXAS 77251−1443 13


 
   
SPRS294B − OCTOBER 2005 − REVISED JUNE 2006

CPU (DSP core) description


The TMS320C6713B floating-point digital signal processor is based on the C67x CPU. The CPU fetches
advanced very-long instruction words (VLIW) (256 bits wide) to supply up to eight 32-bit instructions to the eight
functional units during every clock cycle. The VLIW architecture features controls by which all eight units do not
have to be supplied with instructions if they are not ready to execute. The first bit of every 32-bit instruction
determines if the next instruction belongs to the same execute packet as the previous instruction, or whether
it should be executed in the following clock as a part of the next execute packet. Fetch packets are always 256
bits wide; however, the execute packets can vary in size. The variable-length execute packets are a key
memory-saving feature, distinguishing the C67x CPU from other VLIW architectures.
The CPU features two sets of functional units. Each set contains four units and a register file. One set contains
functional units .L1, .S1, .M1, and .D1; the other set contains units .D2, .M2, .S2, and .L2. The two register files
each contain 16 32-bit registers for a total of 32 general-purpose registers. The two sets of functional units, along
with two register files, compose sides A and B of the CPU (see the functional block and CPU diagram and
Figure 1). The four functional units on each side of the CPU can freely share the 16 registers belonging to that
side. Additionally, each side features a single data bus connected to all the registers on the other side, by which
the two sets of functional units can access data from the register files on the opposite side. While register access
by functional units on the same side of the CPU as the register file can service all the units in a single clock cycle,
register access using the register file across the CPU supports one read and one write per cycle.
The C67x CPU executes all C62x instructions. In addition to C62x fixed-point instructions, the six out of eight
functional units (.L1, .S1, .M1, .M2, .S2, and .L2) also execute floating-point instructions. The remaining two
functional units (.D1 and .D2) also execute the new LDDW instruction which loads 64 bits per CPU side for a
total of 128 bits per cycle.
Another key feature of the C67x CPU is the load/store architecture, where all instructions operate on registers
(as opposed to data in memory). Two sets of data-addressing units (.D1 and .D2) are responsible for all data
transfers between the register files and the memory. The data address driven by the .D units allows data
addresses generated from one register file to be used to load or store data to or from the other register file. The
C67x CPU supports a variety of indirect addressing modes using either linear- or circular-addressing modes
with 5- or 15-bit offsets. All instructions are conditional, and most can access any one of the 32 registers. Some
registers, however, are singled out to support specific addressing or to hold the condition for conditional
instructions (if the condition is not automatically “true”). The two .M functional units are dedicated for multiplies.
The two .S and .L functional units perform a general set of arithmetic, logical, and branch functions with results
available every clock cycle.
The processing flow begins when a 256-bit-wide instruction fetch packet is fetched from a program memory.
The 32-bit instructions destined for the individual functional units are “linked” together by “1” bits in the least
significant bit (LSB) position of the instructions. The instructions that are “chained” together for simultaneous
execution (up to eight in total) compose an execute packet. A “0” in the LSB of an instruction breaks the chain,
effectively placing the instructions that follow it in the next execute packet. If an execute packet crosses the
fetch-packet boundary (256 bits wide), the assembler places it in the next fetch packet, while the remainder of
the current fetch packet is padded with NOP instructions. The number of execute packets within a fetch packet
can vary from one to eight. Execute packets are dispatched to their respective functional units at the rate of one
per clock cycle and the next 256-bit fetch packet is not fetched until all the execute packets from the current fetch
packet have been dispatched. After decoding, the instructions simultaneously drive all active functional units
for a maximum execution rate of eight instructions every clock cycle. While most results are stored in 32-bit
registers, they can be subsequently moved to memory as bytes or half-words as well. All load and store
instructions are byte-, half-word, or word-addressable.

14 POST OFFICE BOX 1443 • HOUSTON, TEXAS 77251−1443

You might also like