0% found this document useful (0 votes)
126 views8 pages

DSP Processor Architecture: TMS 320C67XX Blackfin Processor On Chip Resources and Programming Considerations

The document discusses the TMS320C67x Blackfin processor. It describes the processor's VLIW architecture and how it supports floating-point arithmetic. Details are given about the various memory sizes and peripherals available on different C67x models. Programming considerations for the C67x like using intrinsics and optimizations are outlined. SIMD instructions allow it to perform twice as fast as the C62x processor.

Uploaded by

KAMRAN12345786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
126 views8 pages

DSP Processor Architecture: TMS 320C67XX Blackfin Processor On Chip Resources and Programming Considerations

The document discusses the TMS320C67x Blackfin processor. It describes the processor's VLIW architecture and how it supports floating-point arithmetic. Details are given about the various memory sizes and peripherals available on different C67x models. Programming considerations for the C67x like using intrinsics and optimizations are outlined. SIMD instructions allow it to perform twice as fast as the C62x processor.

Uploaded by

KAMRAN12345786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 8

DSP Processor Architecture:

TMS 320C67XX Blackfin Processor


On chip resources and Programming
Considerations
By: Kamran H. Pathan
Roll No: 9

Summary
The TMS320C67x is a family of 32-bit floating-point DSP processor.
Its architecture is based on a VLIW architecture, which is similar
to fixed-point TMS320C62x and TMS320C64x processor.
The TMS320C67x extends the TMS320C62x instruction set to
support floating-point arithmetic.
Hence C67x is upward compatible with C62x but not with C64x.
The C67x has high precision and a large dynamic range suitable
for the applications like RADAR, SONAR, 3-D graphics, wireless
base stations and medical imaging.
C67x processor can execute 8 instruction per cycle.

On Chip Resources
On chip resources include memory, peripherals and external
memory interfaces.
Below table shows
chip resources for TMS320C67x
On-chipon
Memory
Data Memory

Program Memory

C6701

16K x 32

16K x 32

4 channel DMA, 16 bit HPI, 2BSP, 2


Timers, and 2 32 bit EMIF

C6711

32K bits
L1 cache

32K bits
L1 cache

16 channel enhanced DMA, 16 bit HPI,


2BSP, 2 Timers, and 2 32 bit external
memory interface

C6712

32K bits
L1 cache

32K bits
L1 Cache

16 channel enhanced DMA, 16 bit HPI, 2


serial ports, 2 timers and 16 bit EMIF

C6713

4K byte
L1 cache

4K byte
L1 cache

16 channel enhanced DMA, 16 bit HPI, 2


McBsPs, 2 timers and a 32-bit EMIF

Processo
r

Peripherals and external-memory


interfaces

Programming Considerations
Writing Correct and efficient assembly code for C67x processor
can be very challenging task due to the complex architecture and
deep pipeline.
Therefore, programming in C is highly recommended for the C67x.
The user may write the code in linear assembly (using .sa
extension), which is assembly code that has not allocated registers.
The assembly optimizer performs the task of assigning registers,
inserting NOP instructions automatically, and using loop
optimization before passing the code to the assembler and linker.
In addition, using intrinsic in C code can further enhance the

Programming Considerations
Similar to the C62x, the C67x processor use the same optimization
methods, such as parallel optimization, filling delay slots, loop
unrolling, and SIMD optimization.
The SIMD optimization is further enhanced in the C67x processor
with its long data path (64 bits).
For example, the C67x processor can perform LDDW, which reads
64 bits of data into a register pair.
It ca read two words or four short words, thus accessing two
single-precision floating-point data at a time.
C67x instruction also perform 2 32 x 32-bit or 4 16 x 16-bit

Conclusion
In this presentation we had an overview of the TMS320C67x DSP
processor (Blackfin Processor)
The on chip resources and memory of various C67x Processor
series were compared.
Programming consideration for TMS320C67x were highlighted.
And also saw how SIMD instruction double the performance of
the C62x processor.s

Questions??

Thank You

You might also like