On Chip Periperals tm4c UNIT-II
On Chip Periperals tm4c UNIT-II
On Chip Periperals tm4c UNIT-II
The little-endian machines like Intel x86, the low-order byte is stored at the lower
address and the high-order byte at the higher address.
Ex: Intel’s 8086, 80x86, TI’s MSP430
The big-endian machines like Motorola MC680xx, the high-order byte is stored at the
lower address and the low-order byte at the higher address.
Ex: Motorola MC680x0, Freescale HCS08
ARM Architecture:
ARM cores are designed specifically for embedded systems. The needs of embedded
systems can be satisfied only if features of RISC and CISC are considered together for
processor design. So ARM architecture is not a pure RISC architecture. It has a blend of both
RISC and CISC features.
Figure: Performance and capability graph of Classic ARM and Cortex embedded processors.
Figure: Performance and capability graph of Classic ARM and Cortex application processors.
ARM architecture has been improved a lot in the road map from classic ARM to ARM
Cortex. The above figure depicts the performance and capability comparison of classic ARM with
embedded cortex and application cortex series of processors. Even though ARM had earlier versions
of products i.e. v1, v2, v3 and v4, the classic group of ARM starts with v4T. The classic group is
divided into four basic families called ARM7, ARM9, ARM10 and ARM11.
ARM7 has three-stage (fetch, decode, execute) pipeline, Von-Neumann architecture where
both address and data use the same bus. It executes v4T instruction set. T stands for Thumb.
ARM9 has five-stage (fetch, decode, execute, memory, write) pipeline with higher
performance, Harvard architecture with separate instruction and data bus. ARM9 executes
v4T and v5TE instruction sets. E stands for enhanced instructions.
ARM10 has six-stage (fetch, issue, decode, execute, memory, write) pipeline with optional
vector floating point unit and delivers high floating point performance. ARM10 executes
v5TE instruction sets.
ARM11 has eight-stage pipeline, high performance and power efficiency and it executes v6
instructions set. With the addition of vector floating point unit, it performs fast floating point
operations
Nomenclature:
ARM processor implementation is described by the product nomenclature as given below
ARM [x][y][z][T][D][M][I][E][J][F][-S]
x - Family
y - Memory Management.
z - Cache size
T- Thumb state
D - JTAG Debug option
M - Fast multiplier
I - Embedded ICE Macrocell
E - Enhanced instructions
J - Jazelle state (Java)
F - Vector floating point unit
S - Synthesizable version
Referring to the nomenclature, ARM7TDMI can be understood as an ARM7 processor with
thumb implementation, JTAG debug, multiplier and ICE macro cell. Similarly ARM926EJ-S is an
ARM9 processor with MMU and cache implementation, enhanced instructions, Jazelle state and has a
synthesizable core.
Bus Matrix:
The processor contains a bus matrix that arbitrates the processor core and optional
Debug Access Port (DAP) memory accesses to both the external memory system, the internal
System Control Spaces and to various debug components. It arbitrates requests from different
bus masters in the system. Bus matrix is connected to the code interface for accessing the
code memory, SRAM and peripheral interface for data memory and other peripherals and the
optional MPU for managing different memory regions.
Debug Access Port (DAP):
DAP, the implementation of ARM debug interface enables debug access to various
master ports on the ARM SoC. It provides system access for the debugger tool using AHB-
AP, APB-AP and JTAG-AP without halting the processor. Embedded Trace Macrocell
(ETM) generates instruction trace. Instrumentation Trace Macrocell (ITM) allows software-
generated debug messages and also to generate timestamp information. Data Watch point and
Trace (DWT) unit can be used to generate data trace, event trace, and profiling trace
information. Flash patch and break point (FPB) implements hardware breakpoints, patches
code and data from Code space to System space. Serial wire viewer (SWV) is one bit ETM
port. SWV provides different types of information like program counter values, data read and
write cycles, peripheral values, event counters and exceptions.
Cortex M4 architecture suggests an optional FPU which is IEEE 754 single precision
compliant. The core instruction set supports various signal processing operations. It executes
single instruction multiple data (SIMD) instructions with 16 bit data types. Floating point
core supports addition, multiplication and hardware division. It has a 32X32 multiply and
accumulate (MAC) unit that produces 64 bit results. Embedded signal processing applications
that involve data compression, statistical signal processing, measuring, filtering and
compressing real world analog signals can use Cortex M4 with FPU.
Conversions between fixed point and floating point data formats and instructions with
floating point immediate data.
Saturation math.
Decouple 3-stage pipeline.
Three modes of operations: full compliance mode, flush-to-zero mode and default NaN
mode.
To be disabled when it is not in use to conserve energy.
Introduction to the TM4C family viz. TM4C123x &
TM4C129x and its targeted applications:
TIVA TM4C123GH6PM Microcontroller:
Features:
TM4C123GH6PM microcontroller has 32 bit ARM Cortex M4 CPU core with 80 MHz
clock rate.
Memory protection unit (MPU) provides protected operating system functionality and
floating point unit (FPU) supports IEEE 754 single precision operations.
JTAG/SWD/ETM for serial wire debugs and traces (SWD/T).
Nested vector interrupt controller (NVIC) reduces interrupt response latency.
System control block (SCB) holds the system configuration information.
The microcontroller has a set of memory integrated in it: 256 KB flash memory, 32 KB
SRAM, 2 KB EEPROM and ROM loaded with TIVA software library and boot loader.
Serial communications peripherals such as: 2 CAN controllers, full speed USB
controller, 8 UARTs, 4 I2C modules and 4 Synchronous Serial Interface (SSI) modules.
On chip voltage regulator, two analog comparators and two 12 channel 12-bit analog to
digital converter with sample rate 1 million samples per second (1MSPS) are the analog
functions in built to the device.
Two quadrature encoder interface (QEI) with index module and two PWM modules
are the advanced motion control functions integrated into the device that facilitate
wheel and motor controls.
Various system functions integrated into the device are: Direct Memory Access
controller, clock and reset circuitry with 16 MHz precision oscillator, six 32-bit timers,
six 64-bit timers, twelve 32/64 bit captures compare PWM (CCP), battery backed
hibernation module and RTC hibernation module, 2 watchdog timers and 43 GPIOs.
Few Applications:
Features:
TM4C129CNCZAD microcontroller has 32 bit ARM Cortex M4F CPU core with 120
MHz clock rate.
Memory protection unit (MPU) provides a privileged mode for protected operating
system functionality and floating point unit (FPU) supports IEEE 754 compliant
single precision operations.
JTAG/SWD/ETM for serial wire debug and trace.
Nested vector interrupt controller (NVIC) reduces interrupt response latency and high
performance interrupt handling for time critical applications.
The microcontroller has a set of memory integrated in it: 1MB flash memory, 256 KB
SRAM, 6 KB EEPROM and ROM loaded with TIVA ware software library and boot
loader.
Serial communications peripherals such as: 2 CAN controllers, full speed and high
speed USB controller, 8 UARTs, 10 I2C modules and 4 Synchronous Serial Interface
(SSI) modules.
On chip voltage regulator, three analog comparators and two 12 channel 12-bit analog
to digital converter with sample rate 2 million samples per second (2MSPS) and
temperature sensor are the analog functions in built to the device.
One quadrature encoder interface (QEI) and one PWM module with 8 PWM outputs
are the advanced motion control functions integrated into the device that facilitate
wheel and motor controls.
Various system functions integrated into the device are: Micro Direct Memory Access
controller, clock and reset circuitry with 16 MHz precision oscillator, eight 32-bit
timers, and low power battery backed hibernation module and RTC hibernation module,
2 watchdog timers and 140 GPIOs.
Cyclic Redundancy Check (CRC) computation module is used for message transfer and
safety system checks. CRC module can be used in combination with AES and DES
modules.
Advanced Encryption Standard (AES) and Data Encryption Standard (DES) accelerator
module provides hardware accelerated data encryption and decryption functions.
Secure Hash Algorithm/ Message Digest Algorithm (SHA/MDA) provides hardware
accelerated hash functions for secured data applications.
Bit-banded on-chip
32 KB 0x20000000 to 0x20007FFF
SRAM
SRAM:
The on-chip SRAM starts at address 0x2000.0000 of the device memory map. ARM
provides a technology to reduce occurrences of read-modify-write (RMW) operations called
bit-banding. This technology allows address aliasing of SRAM and peripheral to allow access
of individual bits of the same memory in single atomic operation. For SRAM, the bit-band
base is located at address 0x2200.0000. Bit band alias are computed according to following
formula.
Bitband alias= bitband base + byte offset *32 + bit number *4
Note: Bit banding is the technique to access and modifying content of bits in a register. It is
helpful to finish the read-modify operation in single machine cycle.
The region of the memory which device consider for modification is known as bit
band region and the region of memory to which device maps the selected memory is known
as bit band alias.
The SRAM is implemented using two 32-bit wide SRAM banks (separate SRAM
arrays). The banks are partitioned in a way that one bank contains all, even words (the even
bank) and the other contains all odd words (the odd bank). A write access that is followed
immediately by a read access to the same bank. This incurs a stall of a single clock cycle.
Internal ROM:
The internal ROM of the TM4C123GH6PM device is located at address 0x0100.0000
of the device memory map. The ROM contains:
TivaWare™ Boot Loader and vector table
TivaWare™ Peripheral Driver Library (DriverLib) release of product-specific
peripherals and interfaces
Advanced Encryption Standard (AES) cryptography tables
Cyclic Redundancy Check (CRC) error detection functionality
The boot loader is used as an initial program loader (when the Flash memory is
empty) as well as an application-initiated firmware upgrade mechanism (by calling back to
the boot loader). The Peripheral Driver Library, APIs in ROM can be called by applications,
reducing flash memory requirements and freeing the Flash memory to be used for other
purposes (such as additional features in the application). Advance Encryption Standard (AES)
is a publicly defined encryption standard used by the U.S. Government and Cyclic
Redundancy Check (CRC) is a technique to validate if a block of data has the same contents
as when previously checked.
Peripheral:
All Peripheral devices, timers, and ADCs are mapped as MMIO in address space
0x40000000 to 0x400FFFFF. Since the number of supported peripherals is different among
ICs of ARM families, the upper limit of 0x400FFFFF is variant.
Private Peripheral Bus (PPB):
Private Peripheral Bus is used for System Peripheral like System Timer (SysTick),
Nested Vectored Interrupt Controller (NVIC), System Control Block (SCB), Memory
Protection Unit (MPU), & Floating Point Unit (FPU).
R13: Used as the stack pointer that holds the address of the top of the stack in the
current processor mode.
R14: Used as the link register that saves the content of program counter on control
transfer due to the occurrence of exceptions or using the branch instructions in the program.
R15: Used as the program counter that points to the next instruction to be executed. In
ARM state, all instructions are of 32-bits (four bytes) for which, PC is always aligned to a
word boundary. This means that the least significant two bits of the PC are always zero. The
PC can also be half word (16-bit) aligned for Thumb state (16 bit instructions) or byte aligned
for Jazelle state (8-bit instructions) supported by different versions of ARM architecture.
Programming Model:
Programming model of a processor is basically a set of working registers used to
perform the operations defined in its instruction set. ARM programming model has total 37
registers in its register bank which are segmented for different modes of operation as shown
in below figure. User mode register set is shared by the system mode also.
Each of the remaining privileged modes has a set of banked registers which are active
and accessible to the programmer only when the core enters to the corresponding mode.
Banked registers for a particular mode are physical replication of few of the user mode
registers along with a saved program status register (SPSR) shown by shading in the figure.
If the processor mode is changed, for example from user to FIQ mode due to
occurrence of hardware interrupt (fiq), the banked registers R8-R14 from the FIQ mode will
replace the corresponding registers in user mode but the remaining user mode registers (R0-
R7) can still be used in FIQ mode after saving the previous contents.
It means registers R8-R14 of user mode are unaffected by this mode change. The
purpose of these banked registers is to reduce the context saving overhead. There is only one
dedicated PC (R15) and one CPSR for all the operation modes.
When a mode is changed, the PC and CPSR contents are saved in the link register
(R14) and SPSR of the new mode respectively. While returning back to previous mode,
special instructions are used to restore back the saved register contents. There is no SPSR
available in user mode and one important feature is that, when a mode change is forced,
CPSR content is not saved in SPSR. It happens only when exception occurs.
Addressing Modes:
Addressing mode is the way of addressing data or operand in the instruction. Every
processor instruction set offers different addressing modes to determine the address of
operands. Some fundamental addressing modes used by most of the processors are: register
addressing, immediate addressing, direct addressing and register indirect addressing. In
register addressing mode, the operand is held in a register which is specified in the
instruction. In immediate addressing mode, the operand is held in the instruction. In direct
addressing mode, the operand resides in the memory whose address is specified in the
instruction. Similarly in register indirect addressing mode, the operand is held in the memory
whose address resides in a register that is specified in the instruction.
The ARM supports the following addressing modes:
1) Register Addressing Mode
2) Relative Addressing Mode
3) Immediate Addressing Mode
4) Register Indirect Addressing
5) Register Offset Addressing Mode
6) Register based with Offset Addressing Mode
Pre-Indexed Addressing
Pre-Indexed with write back
Post-Indexed
1) Register Addressing: The operands are in the registers.
Ex: MOV R1, R2 // move content of R2 to R1
SUB R0, R1, R2 //subtract content of R2 from R1 and move the result to R0
2) Relative Addressing: Address of the memory directly specified in the instruction.
Ex: BEQ LOOP // branch to LOOP if previous instruction sets the zero flag i.e., Z=1
3) Immediate Addressing: Operand2 is an immediate value.
Ex: SUB R0, R0, #1 // Save (R0 –1) to R0
MOV R0, #0xFF00 // Put 0xFF00 to R0
4) Register Indirect Addressing: Address of the memory location that holds the operands
there in a register.
Ex: LDR R1, [R2] //Load R1 with the data pointed by register R2.
ADD R0, R1, [R2] //add R1 with the data pointed by R2 and put the result into R0
5) Register Offset Addressing: Operand2 is in a register with some offset calculation.
Ex: MOV R0, R2, LSL #3 // (R2 << 3), then move to R0
AND R0, R1, R2, LSR R3 // (R2 >> R3), logically AND with R1 and move result
to R0
6) Register based with Offset Addressing: Effective memory address has to be calculated
from a base address and an offset. Offset can be an immediate offset, register offset or
scaled register offset.
1) Pre-Indexed Addressing
Ex: LDR R2, [R3, #08] // Take value in R3, add to 08, use it as address and load data
from that address to R2
STR R1, [R0, -R2] // Register offset // Use (R0-R2) as address of the memory and
store data of R1 to that address.
LDR R3, [R1, R2 LSR #8] // Scaled register offset // Use (R1+ (R2>>8)) as address
and load the data from that address to R3.
2) Pre-Indexed with write back also called auto-indexing with pre-indexed addressing.
Symbol indicates that the instruction saves the calculated address in the base address
register.
Ex: LDR R0, [R1, #4]! // Immediate offset // Use (R1+4) as address and load the data
from that address to R0 and update R1 by (R1+4)
STR R1, [R2, R0]! // Register offset // Use (R2+R0) as address and store the data
from R1 to that address. Update R2 by (R2+R0)
STR R3, [R1, R2 LSL #4]! // Scaled register offset // Use (R1+ (R2<<4)) as address
and store the data from R3 to that address. Update R1 by (R1+ (R2<<4))
3) Post-Indexed also called auto-indexing with post-indexed addressing.
Ex: LDR R0, [R1], #4 // Immediate offset // Load the data pointed to by R1 to R0 and
then update R1 by (R1+4).
STR R1, [R3], R4 // Register offset // Store the data in R1 to the memory location
pointed to by R3 and then update R3 by (R3+R4)
LDR R2, [R0], -R3, LSR #4 // Scaled register offset // Load the data from the address
pointed to by R0 to R2 and then update R0 to (R0-(R3>>4)).