0% found this document useful (0 votes)
296 views49 pages

UART Vs I2C Vs SPI

The document compares the UART, SPI, and I2C communication protocols. It provides details on how UART works, including definitions of start/stop bits and baud rate. UART is asynchronous, uses only two wires (TX and RX), and supports simplex, half-duplex, and full-duplex communication. Examples where UART is used include USB-to-UART converters and Arduino/Seeeduino boards, which use UART for programming via USB.

Uploaded by

Kishore Kishu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
296 views49 pages

UART Vs I2C Vs SPI

The document compares the UART, SPI, and I2C communication protocols. It provides details on how UART works, including definitions of start/stop bits and baud rate. UART is asynchronous, uses only two wires (TX and RX), and supports simplex, half-duplex, and full-duplex communication. Examples where UART is used include USB-to-UART converters and Arduino/Seeeduino boards, which use UART for programming via USB.

Uploaded by

Kishore Kishu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 49

UART vs I2C vs SPI – Communication

Protocols and Uses


By yida 2 years ago
When we’re talking communication protocols, a UART, SPI and I2C are the common
hardware interfaces people use in microcontroller development.

This article will compare the various interfaces: UART, SPI and I2C and their
differences. We will be comparing them with various factors through their
protocols, advantages and disadvantages of each interface, etc and we will be
providing some examples of how these interfaces are being used in
microcontrollers.

UART Interface
What is UART?
 Stands for Universal Asynchronous Reception and Transmission (UART)
 A simple serial communication protocol that allows the host communicates
with the auxiliary device.
 UART supports bi-directional, asynchronous and serial data transmission.
 It has two data lines, one to transmit (TX) and another to receive (RX), which
are used to communicate through digital pin 0, digital pin 1.
 TX and RX are connected between two devices. (eg. USB and computer)
 UART can also handle synchronization management issues between
computers and external serial devices.

How does it work?


 It can operate between devices in 3 ways:
 Simplex = data transmission in one direction
 Half-duplex = data transmission in either direction but not
simultaneously
 Full-duplex = data transmission in both directions simultaneously
 Once connected, data flows from TX of the transmitting UART to RX of the
receiving UART.
 As UART is an asynchronous serial transmission protocol = No clocks
 Transmitting UART converts parallel data from the master device (eg. CPU)
into serial form and transmit in serial to receiving UART. It will then convert
the serial data back into parallel data for the receiving device

Ref: Basics of UART communication.

 As UART has no clocks, UART adds start and stop bits that are being
transferred to represent the start and end of a message.
 This helps the receiving UART know when to start and stop reading bits.
When the receiving UART detects a start bit, it will read the bits at the
defined BAUD rate.
 UART data transmission speed is referred to as BAUD Rate and is set to
115,200 by default (BAUD rate is based on symbol transmission rate, but is
similar to bit rate).
 Both UARTs must operate at about the same baud rate. If the difference of
BAUD rate is more than 10%, the timing of bits may be off and render the
data unusable. The user must ensure UARTs are configured to transmit and
receive from the same data packet.

UART Working Protocol


 A UART that is transmitting data will first receive data from a data bus that is
sent by another component (eg. CPU).
 After getting the data from the data bus, it will add a start bit, a parity bit,
and a stop bit to create the data packet.
 The data packet is then transmitted at the TX pin where the receiving UART
will read the data packet at its RX pin. Data is sent until there is no data left in
the transmitting UART.

Data Transmission and Receiving

 Once data is being transmitted by the transmit FIFO, the FIFO ‘BUSY’ flag will
be asserted and active during the process.
 FIFO = First in, First out. It’s a UART buffer that that forces each
byte to be passed in sequence to the receiving UART.
 The ‘BUSY’ bit will only be inactive after data is finished transmitting, the
FIFO is emptied and every bit has been transmitted including the stop bit.
 When the UART receiver is idle and if the data input is low after start bit is
received, the receive counter will start running and expect to receive data in
the 8th cycle of BAUD16.
 If RX is still low during the 8th cycle of Baud16 while the start bit is valid, it
would be processed as the wrong start bit and thus ignored.
 If the start bit is valid, data bits are sampled every 16th cycle of Baud16
based on the length of the data character. If the parity mode is enabled, the
parity bit is also detected.
 If RX is high, a valid stop bit will be acknowledged. Otherwise, a framing error
will occur. 
 When a complete data packet is received, the data is stored in the receiving
FIFO.

Interrupt Control

 The goal of interrupts is to send the content of a buffer automatically.


 User can use interrupts in the event of:
 FIFO Overflow Error
 Line-break error (RX signal remains 0 including the check and the
stop bit.)
 Parity error
 Frame error (Stop bit not 1)
 Receiving timeout (receiving FIFO has data but not full and
subsequent data does not transmit)
 Transmitting
 Receiving

FIFO Operation
 UART module of the Stellaris family of ARM CPUs contain two 16-byte FIFOs:
one for transmission and one for the reception. 
 They can be configured to trigger interrupts at various depths. For example,
1/8, 1/4, 1/2, 3/4, and 7/8 depth.
 If the receiving FIFO triggers an interrupt at 1/4, a receive interrupt is
triggered when the UART receives 4 data.

Working process of transmitting FIFO:


1. The process is initiated as soon as data is entered. The transmission is time-
consuming, thus, other data that needs to be sent can continue to enter the
transmitting FIFO. 
2. When the transmitting FIFO is full, the user will have to wait, or you will lose
your data.
3. The transmitting FIFO will send the data bit by bit until the transmitting FIFO
is completely empty. After transmitted data is clear, an extra slot will be
added in the transmitting FIFO.

Working process of receiving FIFO:

1. When the hardware receives the data, it will be stored into the receiving
FIFO. The program will retrieve and erase the data automatically from the
receiving FIFO, so there will be space in the receiving FIFO. If the data in the
receiving FIFO is not erased and the receiving FIFO is full, the data will be
lost.
2. The transceiver FIFO is to solve the issue regarding the CPU being inefficient
and the UART transceiver being interrupted too frequently. Using UART
communication, the interrupt mode is simpler and more efficient than the
polling method. With no transceiver FIFO, each data will be interrupted once
and become inefficient. With a transceiver FIFO, it can generate an interrupt
and constantly transmit and receive data (up to 14), which improves the
transmission and reception efficiency.
3. Data loss would not occur as a result of the FIFO as it has already foreseen
any problems in the process of sending and receiving. As long as the UART is
initialized, the interrupt routine will do everything automatically.

Loopback

 UART has an internal loopback function for diagnostics or debugging where


data is sent from TX will be received by the RX input.

Serial Infrared Protocol

 UART has an IrDA Serial Infrared (SIR) encoder/decoder module. The IrDA SIR
module translates between an asynchronous UART data stream and a half-
duplex serial SIR interface.
 It is used to provide a digital coded output and a decoded input to the
UART. The UART signal pin can be connected to an infrared transceiver for
the IrDA SIR physical layer connection.
Advantages of Using UART
 Simple to operate, well documented as it is a widely used method with a lot
of resources online
 No clock needed
 Parity bit to allow for error checking

Disadvantages of Using UART


 Size of the data frame is limited to only 9 bits
 Cannot use multiple master systems and slaves
 Baud rates of each UART must be within 10% of each other to prevent data
loss.
 Low data transmission speeds

Examples of UART in Microcontrollers:


USB CP2102 Serial Converter

 Highly-integrated USB to UART bridge controller providing a simple solution


for updating RS-232 designs to USB using minimum components and PCB
space. It provides USB connectivity to devices with a UART interface.
 It uses a standard USB type A male and TTL 6pin connector
 This USB CP2102 Serial Converter is a small adapter for Arduino/Seeeduino
board to accept firmware upgrades from a computer.

FT232r USB UART / USB to UART 5V


 Seeed offers a similar product: USB to UART 5V
 This is a USB to serial UART interface which simplifies USB to serial designs.
 Reduces external component count while operating efficiently with a USB
host controller using as little as possible of the total USB bandwidth
available.
 For the USB to UART 5V, it is based on CH340 which is a USB bus convert chip
and it can realize USB convert to a serial interface.
 This USB will convert to IrDA infrared or USB convert to printer
interface and can also be used for uploading code or
communicating with MCUs.

UART Seeeduino V4.2

 All Arduino boards have at least one serial port (UART) which communicates
on digital pins 0 (RX) and 1 (TX) as well with the computer via USB.
 This is an Arduino-compatible board, which is based on ATmga328P MCU.
With an ATMEGA16U2 as a UART-to-USB converter, the board can basically
work like an FTDI chip and it can be programmed via a micro-USB cable.

Base Shield V2
 Arduino Uno is the most popular Arduino board so far, however, it is
sometimes frustrating when your project requires a lot of sensors or LEDs
and your jumper wires are in a mess.
 The purpose of this product is to help you get rid of the breadboard and
jump wires. With the rich grove connectors on the baseboard, you can add
all the grove modules to the Arduino Uno very conveniently!
 These devices can be connected via UART and I2C (the next communication
peripheral which I am going to touch on!)

I2C Interface
What is I2C?
 Stands for Inter-integrated-circuit (I2C)
 It is a serial communications protocol similarly to UART. However, it is not
used for PC-device communication but instead with modules and sensors.
 It is a simple, bidirectional two-wire synchronous serial bus and requires
only two wires to transmit information between devices connected to the
bus.
 They are useful for projects that require many different parts (eg. sensors,
pin, expansions and drivers) working together as they can connect up to 128
devices to the mainboard while maintaining a clear communication
pathway!
 This is because I2C uses an address system and a shared bus = many
different devices can be connected using the same wires and all data are
transmitted on a single wire and have a low pin count. However, the tradeoff
for this simplified wiring is that it is slower than SPI.
 Speed of I2C is also dependent by data speed, wire quality and external
noise
 The I2C protocol is also used as a two-wire interface to connect low-speed
devices like microcontrollers, EEPROMs, A/D and D/A converters, I/O
interfaces and other similar peripherals in embedded systems.

How does it work?


 It has 2 Lines which are SCL (serial clock line) and SDA (serial data line
acceptance port)
 CL is the clock line for synchronizing transmission. SDA is the data line
through which bits of data are sent or received.
 The master device initiates the bus transfer of data and generates a clock to
open the transferred device and any addressed device is considered a slave
device. 
 The relationship between master and slave devices, transmitting and
receiving on the bus is not constant. It depends on the direction of data
transfer at the time. 
 If the master wants to send data to the slave, the master must first address
the slave before sending any data.
 The master will then terminate the data transfer. If the master wants to
receive data from the slave, the master must again address the slave first. 
 The host then receives the data sent by the slave and finally, the receiver
terminates the receiving process. The host is also responsible for generating
the timing clock and terminating the data transfer.
 It is also necessary to connect the power supply through a pull-up
resistor. When the bus is idle, both lines operate on a high power level. 
 The capacitance in the line will affect the bus transmission speed. As the
current power on the bus is small, when the capacitance is too large, it may
cause transmission errors. Thus, its load capacity must be 400pF, so the
allowable length of the bus and the number of connected devices can be
estimated.

I2C Working Protocol


Data Transmission Method

 The master sends the transmitting signal to every connected slave by


switching the SDA line from a high voltage level to a low voltage level
and SCL line from high to low after switching the SDA line.
 The master sends each slave the 7 or 10-bit address of the slave and a
read/write bit to the slave it wants to communicate with.
 The slave will then compare the address with its own. If the address
matches, the slave returns an ACK bit which switches the SDA line low for
one bit. If the address does not match its address, the slave leaves the SDA
line high
 The master will then send or receive the data frame. After each data frame
has been transferred, the receiving device returns another ACK bit to the
sender to acknowledge successful transmission.
 To stop the data transmission, the master sends a stop signal to the slave by
switching SCL high before switching SDA high

Clock Synchronisation

 All masters generate their own clocks on the SCL line to transmit messages
on the I2C bus. 
 Data is only valid during the high period of the clock.
 Clock synchronization is performed by connecting the I2C interface to the
SCL line where the switch goes from high to low. Once the device’s clock
goes low, it keeps the SCL line in this state until it reaches the high level of
the clock. 
 If another clock is still in a low period, the low-to-high switch does not
change the state of the SCL line. The SCL line is always held low by the
device with the longest low period. At this time, the device with a short and
low period will enter a high and waiting state.
 When all relevant devices have completed their low period, the clock line
goes high. 
 After that, there is no difference in the state of the device clock and the SCL
line, and all devices begin to count their high period. The device that first
completes the high period will pull the SCL line low again.
 The low period of the synchronous SCL clock is determined by the device
with the longest low clock period, while the high period is determined by the
device with the shortest high clock period.

Transmission Modes
Quick Mode:

 Fast mode devices can receive and transmit at 400kbit/s. They have to be


able to synchronize with a 400kbit/s transmission and extend the low period
of the SCL signal to slow down the transmission. 
 Fast mode devices are backwards compatible and can communicate with
standard mode devices from 0 to 100 kbit/s I2C bus systems. However, as
standard mode devices are not upward compatible, they cannot operate in a
fast I2C bus system. The fast mode I2C bus specification has the following
characteristics compared to the standard mode:
 The maximum bit rate is increased to 400 kbit/s;
 Adjusted the timing of the serial data (SDA) and serial clock (SCL)
signals.
 Has the function of suppressing glitch and the SDA and SCL inputs
have Schmitt triggers.
 The output buffer has a slope control function for the falling edges
of the SDA and SCL signals
 Once the power supply of the fast mode device is turned off, the
I/O pins of SDA and SCL must be left idle and cannot block the bus.
 The external pull-up device connected to the bus must be tuned to
accommodate the shortest maximum allowable rise time of the
fast mode I2C bus. For buses with a maximum load of 200pF, the
pull-up device of each bus can be a resistor. For a bus with a load
between 200pF and 400pF, the pull-up device can be a current
source (maximum 3mA) or a switched resistor circuit.

High-Speed Mode:

 Hs mode devices can transmit information at bit rates up to 3.4 Mbit/s and
remain fully backwards compatible with fast mode or standard mode (F/S
mode) devices that can communicate bi-directionally in a speed mixed bus
system.
 The Hs mode transmission has the same serial bus principle and data format
as the F/S mode system except for arbitration and clock synchronization
which is not performed.
 The I2C bus specification in high-speed mode is as follows:
 In high speed (Hs) mode, the master device has an open-drain
output buffer for the high-speed (SDAH) signal and an open-drain
pull-down and current source pull-up circuit at the high-speed
serial clock (SCLH) output. This shortens the rise time of the SCLH
signal and at any time, only one host current source is active;
 In the Hs mode of a multi-master system, arbitration and clock
synchronization are not performed in order to speed up the bit
processing capability. The arbitration process normally ends after
the host code is transmitted in the F/S mode.
 The Hs mode master device generates a high and low serial clock
signal with a ratio of 1:2 which removes the timing requirements
for setup and hold time.
 The Hs mode device can have a built-in bridge. During Hs mode
transmission, the SDAH and SCLH lines of the Hs mode device are
separated from the SDA and SCL lines which reduces the capacitive
loading of the SDAH and SCLH lines and make rise and fall faster.
 The difference between Hs mode slave devices and F/S slave
devices is the speed at which they operate. 
 The Hs mode device can suppress glitches, and the SDAH and SCLH
outputs also have a Schmitt trigger;
 The output buffer of the Hs mode device has a slope control
function for the falling edges of the SDAH and SCLH signals.

Advantages of using I2C


 Has a low pin/signal count even with numerous devices on the bus
 Flexible, as it supports multi-master and multi slave communication.
 Simple as it only uses 2 bidirectional wires to establish communication
among multiple devices.
 Adaptable as it can adapt to the needs of various slave devices.
 Support multiple masters.

Disadvantages of using I2C


 Slower speed as it requires pull-up resistors rather than push-pull resistors
used by SPI.  It also has an open-drain design = limited speed.
 Requires more space as the resistors consume valuable PCB real estate.
 May become complex as the number of devices increases.

Examples of I2C in Microcontrollers


Grove – I2C Hub (6 Port)
 I2C is a very popular communication protocol. In the Grove system, I2C is
used by 80+ sensors for communication, 19 of which are related to
environmental monitoring.
 Today more and more MCUs uses 3.3V communication levels, but the
traditional ArduinoUno still uses 5V, which leads to many modules,
especially sensor modules, needing to be levelled when using them.
 We actually worked on this area, and now most of the Grove sensor modules
have a level shifting function, and users do not need to consider the use of
3.3V or 5V MCU when using it. This is in line with Grove’s motto; plugin, and
use it, it’s that simple. For a more detailed sensor review compatibility, you
can view our Grove Selection Guide.

4-Channel 16-Bit ADC for Raspberry Pi (ADS1115)

 This product by Seeed is fully compatible with Raspberry Pi.


 It is used for a Raspberry Pi without an analog-to-digital converter, or when
you need a more accurate ADC.
 We provide 4-channel 16-bit ADC for Raspberry Pi (ADS1115) over I2C, a 4-
channel ADC based on Texas Instrument ADS1115, which is a high-precision,
low-power, 16-bit ADC chip.
I2C Arduino

 I2C communication can also be used between two Arduino boards


 Used only for short-distance communication and uses a synchronised clock
pulse.
 Mainly used to communicate with sensors or other devices which have to
send information to a master. 

I2C Driver/Adapter-Easily Driver I2C Devices

 I²C Driver is an easy-to-use, open-source tool for controlling I²C devices. It


works with Windows, Mac, and Linux, and has a built-in colour screen that
shows a live “dashboard” of all the I²C activity.
 With the built-in display shows a heatmap of all active network nodes, you
are able to observe from an I²C network with multiple devices which ones
are the most active.
 When an I²C Driver is connected to an existing I²C bus, it “snoops” the traffic
and displays it on the screen.

 This provides an excellent tool for debugging I²C issues because you can
listen in on the conversation as it happens.

MCP 23017
Ref: Electronicwings, MCP23017 16-bit GPIO Expander.

 16-bit, general-purpose parallel I/O expansion for the I2C bus. Similar to
MCP23S17 except for serial interface (I2C vs SPI).
 Port expander that gives the user virtually identical ports compared to
standard microcontrollers.

PCF 8574
Ref: PCF8574 Serial Interface Module Board LCD Converter.

 Provides general-purpose remote I/O expansion via the two-wire


bidirectional I2C-bus (serial clock (SCL), Serial Data (SDA)).
 Seeed will be using this in our future products, do keep a lookout!

Grove Base Hat for Raspberry Pi


 What is Grove?
 It is a modular, standardized connector prototyping system. Grove
takes a building block approach to assemble electronics. which
makes it easier to connect, experiment and build and simplifies the
learning system.
 Today, the grove series of sensors, actuators, and displays have grown into a
large family and today we introduce the Raspberry Pi to the whole Grove
System.
 The Grove Base Hat for Raspberry Pi provides Digital/Analog/ I2C/
PWM/UART port to meet all your needs.
 With the help of the build-in MCU, a 12-bit  8 channel ADC is also available for
Raspberry Pi. Currently, more than 60 groves have supported the Grove Base
Hat for Raspberry Pi.

SPI Interface
What is SPI?
 Stands for Serial Peripheral Interface (SPI)
 It is similar to I2C and it is a different form of serial-communications protocol
specially designed for microcontrollers to connect.
 Operates at full-duplex where data can be sent and received simultaneously.
 Operate at faster data transmission rates = 8Mbits or more
 It is typically faster than I2C due to the simple protocol. Even if data/clock
lines are shared between devices, each device will require a unique address
wire.
 Used in places where speed is important. (eg. SD cards, display modules or
when info updates and changes quickly like thermometers)

How does it work?


 Communicate with 2 ways:
1. Selecting each device with a Chip Select line. A separate Chip
Select line is required for each device. This is the most common
way RPi’s currently use SPI.
2. Daisy chaining where each device is connected to the other
through its data out to the data in line of the next.
 There is no limit to the number of SPI device that can be connected.
However, there are practical limits due to the number of hardware select
lines available on the main device with the chip select method or the
complexity of passing data through devices in the daisy-chaining method.
 In point-to-point communication, the SPI interface does not require
addressing operations and is full-duplex communication, which is simple
and efficient. 

SPI Working Protocol


 The SPI communicates via 4 ports which are:
 MOSI – Master Data Output, Slave Data Input
 MISO – master data input, slave data output
 SCLK – clock signal, generated by the master device, up to fPCLK/2,
slave mode frequency up to fCPU/2
 NSS – Slave enabled signal, controlled by the master device, some
ICs will be labelled as CS (Chip select)
 In a multi-slave system, each slave requires a separate enable signal, which
is slightly more complicated on hardware than the I2C system.
 The SPI interface is actually two simple shift registers in the internal
hardware. The transmitted data is 8 bits. It is transmitted bit by bit under the
slave enable signal and shift pulse generated by the master device. The high
bit is in the front and the low bit is in the back.
 The SPI interface is synchronous serial data transmission between the CPU
and the peripheral low-speed device. Under the shift pulse of the master
device, the data is transmitted bit by bit. The high bit is in the front and the
low bit is in the back. It is full-duplex communication, and the data
transmission speed is overall faster than the I2C bus and can reach speeds of
a few Mbps.

Advantages of using SPI


 The protocol is simple as there is no complicated slave addressing system
like I2C.
 It is the fastest protocol compared to UART and I2C.
 No start and stop bits unlike UART which means data can be transmitted
continuously without interruption
 Separate MISO and MOSI lines which means data can be transmitted and
received at the same time

Disadvantages of using SPI


 More Pin ports are occupied, the practical limit to a number of devices.
 There is no flow control specified, and no acknowledgement mechanism
confirms whether data is received unlike I2C
 Uses four lines – MOSI, MISO, NCLK, NSS
 No form of error check unlike in UART (using parity bit)
 Only 1 master

Examples of SPI in Microcontrollers:


MCP 3008 / Grove I2C ADC

 Seeed does offer a similar product which has the same functions: Grove I2C
ADC but its communication peripheral is I2C.
 It is 10 bit 8-channel analogue-to-digital converter (ADC).
 For the MCP 3008, it connects to the Raspberry Pi using an SPI serial
connection. Done by using the hardware SPI bus or any four GPIO pins and
software SPI to connect to the MCP 3008.

Serial CAN-BUS Module based on MCP2551 and MCP2515


 This Seeed product: Serial CAN Bus module provides your Arduino with CAN
bus capabilities and allows you to hack your vehicle. It lets you read and
write messages to the CAN bus.
 CAN bus is a messaging protocol system that lets various microcontrollers
and sensors within a vehicle to talk to each other. CAN provides long-
distance, medium communication speed, and high reliability.
 This Serial CAN Bus module can also be connected to your Arduino through
the on-board Grove connector.
 Interfaces with microcontrollers via SPI.

SPI Seeeduino V4.2

 SPI serial communication can be used with Arduino for communication


between two Arduinos where one Arduino will act as master and another
one will act as a slave.
 Used to communicate over short distances at high speed.
 This is the same product: Arduino v4.2 from the above UART example

ENC28J60 OVERLAYS HAT for Raspberry pi

 The Pi zero ENC28J60 is a simple Network Adapter module for Pi zero that is
very easy to assemble and configure.
 It allows your Raspberry Pi zero to access the network smoothly, and it is
easy to do system updates and software installation operations.
 Microchip’s ENC28J60 is a 28-pin, 10BASE-T stand-alone Ethernet controller
with an SPI interface.
 The SPI interface serves as a communication channel between the host
controller and the ENC28J60.

SPI Driver/Adapter-Easily Driver SPI Devices

 This is a similar product as the I2C Driver/Adapter-Easily Driver I2C


Device but for SPI instead. It is an easy-to-use tool for controlling SPI
devices. It works with Windows, Mac, and Linux, and has a built-in colour
screen that shows a live logic-analyzer display of all SPI traffic.
 Similarly, it uses a standard FTDI USB serial chip to talk to the PC, so no
special drivers need to be installed. The board includes 3.3 and 5 V supplies
with voltage and current monitoring.

 SPI flash is very common, and by using a test clip, SPIDriver makes it
convenient to read and write SPI flash in-circuit. A short script is all it takes
to read or write an Atmel’s flash and SPI LED strips are also easy to hook up
to the SPI Driver, You can also be able to control them directly which makes
them much more fun!
 Using SPI in this secnario is fast enough to smoothly animate long strips and
achieve POV effects. Short strips can also be powered directly by the
SPIDriver’s beefy 470 mA built-in supply.                                                                 

So, which of these communication peripherals is


the “best”? UART, SPI or I2C?
Unfortunately, there is no “best” communication peripheral. Each communication
peripheral has its own advantages and disadvantages.

Thus, a user should pick a communication peripheral that suits your project the
best. For example, you want the fastest communication peripheral, SPI would be
the ideal pick. On another hand, if a user wants to connect many devices without it
being too complex, I2C will be the ideal pick as it can connect up to 127 devices and
it is simple to manage.
Summary
In summary, I have compiled all the various advantages/disadvantages and
functions of the various communication protocols and compared them so you can
easily pick which is the best for your project. Do keep in mind that the device,
accessory, module or sensor you are using must support the communication
protocol as well.

Protocol UART I2C SPI

Complexity Complex as device


Simple Easy to chain multiple devices
increases

Speed Slowest Faster than UART Fastest

Number of devices Up to 2 devices Up to 127, but gets complex Many, but gets complex

Number of wires 1 2 4

Duplex Full Duplex Half Duplex Full Duplex

No. of masters and slaves Single to Single Multiple slaves and masters 1 master, multiple slaves

Understanding AMBA Bus Architechture and


Protocols
The Advanced Micro controller Bus Architecture (AMBA) bus protocols is a set of interconnect
specifications from ARM that standardizes on chip communication mechanisms between
various functional blocks (or IP) for building high performance SOC designs. These designs
typically have one or more micro controllers or microprocessors along with several other
components — internal memory or external memory bridge, DSP, DMA, accelerators and
various other peripherals like USB, UART, PCIE, I2C etc — all integrated on a single chip. The
primary motivation of AMBA protocols is to have a standard and efficient way to interconnecting
these blocks with re-use across multiple designs.
The first step in learning AMBA protocols is to understand where exactly these different
protocols are used , how these evolved and how all of them fit into a SOC design.Following
diagram (reference from the AMBA 2.0 spec) illustrates a traditional AMBA based SOC design
that uses the AHB (Advanced High performance) or ASB (Advanced System Bus) protocols for
high bandwidth interconnect and an APB (Advanced Peripheral Bus) protocol for low bandwidth
peripheral interconnects

With increasing number of functional blocks (IP) integrating into SOC designs, the shared bus
protocols (AHB/ASB) started hitting limitations sooner and in 2003 , the new revision of AMBA 3
introduced a point to point connectivity protocol — AXI (Advanced Extensible Interface). Further
in 2010, an enhanced version was introduced — AXI 4. Following diagram illustrates this
evolution of protocols along with the SOC design trends in industry.

Following diagram illustrates how an AXI interconnect can be used to build an SOC with various
functional blocks talking through a master-slave protocol. The interconnect could be a custom
crossbar or a switch design or even an off the shelve NOC (Network on Chip) IP that supports
multiple AXI masters and slaves. The AXI interconnect helps in scaling up connectivity for
number of agents compared to previous AHB/ASB bus. An AXI to APB bridge on one of the
slave port is normally used to bridge communications to a set of peripherals shared on an APB
bus.
Further evolution happened in the era of mobile and smartphones with SOCs
having dual/quad/octa core processors with shared caches integrated and the
need for hardware managed coherency across the memory subsystem. This
lead to the introduction of ACE (AXI Coherency Protocol Extension) in AMBA
revision 4.
Lastly, in the current era of heterogeneous computing for HPC and data
center markets, the integration trend continues with increasing number of
processor cores along with several heterogeneous computing elements like
GPU, DSP, FPGAs, memory controllers and IO sub systems. In 2013, AMBA
5 introduced the CHI (Coherent Hub Interconnect) protocol as a re-design of
the AXI/ACE protocol. The signal based AXI/ACE protocol was replaced with
the new packet based CHI layered protocol that can scale very well for near
term future.
Now that hopefully you understand how the protocols evolved and how each
of them fit in to an SOC design— here are few basics and references to
resources that you can use to learn more in depth about each of the protocol.
ARM has open sourced all of the protocols and all the specifications can be
downloaded from the ARM website free by signing up.
1. APB : The Advanced Peripheral Bus (APB) is used for connecting low bandwidth peripherals. It is
a simple non-pipelined protocol that can be used to communicate(read or write) from a
bridge/master to a number of slaves through the shared bus. The reads and writes shares the
same set of signals and no burst data transfers are supported. The latest spec (APB 2.0) is
available on ARM website here and is a relatively easy protocol to learn.
2. AHB: The Advanced High-performance Bus (AHB) is used for connecting components that need
higher bandwidth on a shared bus. These could be a internal memory or an external memory
interface, DMA , DSP etc but the shared bus would limit the number of agents. Similar to APB,
this is a shared bus protocol for multiple masters and slaves, but higher bandwidth is possible
through burst data transfers. The latest spec can be found on ARM website here and is relatively
easy to learn
3. AHB-lite protocol is a simplified version of AHB. The simplification comes with support for only a
single master design and that removes need for any arbitration, retry, split transactions etc.
4. AXI: The Advanced Extensible interface (AXI) is useful for high bandwidth and low latency
interconnects. This is a point to point interconnect and overcomes the limitations of a shared bus
protocol in terms of number of agents that can be connected. The protocol also was an
enhancement from AHB in terms of supporting multiple outstanding data transfers (pipe-lined),
burst data transfers, separate read and write paths and supporting different bus widths.
5. AXI-lite protocol is a simplified version of AXI and the simplification comes in terms of no support
for burst data transfers.
6. AXI-stream protocol is another flavor of the AXI protocol that supports only streaming of data
from a master to a slave. There is no separate read/write channels in the stream protocol unlike a
full AXI or AXI-lite as the intend is to only stream in one direction. Multiple streams of data can be
transferred (even with interleaving) across a master and slave. This becomes useful in designs
like video streaming applications.
7. The full AXI and AXI-lite specification can be downloaded on ARM website here. The AXI-stream
protocol has a different spec and is available here for download.
8. ACE — AXI Coherence extension protocol is an extension to AXI 4 protocol and evolved in the era
of multiple CPU cores with coherent caches getting integrated on a single chip. The ACE protocol
extends the AXI read and write data channels by introducing separate snoop address, snoop data
and snoop response channels. These extra channels provides mechanisms to implement a
snoop based coherency protocol. If you are new to coherency, understanding that will be a pre-
requisite before learning ACE protocol. The spec is available for download from ARM here as
part of AXI4 spec
9. ACE-Lite — The ACE also has a simplified version of protocol for those agents that does not have
a cache of its own but still are part of the shareable coherency domain. Typical agents like DMA
or network interface agents fall implement this “one-way” coherency using a ACE-lite protocol.
10. CHI —( Coherent Hub Interface) — The ACE protocol was developed as an extension to AXI to
support coherent interconnects. The ACE protocol used a signal level communication between
master/slave and hence the interconnects needed large number of wires with added channels for
snoops and responses. This worked well for small coherent clusters with dual/quad core mobile
SOC designs. With increasing number of coherent clusters on SOC along with other
heterogeneous compute elements and memory controllers — the AMBA 5 revision introduced CHI
protocol as a complete re-design of the ACE protocol. The CHI protocol uses a layered packet
based communication protocol with protocol, link layer and physical layer implementation and
also supports QoS based flow control and retry mechanisms.

What is AMBA, and why use it?


The Advanced Microcontroller Bus Architecture, or AMBA, is an open-standard, on-chip
interconnect specification for the connection and management of functional blocks in system-on-
a-chip (SoC) designs.

Essentially, AMBA protocols define how functional blocks communicate with each other.

The following diagram shows an example of an SoC design. This SoC has several functional
blocks that use AMBA protocols, like AXI, to communicate with each other:

Where is AMBA used?


AMBA simplifies the development of designs with multiple processors and large numbers of
controllers and peripherals. However, the scope of AMBA has increased over time, going far
beyond just microcontroller devices.

Today, AMBA is widely used in a range of ASIC and SoC parts. These parts include
applications processors that are used in devices like IoT subsystems, smartphones, and
networking SoCs.

Why use AMBA?


AMBA provides several benefits:

Efficient IP reuse

IP reuse is an essential component in reducing SoC development costs and timescales.


AMBA specifications provide the interface standard that enables IP reuse. Therefore,
thousands of SoCs, and IP products, are using AMBA interfaces.

Flexibility

AMBA offers the flexibility to work with a range of SoCs. IP reuse requires a common
standard while supporting a wide variety of SoCs with different power, performance, and
area requirements. Arm offers a range of interface specifications that are optimized for
these different requirements.

Compatibility

A standard interface specification, like AMBA, allows compatibility between IP


components from different design teams or vendors.

Support

AMBA is well supported. It is widely implemented and supported throughout the


semiconductor industry, including support from third-party IP products and tools.

Bus interface standards like AMBA, are differentiated through the performance that they enable.
The two main characteristics of bus interface performance are:

Bandwidth

The rate at which data can be driven across the interface. In a synchronous system, the
maximum bandwidth is limited by the product of the clock speed and the width of the
data bus.

Latency

The delay between the initiation and completion of a transaction. In a burst-based


system, the latency figure often refers to the completion of the first transfer rather than
the entire burst.

The efficiency of your interface depends on the extent to which it achieves the maximum
bandwidth with zero latency.

How has AMBA evolved?


AMBA has evolved over the years to meet the demands of processors and new technologies, as
shown in the following diagram:
AMBA
Arm introduced AMBA in the late 1990s. The first AMBA buses were the Advanced System
Bus (ASB) and the Advanced Peripheral Bus (APB). ASB has been superseded by more recent
protocols, while APB is still widely used today.

APB is designed for low-bandwidth control accesses, for example, register interfaces on system
peripherals. This bus has a simple address and data phase and a low complexity signal list.

AMBA 2
In 1999, AMBA 2 added the AMBA High-performance Bus (AHB), which is a single clock-edge
protocol. A simple transaction on the AHB consists of an address phase and a subsequent data
phase. Access to the target device is controlled through a MUX, admitting access to one manager
at a time. AHB is pipelined for performance, while APB is not pipelined for design simplicity.

AMBA 3
In 2003, Arm introduced the third generation, AMBA 3, which includes ATB and AHB-Lite.

Advanced Trace Bus (ATB), is part of the CoreSight on-chip debug and trace solution.

AHB-Lite is a subset of AHB. This subset simplifies the design for a bus with a single manager.

Advanced eXtensible Interface (AXI), the third generation of AMBA interface defined in the
AMBA 3 specification, is targeted at high performance, high clock frequency system designs.
AXI includes features that make it suitable for high-speed submicrometer interconnect.

AMBA 4
In 2010, the AMBA 4 specifications were introduced, starting with AMBA 4 AXI4 and then
AMBA 4 AXI Coherency Extensions (ACE) in 2011.

ACE extends AXI with additional signaling introducing system-wide coherency. This system-
wide coherency allows multiple processors to share memory and enables technology like
big.LITTLE processing. At the same time, the ACE-Lite protocol enables one-way coherency.
One-way coherency enables a network interface to read from the caches of a fully coherent ACE
processor.

The AXI4-Stream protocol is designed for unidirectional data transfers from manager to
subordinate with reduced signal routing, which is ideal for implementation in FPGAs.

AMBA 5
In 2014, the AMBA 5 Coherent Hub Interface (CHI) specification was introduced, with a
redesigned high-speed transport layer and features designed to reduce congestion. There have
been several editions of the CHI protocol, and each new version adds new features.

In 2016, the AHB-Lite protocol was updated to AHB5, to complement the Armv8-M
architecture, and extend the TrustZone security foundation from the processor to the system.

In 2019, the AMBA Adaptive Traffic Profiles (ATP) was introduced. ATP complements the
existing AMBA protocols and is used for modeling high-level memory access behavior in a
concise, simple, and portable way.

AXI5, ACE5 and ACE5-Lite extend prior generations, to include a number of performance and
scalability features to align with and complement AMBA CHI. Some of the new features and
options include:

 Support for high frequency, non-blocking coherent data transfer between many
processors.
 A layered model to allow separation of communication and transport protocols
for flexible topologies, such as a cross-bar, ring, mesh or ad hoc.
 Cache stashing to allow accelerators or IO devices to stash critical data within a
CPU cache for low latency access.
 Far atomic operations enable the interconnect to perform high-frequency updates
to shared data.
 End-to-end data protection and poisoning signalling.

AXI protocol overview


AXI is an interface specification that defines the interface of IP blocks, rather than the
interconnect itself.

The following diagram shows how AXI is used to interface an interconnect component:
There are only two AXI interface types, manager and subordinate. These interface types are
symmetrical. All AXI connections are between manager interfaces and subordinate interfaces.

AXI interconnect interfaces contain the same signals, which makes integration of different IP
relatively simple. The previous diagram shows how AXI connections join manager and
subordinate interfaces. The direct connection gives maximum bandwidth between the manager
and subordinate components with no extra logic. And with AXI, there is only a single protocol to
validate.

AXI in a multi-manager system


The following diagram shows a simplified example of an SoC system, which is composed of
managers, subordinates, and the interconnect that links them all:

An Arm processor is an example of a manager, and a simple example of a subordinate is a


memory controller.

The AXI protocol defines the signals and timing of the point-to-point connections between
manager and subordinates.

Note: The AXI protocol is a point-to-point specification, not a bus specification. Therefore, it


describes only the signals and timing between interfaces.

The previous diagram shows that each AXI manager interface is connected to a single AXI
subordinate interface. Where multiple managers and subordinates are involved, an interconnect
fabric is required. This interconnect fabric also implements subordinate and manager interfaces,
where the AXI protocol is implemented.

The following diagram shows that the interconnect is a complex element that requires its own
AXI manager and subordinate interfaces to communicate with external function blocks:
The following diagram shows an example of an SoC with various processors and function
blocks:

The previous diagram shows all the connections where AXI is used. You can see that AXI3 and
AXI4 are used within the same SoC, which is common practice. In such cases, the interconnect
performs the protocol conversion between the different AXI interfaces.

AXI channels
The AXI specification describes a point-to-point protocol between two interfaces: a manager and
a subordinate. The following diagram shows the five main channels that each AXI interface uses
for communication:

Write operations use the following channels:

 The manager sends an address on the Write Address (AW) channel and transfers


data on the Write Data (W) channel to the subordinate.
 The subordinate writes the received data to the specified address. Once the
subordinate has completed the write operation, it responds with a message to the
manager on the Write Response (B) channel.

Read operations use the following channels:

 The manager sends the address it wants to read on the Read Address


(AR) channel.
 Thesubordinate sends the data from the requested address to the manager on
the Read Data (R) channel.
The subordinate can also return an error message on the Read Data (R) channel.
An error occurs if, for example, the address is not valid, or the data is corrupted, or
the access does not have the right security permission.
Note: Each channel is unidirectional, so a separate Write Response channel is needed to pass
responses back to the manager. However, there is no need for a Read Response channel,
because a read response is passed as part of the Read Data channel.

Using separate address and data channels for read and write transfers helps to maximize the
bandwidth of the interface. There is no timing relationship between the groups of read and write
channels. This means that a read sequence can happen at the same time as a write sequence.

Each of these five channels contains several signals, and all these signals in each channel have
the prefix as follows:

 AW for signals on the Write Address channel


 AR for signals on the Read Address channel
 W for signals on the Write Data channel
 R for signals on the Read Data channel
 B for signals on the Write Response channel
Note: B stands for buffered, because the response from the subordinate happens after all writes
have completed.

Main AXI features


The AXI protocol has several key features that are designed to improve bandwidth and latency of
data transfers and transactions, as you can see here:

Independent read and write channels

AXI supports two different sets of channels, one for write operations, and one for read
operations. Having two independent sets of channel helps to improve the bandwidth
performances of the interfaces. This is because read and write operations can happen at
the same time.

Multiple outstanding addresses

AXI allows for multiple outstanding addresses. This means that a manager can issue
transactions without waiting for earlier transactions to complete. This can improve
system performance because it enables parallel processing of transactions.

No strict timing relationship between address and data operations


With AXI, there is no strict timing relationship between the address and data operations.
This means that, for example, a manager could issue a write address on the Write
Address channel, but there is no time requirement for when the manager has to provide
the corresponding data to write on the Write Data channel.

Support for unaligned data transfers

For any burst that is made up of data transfers wider than one byte, the first bytes
accessed can be unaligned with the natural address boundary. For example, a 32-bit data
packet that starts at a byte address of 0x1002 is not aligned to the natural 32-bit address
boundary.

Out-of-order transaction completion

Out-of-order transaction completion is possible with AXI. The AXI protocol includes
transaction identifiers, and there is no restriction on the completion of transactions with
different ID values. This means that a single physical port can support out-of-order
transactions by acting as several logical ports, each of which handles its transactions in
order.

Burst transactions based on start address

AXI managers only issue the starting address for the first transfer. For any following
transfers, the subordinate will calculate the next transfer address based on the burst
type.

Channel transfers and transactions


This section explains the handshake principle for AXI channels, and shows how the handshake is
the underpinning mechanism for all read and write transactions.

Channel handshake
The AXI4 protocol defines five different channels, as described in AXI channels. All of these
channels share the same handshake mechanism that is based on
the VALID and READY signals, as shown in the following diagram:

The VALID signal goes from the source to the destination, and READY goes from the destination to the
source.
Whether the source or destination is a manager or subordinate depends on which channel is
being used. For example, the manager is a source for the Read Address channel, but a
destination for the Read Data channel.
The source uses the VALID signal to indicate when valid information is available.
The VALID signal must remain asserted, meaning set to high, until the destination accepts the
information. Signals that remain asserted in this way are called sticky signals.
The destination indicates when it can accept information using the READY signal.
The READY signal goes from the channel destination to the channel source.

This mechanism is not an asynchronous handshake, and requires the rising edge of the clock for
the handshake to complete.

Differences between transfers and


transactions
When designing interconnect fabric, you must know the capabilities of the managers and
subordinates that are being connected. Knowing this information lets you include sufficient
buffering, tracking, and decode logic to support the various data transfer ordering possibilities
that allow performance improvements in faster devices.

Using standard terminology makes understanding the interactions between connected


components easier. AXI makes a distinction between transfers and transactions:

 A transfer is a single exchange of information, with


one VALID and READY handshake.
 A transaction is an entire burst of transfers, containing an address transfer, one or
more data transfers, and, for write sequences, a response transfer.

Channel transfer examples


This section examines some examples of possible handshakes between source and destination. It
shows several possible combinations of VALID and READY sequences that conform to the
AXI protocol specifications.
In the first example, shown in the following diagram, we have a clock signal, followed by an
information bus, and then the VALID and READY signals:

This example has the following sequence of events:


1. In clock cycle 2, the VALID signal is asserted, indicating that the data on the information
channel is valid.
2. In clock cycle 3, the following clock cycle, the READY signal is asserted.
3. The handshake completes on the rising edge of clock cycle 4, because
both READY and VALID signals are asserted.

The following diagram shows another example:

This example has the following sequence of events:

1. In clock cycle 1, the READY signal is asserted.


2. The VALID signal is not asserted until clock cycle 3.
3. The handshake completes on the rising edge of clock cycle 4, when
both VALID and READY are asserted.

The final example shows both VALID and READY signals being asserted during the clock
cycle 3, as seen in the following diagram:

Again, the handshake completes on the rising edge of clock cycle 4, when
both VALID and READY are asserted.
In all three examples, information is passed down the channel when READY and VALID are
asserted on the rising edge of the clock signal.

Read and write handshakes must adhere to the following rules:

 A source cannot wait for READY to be asserted before asserting VALID.


 A destination can wait for VALID to be asserted before asserting READY.
These rules mean that READY can be asserted before or after VALID, or even at the same time.

Write transaction: single data item


This section describes the process of a write transaction for a single data item, and the different
channels that are used to complete the transaction.

This write transaction involves the following channels:

 Write Address (AW)


 Write (W)
 Write Response (B)
First, there is a handshake on the Write Address (AW) channel, as shown in the following
diagram:

This handshake is where the manager communicates the address of the write to the subordinate. The
handshake has the following sequence of events:

1. The manager puts the address on AWADDR and asserts AWVALID in clock cycle 2.


2. The subordinate asserts AWREADY in clock cycle 3 to indicate its ability to receive the address
value.
3. The handshake completes on the rising edge of clock cycle 4.

After this first handshake, the manager transfers the data to the subordinate on the Write
(W) channel, as shown in the following diagram:
The data transfer has the following sequence of events:

1. The subordinate is waiting for data with WREADY set to high in clock cycle n.


2. The manager puts the data on the WDATA bus and asserts WVALID in clock cycle n+2.
3. The handshake completes on the rising edge of clock cycle n+3

Finally, the subordinate uses the Write Response (B) channel, to confirm that the write
transaction has completed once all WDATA has been received. This response is shown in the
following diagram:

The write response has the following sequence of events:

1. The manager asserts BREADY.


2. The subordinate drives BRESP to indicate success or failure of the write transaction, and
asserts BVALID.

The handshake completes on the rising edge of clock cycle n+3.

Write transaction: multiple data items


AXI is a burst-based protocol, which means that it is possible to transfer multiple data in a single
transaction. We can transfer a single address on the AW channel to transfer multiple data, with
associated burst width and length information.
The following diagram shows an example of a multiple data transfer:

In this case, the AW channel indicates a sequence of three transfers, and on the W channel, we


see three data transfers.
The manager drives the WLAST high to indicate the final WDATA. This means that the
subordinate can either count the data transfers or just monitor WLAST.
Once all WDATA transfers are received, the subordinate gives a single BRESP value on
the B channel. One single BRESP covers the entire burst. If the subordinate decides that any of
the transfers contain an error, it must wait until the entire burst has completed before it informs
the manager that an error occurred.

Read transaction: single data item


This section looks in detail at the process of a read transaction for a single data item, and the
different channels used to complete the transaction.

This write transaction involves the following channels:

 Read Address (AR)


 Read (R)
First, there is a handshake on the Read Address (AR) channel, as shown in the following
diagram:

The handshake has the following sequence of events:


1. In clock cycle 2, the manager communicates the address of the read to the subordinate
on ARADDR and asserts ARVALID.
2. In clock cycle 3, the subordinate asserts ARREADY to indicate that it is ready to receive the
address value.

The handshake completes on the rising edge of clock cycle 4.

Next, on the Read (R) channel, the subordinate transfers the data to the manager. The following
diagram shows the data transfer process:

The data transfer handshake has the following sequence of events:

1. In clock cycle n, the manager indicates that it is waiting to receive the data by
asserting RREADY.
2. The subordinate retrieves the data and places it on RDATA in clock cycle n+2. In this
case, because this is a single data transaction, the subordinate also sets the RLAST signal to
high.
At the same time, the subordinate uses RRESP to indicate the success or failure of the
read transaction to the manager, and asserts RVALID.

3. Because RREADY is already asserted by the manager, the handshake completes on the rising
edge of clock cycle n+3.

Read transaction: multiple data items


The AXI protocol also allows a read burst of multiple data transfer in the same transaction. This
is similar to the write burst that is described in Write transaction: multiple data items.

The following diagram shows an example of a burst read transfer:


In this example, we transfer a single address on the AR channel to transfer multiple data items, with
associated burst width and length information.

Here, the AR channel indicates a sequence of three transfers, therefore on the R channel, we see


three data transfers from the subordinate to the manager.
On the R channel, the subordinate transfers the data to the manager. In this example, the
manager is waiting for data as shown by RREADY set to high. The subordinate drives
valid RDATA and asserts RVALID for each transfer.
One difference between a read transaction and a write transaction is that for a read transaction
there is an RRESP response for every transfer in the transaction. This is because, in the write
transaction, the subordinate has to send the response as a separate transfer on the B channel. In
the read transaction, the subordinate uses the same channel to send the data back to the manager
and to indicate the status of the read operation.

If an error is indicated for any of the transfers in the transaction, the full indicated length of the
transaction must still be completed. There is no such thing as early burst termination.

Active transactions
Active transactions are also known as outstanding transactions.

An active read transaction is a transaction for which the read address has been transferred, but
the last read data has not yet been transferred at the current point in time.

With reads, the data must come after the address, so there is a simple reference point for when
the transaction starts. This is shown in the following diagram:
For write transactions, the data can come after the address, but leading write data is also allowed. The
start of a write transaction can therefore be either of the following:

 The transfer of the write address

 The transfer of leading write information

Therefore, an active write transaction is a transaction for which the write address or leading write
data has been transferred, but the write response has not yet been transferred.

The following diagram shows an active write transaction where the write address has been
transferred, but the write response has not yet been transferred:

The following diagram shows an active write transaction where the leading write data has been
transferred, but the write response has not yet been transferred:
What are the AMBA protocols?
February 15, 2021 By Nikhil Agnihotri

As electronic miniaturization has been an all-time goal of chip manufacturers,


motherboard-based computer/electronic systems have been eventually replaced by
System-on-Chip (SoC) and Package-on-Package (PoP) ICs. Complex computer systems
are now condensed to smartphones and other handheld devices. These sophisticated
electronic devices and gadgets have an SoC at their heart which manages complete
computing and control. The SoC package is comprised of several intellectual property
(IP) cores. These IP cores come from different chip design companies and vendors.

A scalable, compatible, and efficient data communication between various IP cores in an


SoC had been a challenge. This initially was handled by chip designers by laborious
redesign, compatibility testing, and designing of additional interfaces. This approach
lacked right-first-time coherency leading to costly future re-designs. One of the widely
accepted and feasible solutions to this problem was introduced by Arm in 1996 as
AMBA protocols.

What is AMBA?
AMBA (Advanced Microcontroller Bus Architecture) is a freely-available, open standard
for interconnection and management of IP cores in a System-on-Chip (SoC) IC. It allows
right-first-time development of multi-processor chip designs in a modular, reusable, and
scalable manner. This helps in avoiding costly re-designs and reduces time-to-market
integrated designs.
AMBA was first introduced in 1996 with Advanced Peripheral Bus (APB) and Advanced
System Bus (ASB) specifications. The second version of AMBA was introduced in 1999
and included Advanced High-Performance Bus (AHB) specifications. AMBA 3 that
included Advanced Extensible Interface (AXI), was introduced in 2003. AMBA 4
introduced AXI Coherency Extensions (ACE) in 2010 and AMBA 5, the latest version of
AMBA, introduced Coherent Hub Interface (CHI) in 2013.

AMBA Bus system


A traditional AMBA based SoC design uses Advanced System Bus (ASB) or Advanced
High-performance Bus (AHB) specifications for high bandwidth communication with
blocks like processor, on-chip RAM, memory interfaces, and DMA bus master. While it
uses Advanced Peripheral Bus (APB) for low-bandwidth communication with blocks like
UART, GPIO, keypad, display, timer, etc,  SoCs that have a large number of functional
blocks or IP cores require point-to-point interconnect that uses Advanced Extensible
Interface (AXI) specifications. The AXI bus manages communication using a master-slave
protocol and can be easily bridged with APB. There can be multiple AXI masters and
slaves sharing a bus.

Mobile phones and smartphones that contain SoC having multiple processor cores
sharing a common cache memory require management of coherency across the
memory subsystem. For this, ACE specifications were introduced in AMBA 4. The
AXI/ACE specifications were redesigned as CHI to manage communication mechanisms
in heterogeneous computing systems. In contrast, to signal based protocol in AXI/ACE
specifications, CHI is a packet-based layered protocol that can scale up to
communication mechanisms between heterogeneous functional blocks like Digital
Signal Processors (DSP), Graphics Processing Units (GPU), I/O subsystems, and memory
controllers.

AMBA specifications
AMBA is a set of interconnect protocols. The latest version AMBA 5 includes the
following specifications:

1. APB: The latest version of the Advanced Peripheral Bus (APB) was
introduced in AMBA 2.0. This is a simple non-pipelined protocol that is used
for master-slave communication with low bandwidth peripherals. A number
of peripherals can be connected to a shared bus, which is managed via a
bridge (like AXI-APB bridge) or directly by a master (processor/controller).
In APB specifications, the same set of signals are used to read and write
over the bus and no burst data transfers are supported.
2. ASB: The Advanced System Bus (ASB) is a pipelined protocol for
communication mechanisms with high-bandwidth and high-frequency
components. It supports burst transfers and multiple bus masters. This bus
system supports interconnection between multiple masters and memories.
The bus consists of four types of blocks – Master, Arbiter, Slave, and
Decoder. At any time, only one master can access the bus. A master can
only access the bus with the help of an arbiter while it needs selecting a
slave for communication using a decoder. The master initiates read or write
operation and the selected slave responds to the read and write requests.
3. AHB: The Advanced High-Performance Bus (AHB) was introduced in AMBA 2.0. It
is an alternative to ASB where high-performance features are required. It supports
wider data bus configurations, single-cycle bus-master handover, split
transactions, and single clock-edge operations. Like ASB, the AHB bus also
requires additional components for managing communication mechanisms like a
read multiplexer, write multiplexer, decoder, arbiter, address, and control
multiplexer. The bus system consists of three signals – address signal, write data
bus, and read data bus. The address signal is used to select a slave, the write data
bus is used to move commands from master to the slave, and the read data bus is
used to move responses from slaves to masters. The master access the bus by
requesting to the arbiter and uses decoder to select a slave. The bus is allotted to
a master on the basis of a prioritization scheme. This scheme is defined in the
AMBA specifications and differs between different designs. There are 20 different
AHB signals in total compared to 15 signals in ASB.
4. AHB-lite: It is a simplified version of AHB. It supports communication
mechanisms with a single master without the need for any arbiter. It also
excludes some high-performance features of AHB like split transactions and
retries.
5. AXI: The Advanced Extensible Interface (AXI) is a point-to-point interconnect
specification that overcomes the limitations of shared bus protocols in
connecting multiple agents. It was specifically designed to manage
communication mechanisms with multi-core processors and controllers. AXI
specifications were introduced in AMBA 3.0. Instead of using a system bus, it uses
well-defined interfaces for high-bandwidth and low-latency communication
mechanisms. It has several enhanced features compared to AHB, like multiple
pipelined transfers, separate read/write wires, wider data bus widths, and burst
data transfers.
6. AXI-lite: It is a simplified version of the AXI protocol. It lacks burst data transfers
compared to full AXI specifications.
7. AXI-stream: This is a modification of the AXI protocol for supporting data
streaming from masters to slaves. In this protocol, data is moved only in one
direction from master to slave. The read/write channels are not separate in AXI-
stream unlike in the full AXI specification. It is possible to transfer multiple
streams of data between master and slave. This protocol is useful in applications
like video streaming, game streaming, etc.
8. ACE: The AXI Coherency Extensions (ACE) specifications were introduced in
AMBA 4.0. This specification is used to manage communication mechanisms in
multi-core processors/controllers with coherent cache memories shared between
them. The ACE specification extends the AXI read and write channels using
separate snoop address, snoop data channel,s and snoop response channels.
These additional channels implement snoop based coherency protocol.
9. ACE-lite: ACE-lite is a simplified version of the full ACE specification. It was
designed to manage communication mechanisms with agents that do not have a
cache memory of their own but still can participate in a sharable coherency
system using one-way coherency. Examples of such agents are DMA controllers
and Network-on-Chip blocks.
10. CHI: The Coherent Hub Interface (CHI) is a redesign of the ACE protocol for much
complex heterogeneous computing systems. The ACE protocol uses signal-level
master-slave communication interconnecting using a large number of wires and
additional channels for snoops and responses. This works fine for small coherent
clusters like dual or quad-core mobile SoCs. However, with many heterogeneous
components like DSP, GPU, NPU, etc, AXI hits limitations due to being still a
signal-based protocol. CHI is a redesign of the AXI bus that uses packet-based
interface protocols instead of a signal-based bus system.

Conclusion
If you are in VLSI design, you most likely have heard or learned about AMBA protocols.
AMBA has evolved over years to meet the needs of state-of-the-art SoC designs and
future IC developments. AMBA protocols are open-standard and can be downloaded
from the Arm website after free registration. This article gives you an overview of various
AMBA specifications. You can download the specifications from the Arm website and
learn about these chip design protocols in more depth.
APB Protocol
Introduction
Advanced Peripheral Bus (APB) is the part of Advanced Microcontroller Bus Architecture
(AMBA) family protocols. The latest version of APB is v2.0, which was a part of AMBA 4
realease. It is a low-cost interface and it is optimized for minimal power consumption and
reduced interface complexity. Unlike AHB, it is a Non-Pipelined protocol, used to connect
low-bandwidth peripherals. Mostly, used to connect the external peripheral to the SOC. In
APB, every transfer takes at least two clock cycles (SETUP Cycle and ACCESS Cycle) to
complete. It can also interface with AHB and AXI protocols using the bridges in between.

The above diagram depicts a block diagram of a System. The High-performance ARM
processor is the Core of the system. The other components like High-bandwidth on-chip RAM,
DMA bus master and High-bandwidth Memory Interface are connected to the Core by System
bus,which is AHB in this case. The other low bandwidth peripherals like UART, Timer, Keypad
and PIO are connected to the System bus through the Bridge by using Peripheral bus, here it
is APB bus. In this scenario, the Bridge acts as the AHB Slave corresponding to the Core
Master. And it also acts as the APB Master corresponding to remaining low-bandwidth
external peripherals.

Generally there won’t be any component that produces the APB transfers. The AHB to APB
Bridge is the only component that acts as the APB master in a system.

Block Diagram & Signal Description


From the block diagram shown above,

 System bus slave Interface – This is the System bus interface which transfers the
AHB/AXI transactions to APB Bridge
 PCLK – Generally System clock is directly connected to this
 PRESETn – Active Low Asynchronous Reset
 PADDR[31:0] – Address bus from Master to Slave, can be up 32 to bit wide
 PWDATA[31:0] – Write data bus from Master to Slave, can be up to 32 bit wide
 PRDATA[31:0] – Read data us from Slave to Master, can be up to 32 bit wide
 PSELx – Slave select signal, there will be one PSEL signal for each slave connected to
master. If master connected to ‘n’ number of slaves, PSELn is the maximum number of
signals present in the system. (Eg: PSEL1,PSEL2,..,PSELn)
 PENABLE – Indicates the second and subsequent cycles of transfer. When PENABLE is
asserted, the ACCESS phase in the transfer starts.
 PWRITE – Indicates Write when HIGH, Read when LOW
 PREADY – It is used by the slave to include wait states in the transfer. i.e. whenever
slave is not ready to complete the transaction, it will request the master for some time
by de-asserting the PREADY.
 PSLVERR – Indicates the Success or failure of the transfer. HIGH indicates failure and
LOW indicates Success

Let’s see how a typical Write and Read transfers are done in APB Protocol

WRITE Transfer –  Without Wait States


 At T1, a write transfer with address PADDR,PWDATA,PWRITE and PSEL starts.
 They will registered at the next rising edge of PCLK, T2.
 This is Setup Phase of Transfer.
 After T2, PENABLE and PREADY are registered at the rising edge of PCLK.
 When asserted, PENABLE indicates starting of ACCESS Phase
 When asserted, PREADY indicates that slave can complete the transfer at the next
rising edige of PCLK.
 PADDR, PDATA and control signals all should remain valid till the transfer completes at
T3.
 PENABLE signal will be de-asserted at the end of transfer.

PSEL is also de-asserted, if next transfer is not to the same slave.

WRITE Transfer –  With Wait States


 During the ACCESS Phase, when PENABLE is high, the slave extends the transfer by
driving PREADY low.
 The PADDR, PWRITE, PSEL, PENABLE, PWDATA, PSTRB, PPROT signals should remain
unchanged while PREADY is low
 PREADY can take any value when PENABLE is low.
 It is recommended that the address and write signals are not changed immediately
after a transfer, but remain stable until another access occurs.

READ Transfer –  Without Wait States

 At T1, a READ transfer with address PADDR, PWRITE and PSEL starts.
 They will be registered at rising edge of PCLK.
 This is SETUP Phase of the transfer.
 After T2, PENABLE and PREADY are registered at the rising edge of PCLK.
 When asserted, PENABLE indicates the starting of ACCESS phase.
 When asserted, PREADY indicates that slave can complete the transfer at next rising
edge of PCLK by providing the data on PRDATA.
 Slave must provide the data before the end of read transfer. i.e. before T3.

READ Transfer –  With Wait States

 During the ACCESS Phase, when PENABLE is high, the slave extends the transfer by
driving PREADY low.
 The PADDR, PWRITE, PSEL, PENABLE, PPROT signals should remain unchanged while
PREADY is low

ERROR Response

Whenever there is a problem in the transfer, Slave indicates the error response for the
transfer by asserting the PSLVERR signal. PSLVERR is only considered valid during the last
cycle f and APB transfer, when PSEL, PENABLE and PREADY are all HIGH. It is recommended,
but not mandatory that you drive PSLVERR low when it is not being sampled.

Transactions that receive an error response, might or might not have changed the state of
peripheral. For example, If APB master performs a write transaction to an APB slave and
received an error response, it is not guaranteed that the data is not written on the slave
peripheral.

Error Response for a read transfer:


Error Response for a write transfer:

Protection Unit Support:

             To support complex system designs, it is often necessary for both the interconnect and
other devices in the system to provide protection against illegal transactions.  It is provided
by Protection Unit in APB Protocol. The signals indicating the protection unit are PPROT[2:0].

The three levels of access protection are

 PPROT[0]:
 LOW indicates Normal Access
 HIGH indicates Privileged Access
 PPROT[1]:
 LOW indicates Secure Access
 HIGH indicates Non-Secure Access
 PPROT[2]:
 LOW indicates Data Access
 HIGH indicates Instruction Access

Operating States

The APB Protocol operates in three operating states as shown below.

IDLE :  This is the default state of APB.

SETUP : When transfer is required, PSELx is asserted then the bus moves in setup state. Bus
only remains in SETUP for only one clock cycle and always moves to ACCESS state on next
rising edge of clock. So, the slave must be able to sample the Address and control information
in the SETUP cycle itself.

ACCESS :  PENABLE is asseted to enter into the ACCESS state. The PADDR, PWRITE, PSELx and
PWDATA signals must remain stable during ACCESS state.

You might also like