Notes - Unit 5
Notes - Unit 5
Notes - Unit 5
EMBEDDED SYSTEMS IN ALL-PROGRAMMABLE SOC
AXI BUS
References:
Write Address Channel
ZynqTM Book
AXI4 Specification Address/
Connecting User Logic to AXI Interfaces of Control
High-Performance Communication Blocks in
the SmartFusion2 Devices – Libero SoC v11.4.
Write Data Channel
AXI4-FULL INTERFACE Write Write
The AXI protocol is burst-based and defines Data Data
five independent transaction channels.
Write Channel Architecture: Address and
Control data is transmitted to the slave before
AXI MASTER
Write Response Channel
AXI SLAVE
a burst of data is transmitted, and a Write
Write
Response signaled following completion:
Response
Write Address Channel
Write Data Channel
Write Response Channel
Read Channel Architecture: Address and Read Address Channel
Control data transmitted to the slave before a Address/
burst of read data is transmitted to the master: Control
Read Address Channel
Read Data Channel
Data can move in both directions Read Data Channel
simultaneously.
Read Read
Data transfer size: up to 256 data transfers
Data Data
(burst transactions).
AXI4-Lite: One data transfer per transaction.
Burst is not supported
AXI4-Stream: One single channel for transmission of streaming data. It can burst an unlimited amount of data.
Write/Read Data Channel: The data bus can be: 8, 16, 32, 64, 128, 256, 512, or 1024 bits wide.
Burst Size: This is defined by the signals 𝑆_𝐴𝑋𝐼_𝐴𝑊𝑆𝐼𝑍𝐸 and 𝑆_𝐴𝑋𝐼_𝐴𝑅𝑆𝐼𝑍𝐸. They can have the values 000 (1 byte), 001
(2 bytes), 010 (4 bytes), 011 (8 bytes), and 100 (16 bytes = 128 bits).
The Burst Size must not exceed the Data Bus Width. If the AXI Width is greater than the Burst size, the AXI interface must
determine from the transfer address which byte lanes of data bus to use for each transfer (when writing, this can be done
using the WSTRB signal).
As a good rule of thumb, make the Burst Size the same as the Write/Read Data Channel.
Burst type: Defined by 𝑆_𝐴𝑋𝐼_𝐴𝑊𝐵𝑈𝑅𝑆𝑇 and 𝑆_𝐴𝑋𝐼_𝐴𝑅𝐵𝑈𝑅𝑆𝑇. 00: FIXED (address remains constant during transaction),
01: INCR (address increments depending on the transaction size), 10: WRAP. This is for the address inside the peripheral
where data should be placed. It is up to the recipient of the data to implement this feature.
Burst Length: This is defined by the S_AXI_AWLEN and S_AXI_ARLEN signals. It provides the exact number of transfers
in a burst. 1-256 (0x00 – 0xFF) for the INCR burst type. For all the other burst types, only 1-16 are supported. (It seems
that in Zynq, burst can only be up to 16 words.)
Signals:
Global System Signals:
S_AXI_CLK: AXI4 clock
S_AXI_ARESETN: AXI4 active-low reset.
Each of the five channels has their own set of respective signals:
WRITE ADDRESS CHANNEL WRITE DATA CHANNEL
S_AXI_AWID S_AXI_WDATA
S_AXI_AWADDR S_AXI_WSTRB
AXI MASTER
AXI SLAVE
S_AXI_AWLEN S_AXI_WLAST
S_AXI_AWSIZE S_AXI_WUSER
S_AXI_AWBURST S_AXI_WVALID
AXI MASTER
S_AXI_AWLOCK S_AXI_WREADY
AXI SLAVE
S_AXI_AWCACHE
S_AXI_AWPROT WRITE RESPONSE CHANNEL
S_AXI_AWQOS S_AXI_BID
AXI MASTER
S_AXI_AWREGION S_AXI_BRESP
AXI SLAVE
S_AXI_AWUSER S_AXI_BUSER
S_AXI_AWVALID S_AXI_BVALID
S_AXI_AWREADY S_AXI_BREADY
AXI SLAVE
S_AXI_ARLEN S_AXI_RRESP
S_AXI_ARSIZE S_AXI_RLAST
S_AXI_ARBURST S_AXI_RUSER
AXI MASTER
S_AXI_ARLOCK S_AXI_RVALID
AXI SLAVE
S_AXI_ARCACHE S_AXI_RREADY
S_AXI_ARPROT
S_AXI_ARQOS
S_AXI_ARREGION
S_AXI_ARUSER
S_AXI_ARVALID
S_AXI_ARREADY
AXI4-FULL PROTOCOL
The VALID/READY handshake process is used by all five transaction channels (‘Assert and Wait’ Rule)
VALID: Generated by the source only when information (address, data, and control) is available.
READY: Generated by the destination to indicate it can accept information.
Transfer occurs on the rising clock edge when VALID=READY=1. At that moment, VALID becomes 0 followed by READY
becoming 0. * A source is not permitted to wait until READY is asserted before asserting VALID.
ACLK
ARESETN
VALID
READY
ACLK
WRITE ADDRESS
AWADDR A
CHANNEL
AWLEN 03
AWVALID
AWREADY
WLAST
WVALID
WREADY
WRITE RESPONSE
BRESP OK
CHANNEL
BVALID
BREADY
ACLK
READ ADDRESS
ARADDR A
CHANNEL
AWLEN 03
ARVALID
ARREADY
RLAST
RVALID
RREADY
RRESP OK OK OK OK
AXI4-LITE INTERFACE
This is a reduced version of the AXI4-Full. It does not support bursts, i.e., we only have one transaction at a time.
Data bus: 32 or 64 bits.
WRITE ADDRESS CHANNEL WRITE DATA CHANNEL
S_AXI_AWADDR S_AXI_WDATA
AXI MASTER
AXI MASTER
AXI SLAVE
AXI SLAVE
S_AXI_AWPROT S_AXI_WSTRB
S_AXI_AWVALID S_AXI_WVALID
S_AXI_AWREADY S_AXI_WREADY
AXI SLAVE
S_AXI_BVALID
S_AXI_BREADY
AXI MASTER
AXI SLAVE
AXI SLAVE
S_AXI_ARPROT S_AXI_RRESP
S_AXI_ARVALID S_AXI_RVALID
S_AXI_ARREADY S_AXI_RREADY
AXI4-LITE PROTOCOL
The AXI Master Interface provided by Zynq in Vivado sends both the Write Address and Write Data at the same time. When
Reading, the Master first requests to read an address and the AXI Slave responds with data.
Write cycle and Read Cycle (Xilinx AXI4-Lite, from Master’s point of view)
S_AXI_AWREADY: Registered signal asserted for one clock S_AXI_AWADDR axi_awaddr
cycle when S_AXI_AWVALID=S_AXI_WVALID=‘1’ (this can E
happen immediately or after a few cycles).
S_AXI_WREADY: Registered signal that is asserted for one S_AXI_AWVALID S_AXI_WREADY
clock cycle when S_AXI_AWVALID=S_AXI_WVALID=1 (this S_AXI_WVALID
can happen immediately or after a few cycles).
S_AXI_AWADDR: It is captured into 𝑎𝑥𝑖_𝑎𝑤𝑎𝑑𝑑𝑟 when
S_AXI_AWVALID=S_WVALID=‘1’, S_AXI_AWREADY=’0’. S_AXI_AWVALID S_AXI_AWREADY
S_AXI_ARREADY: It is asserted for one clock cycle when S_AXI_WVALID
S_AXI_RVALID is asserted (it can happen immediately or after
a few cycles).
S_AXI_ARADDR: It is captured into the 𝑎𝑥𝑖_𝑎𝑟𝑎𝑑𝑑𝑟 signal when S_AXI_ARADDR axi_araddr
S_AXI_ARVALID =’1’ and S_AXI_ARREADY=’0’. E
S_AXI_RVALID: It is asserted for one clock cycle right after
both S_AXI_ARVALID and S_AXI_ARREADY are detected to be S_AXI_ARREADY
S_AXI_ARVALID
‘1’. During that clock cycle, S_AXI_RREADY is still ‘1’ (due to
the AXI specification), so when S_AXI_RVALID becomes zero,
S_AXI_RREADY follows suit and becomes zero.
WRITE ADDRESS
ACLK
CHANNEL
AWADDR A
AWVALID
AWREADY
WDATA D(A)
WRITE DATA
CHANNEL
WSTRB 1111
WVALID
WREADY
WRITE RESPONSE
BRESP OK
CHANNEL
BVALID
BREADY
ACLK
READ ADDRESS
CHANNEL
ARADDR A
ARVALID
ARREADY
RDATA D(A)
READ DATA
CHANNEL
RVALID
RREADY
RRESP OK
S_AXI_AWADDR 4 4 S_AXI_ARADDR
S_AXI_AWVALID upix_ip
S_AXI_ARVALID
S_AXI_AWREADY LUT S_AXI_ARREADY
8-to-8
Slave Slave
Register 0 LUT Register 1
S_AXI_WDATA 32 8-to-8 32 S_AXI_RDATA
S_AXI_WSTRB 4 LUT S_AXI_RRESP
E E
8-to-8
slv_reg_rden
S_AXI_WREADY S_AXI_RREADY
= 00 = 01
S_AXI_BRESP 2
axi_awaddr(3..2)
axi_araddr(3..2)
S_AXI_BVALID
S_AXI_BREADY
S_AXI_ACLK
Address (𝑆_𝐴𝑋𝐼_𝐴𝑊𝐴𝐷𝐷𝑅, 𝑆_𝐴𝑋𝐼_𝐴𝑅𝐴𝐷𝐷𝑅): In this example, we selected only two registers, but Vivado 2015.3 creates
a template with a minimum of four 32-bit registers. So, we have 16 bytes, hence the 4 bit addresses, from which we
only use the 2 MSBs to identify the Slave Registers: Register 0 is given the 00 code, and Register 1 the 01 code.
Note that for Slave Registers 1 and 2, we do not need a physical register for both so-called Slave Registers. A multiplexor
suffices in this case.
Important: The pipelined divider captures input data when 𝐸 = 1. We use the signal 𝑠𝑙𝑣_𝑟𝑒𝑔_𝑤𝑟𝑒𝑛 to determine whether
data is present on Slave Register 0. However, data is present on Slave Register 0 on the cycle after 𝑠𝑙𝑣_𝑟𝑒𝑔_𝑤𝑟𝑒𝑛 = 1. That
is why 𝐸 is asserted on the next state (S2). This is an important consideration when designing more complex systems.
Software application: The software routine writes a 32-bit word (A and B) and the divider starts processing. The software
routine must write another 32-bit word (a dummy) to restart the process.
An improvement would be to let the software online write 32-bit words of actual data (A and B), while a more complex FSM
would take care of asserting E and then de-asserting E (when the processor requests reading via 𝑠𝑙𝑣_𝑟𝑒𝑔_𝑟𝑑𝑒𝑛). When using
more Slave Registers we need to consider 𝑎𝑥𝑖_𝑎𝑤𝑎𝑑𝑑𝑟 and 𝑎𝑥𝑖_𝑎𝑟𝑎𝑑𝑑𝑟 to identify the registers to/from we write/read.
S_AXI_AWADDR 4 4 S_AXI_ARADDR
S_AXI_AWVALID S_AXI_ARVALID S_AXI_ARESETN=0
Slave Register 1 = 01 S1
S_AXI_AWREADY Slave Register 2 = 10 S_AXI_ARREADY
Slave
Register 0 PIPELINED DIVIDER
S_AXI_WDATA 32 A Q
32 S_AXI_RDATA 0
01 slv_reg_wren
S_AXI_WSTRB 4 B R S_AXI_RRESP
E 10 E 1
v
S2
slv_reg_rden
S_AXI_WREADY S_AXI_RREADY
= 00 0
slv_reg_wren
S_AXI_BRESP 2
axi_awaddr(3..2)
1
S_AXI_BVALID
S_AXI_BREADY FSM E 0
FSM at S_AXI_ACLK
slv_reg_wren
S_AXI_ACLK
Reading bursts (according to timing diagram obtained by simulating Vivado template), this particular circuit can only output
one word every two cycles.
Burst: This is configured by: i) 𝑆_𝐴𝑋𝐼_𝐴𝑊𝑆𝐼𝑍𝐸 and 𝑆_𝐴𝑋𝐼_𝐴𝑅𝑆𝐼𝑍𝐸 (Data width per burst), ii) 𝑆_𝐴𝑋𝐼_𝐴𝑊𝐵𝑈𝑅𝑆𝑇 and
𝑆_𝐴𝑋𝐼_𝐴𝑅𝐵𝑈𝑅𝑆𝑇 (Burst type), and iii) 𝑆_𝐴𝑋𝐼_𝐴𝑊𝐿𝐸𝑁 and 𝑆_𝐴𝑋𝐼_𝐴𝑅𝐿𝐸𝑁 (transfer per bursts).
S_AXI_AWID S_AXI_ARID
S_AXI_AWADDR 6 6 S_AXI_ARADDR
S_AXI_AWLEN 8 8 S_AXI_ARLEN
S_AXI_AWSIZE 3 3 S_AXI_ARSIZE
S_AXI_AWBURST 2 axi_awv_awr_flag 2 S_AXI_ARBURST
S_AXI_AWVALID axi_arv_arr_flag S_AXI_ARVALID
S_AXI_AWREADY S_AXI_ARREADY
Memory
S_AXI_WDATA 32 S_AXI_RID
S_AXI_WSTRB 4 32 S_AXI_RDATA
S_AXI_WLAST 2 S_AXI_RRESP
S_AXI_WVALID S_AXI_RLAST
S_AXI_WREADY S_AXI_RVALID
S_AXI_RREADY
S_AXI_BID
axi_wready
S_AXI_BRESP 2
S_AXI_BVALID mem_rden
S_AXI_BREADY mem_wren
axi_awaddr
axi_araddr
S_AXI_AWID S_AXI_ARID
S_AXI_AWADDR 6 6 S_AXI_ARADDR
S_AXI_AWLEN 8 8 S_AXI_ARLEN
S_AXI_AWSIZE 3 3 S_AXI_ARSIZE
S_AXI_AWBURST 2 2 S_AXI_ARBURST
axi_awv_awr_flag
S_AXI_AWVALID S_AXI_ARVALID
S_AXI_AWREADY axi_arv_arr_flag S_AXI_ARREADY
Memory
S_AXI_WDATA 32 LUTs S_AXI_RID
S_AXI_WSTRB 4 8-to-8 32 S_AXI_RDATA
S_AXI_WLAST 2 S_AXI_RRESP
S_AXI_WVALID S_AXI_RLAST
S_AXI_WREADY S_AXI_RVALID
S_AXI_RREADY
S_AXI_BID axi_wready
S_AXI_BRESP 2
S_AXI_BVALID mem_rden
S_AXI_BREADY mem_wren
axi_awaddr
axi_araddr
Input Output
0xDEADBEEF 0xEED2DDF7
0xBEBEDEAD 0xDDDDEED2
0xFADEBEAD 0xFDEEDDD2
0xCAFEBEDF 0xE3FFDDEF
S_AXI_AWID S_AXI_ARID
S_AXI_AWADDR 6 6 S_AXI_ARADDR
S_AXI_AWLEN 8 8 S_AXI_ARLEN
S_AXI_AWSIZE 3 3 S_AXI_ARSIZE
S_AXI_AWBURST 2 2 S_AXI_ARBURST
upix_ip
S_AXI_AWVALID S_AXI_ARVALID
S_AXI_AWREADY LUT
S_AXI_ARREADY
iFIFO 512x32 8-to-8 oFIFO 512x32
FWFT LUT FWFT
S_AXI_WDATA 32 8-to-8 S_AXI_RID
DI DO DI DO
S_AXI_WSTRB 4 wren rden LUT wren rden 32 S_AXI_RDATA
8-to-8
S_AXI_WLAST 2 S_AXI_RRESP
empty
empty
rst rst
S_AXI_WVALID S_AXI_RLAST
full
full
LUT
8-to-8
S_AXI_WREADY S_AXI_RVALID
S_AXI_RREADY
iempty
S_AXI_BID FSM
ifull
S_AXI_BRESP 2
oempty
S_AXI_BVALID orden
FSM
S_AXI_BREADY
axi_rvalid
S_AXI_ACLK
mem_rden
mem_wren axi_arv_arr_flag
CLKFX
10 Instructor: Daniel Llamocca
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY
ECE-495/595: Special Topics – Reconfigurable Computing Fall 2015
no CC+1
C=15
fifo_fsm_rst 1 0
yes iempty
C0
1
S2
0 oempty
1
S2
iempty=0 no
& ofull=0
0 1 yes
mem_wren
irden1,
0 owren1
mem_rden
ifull
1 0 FSM at CLKFX
1 iwren1
oempty
0
AXI_ARESETN rst
0
axi_rvalid
fifo_fsm_rst
1
orden1
FSM at S_AXI_ACLK
S_AXI_AWID S_AXI_ARID
S_AXI_AWADDR 6 6 S_AXI_ARADDR
S_AXI_AWLEN 8 8 S_AXI_ARLEN
S_AXI_AWSIZE 3 3 S_AXI_ARSIZE
S_AXI_AWBURST 2 2 S_AXI_ARBURST
S_AXI_AWVALID S_AXI_ARVALID
S_AXI_AWREADY 2D DCT IP Output S_AXI_ARREADY
iFIFO 512x32 oFIFO 512x32
Output Interface
Buffer
Input Interface
FWFT FWFT
NOxN
BxN
S_AXI_WDATA 32 32 S_AXI_RDATA
DI DO X Y ... DI DO
S_AXI_WSTRB 4 wren rden wren rden S_AXI_RID
rst E v
S_AXI_WLAST 2 S_AXI_RRESP
empty
empty
rst N rst
S_AXI_WVALID S_AXI_RLAST
full
full
irden
S_AXI_WREADY S_AXI_RVALID
S_AXI_RREADY
iempty
S_AXI_BID FSM owren
S_AXI_BRESP 2 ifull
oempty
S_AXI_BVALID
orden
S_AXI_BREADY FSM
axi_rvalid
S_AXI_ACLK
mem_wren
mem_rden axi_arv_arr_flag
CLKFX