Gem5 X TechnicalManual Wireless
Gem5 X TechnicalManual Wireless
*
E MBEDDED S YSTEMS L ABORATORY,
S WISS F EDERAL I NSTITUTE OF T ECHNOLOGY, L AUSANNE (EPFL)
‡
S CHOOL OF E NGINEERING AND M ANAGEMENT VAUD (HEIG-VD),
U NIVERSITY OF A PPLIED S CIENCES W ESTERN S WITZERLAND (HES-SO)
**
D EPARTMENT OF C OMPUTER A RCHITECTURE ,
C OMPLUTENSE U NIVERSITY OF M ADRID
B ASED ON V 2.0 OF GEM 5-X TECHNICAL MANUAL BY YASIR Q URESHI * , W ILLIAM S IMON * , M ARINA
Z APATER *‡ , K ATZALIN O LCOZ ** , AND DAVID ATIENZA *
December 2022
CONTENTS
Contents
1 Executive Summary 2
1.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Release Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Collaboration and Contact Information . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Running gem5-X Full System (FS) Mode with ARMv8 and Linux 4
2.1 Necessary Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Full System Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Device Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Quick-Start Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Building the gem5 Binary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.3 Running Your FS Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Hot-fixes for running gem5-X on Ubuntu 20.04 using Docker . . . . . . . . . . . . . . 6
6 Core Clustering 16
7 Heterogeneous Cores 17
Page 1 of 23
1 Executive Summary
1.1 Abstract
The gem5 architectural simulator is well established and widely used in both the industry and
academia. Based on gem5, we present we present gem5-X (”a gem5-based full-system simulator
with architectural eXtensions”), a simulation framework that enables fast profiling and architectural
exploration and optimization for system level architectural innovations. Gem5-X provides out-of-
the-box simulation of ARM based systems with full Linux stack, along with several architectural ex-
tensions like ISA extensions, clustering, heterogeneous many-core simulation and HBM2 memory
model. Several enhanced features have also been added, like advanced check-pointing, workload
automation (WA) and gperf profiler support.
This version of the gem5-X repository, named gem5-X-On-Chip-Wireless, further provides
support for emulating systems featuring in-package wireless communication enabled by nano-
antennae and on-chip transceivers. This gem5-X fork was developed in the context of the Wiplash
Horizon 2020 project (https://www.wiplash.eu).
In this technical manual, we first provide guidelines on how to use various architecture features
and support enhancements of gem5-X. More information on downloading and source code for
gem5-X can be found at https://esl.epfl.ch/gem5-x.
Then in Section 9, we describe the features specific to gem5-X-On-Chip-Wireless.
Page 2 of 23
1.3 Collaboration and Contact Information
Because the scope of this project is very large, we are always interested in potential collabo-
ration efforts to develop new features and keep gem5-X updated to gem5 master. For inquiries,
source code, and additional information, please contact one of the aforementioned emails.
Page 3 of 23
2 Running gem5-X Full System (FS) Mode with ARMv8 and Linux
In this chapter we describe how to configure and run our ARMv8 64-bit FS simulation in gem5-X
.
• A bootloader
• A disk image
Once you register for gem5-X at https://esl.epfl.ch/gem5-x, you will receive an email with a link
to all the system files, except for the device tree. The file downloaded is named full system images.tar.gz.
This contains the disk image, bootloader and kernel binary. Follow the instructions below to set it
up
1 t a r − z x v f f u l l s y s t e m i m a g e s . t a r . gz
The files are as follows:
• Kernels (vmlimux and vmlinux wa) are at [path to full system images]/binaries/
• Disk image (gem5 ubuntu16.img) can be found at [path to full system images]/disks/
We now need to setup the path to full system images, so that the files under it can be used
and recognized by gem5-X during FS simulation.
1 cd <path to gem5 −X>
2 . / apply − patch . sh <PATH TO FULL SYSTEM IMAGES>
Alternatively, you can also do
1 e x p o r t M5 PATH=<PATH TO FULL SYSTEM IMAGES>
The full system files are now setup and ready to be used in FS mode.
Page 4 of 23
2.2 Quick-Start Guide
2.2.1 Prerequisites
You will need to set up the gem5-X environment in order to compile and run the gem5-X binary
using the SCons (SConstruct) builder. If running on an Ubuntu-based host system, you can use the
following command to get all the required libraries. However, there are some known dependency
problems on the latest Ubuntu image, i.e., 20.04. If you are running these host systems, we
recommend you follow Section 2.3 to build a docker image to run gem5-X inside.
1 sudo a p t i n s t a l l b u i l d − e s s e n t i a l g i t m4 scons z l i b 1 g \
2 z l i b 1 g −dev l i b p r o t o b u f −dev p r o t o b u f − c o m p i l e r l i b p r o t o c −dev \
3 l i b g o o g l e − p e r f t o o l s −dev python −dev python − s i x python \
4 l i b b o o s t − a l l −dev swig
Once the above is done, you will need to build a ARM gem5 binary. You can create multiple
builds including .fast, .opt, and .debug. If you are only concerned about running experiments, it
is recommended to only create gem5.fast. However, if you need to debug anything or want to
generate traces, you will need to build gem5.opt or gem5.debug. Do this with the following:
1 cd <path to gem5 −X>/
2 scons b u i l d /ARM/ gem5 . { f a s t , opt , debug}
Additionally, if you would like to speed up the compilation process, you can use the option ”-jN”
on the scons build line where N is the number of threads you want to assign for compilation.
Page 5 of 23
2.3 Hot-fixes for running gem5-X on Ubuntu 20.04 using Docker
Once the build process is complete you can launch your simulation in the following way
1 cd <path to gem5 −X>/
2
3 . / b u i l d /ARM/ gem5 . { f a s t , opt , debug} \
4 −−remote −gdb− p o r t =0 \
5 −d / path / t o / your / o u t p u t / d i r e c t o r y \
6 c o n f i g s / example / f s . py \
7 −−cpu− c l o c k =1GHz \
8 −− k e r n e l = v m l i n u x \
9 −−machine − t y p e =VExpress GEM5 V1 \
10 −−dtb − f i l e =<f u l l p a t h t o g e m 5 −X>/ system / arm / d t / armv8 gem5 v1 1cpu . d t b \
11 −n 1 \
12 −− d i s k −image=gem5 ubuntu16 . img \
13 −−caches \
14 −− l2cache \
15 −− l 1 i s i z e =32kB \
16 −− l 1 d s i z e =32kB \
17 −− l 2 s i z e =1MB \
18 −− l 2 a s s o c =2 \
19 −−mem− t y p e =DDR4 2400 4x16 \
20 −−mem− ranks =4 \
21 −−mem− s i z e =4GB \
22 −−sys − c l o c k =1600MHz
At this point you should be able to connect to your running gem5 instance in another terminal
with,
1 t e l n e t l o c a l h o s t 3456
Alternatively, you can also build the terminal program provided with gem5-X and use it
1 cd <path to gem5 −X>/ u t i l / term /
2 make
3 m5term 1 2 7 . 0 . 0 . 1 3456
Upon connecting to your gem5 instance, you should be able to the kernel dmesg, followed
finally by a login and a terminal in the gem5-X FS mode
Page 6 of 23
2.3 Hot-fixes for running gem5-X on Ubuntu 20.04 using Docker
Page 7 of 23
3 Support Enhancements of Gem5-X
In this chapter we will look into the following support enhancements we have added in gem5-X:
• Enhanced checkpointing
• gperf profiler
• File sharing between gem5-X and host system using 9P over Virtio
Page 8 of 23
3.2 Gperf Profiler
Sometimes it is feasible to take a checkpoint using SimpleAtomic CPU model just before your
region-of-interest (ROI) and then switch to an accurate in-order or OoO CPU. You can do this in
either the command prompt or a script using the following command:
m5 c h e c k p o i n t
If you are in a C/C++ program, you can use the following call within the program,
system ( ”m5 c h e c k p o i n t ” ) ;
• After installation, check where DIOD is installed by typing ”which diod”. This path should be
updated in the file src/dev/virtio/VirtIO9P.py at line 62. Then re-compile gem5-X using scons
command as usual.
Page 9 of 23
3.4 Modifying disk image using QEMU
• Use kernel ”vmlinux wa”, during the gem5 simulation. This file is provided with gem5-X under
full system images/binaries
• Now any file under the ”SHARED FOLDER ON HOST SYSTEM” appears in the /mnt direc-
tory in gem5 simulation.
Page 10 of 23
3.4 Modifying disk image using QEMU
Page 11 of 23
4 ARMv8 ISA Extension
This guide describes how to extend the ARMv8 ISA, using as an example the creation of a
custom “ADD1” instruction, which does exactly the same than the ARM ADD instruction, but using
one of the unallocated opcodes.
In order to extend the ISA of any architecture in Gem5, some unallocated opcodes need to
exist. There are several opportunities to extend the ARMv8 ISA in gem5-X as there exist a lot of
unallocated opcodes. The complete ARMv8 ISA can be found at:
https://static.docs.arm.com/ddi0487/ca/DDI0487C a armv8 arm.pdf
3. Open the file aarch64.isa, which contains the top-level decoder functionality
vim aarch64 . i s a
4. Go to the end of the file and look for the function “def format Aarch64()”. This is where the
top-level decoding is done according to Table C4-1. You will see that when bit[27]=0 and
bit [28]=0, there is a call to “Unknown64()” function, as there is no instruction allocated for
this opcode. This is where we will add our instruction. You can remove the line where there
is a return of Unknown4() and add the following, where our instruction will be decoded to
“decodeCusDataProcImm” function.
/ / b i t 28:27=00
r e t u r n decodeCusDataProcImm ( machInst ) ;
5. Now, we need to add this new function. To do it, add it in the beginning of the file under
o u t p u t header {{
namespace Aarch64
{
6. Now, to add the ADD functionality to this new function, we implement it in a similar way as
is done for “AddXImm” function in the file. We also name our new instruction as Add1XImm.
Please refer to the modified aarch64.isa file that can be found here under the modified files
folder.
Page 12 of 23
4.2 Handling the new ADD1 instruction
7. The above was to add the instruction into the decoding path. To have the actual functionality
implemented as ADD, we need to modify the following:
cd . . / i n s t s / ( f u l l path i s gem5 / s r c / arch / arm / i s a / i n s t s )
vim data64 . i s a
10. Note that have assigned this instruction an OpClass of CusAluOp. To enable this, we had to
add the overrideOpClass parameter to different functions in the file. Please see the attached
data64.isa file here under the modified files folder to see how we did it.
2. We need to add the CusAlu unit under (Please refer to attached FuncUnit.py file here under
the modified files folder)
c l a s s OpClass (Enum ) :
v a l s c l a s s OpClass (Enum ) :
vals = [
3. Open the file op class.hh and add the following to the end
s t a t i c const OpClass CusAluOp = Enums : : CusAlu ;
Edit the following files. (See the attached files here under the modified files folder for details)
Page 13 of 23
4.3 Testing that the ISA extension works
• FUPool.py
• FuncUnitConfig.py
Edit the following files. (Attached here under the modified files folder )
• base.cc
• exec context.hh
7. cd . . / . . / p r o t o / ( F u l l path gem5 / s r c / p r o t o )
8. Edit the file inst.proto and add following to the enum InstType
CusAlu = 3 4 ;
1. Run Gem5 in SE (System Emulation) mode using the test program. The output should look
like this:
C[ 0 ] = 13
C[ 1 ] = 23
C[ 2 ] = 33
C[ 3 ] = 43
C[ 4 ] = 53
C[ 5 ] = 63
C[ 6 ] = 73
C[ 7 ] = 83
C[ 8 ] = 93
C[ 9 ] = 103
The program should also work perfectly fine in full-system mode as we are using in-line
assembly to call the newly added instruction.
Page 14 of 23
5 High Bandwidth Memory v2 (HBM2)
High Bandwidth Memeory (HBM) is based on 3D stacked DRAM banks made possible due to
Through Silicon Vias (TSVs) achieving a high bandwidth of up to 307.2 GB/s. To implement the
functional behavior of the HBM2 memory model in gem5-X, we extend the DRAM controller model
of gem5 according to the architectural details of HBM2. To have 8-channels with memory inter-
leaving, we initialized 8 DRAM controllers, each 128 bits wide. We connect all 8 DRAM controllers
to a 1024-bit wide system bus, that connects to the cache hierarchy.
To use 8-channel HBM2 in gem5-X full system simulation, with appropriate bus widths through-
out the system all the way to the caches, use the following command:
1 cd <path to gem5 −X>/
2
3 . / b u i l d /ARM/ gem5 . { f a s t , opt , debug} \
4 −−remote −gdb− p o r t =0 \
5 −d / path / t o / your / o u t p u t / d i r e c t o r y \
6 c o n f i g s / example / f s . py \
7 −−cpu− c l o c k =1GHz \
8 −− k e r n e l = v m l i n u x \
9 −−machine − t y p e =VExpress GEM5 V1 \
10 −−dtb − f i l e =<f u l l p a t h t o g e m 5 −X>/ system / arm / d t / armv8 gem5 v1 1cpu . d t b \
11 −n 1 \
12 −− d i s k −image=gem5 ubuntu16 . img \
13 −−caches \
14 −−l2cache \
15 −− l 1 i s i z e =32kB \
16 −− l 1 d s i z e =32kB \
17 −− l 2 s i z e =1MB \
18 −− l 2 a s s o c =2 \
19 −−l2bus − w i d t h =128 \
20 −−membus− w i d t h =128 \
21 −−mem− t y p e =HBM2 2000 4H 1x128 \
22 −−mem− ranks =1 \
23 −−mem−channels =8 \
24 −−mem− s i z e =4GB \
25 −−sys − c l o c k =1600MHz \
No separate software support is required to use HBM2 in FS mode, and hence we are able to
boot the Ubuntu Linux distribution using HBM2.
Page 15 of 23
6 Core Clustering
Core clustering enables group of compute cores to have their own shared cache, which can be
last level cache (LLC), separate from other cores in the system. This reduces the shared resources
between different compute clusters in the system to just cross bar interconnect and memory. In
addition, clustering is also used when different type of cores are used in system. Same core types
are clustered together with their own LLCs. This enables to have a heterogeneous system.
Cluster in now supported in gem5-X. To have different core clusters in gem5-X, use the follow-
ing command:
1 . / b u i l d /ARM/ gem5 . { f a s t , opt , debug} \
2 −−remote −gdb− p o r t =0 \
3 −d / path / t o / your / o u t p u t / d i r e c t o r y \
4 c o n f i g s / example / f s . py \
5 −−cpu− c l o c k =1GHz \
6 −− k e r n e l = v m l i n u x \
7 −−machine − t y p e =VExpress GEM5 V1 \
8 −−dtb − f i l e =<path to gem5 −X>/ system / arm / d t / armv8 gem5 v1 <NUM CORES>cpu . d t b \
9 −n <NUM OF CORES> \
10 −− d i s k −image=gem5 ubuntu16 . img \
11 −−caches \
12 −−l2cache \
13 −− l 1 i s i z e =32kB \
14 −− l 1 d s i z e =32kB \
15 −− l 2 s i z e =1MB \
16 −− l 2 a s s o c =2 \
17 −− l 2 c l u s t e r s i z e =<NUM OF CORE PER CLUSTER> \
18 −−mem− t y p e =DDR4 2400 4x16 \
19 −−mem−channels =4 \
20 −−mem− ranks =4 \
21 −−mem− s i z e =4GB \
22 −−sys − c l o c k =1600MHz
This command will simulate a system with core clusters. Each cluster will have number of
cores defined in –l2 cluster size parameter. The number of cores defined by -n parameter should
be divisible by the –l2 cluster size. Dividing n by l2 cluster size, gives the number of clusters in
the system. Each cluster will have its own L2 (LLC) cache.
Page 16 of 23
7 Heterogeneous Cores
Heterogeneity enables different workloads with vartying performance and energy constraints to
be allocated to different core types in the system. Gem5-X supports both in-order and OoO cores
in the same system. Different core types are distributed into different clusters.
To use heterogeneity in gem5-X, first the system is launched to boot up the linux to reach the
region-of-interest (ROI), with the following command:
1 . / b u i l d /ARM/ gem5 . { f a s t , opt , debug} \
2 −−remote −gdb− p o r t =0 \
3 −d / path / t o / your / o u t p u t / d i r e c t o r y \
4 c o n f i g s / example / f s . py \
5 −−cpu− c l o c k =1GHz \
6 −− k e r n e l = v m l i n u x \
7 −−machine − t y p e =VExpress GEM5 V1 \
8 −−dtb − f i l e =<path to gem5 −X>/ system / arm / d t / armv8 gem5 v1 <NUM CORES>cpu . d t b \
9 −n <NUM OF CORES> \
10 −− d i s k −image=gem5 ubuntu16 . img \
11 −−caches \
12 −−l2cache \
13 −− l 1 i s i z e =32kB \
14 −− l 1 d s i z e =32kB \
15 −− l 2 s i z e =1MB \
16 −− l 2 a s s o c =2 \
17 −− l 2 c l u s t e r s i z e =<NUM OF CORE PER CLUSTER> \
18 −− c l u s t e r s i z e 1 =4 \
19 −−mem− t y p e =DDR4 2400 4x16 \
20 −−mem−channels =4 \
21 −−mem− ranks =4 \
22 −−mem− s i z e =4GB \
23 −−sys − c l o c k =1600MHz
This command will simulate a system with core clusters. The parameter –cluster size 1 defines
the size of the 1st cluster of type 1. This should be the same as –l2 cluster size. All the cores in
the remaining clusters will be of type 2. For instance, if number of cores is defined to be 16, and
both –l2 cluster size and –cluster size 1 are set to 4, this implies to have 4 clusters in the system,
each with 4 cores. The first cluster will have cores of type 1 and the remaining three clusters will
have cores of types 2.
Once the ROI is reached, take a checkpoint using ”m5 checkpoint” command. Then one can
resume from the checkpoint with the desired core types for each cluster. For the above code type
1 cores are set to be in-order and type 2 to be OoO, as in the following command:
1 . / b u i l d /ARM/ gem5 . { f a s t , opt , debug} \
2 −−remote −gdb− p o r t =0 \
3 −d / path / t o / your / o u t p u t / d i r e c t o r y \
4 c o n f i g s / example / f s . py \
5 −−cpu− c l o c k =1GHz \
6 −− k e r n e l = v m l i n u x \
7 −−machine − t y p e =VExpress GEM5 V1 \
8 −−dtb − f i l e =<path to gem5 −X>/ system / arm / d t / armv8 gem5 v1 <NUM CORES>cpu . d t b \
Page 17 of 23
9 −n <NUM OF CORES> \
10 −− d i s k −image=gem5 ubuntu16 . img \
11 −−caches \
12 −−l2cache \
13 −− l 1 i s i z e =32kB \
14 −− l 1 d s i z e =32kB \
15 −− l 2 s i z e =1MB \
16 −− l 2 a s s o c =2 \
17 −− l 2 c l u s t e r s i z e =<NUM OF CORE PER CLUSTER> \
18 −− c l u s t e r s i z e 1 =4 \
19 −−mem− t y p e =DDR4 2400 4x16 \
20 −−mem−channels =4 \
21 −−mem− ranks =4 \
22 −−mem− s i z e =4GB \
23 −−sys − c l o c k =1600MHz \
24 −r 1 \
25 −−cpu− t y p e =MinorCPU \
26 −−cpu− t y p e 2 =DerivO3CPU \
Page 18 of 23
8 Scratchpad Memory (SPM)
Scratchpad Memories (SPMs) are software programmable memories at the same level as L1
cache, but controlled by the user. Gem5-X supports SPMs, which are both local and shared
between two consecutive cores.
To use SPMs gem5-X, the following command can be used:
1 . / b u i l d /ARM/ gem5 . { f a s t , opt , debug} \
2 −−remote −gdb− p o r t =0 \
3 −d / path / t o / your / o u t p u t / d i r e c t o r y \
4 c o n f i g s / example / f s . py \
5 −−cpu− c l o c k =1GHz \
6 −− k e r n e l = v m l i n u x \
7 −−machine − t y p e =VExpress GEM5 V1 \
8 −−dtb − f i l e =<path to gem5 −X>/ system / arm / d t / armv8 gem5 v1 <NUM CORES>cpu . d t b \
9 −n <NUM OF CORES> \
10 −− d i s k −image=gem5 ubuntu16 . img \
11 −−caches \
12 −−l2cache \
13 −− l 1 i s i z e =32kB \
14 −− l 1 d s i z e =32kB \
15 −− l 2 s i z e =1MB \
16 −− l 2 a s s o c =2 \
17 −−mem− t y p e =DDR4 2400 4x16 \
18 −−mem− ranks =4 \
19 −−mem− s i z e =4GB \
20 −−sys − c l o c k =1600MHz \
21 −−spm \
22 −− d spm size =128kB
The –spm commands enabes SPM in gem5-Xand –d spm size defines the SPM size, which is
set to 128KB in the above example. The SPMs can be accessed by two consecutive cores. For
instance, SPM0 is accessible by core0 and core1, SPM1 by core1 and core2, SPM2 by core2 and
core3 and so on.
Since this is a FS mode of gem5-X, to use SPM, they need to be mapped using mmapp, as in
the following code:
1 void * spm mem alloc ( u i n t 6 4 t mem size , u i n t 6 4 t mem address )
2 {
3
4 u i n t 6 4 t a l l o c m e m s i z e , page mask , p a g e s i z e ;
5 void * mem pointer ;
6 void * v i r t a d d r ;
7
8 p a g e s i z e = s y s c o n f ( SC PAGESIZE ) ;
9 a l l o c m e m s i z e = ( ( ( mem size / p a g e s i z e ) + 1 ) * p a g e s i z e ) ;
10 page mask = ( p a g e s i z e − 1 ) ;
11
12 i n t mem dev = open ( ” / dev /mem” , O RDWR | O SYNC ) ;
13 i f ( mem dev == −1)
Page 19 of 23
14 {
15 p e r r o r ( ” Cannot open / dev /mem \n ” ) ;
16 / / r e t u r n −1;
17
18 }
19
20 mem pointer = mmap( NULL ,
21 alloc mem size ,
22 PROT READ | PROT WRITE ,
23 MAP SHARED,
24 mem dev ,
25 ( mem address & ˜ page mask )
26 );
27
28 i f ( mem pointer == MAP FAILED )
29 {
30 p e r r o r ( ” Cannot MAP \n ” ) ;
31 / / r e t u r n −1;
32 }
33
34 p r i n t f ( ” Memory Mapped \n ” ) ;
35 v i r t a d d r = ( mem pointer + ( mem address & page mask ) ) ;
36
37
38 return v i r t a d d r ;
39 }
The above core snippet returns a virtual pointer in SPM in FS mode. The parameter uint64 t
mem size is used to define the size of memory allocated within SPM. The parameter uint64 t
mem address defines the memory address of the SPM in physical memory space. So for SPM0
this should be at an offset after the main memory and I/O devices in gem5-X. So for instance of the
main memory size is 4GB, the offset for SPM0 should be 4GB+2GB(I/O devices memory space),
i.e. 6GB=6442450944. SPM1 should be at an offset defined by main-memory size + I/O devices +
SPM0 size.
Page 20 of 23
9 On-Chip Wireless Networking Integration and Modeling
On-chip wireless communication is enabled by interfacing a transceiver and a nano-antenna
to the system components. gem5-X-On-Chip-Wireless supports the modelling of this interconnect
strategy. The behaviour a wireless link with different latencies, bandwidths, and Medium Access
Control (MAC) protocols can be explored. This extension was the basis for the conference paper
“System-Level Exploration of In-Package Wireless Communication for Multi-Chiplet Platforms1 ”.
9.1 Running gem5-X Full System Mode with ARMv8, Linux and wireless exten-
sions
Table 1 reports the compatibility of gem5-X-ALPINE with respect to other gem5 extensions
in gem5-X. No guarantees of compatibility with any present or future gem5-X version should be
assumed beyond the ones provided in this table.
gem5-X-On-Chip-Wireless can be cloned from the associated repository via the following com-
mand
1 g i t c l o n e h t t p s : / / g i t h u b . com / gem5−X / On−Chip − W i r e l e s s . g i t
After the environment is set up, wireless-capable systems can be specified either from a termi-
nal command line or from a configuration file.
Page 21 of 23
9.1 Running gem5-X Full System Mode with ARMv8, Linux and wireless extensions
11 −−caches \
12 −−l2cache \
13 −− l 1 i s i z e =32kB \
14 −− l 1 d s i z e =32kB \
15 −− l 2 s i z e =1MB \
16 −− l 2 a s s o c =2 \
17 −−mem− t y p e =DDR4 2400 4x16 \
18 −−mem− ranks =4 \
19 −−mem− s i z e =4GB \
20 −−sys − c l o c k =1600MHz \
21 −−membus− w i r e l e s s \
22 −− w i r e l e s s −bandwidth =12.5GB/ s \
23 −−mac− p r o t o c o l = e x p b a c k o f f
The command above generates a system with ¡NUM CORES¿ number of cores, L1 and L2
caches of the defined sizes and a CPU clock of 1GHz. The system will mount a disk containing
Ubuntu Linux and boot from it. Options specifically related to Gem5-X-On-Chip-Wireless are in the
last three lines of the command. Using them, a wireless memory bus is instantiated, connecting
main memory with the L2 cache. The wireless link has a bandwidth of 12.5GB per second, and
employs an exponential backoff protocol to arbitrate bus collisions.
Command line options related to in-package wireless links are
• --mac-protocol=<exp backoff / token pass>: selects the MAC protocol, either as expo-
nential backoff or as token passing, as described in D5.3.
• --retry-slot-size=<SIZE>: sets the size of the retry slot when using the exponential back-
off protocol, specified as a multiple of the time required to transmit a byte according to the
available bandwidth.
• --backoff-ceil=<MAX EXPONENT>: sets the upper limit to of the size of the retransmission
window in the exponential backoff protocol.
Defining systems via command line offers a fast avenue towards exploring the performance
of in-package wireless. Nonetheless, it also limits flexibility in the system generation. Indeed,
only systems with a two-levels cache hierarchy are supported, and only either L1/L2 and L2/main-
memory wireless links can be instantiated.
In a system configuration file, a wireless link can be instantiated similarly to a standard gem5
crossbar, adding the parameters specific to wireless transmission (bandwidth, employed MAC pro-
tocol etc..). An example related to link using an exponential backoff protocol is reported below.
system . w i r e l e s s l i n k = WirelessXBar (
c l k d o m a i n = system . clk domain ,
Page 22 of 23
9.2 On-Chip-Wireless module implementation files
bandwidth = o p t i o n s . w i r e l e s s b a n d w i d t h ,
m a c p r o t o c o l = o p t i o n s . ma c p ro to co l ,
r e t r y s l o t s i z e = options . r e t r y s l o t s i z e ,
b a c k o f f c e i l = options . b a c k o f f c e i l )
Alternatively, we provide components that define a wireless bus adapted to work as an inter-
connect between cache and memories: WirelessL2XBar and WirelessSystemBar.
To add one element to the wireless interconnect, its port can be attached to the wireless cross-
bar in the following way:
system . l1cache . master = system . w i r e l e s s l i n k . s l a v e
system . l2cache . s l a v e = system . w i r e l e s s l i n k . master
An example of a definition of a wireless bus between the L1 caches and the L2 cache can be
found in configs/common/CacheConfig wirelessExample.py. The example is run by executing the
example in configs/example/fs wirelessExample.py.
Page 23 of 23