Thesis Final 1
Thesis Final 1
of the
in
May 2022
APPROVED:
In one way or another, effective tooling can help understand the origins of
side-channel leakage. This thesis presents two tools, Side-channel Observer
Verification Intellectual Property (SCO VIP) and Saidoyoki, that can help
in side-channel leakage assessment efforts in pre- and post-silicon settings.
SCO VIP is a functional verification IP that is written for an industry stan-
dard, Universal Verification Methodology (UVM). SCO VIP can perform
side-channel leakage assessments in register-transfer layer (RTL). Another
tool is the Saidoyoki board. Saidoyoki is a highly configurable printed cir-
cuit board that houses two in-house designed cryptographic chips and all
needed infrastructure to perform side-channel analysis or post-silicon leak-
age assessment. This thesis also presents two cases in which the capabilities
of Saidoyoki and SCO VIP are demonstrated.
Acknowledgements
This thesis was made possible with the collaboration and helps of my
friends in Vernam Lab. I would like to thank them and wish them the best
of luck in their research.
I would also like to thank Bilal Gece, who is formally my former manager,
but inwardly my elder-brother, for his continuous belief and support in me,
and in my education.
List of Figures vi
1 Introduction 1
2 Background 5
2.1 Hardware Security and Threats . . . . . . . . . . . . . . . . . 6
2.1.1 Methods for Securing a Hardware . . . . . . . . . . . . 6
2.1.2 Side-channel Leakage in a Secure Hardware . . . . . . . 10
2.1.3 Revealing the Secret Key . . . . . . . . . . . . . . . . . 11
2.2 Pre-Silicon Leakage Assessment . . . . . . . . . . . . . . . . . 13
2.2.1 Leakage Assessment in RTL . . . . . . . . . . . . . . . 14
2.2.2 Leakage Assessment in GTL . . . . . . . . . . . . . . . 16
2.3 Post-Silicon Leakage Assessment . . . . . . . . . . . . . . . . . 17
2.4 Universal Verification Methodology . . . . . . . . . . . . . . . 17
iii
4.2.4 Measurements . . . . . . . . . . . . . . . . . . . . . . . 31
4.3 Interfacing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3.1 Communication Interface . . . . . . . . . . . . . . . . . 33
4.3.2 Physical Interface . . . . . . . . . . . . . . . . . . . . . 34
5 Implementations 36
5.1 Pre-Silicon Leakage Assessment on AES-128 . . . . . . . . . . 36
5.1.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1.2 Testbench . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.1.3 Statistical Analysis . . . . . . . . . . . . . . . . . . . . 42
5.2 Attacking PICOCHIP’s AES Coprocessor . . . . . . . . . . . . 44
5.2.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.2 Hardware Setup . . . . . . . . . . . . . . . . . . . . . . 45
5.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6 Conclusion 52
iv
List of Figures
v
5.4 KL divergence at clock cycles. . . . . . . . . . . . . . . . . . . 43
5.5 Runtime of SCO VIP in UVM with PICOs AES coprocessor
as the target device. . . . . . . . . . . . . . . . . . . . . . . . 44
5.6 T values calculated from the data sets of all-zeros and all-ones. 45
5.7 Wiring between ChipWhisperer and SaiFdoyoki. . . . . . . . . 46
5.8 A randomly chosen trace and the attack region. . . . . . . . . 47
5.9 Correlation peaks for key bytes from zero to seven. . . . . . . 49
5.10 Correlation peaks for key bytes from eight to fifteen. . . . . . 50
5.11 Detailed correlation plot of byte 1. . . . . . . . . . . . . . . . 51
vi
List of Code Listings
5.1 A sample testbench top module that imports the UVM base
class libraries. . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2 A sample test class that imports the agents, configures and
drives the test sequence. . . . . . . . . . . . . . . . . . . . . . 41
5.3 A sample environment class that imports the agents (and other
components) and does the internal connections. . . . . . . . . 41
vii
Chapter 1
Introduction
For more than a century, sensitive information has been transferred using
several different methods or combinations of them. These methods usually
employ electric, electromagnetic (EM), or optical signals. A piece of infor-
mation is encrypted at one end, sent over the communication channel, and
decrypted at the other end. The aim of cryptography is to protect this data,
on the way, from anyone who is not the intended recipient. So naturally, en-
crypted data has been the subject of countless attempts to extract sensitive
information. A renowned incidence of an attack on a piece of encrypted infor-
mation was carried out by Rejewski on Enigma Cipher in 1938 [2]. Although
this development was going to be a miraculous advancement for humanity,
it still showed a flaw in protecting sensitive data.
Getting closer to today, Data Encryption Standard (DES) became the first
cipher to be standardized by the US National Bureau of Standards (NBS),
today known as the National Institute of Standards and Technology (NIST),
1
in 1977 [3]. DES is a symmetric-key block cipher [4] that takes a plaintext
of 64 bits and a key of 56 bits as inputs and produces a ciphertext of 64
bits. Later, DES is proven to be vulnerable to brute-force attacks because of
its short key length [5]. Another well-known cipher is Advanced Encryption
Standard (AES), proposed in 1999 and adopted by the NIST in 2001 [6].
AES is also a symmetric-key block cipher that operates on 128-bit plaintext
blocks and produces 128-bit ciphertexts. For AES, key size can be 128-bit,
192-bit, or 256-bit. AES was long enough in the key length to prevent brute-
force attacks. However, many methods have been proposed not long after
its publication to extract the secret key from a device while performing AES
encryption or decryption.
As the attack methods grew more robust in effect and more efficient in
time, the need for testing against information leakage increased as well. As
a result, side-channel leakage assessment emerged as an essential part of
hardware security research. Test Vector Leakage Assessment (TVLA) is one
of the most frequently used methods for testing the cipher in the post-silicon
phase [8]. However, finding a flaw in the cipher in the later phases of the
design flow is a costly practice in terms of time and money. To address this
concern, researchers proposed new methods for testing at gate level (GTL)
and register transfer level (RTL). Recently, Yao et al. proposed Architecture
Correlation Analysis (ACA) to identify sources of side-channel leakage in
GTL [9]. Another work is RTL-PSC from He et al., which focuses on finding
vulnerable areas in an AES cipher at the earliest design phase, RTL [10].
Many other methods were also proposed focusing in different aspects of side-
channel leakage assessment [11, 12, 13].
2
When it comes to electrical measurement, several hardware platforms ex-
ist. Usually, to observe the power side-channel, an integrated circuit (IC)
containing a cipher, or a Central Processing Unit (CPU) performing a ci-
pher algorithm, is monitored for used power in the process of encryption
or decryption. After enough traces are collected, an attack method can be
employed. Side-channel leakage testing boards usually employ a Field Pro-
grammable Gate Array (FPGA) as their target unit. SAKURA-G is one
of the well-known hardware platforms for hardware security research [14].
Although FPGAs are useful for flexibility, they come with handicaps of not
using the original logic fabric of the design and, consequently, not having
control of low-level circuitry.
To address the points mentioned above, this thesis presents two infras-
tructures for pre- and post-silicon settings of side-channel leakage testing to
improve the quality of the test in simulation and measurement and presents
implementation examples of the proposed tools. First, side-Channel Ob-
server (SCO) Verification Intellectual Property (VIP) is a Universal Ver-
ification Methodology (UVM) VIP that closely monitors toggle counts of
the design being tested and provides useful data that can be used in side-
channel assessment in RTL. Another infrastructure is Saidoyoki. Saidoyoki
is a printed circuit board (PCB) that houses two in-house designed crypto-
graphic Application-Specific Integrated Circuits (ASIC) and provides power
measurement and configuration flexibility. With developing SCO VIP and
Saidoyoki, this thesis presents the following contributions:
• Saidoyoki board houses ASICs as its test chips. Unlike FPGA designs
implemented on FPGA fabric, ASICs make use of the original logic
3
design, which is more accurate and free of additional unwanted behavior
caused by FPGA fabric.
• Saidoyoki provides easy usage for researchers. Using only one cable, it
is possible to configure, program, and control the board.
• Saidoyoki has a variety of clock sources for the test chips. For example,
a test chip can receive its clock signal from an on-board clock generator,
an SMA connector, or a header.
4
Chapter 2
Background
This chapter provides the necessary background that the core work of this
thesis relies on. The first section presents a general review of the commonly
used ciphers, types of side-channel leakage, and the methods used to reveal
the secured information. The next part discusses current side-channel as-
sessment techniques in pre-silicon design phases, GTL and RTL. After this
comes the post-silicon, when a physical device is present, and the ways of
testing its side-channel vulnerabilities. Lastly, an industry-standard design
verification technique, UVM, is reviewed.
5
Security Methods Chiphers
6
secret key
Symmetric-key
algorithm
Public-key
algorithm
7
b0 b1 b2 b3
b4 b5 b6 b7
b8 b9 b10 b11
• ShiftRows: In this step, each row of the state array is shifted to left
in bytes for the following number of times:
– Row 1: No shift
– Row 2: 1
– Row 3: 2
– Row 4: 3
8
Key Plaintext
Repeat N-
1 times
SubBytes
ShiftRows
1 round
MixColumns
AddRoundKey
Round Keys
SubBytes
ShiftRows Last
Round
AddRoundKey
Ciphertext
9
2.1.2 Side-channel Leakage in a Secure Hardware
Secure hardware designs aim to keep a sensitive piece of information out of
the reach of adversaries. Extracting this sensitive information from encrypted
data is visually impossible. However, secure devices tend to present hints of
the data while the cryptographic algorithm is performing. When a crypto-
graphic operation is underway, several other channels can be observed with
the possibility of revealing a piece of critical information. These channels are
called side channels. Some of the well-studied side-channels discussed here
are power consumption, EM radiation, and operation time.
Power-side channel leakage is the one that has been extensively used in
cryptanalysis. The roots of power-side channel leakage can be explained by
analyzing the factors that contribute to the device’s power consumption. The
power consumption of a digital circuit originated from the power consumption
of each transistor in the device. A transistor’s power consumption, as seen
in Equation 2.1, consists of the power consumed due to short-circuit, current
leak, and switching activity. Among them, leaking current and short-circuit
consumption are the static ones. Yet, the power consumption caused by
the switching activity changes in time, based on the number of transistors
switched from 0 to 1 or vice versa. When measuring the power consumption,
a component of noise from the measurement equipment or the circuitry itself
also adds up to the total. Equation 2.2 shows the measured consumption.
10
Timing is another quantity that can be observed to read out the secret
data. In digital circuits, every logical operation is carried out with different
circuit structures, called gates. For example, the number of transistors used
in an AND gate is different from the XOR gate. Also, the composition of
these transistor networks is different from one to another. This difference
brings a distinction in the signal delay times of different operations. This
distinction is often exploited in cryptanalysis [18].
11
Plaintext btye [n] Secret Key btye [n]
0 1 1 0 0 0 1 1
S-box
✔ X X ✔ ✔ X X X
Attack
1 1 1 1 1 0 1 1
Point
Figure 2.5: Hamming Weight and Hamming Distance calculation for AES.
method to predict the secret key. For more advanced methods, Differential
Power Analysis (DPA) and Correlation Power Analysis (CPA) can be counted
[20, 21]. DPA and CPA methods make use of statistical analysis of the
collected data. This thesis focuses on the CPA method.
CPA method relies on the statistical correlation analysis between the hypo-
thetical power consumption, HW or HD, and the measured power consump-
tion. For AES, hypothetical power consumption is the HW or HD values at
the end of the first round S-box, which includes the plaintext and the secret
key. Hypothetical power consumption is calculated for every possible key
value (key guess). This set of values, along with the measurement results,
are used for the calculation of the Pearson Correlation Coefficient using the
Formula 2.3, where H and T are the hypothetical power consumption and
the measurement values, respectively.
12
Key Guess Hypothetical Symbolic
Correlation
No. Power Measurement
1 4 Low
2 2 Low
3 3 High
4 4 Highest
cov(H, T )
ρ(H, T ) = p (2.3)
V (H).V (T )
After this, the results are expected to show a distinctive difference for one
key guess among all the other possible ones. This is because when a correla-
tion coefficient for a key guess with a higher value of hypothetical power con-
sumption and an actual measurement with a high power value is calculated,
the result becomes high too. In the case of the correct key, the coefficient
value will be the highest. Figure 2.6 is an example of this comparison, in
which the key guess number one will be the most likely prediction.
13
introducing additional noise into the system. Randomization techniques are
often used to create an artificial noise to reduce the Signal-to-Noise Ratio
(SNR) [22]. Low SNR reduces the correlation between the secret key and
the measurement. Masking countermeasures are based on the idea of split-
ting the sensitive information into a number of shares so that the attacker
needs to find all of them to run a successful attack [23]. Although both
countermeasure methods are effective, they introduce significant overheads
in design resources. For example, a countermeasure by Das et al. introduces
1.63 times overhead in power and 1.25 times in area [24]. Yet, none of these
countermeasures are proven to be non-breakable.
14
In [10], He et al. proposes a framework named RTL-PSC that counts the
switching number in a simulation of AES and then utilizes Kullback-Leuber
(KL) divergence and success rate (SR) metrics based on maximum likeli-
hood estimation to power side-channel analysis. KL divergence is a method
used to estimate the statistical distance between two different probability
distributions. After collecting the switch count numbers for 1000 encryption
cycles of AES, RTL-PSC derives two probability distribution functions. One
function is derived from the 1000 cycles of encryption when the key is all
zeroes. The other one is derived when the key is all ones. This practice is
followed to obtain the largest Hamming Distance between the simulations.
RTL-PSC assumes that these functions follow Gaussian Distribution. KL di-
vergence then is calculated between these two functions for each clock cycle
and the design unit. High values in the results are suggested to leak more
information than the others. The other metric, SR, assumes that the adver-
sary usually selects a key guess that gives the highest maximum likelihood
value. Based on this, SR calculates the probability that the chosen key is the
correct key. Finally, RTL-PSC sets a threshold value that names them as
vulnerable if one or both of these two metrics cross. Inspired by RTL-PCB,
the Side-Channel Observer (SCO) VIP, which will be introduced in Chapter
3, implements the KL divergence metric in Chapter 4.
15
2.2.2 Leakage Assessment in GTL
In the gate-level (GTL) design phase, side-channel leakage assessment pro-
vides more accurate results than RTL. This is because GTL is closer to the
real-life conditions where the design actually appears in logic cells and wires
between them. Still, GTL properties do not include noise but other low-level
impacts such as glitches and physical routing. Examples of gate-level assess-
ment methods can be ACA proposed by Yao et al. [9], GLIFT proposed by
Oberg at al. [26] and Co-Co proposed by Gigerl at al. [11].
ACA method proposes a technique that is able to analyze and rank logic
cells in a gate-level design based on their contribution to the power side-
channel leakage. In this method, they aim to localize the position of a leaky
cell in the design so that the design engineers can implement a local solution
with the advantage of decreasing the overhead caused by the countermea-
sure. The proposed Leakage Impact Factor (LIF) is calculated for a secure
hardware design, and the cells are ranked based on their LIF values. As
a result, ACA shows that only a very small number of logic cells actually
contribute to a side-channel leakage.
16
2.3 Post-Silicon Leakage Assessment
Post-silicon side-channel leakage assessment usually refers to statistical meth-
ods, such as Test Vector Leakage Assessment (TVLA), checking if the sensi-
tive variables of the cryptographic operation significantly affected the side-
channel measurements.
17
is an essential concept in UVM. Typically, every testbench needs basic com-
ponents like drivers, monitors, stimulus generators, and checkers. A veri-
fication engineer can write all these components and perform the verifica-
tion process. However, it is possible to write these mentioned components in
countless ways. UVM deals with the confusion that this older practice brings.
A functionality for a digital interface is implemented in the structures named
agents. Agents are connected to both the related parts of the DUT and to
the rest of the testbench. For example, an agent can implement the protocol
of Serial Peripheral Interface (SPI). The agent, then, is responsible for acting
as a formal SPI endpoint.
18
UVM Agent
Monitor
Interface DUT
Sequencer Driver
19
Test Top
UVM test
UVM env
Interface DUT
UVM agent D
Sequences
20
Chapter 3
3.1 Overview
The idea behind performing a pre-silicon side-channel assessment is to de-
tect information leakage at an early design phase. As the cost of recovering
a detected flaw from a late design phase would be so costly in time and re-
sources, previous works in this area are generally proposed methods with the
tendency of being compatible with the current design flow followed by almost
every chip manufacturer. As an example of this, He et al. in [10], mentions
that RTL-PSC, which is a side-channel leakage assessment methodology in
RTL based on toggle counting, is developed to be integrated with the tradi-
tional ASIC and FPGA design flow. However, traditional chip design flows
are well-defined, so making changes is not always feasible. Instead, adding
a parallel thread to this process, using its native tools, can be a good idea.
Functional verification is a critical process in the chip design flow, and UVM
is the most widely used platform for performing this verification. A UVM
VIP can be well capable of handling RTL side-channel leakage by collecting
and analyzing toggle counts while other VIPs are dealing with the native
verification process.
21
is usually prepared to implement a specific protocol. It is different from a
UVM Agent because, although a UVM Agent is capable of handling low-level
signaling with its driver, it cannot work standalone. For example, an agent
for a Tightly Coupled Memory does know how to implement the memory
writes and reads. However, it needs consumable materials (data and timing
information) while performing its function.
22
DESIGN
SCO Agent
obs_a obs_b obs_c obs_d
Counters
enable counting
Internal Scoreboard
Analysis results
23
is created. Then, this array can be used in pre-silicon toggle counting side-
channel assessment applications. Figure 3.1 shows the block diagram of this
structure.
24
Figure 3.2: Positions of the nets with possibility to be counted for twice in
an example circuit.
25
Chapter 4
Saidoyoki: Post-Silicon
Side-Channel Test Platform
4.1 Overview
The most accurate side-channel assessment can be made in the post-silicon
setting. This is because all unforeseeable factors that are affecting the mea-
surement values exist in the physical circuit. In RTL, so far, the only useful
data seems to be the signal toggle counts, as they more or less represent
the dynamic power consumption of transistors. However, RTL lacks most
of the other factors that can contribute accurate assessment of side-channel
leakage. For example, despite circuit properties being much closer to reality
at the gate level, a true noise, which is an important factor in measurements,
is still missing.
26
Figure 4.1: Simplified block diagram of Saidoyoki.
chips independently using the optional current probe port or the on-board
low noise amplifier (LNA). As Saidoyoki uses ASICs instead of FPGAs, it re-
veals more concrete facts in side-channel research. Figure 4.1, is a simplified
block diagram of Saidoyoki.
Saidoyoki tool has been under development for the last two years. Two
versions of Saidoyoki have been developed so far. To prevent confusion, the
board mentioned as Saidoyoki in [30] refers to version one. In this thesis,
Saidoyoki refers to the most recent design, version two. Figure 4.2 shows an
image of the version two board.
27
Figure 4.2: A photo of Saidoyoki V2 with a PICO mounted on.
28
Figure 4.3: (a) PICO Block diagram and (b) FAMEv2 Block diagram.
The PICO ASIC is a 180nm SoC with a RISCV (RV32) core and 64 kByte
of internal memory, and several coprocessors. The program exclusively runs
from off-chip flash through a Quad-SPI flash ROM. The system is integrated
on a single bus. All coprocessors run as bus slaves and communicate with the
RISC-V software through memory-mapped registers. PICO contains crypto-
graphic accelerators for symmetric-key encryption (AES), authenticated en-
cryption (ASCON), and hardware testing of true random bitstreams (TRNG
test). The sensors in PICO detect fault injection as well as side-channel leak-
age.
The FAME ASIC is a 180nm SoC with a LEON3 core and 128 kByte inter-
nal memory, and several coprocessors. The program can either execute from
on-chip SRAM or off-chip flash through an SPI flash ROM. A debug unit,
controlled through an on-chip Debug UART, provides program loading, mon-
itoring, and breakpoints. The coprocessors are isolated from the processor
through a bus bridge. All coprocessors exclusively operate as bus slaves and
communicate with the software through memory-mapped registers. FAME
contains cryptographic accelerators for symmetric-key encryption (AES and
AES+, a hardened version of AES) and pseudo-random stream generation
29
3V3
Regulators
Barrel Jack 1V8 PICO (1V - 2V Adjustable)
(KeyMill). The sensors in FAME detect timing faults injected through clock
glitching and voltage glitching.
4.2.2 Power
The power supply network of Saidoyoki provides a flexible and reliable power
supply for hardware security experiments. The board can be powered by
one of the two possible input ports, a barrel jack with a 2.5mm center pin
diameter, and a screw terminal. After an on/off slider switch, four Analog
Devices LT8083IDF#PBF linear voltage regulators are positioned in parallel.
Two of these regulators are used for general purposes 1.8V and 3.3V supply
rails. The general purpose rails supply I/O voltage for PICO and FAME and
the rest of the board. The other two regulators are dedicated to the core
voltage supply to the test chips. They are both set to 1.8V as default, but
it is possible to adjust their voltage output for other experiments, like fault
attacks. These two regulators’ names are written next to them on the board.
Users can switch the related slider to the ADJ position to enable adjustable
operation. After that, the accompanying variable resistor can be set using a
screwdriver. The output of these adjustable regulators is limited from 1v to
2V, protecting the test chips from severe off-limits voltage values. A block
diagram of the power network can be seen in Figure 4.4.
30
EEPROM
clk_PICO
Clock Generator
Switcher
clk_FAME
Headers
4.2.3 Clocking
Saidoyoki can supply independently configurable clock signals from several
sources for its test chips, PICO and FAME. The primary clock source is
the on-board clock generation IC, CDCE925PWR from Texas Instruments.
CDCE925PWR is a clock synthesizer IC with two programmable PLLs. An-
other feature is the internal EEPROM of the clocking IC that can store device
configurations permanently. CDCE925PWR is configured over an I2C bus
via a USB bridge. A user can program the clocking chip before program-
ming the test chips. If the internal EEPROM was also programmed with
the current configuration, the IC would continue to supply the clock after a
power-off. A clock signal up to 230 MHz can be supplied through this device.
Saidoyoki can receive clock signals from an external source too. It houses an
SMA connector and a selection header switch that can also be used as another
clock supply port. In Chapter 5, an implementation of Saidoyoki is presented
where it receives its clock signal from an external device through this header
switch. Figure 4.5 presents a block diagram of the clocking circuit.
4.2.4 Measurements
Power measurement is the most critical part of post-silicon side-channel eval-
uation. Therefore, Saidoyoki focuses on the power measurement region by
providing several different methods to collect power side-channel data. A di-
agram of the power measurement region on Saidoyoki can be seen in Figure
4.6.
31
Current
Probe Port 0R1
Shunt
1R
PICO_INT_REG Select
4R7
PICO_INT
10R
Current Probe
Enable
Saidoyoki has two power measurement circuitry, one for the PICO and
one for the FAME. Although they are the same, Figure 4.6 shows the PICO
part. The power measurement block observes the core voltage supply rail
of the test chips. The blocks stand between the output of the regulators
and the input of the chip power pins. So, they collect power measurements
from the high side of the power network. This structure’s two main objects
are the current probe port and the low noise amplifier (LNA). The first
one is the optional current probe port. This port, when enabled, breaks
the serial circuit and drives the current to a screw terminal. The current
data is collected externally, and the board receives the current back from
the other port of the screw terminal. When disabled, it acts as a closed
circuit without letting the current pass through the screw terminal. This
prevents unnecessary resistance addition in the network. However, removing
the additional resistance caused by the on/off switch is not possible. The
next step in the network is the shunt resistor selection block. Users can
choose one of the four precise shunt resistors for power measurement. Shunt
resistor values are 0.1 Ohm, 1 Ohm, 4.7 Ohm, and 10 Ohm.
There is also an option to bypass the shunt resistor block when it is not
needed. After the shunt resistor selection, the next part is the LNA; also,
the power rail is routed to the related chip in a 50 Ohms trace (providing
better signal quality) after this point. Saidoyoki, as the LNA, has an NXP
BGA2801 MMIC wideband signal amplifier providing a gain of 22.2 dB at
250 MHz. The LNA is also internally matched to 50 Ohms. This amplifier is
used for side-channel data collection in Chapter 5. The output of the LNA
is reachable with an SMA connector. Additionally, there are two other SMA
connectors in each power measurement block, one at the block input and one
at the LNA input, for observing the power rail when needed.
32
SPI/QSPI PICO FAME
Flash Flash
I2C Clock
FTDI 4-channel USB Generator
Bridge UART PICO/FAME User
UART
UART Fame Debug
UART
4.3 Interfacing
4.3.1 Communication Interface
One of the most essential features that make Saidoyoki ideal for side-channel
research is that it is possible to control every programmable component on
the board with a single USB micro-B port. Saidoyoki has a USB bus con-
troller (USB bridge) IC, FT4232H from FTDI, with four channels. Users can
access any of those channels through a computer. Figure 4.7 shows a block
diagram of the data communication network of Saidoyoki. These channels
are as follows:
Most of these channels are shared between PICO and FAME. This allows
easy board configuration without requiring different ports for programming,
debugging, and communication. Saidoyoki also has a GitHub page. Users
can find example codes and easy programming scripts that do not require
more than one Linux command for flashing binaries to PICO’s or FAME’s
program memory.
33
4.3.2 Physical Interface
A wide range of flexibility comes with many components to be configured.
FTDI’s USB bridge can handle all the data communication, but there are
still many configuration points that should be set by hand. Although some of
these physical configuration points were mentioned above, this section gives
a complete list of them:
• Current ports can be enabled or disabled using the slider switch. How-
ever, users should remember that when these switches are left at the
”enable” position but no current probe is connected, the power supply
rail will remain as open-circuit.
• CDCE925PWR clock chip has a user-defined input port, S0, that can
be programmed by the user. This port can be grounded by a jumper.
It remains high when left unconnected.
• CDCE925PWR clock chip uses its S1 and S2 user input ports as I2C
ports at the same time. If accidentally, the chip is programmed to
use its I2C ports as user input ports, it no longer uses the I2C bus.
One recovery from this situation is to force the output of the chip
voltage supply pin to the ground. Using the related header, this pin
is connected to 3.3V as default. It can be connected to the ground
header that stands next to it in such a case. At this stage, the chip
temporarily restores the I2C bus for use.
• Flash chips share the same SPI/QSPI bus as the same slave. Based
on this, when programming a flash, its nearby headers should be set
to ”FT”; when a chip is to read the program from its flash memory,
related switches should be set to ”FLASH.” Switches of both flash
memory ships should not be at the ”FT” position at the same time.
34
• User UART is shared between PICO and FAME. Related headers
should be set accordingly to connect to the correct chip.
• FAME has two debug UART ports, the needed one should be set ac-
cordingly.
• Boot pin of FAME can be set to ground or to 3.3V using the related
slider switch.
• Test mode for FAME can be enabled using the related header.
• It is possible to reach all four channels of the USB bridge via headers.
More information about the physical design features and the physical con-
figuration before programming can be found on the GitHub page.
35
Chapter 5
Implementations
36
Figure 5.1: KL divergence value tendencies based on two different scenarios.
This thesis used the KL divergence metric for our RTL side-channel leakage
assessment. KL divergence is a statistical method that evaluates the statis-
tical distance between two probability distributions. Figure 5.1 presents a
visual explanation of the KL divergence metric based on two hypothetical
distribution scenarios.
Z
q(x)
D(q ∥ p) = q(x) log dx (5.1)
p(x)
37
the calculation of KL divergence values. First, KL divergence is calculated
for each block at each clock cycle of the encryption. Then, the resulting
values are normalized by the maximum value. As a result, normalized KL
divergence value locations larger than 0.5 are considered as leaky points.
5.1.2 Testbench
Universal Verification Methodology is an industry standard framework that
is being used to functionally verify digital circuits at RTL. Therefore, when
there is a method for side-channel leakage assessment at RTL that is meant
to be a part of the industrial design flow, UVM is the best candidate that
can implement this idea.
UVM testbenches are structued on a highly modular basis, this brings the
reusability feature to UVM which is one of the strongest factors contributing
38
DESIGN
Which clock,
which AES,
enable count
Counters
SCO VIP
Test counts array[block_name, clock_cycle, enc_number]
Scoreboard Test vectors -
seeding
Counts with 00s Counts with FFs
UVM ENV
Assessment Results
Figure 5.2: UVM testbench arcitecture for RTL side-channel assessment with
KL divergence metric.
39
to its ascension as an industry-standard. In Code Listings 5.1, 5.2, and 5.3,
it is shown that how UVM base class libraries and SCO VIP packages can be
imported, configured and used to start a simple test. UVM code structures
are usually long. Therefore, the mentioned listings show sample coding styles.
The rest of the source code of the testbench and the SCO VIP package can
be found in the project’s GitHub page. Although many RTL simulators have
the UVM base class libraries built-in, they are also available as open source
at the website of Accellera Systems Initiative.
1 // testbench top
2 ‘include " uvm_macros . svh "
3 module top ;
4 import uvm_pkg ::*;
5 import test_pkg ::*;
6 // Variable declerations
7 . . .
8 //
9 aes_com_if aes_com_if_0 () ; // Interface for com agent
10 if_aes_comp_core * if_aes_comp_core * _0 () ; // Interface for
observers
11 aes_top DUT ( /* DUT top level signals to agent interfaces */
);
12 bi nd _d ut _i nt er na ls () ; // bind DUT signals to observer
interfaces
13 initial begin // make interfaces available to the uvm
14 uvm_config_db #( virtual aes_com_if ) :: set ( null , " * " , "
aes_com_vif " , aes_com_if_0 ) ;
15 . . . ;
16 end
17 initial begin
18 g e n e r a t e _ c l o c k _ a n d _ r e s e t () ;
19 end
20 initial begin
21 run_test () ; // specific method to start uvm_tests
22 end
23 endmodule
Code Listing 5.1: A sample testbench top module that imports the UVM
base class libraries.
40
1 // test -- all associated files are combined in test_pkg
2 class test extends uvm_test ;
3 ‘ u v m _ c o m p o n e n t _ u t i l s ( test ) // register the class as a uvm
component
4 env_base m_env ; // bring uvm environment
5 env_config_base m_cfg ; // bring uvm env config class
6 ae s_ co m_ en v_ co nf ig m _ a e s _ c o m _ e n v _ c o n f i g ; // bring
communication agent
7 aes_com_agent_config m_aes_com_agent_config ;
8 aes_observer_env_config m _ a e s _ o b s e r v e r _ e n v _ c o n f i g ; //
bring SCO
9 aes_observer_agent_config m_aes_observer_agent_config ;
10 aes_sequence m_aes_sequence ; // bring test sequence
11 // construct and register classes to uvm database
12 // configure agents and the environment using the handles .
13 task run_phase ( uvm_phase phase ) ;
14 ‘uvm_info ( get_type_name () , " In run phase of ’ test ’. " ,
UVM_LOW )
15 phase . raise_objection ( this , " test " ) ;
16 m_aes_sequence . start () ; // start test sequence
17 phase . drop_objection ( this , " test " ) ;
18 endtask
19 endclass
Code Listing 5.2: A sample test class that imports the agents, configures and
drives the test sequence.
41
we need to use the discrete version of this formula, which is given in Formula
5.2 where p and q are probability distributions, and D is KL divergence.
X q(x)
D(q ∥ p) = q(x) log (5.2)
x∈X
p(x)
After the collection of the toggle counts, every specific module has a data
set of toggle count numbers for each clock cycle. Each of those data sets
includes 1000 data points as both encryptions were repeated 1000 times.
RTL-PSC proposes that these data sets follow Gaussian Distribution. So,
then, means and variances values are calculated. After this, mean and vari-
ance values are used to calculate probability functions, and finally, the KL
divergence values for every module at every clock cycle.
42
Figure 5.3: Normalized KL divergences values of PICOs AES accelerators
RTL modules at every clock cycle.
43
Figure 5.5: Runtime of SCO VIP in UVM with PICOs AES coprocessor as
the target device.
44
Figure 5.6: T values calculated from the data sets of all-zeros and all-ones.
by analyzing the power traces which are taken from the target device while
it is performing a key-based operation. It was shown by previous works
that CPA can effectively extract the secret key value from AES ciphers.
CPA does this by calculating the Person Correlation Coefficient value for
each sample point and key guess. In CPA, the correlation is made between
a hypothetical power consumption (power model) and the measured ones.
Naturally, the correlation values that were calculated using the hypothetical
power consumption of the correct key guess show a distinction from the
others. As mentioned above, the power model can be Hamming Weight
or Hamming Distance. In this demonstration, we used Hamming Distance
between the input and the output of the S-box operation in the first AES
round. Formula 5.3 shows the calculation of the power model where HD is
the Hamming Distance, HW is the Hamming Weight, p is the plaintext, and
k is the key guess. In this attack, all calculations are made byte-wise.
45
Figure 5.7: Wiring between ChipWhisperer and SaiFdoyoki.
46
Attack Region
signal. When the trigger signal arrives, it starts to collect the trace. On the
side of Saidoyoki, power traces are collected from the LNA using a 4.7 Ohm
shunt resistor. This operation repeats as many times as required.
5.2.3 Results
In the initial experiments, it was seen that after Saidoyoki issues the trigger
signal, the whole AES rounds take 11500 sample points at most, remember
that the sampling rate of CW was 16 MS/s. Provided that, 11500 sample
points correspond to 718.75 µs or 2875 clock cycles after the trigger is set.
Figure 5.8 shows a random trace. In this trace, it is easy to visually locate
the s-box operation from which we calculated the power model. This is the
attack region. This is the the attack region.
After this, 400 thousand traces are collected from Saidoyoki. A range of
sample points from the sample 1500 to the sample 3500 were cut out from
each of these traces to decrease the analysis time. As a result of this analysis,
CPA was able to recover fifteen out of sixteen key bytes. For the only key
byte that was not able to be recovered, the correct key guess had the second
highest correlation with a very small error after the highest one. Figure 5.9,
47
Figure 5.10 and Figure 5.11 show the correlation peaks observed for the key
bytes zero to seven, eight to fifteen, and a more detailed plot of the erroneous
byte (byte 1), respectively.
One reason for not being able to recover one byte could be the possible
electrical noise observed on Saidoyoki. Saidoyoki version two is the improved
version which is hardened explicitly against electrical noise. Although it is
not possible to eliminate the noise completely, traces collected from version
two show more clean trends, and the CPA results show more distinct peaks
compared with the results from version one. Moreover, to eliminate the
distortion caused by the noise, other experiments using more traces were also
done. However, these experiments still showed the same behavior for byte1 -
failure to recover with a slight difference. As a result, despite electrical noise
being a strong candidate to be the origin of these issues, there could be other
factors in effect instead of the noise or together with the noise. This will be
a future improvement action point for Saidoyoki version three.
48
Byte 0 Byte 1
Byte 2 Byte 3
Byte 4 Byte 5
Byte 6 Byte 7
Figure 5.9: Correlation peaks for key bytes from zero to seven.
49
Byte 8 Byte 9
Byte 10 Byte 11
Byte 12 Byte 13
Byte 14 Byte 15
Figure 5.10: Correlation peaks for key bytes from eight to fifteen.
50
Figure 5.11: Detailed correlation plot of byte 1.
51
Chapter 6
Conclusion
Towards the needs mentioned above, this thesis presented two side-channel
leakage assessment tools, SCO VIP and Saidoyoki board. SCO VIP is focused
on being a UVM compatible side-channel assessment tool in RTL so that new
assessment methods based on toggle counting can easily be implemented in
an industry standard verification environment. A demonstration of SCO VIP
is made by implementing the recently proposed RTL-PSC method. SCO VIP
proved to show results faster than the original work. Saidoyoki, on the other
hand, is designed to be a flexible and reliable post-silicon tool for side-channel
leakage experiments. Its highly configurable structure is demonstrated to be
making side-channel analysis experiments easier. Saidoyoki board presented
in this thesis is the second version of the board. Saidoyoki version two also
52
showed itself in a CPA attack on a 128-bit AES hardware accelerator, result-
ing in more distinct correlation peaks than version one.
53
Bibliography
[1] Hans Delfs, Helmut Knebl, and Helmut Knebl. Introduction to cryptog-
raphy, volume 2. Springer, 2002.
[5] Matt Curtin and Justin Dolske. A brute force search of des keyspace.
In 8th Usenix Symposium, January, pages 26–29. Citeseer, 1998.
[6] Joan Daemen and Vincent Rijmen. The rijndael block cipher: Aes pro-
posal. In First candidate conference (AeS1), pages 343–348, 1999.
[9] Yuan Yao, Tarun Kathuria, Baris Ege, and Patrick Schaumont. Archi-
tecture correlation analysis (aca): Identifying the source of side-channel
leakage at gate-level. In 2020 IEEE International Symposium on Hard-
ware Oriented Security and Trust (HOST), pages 188–196. IEEE, 2020.
54
[10] Miao He, Jungmin Park, Adib Nahiyan, Apostol Vassilev, Yier Jin,
and Mark Tehranipoor. Rtl-psc: Automated power side-channel leakage
assessment at register-transfer level. In 2019 IEEE 37th VLSI Test
Symposium (VTS), pages 1–6. IEEE, 2019.
[11] Barbara Gigerl, Vedad Hadzic, Robert Primas, Stefan Mangard, and
Roderick Bloem. Coco:{Co-Design} and {Co-Verification} of masked
software implementations on {CPUs}. In 30th USENIX Security Sym-
posium (USENIX Security 21), pages 1469–1468, 2021.
[12] Davide Poggi, Philippe Maurine, Thomas Ordas, and Alexandre Sarafi-
anos. Protecting secure ics against side-channel attacks by identifying
and quantifying potential em and leakage hotspots at simulation stage.
In International Workshop on Constructive Side-Channel Analysis and
Secure Design, pages 129–147. Springer, 2021.
[13] Rajat Sadhukhan, Paulson Mathew, Debapriya Basu Roy, and Deb-
deep Mukhopadhyay. Count your toggles: a new leakage model for pre-
silicon power analysis of crypto designs. Journal of Electronic Testing,
35(5):605–619, 2019.
[14] Hendra Guntur, Jun Ishii, and Akashi Satoh. Side-channel attack user
reference architecture board sakura-g. In 2014 IEEE 3rd Global Con-
ference on Consumer Electronics (GCCE), pages 271–274. IEEE, 2014.
[15] Alfred J Menezes, Paul C Van Oorschot, and Scott A Vanstone. Hand-
book of applied cryptography. CRC press, 2018.
[16] Iqtadar Hussain, Tariq Shah, Hasan Mahmood, and Muhammad Asif
Gondal. A projective general linear group based algorithm for the con-
struction of substitution box for block ciphers. Neural Computing and
Applications, 22(6):1085–1093, 2013.
[17] Dakshi Agrawal, Bruce Archambeault, Josyula R Rao, and Pankaj Ro-
hatgi. The em side—channel (s). In International workshop on crypto-
graphic hardware and embedded systems, pages 29–45. Springer, 2002.
[18] Onur Acıiçmez, Werner Schindler, and Çetin K Koç. Cache based re-
mote timing attack on the aes. In Cryptographers’ track at the RSA
conference, pages 271–286. Springer, 2007.
55
[19] Stefan Mangard. A simple power-analysis (spa) attack on implementa-
tions of the aes key expansion. In International Conference on Informa-
tion Security and Cryptology, pages 343–358. Springer, 2002.
[20] Paul Kocher, Joshua Jaffe, and Benjamin Jun. Differential power anal-
ysis. In Annual international cryptology conference, pages 388–397.
Springer, 1999.
[21] Eric Brier, Christophe Clavier, and Francis Olivier. Correlation power
analysis with a leakage model. In International workshop on crypto-
graphic hardware and embedded systems, pages 16–29. Springer, 2004.
[22] Yusuke Yano, Kengo Iokibe, Yoshitaka Toyota, and Toshiaki Teshima.
Signal-to-noise ratio measurements of side-channel traces for establish-
ing low-cost countermeasure design. In 2017 Asia-Pacific International
symposium on electromagnetic compatibility (APEMC), pages 93–95.
IEEE, 2017.
[24] Debayan Das, Shovan Maity, Saad Bin Nasir, Santosh Ghosh, Arijit
Raychowdhury, and Shreyas Sen. Asni: Attenuated signature noise in-
jection for low-overhead power side-channel attack immunity. IEEE
Transactions on Circuits and Systems I: Regular Papers, 65(10):3300–
3311, 2018.
[25] Tao Zhang, Jungmin Park, Mark Tehranipoor, and Farimah Farah-
mandi. Psc-tg: Rtl power side-channel leakage assessment with test
pattern generation. In 2021 58th ACM/IEEE Design Automation Con-
ference (DAC), pages 709–714. IEEE, 2021.
[26] Jason Oberg, Sarah Meiklejohn, Timothy Sherwood, and Ryan Kastner.
Leveraging gate-level properties to identify hardware timing channels.
IEEE Transactions on Computer-Aided Design of Integrated Circuits
and Systems, 33(9):1288–1301, 2014.
56
[27] Benjamin Jun Gilbert Goodwill, Josh Jaffe, Pankaj Rohatgi, et al. A
testing methodology for side-channel resistance validation. In NIST
non-invasive attack testing workshop, volume 7, pages 115–136, 2011.
[28] Stephen E Fienberg and Nicole Lazar. William sealy gosset. In Statis-
ticians of the Centuries, pages 312–317. Springer, 2001.
[29] Juan Francesconi, J Agustin Rodriguez, and Pedro M Julian. Uvm based
testbench architecture for unit verification. In 2014 Argentine Confer-
ence on Micro-Nanoelectronics, Technology and Applications (EAMTA),
pages 89–94. IEEE, 2014.
[30] Pantea Kiaei, Zhenyuan Liu, Ramazan Kaan Eren, Yuan Yao, and
Patrick Schaumont. Saidoyoki: Evaluating side-channel leakage in pre-
and post-silicon setting. Cryptology ePrint Archive, 2021.
57