Standard Cheat Sheet
Standard Cheat Sheet
i=0
B2T:
n−2
+ ∑ ai∗2
n−1 i
(a n−1 … a i … a 0)2 ⇒−a n−1∗2
i=0
B2H/H2B: 4 bits <-> 1 hex digit
T2U/U2T: +/- 2^n when necessary
H2T=H2B+B2T; T2B=T2U+U2B; T2H =
T2U+U2H.
Memory: byte-addressable
x86-64 data size:
Byte (b) 1 byte 8 bits
Word (w) 2 bytes 16 bits
Double word (l) 4 bytes 32 bits
Quad word (q) 8 bytes 64 bits
starting address must be multiple of
data size, e.g., valid starting address
of a double word mod 4 = 0
x & mask: clear bits that are 0 in mask
x | mask: set bits that are 1 in mask Components of x86-64 ISA
x ^ mask: flip bits that are 1 in mask 1. Data and address size
~x: flip all bits 1/2/4/8 bytes data size; 64-bit address
~x + 1 = -x: negation = flip bits + 1 2. Supported instructions
A - B = A + ~B + 1 mov, add, jmp, ...
3. Registers
0 & x = 0; 1 & x = x; x & x = x. 16 general-purposed registers and usage
0 | x = x; 1 | x = 1; x | x = x. conventions (optional)
0 ^ x = x; 1 ^ x = ~x; x ^ x = 0. 4. Addressing Modes
5. Length and format of instructions
Left shift (x << n): fill in 0
q: 8 bytes l: 4 bytes w: 2 bytes b: 1 byte
%rax %eax %ax %al
%rbx %ebx %bx %bl
%rcx %ecx %cx %cl
%rdx %edx %dx %dl
%rsi %esi %si %sil
%rdi %edi %di %dil
%rbp %ebp %bp %bpl
%rsp %esp %sp %spl
%r8 %r8d %r8w %r8b
%r9 %r9d %r9w %r9b
%r10 %r10d %r10w %r10b
%r11 %r11d %r11w %r11b
%r12 %r12d %r12w %r12b
%r13 %r13d %r13w %r13b
%r14 %r14d %r14w %r14b
%r15 %r15d %r15w %r15b
leaq src, dst calculates src, does not load Mem[src] to dst
leaq imm(reg1,reg2,c), reg3 saves imm+reg1+reg2*c into reg3
Unary (with b/w/l/q variants): C equivalent
incq dst: dst++
decq dst: dst—
negq dst: dst = -dst
notq dst: dst = ~dst
Binary (with b/w/l/q variants): C equivalent
addq src,dst: dst = dst + src
subq src,dst: dst = dst - src
andq src,dst: dst = dst & src
orq src,dst: dst = dst | src
xorq src,dst: dst = dst ^ src
salq src,dst: dst = dst << src
sarq src,dst: dst = dst >> src arithmetic
shrq src,dst: dst = dst >> src logical
32-bit variants set the upper 4 bytes to 0 for register destination
# memory access: leaq: 0; mov: 0 or 1; arithmetic & logical: 0 or 1 or 2
All integer arithmetic & logical instructions set condition codes, but
1. leaq does not change any condition codes
2. For bitwise operations (&, |, ^, ~) and test, CF = 0, OF = 0
3. For shift operations, CF = last bit shifted out, OF = 0
4. inc and dec will set OF, but leave CF unchanged
cmp src1, src2 computes src2 – src1; test src1, src2 computes src2 & src1
Set condition codes, do not store result in src2
Jump based on relation of two variables Jump based on testing of one variable
cmp b, a # compare a : b test a, a # test a : 0
j* label # jump if a * b j* label
Instruction 1 cmp b, a test b, a
Instruction 2 Conditional Conditional
je label #equal / zero a == b a & b == 0
jne label #not equal / not zero a != b a & b != 0
js label #negative a – b < 0 a & b < 0
jns label #non-negative a – b >= 0 a & b >= 0
jg label #greater (signed >) a > b (signed) a & b > 0
jge label #greater or equal (signed >=) a >= b (signed) a & b >= 0
jl label #less (signed <) a < b (signed) a & b < 0
jle label #less or equal (signed <=) a <= b (signed) a & b <= 0
ja label #above (unsigned >) a > b (unsigned) a & b > 0U
jae label #above or equal (unsigned >=) a >= b (unsigned) a & b >= 0U
jb label #below (unsigned <) a < b (unsigned) a & b < 0U
jbe label #below or equal (unsigned <=) a <= b (unsigned) a & b <= 0U
a: address of x; a in %rdx; index i in %rcx; return value is stored in %rax
On write-hit
1. Write-through (update both cache and memory)
2. Write-back (only update cache, update memory
later, check dirty bit when a block is evicted)
On write-miss
1. No-write-allocate (directly update memory)
2. Write-allocate (only load into cache)
1. Compulsory miss occurs on first access to a block
2. Conflict miss occurs when the cache is large enough, but multiple data all map
to the same block e.g., 0, 8, 0, 8, ... could miss every time
Direct mapped caches have more conflict misses than 𝐸-way set-associative (𝐸 > 1)
Fully associative does not have conflict misses
3. Capacity miss occurs when the set of active cache blocks (the working set)
is larger than the cache (will not fit even if cache is fully associative)
GOOD LUCK!