08 Machine Data
08 Machine Data
Instructors:
Randal E. Bryant and David R. O’Hallaron
Today
Arrays
▪ One-dimensional
▪ Multi-dimensional (nested)
▪ Multi-level
Structures
▪ Allocation
▪ Access
▪ Alignment
Floating Point
Array Allocation
Basic Principle
T A[L];
▪ Array of data type T and length L
▪ Contiguously allocated region of L * sizeof(T) bytes in memory
char string[12];
x x + 12
int val[5];
x x+4 x+8 x + 12 x + 16 x + 20
double a[3];
x x+8 x + 16 x + 24
char *p[3];
x x+8 x + 16 x + 24
Array Access
Basic Principle
T A[L];
▪ Array of data type T and length L
▪ Identifier A can be used as a pointer to array element 0: Type T*
int val[5]; 1 5 2 1 3
x x+4 x+8 x + 12 x + 16 x + 20
Array Example
#define ZLEN 5
typedef int zip_dig[ZLEN];
zip_dig cmu = { 1, 5, 2, 1, 3 };
zip_dig mit = { 0, 2, 1, 3, 9 };
zip_dig ucb = { 9, 4, 7, 2, 0 };
zip_dig cmu; 1 5 2 1 3
16 20 24 28 32 36
zip_dig mit; 0 2 1 3 9
36 40 44 48 52 56
zip_dig ucb; 9 4 7 2 0
56 60 64 68 72 76
int get_digit
(zip_dig z, int digit)
{
return z[digit];
} ◼ Register %rdi contains
starting address of array
IA32 ◼ Register %rsi contains
# %rdi = z array index
# %rsi = digit ◼ Desired digit at
movl (%rdi,%rsi,4), %eax # z[digit] 4*%rdi + %rsi
◼ Use memory reference
(%rdi,%rsi,4)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 6
Carnegie Mellon
# %rdi = z
movl $0, %eax # i = 0
jmp .L3 # goto middle
.L4: # loop:
addl $1, (%rdi,%rax,4) # z[i]++
addq $1, %rax # i++
.L3: # middle
cmpq $4, %rax # i:4
jbe .L4 # if <=, goto loop
rep; ret
int A[R][C];
A A A A A A
[0] • • • [0] [1] • • • [1] • • • [R-1] • • • [R-1]
[0] [C-1] [0] [C-1] [0] [C-1]
4*R*C Bytes
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 8
Carnegie Mellon
zip_dig
1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1
pgh[4];
int A[R][C];
A A A A A A
[0] ••• [0] • • • [i] ••• [i] • • • [R-1] ••• [R-1]
[0] [C-1] [0] [C-1] [0] [C-1]
A A+(i*C*4) A+((R-1)*C*4)
int A[R][C];
A A A A A
[0] ••• [0] • • • ••• [i] ••• • • • [R-1] ••• [R-1]
[0] [C-1] [j] [0] [C-1]
A A+(i*C*4) A+((R-1)*C*4)
A+(i*C*4)+(j*4)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 11
Carnegie Mellon
cmu
1 5 2 1 3
univ
16 20 24 28 32 36
160 36 mit
0 2 1 3 9
168 16
176 56 ucb 36 40 44 48 52 56
9 4 7 2 0
56 60 64 68 72 76
Computation
▪ Element access Mem[Mem[univ+8*index]+4*digit]
▪ Must do two memory reads
▪ First get pointer to row array
▪ Then access element within array
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 13
Carnegie Mellon
Mem[pgh+20*index+4*digit] Mem[Mem[univ+8*index]+4*digit]
N X N Matrix #define N 16
typedef int fix_matrix[N][N];
Code /* Get element a[i][j] */
int fix_ele(fix_matrix a,
Fixed dimensions size_t i, size_t j)
▪ Know value of N at {
compile time return a[i][j];
}
#define IDX(n, i, j) ((i)*(n)+(j))
Variable dimensions, /* Get element a[i][j] */
explicit indexing int vec_ele(size_t n, int *a,
▪ Traditional way to size_t i, size_t j)
{
implement dynamic
return a[IDX(n,i,j)];
arrays }
16 X 16 Matrix Access
Array Elements
▪ Address A + i * (C * K) + j * K
▪ C = 16, K = 4
n X n Matrix Access
Array Elements
▪ Address A + i * (C * K) + j * K
▪ C = n, K = 4
▪ Must perform integer multiplication
/* Get element a[i][j] */
int var_ele(size_t n, int a[n][n], size_t i, size_t j)
{
return a[i][j];
}
Today
Arrays
▪ One-dimensional
▪ Multi-dimensional (nested)
▪ Multi-level
Structures
▪ Allocation
▪ Access
▪ Alignment
Floating Point
Structure Representation
r
struct rec {
int a[4];
size_t i; a i next
struct rec *next;
0 16 24 32
};
struct rec {
int a[3];
Following Linked List int i;
struct rec *next;
C Code };
r
void set_val
(struct rec *r, int val) a i next
{
0 16 24 32
while (r) {
int i = r->i; Element i
r->a[i] = val;
r = r->next; Register Value
} %rdi r
}
%rsi val
.L11: # loop:
movslq 16(%rdi), %rax # i = M[r+16]
movl %esi, (%rdi,%rax,4) # M[r+4*i] = val
movq 24(%rdi), %rdi # r = M[r+24]
testq %rdi, %rdi # Test r
jne .L11 # if !=0 goto loop
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 21
Carnegie Mellon
Aligned Data
▪ Primitive data type requires K bytes
▪ Address must be multiple of K
Multiple of 4 Multiple of 8
Multiple of 8 Multiple of 8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 22
Carnegie Mellon
Alignment Principles
Aligned Data
▪ Primitive data type requires K bytes
▪ Address must be multiple of K
▪ Required on some machines; advised on x86-64
Motivation for Aligning Data
▪ Memory accessed by (aligned) chunks of 4 or 8 bytes (system
dependent)
▪ Inefficient to load or store datum that spans quad word
boundaries
▪ Virtual memory trickier when datum spans 2 pages
Compiler
▪ Inserts gaps in structure to ensure correct alignment of fields
Multiple of 4 Multiple of 8
Multiple of 8 Multiple of 8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 25
Carnegie Mellon
Multiple of K=8
Arrays of Structures
struct S2 {
Overall structure length double v;
int i[2];
multiple of K char c;
Satisfy alignment requirement } a[10];
for every element
i 2 bytes v j 2 bytes
a+12*idx a+12*idx+8
Saving Space
Put large data types first
struct S4 { struct S5 {
char c; int i;
int i; char c;
char d; char d;
} *p; } *p;
Effect (K=4)
c 3 bytes i d 3 bytes
i c d 2 bytes
Today
Arrays
▪ One-dimensional
▪ Multi-dimensional (nested)
▪ Multi-level
Structures
▪ Allocation
▪ Access
▪ Alignment
Floating Point
Background
History
▪ x87 FP
Legacy, very ugly
▪
▪ SSE FP
▪ Supported by Shark machines
▪ Special case use of vector instructions
▪ AVX FP
▪ Newest version
▪ Similar to SSE
▪ Documented in book
◼ 8 16-bit integers
◼ 4 32-bit integers
◼ 4 single-precision floats
◼ 2 double-precision floats
◼ 1 single-precision float
◼ 1 double-precision float
+
%xmm1
◼ SIMD Operations: Single Precision addps %xmm0,%xmm1
%xmm0
+ + + +
%xmm1
◼ Scalar Operations: Double Precision
addsd %xmm0,%xmm1
%xmm0
FP Basics
Arguments passed in %xmm0, %xmm1, ...
Result returned in %xmm0
All XMM registers caller-saved
float fadd(float x, float y) double dadd(double x, double y)
{ {
return x + y; return x + y;
} }
FP Memory Referencing
Integer (and pointer) arguments passed in regular registers
FP values passed in XMM registers
Different mov instructions to move between XMM registers,
and between memory and XMM registers
# p in %rdi, v in %xmm0
movapd %xmm0, %xmm1 # Copy v
movsd (%rdi), %xmm0 # x = *p
addsd %xmm0, %xmm1 # t = x + v
movsd %xmm1, (%rdi) # *p = t
ret
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 35
Carnegie Mellon
Summary
Arrays
▪ Elements packed into contiguous region of memory
▪ Use index arithmetic to locate individual elements
Structures
▪ Elements packed into single region of memory
▪ Access using offsets determined by compiler
▪ Possible require internal and external padding to ensure alignment
Combinations
▪ Can nest structure and array code arbitrarily
Floating Point
▪ Data held and operated on in XMM registers