0% found this document useful (0 votes)
6 views37 pages

08 Machine Data

Uploaded by

Hiếu Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views37 pages

08 Machine Data

Uploaded by

Hiếu Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Carnegie Mellon

Machine-Level Programming IV:


Data
15-213: Introduction to Computer Systems
8th Lecture, Sep. 24, 2015

Instructors:
Randal E. Bryant and David R. O’Hallaron

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 1


Carnegie Mellon

Today
 Arrays
▪ One-dimensional
▪ Multi-dimensional (nested)
▪ Multi-level
 Structures
▪ Allocation
▪ Access
▪ Alignment
 Floating Point

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 2


Carnegie Mellon

Array Allocation
 Basic Principle
T A[L];
▪ Array of data type T and length L
▪ Contiguously allocated region of L * sizeof(T) bytes in memory

char string[12];

x x + 12

int val[5];

x x+4 x+8 x + 12 x + 16 x + 20

double a[3];

x x+8 x + 16 x + 24

char *p[3];

x x+8 x + 16 x + 24

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 3


Carnegie Mellon

Array Access
 Basic Principle
T A[L];
▪ Array of data type T and length L
▪ Identifier A can be used as a pointer to array element 0: Type T*

int val[5]; 1 5 2 1 3
x x+4 x+8 x + 12 x + 16 x + 20

 Reference Type Value


val[4] int 3
val int * x
val+1 int * x+4
&val[2] int * x+8
val[5] int ??
*(val+1) int 5
val + i int * x+4i
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 4
Carnegie Mellon

Array Example
#define ZLEN 5
typedef int zip_dig[ZLEN];

zip_dig cmu = { 1, 5, 2, 1, 3 };
zip_dig mit = { 0, 2, 1, 3, 9 };
zip_dig ucb = { 9, 4, 7, 2, 0 };

zip_dig cmu; 1 5 2 1 3
16 20 24 28 32 36
zip_dig mit; 0 2 1 3 9
36 40 44 48 52 56
zip_dig ucb; 9 4 7 2 0
56 60 64 68 72 76

 Declaration “zip_dig cmu” equivalent to “int cmu[5]”


 Example arrays were allocated in successive 20 byte blocks
▪ Not guaranteed to happen in general
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 5
Carnegie Mellon

Array Accessing Example


zip_dig cmu; 1 5 2 1 3
16 20 24 28 32 36

int get_digit
(zip_dig z, int digit)
{
return z[digit];
} ◼ Register %rdi contains
starting address of array
IA32 ◼ Register %rsi contains
# %rdi = z array index
# %rsi = digit ◼ Desired digit at
movl (%rdi,%rsi,4), %eax # z[digit] 4*%rdi + %rsi
◼ Use memory reference
(%rdi,%rsi,4)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 6
Carnegie Mellon

Array Loop Example


void zincr(zip_dig z) {
size_t i;
for (i = 0; i < ZLEN; i++)
z[i]++;
}

# %rdi = z
movl $0, %eax # i = 0
jmp .L3 # goto middle
.L4: # loop:
addl $1, (%rdi,%rax,4) # z[i]++
addq $1, %rax # i++
.L3: # middle
cmpq $4, %rax # i:4
jbe .L4 # if <=, goto loop
rep; ret

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 7


Carnegie Mellon

Multidimensional (Nested) Arrays


 Declaration A[0][0] • • • A[0][C-1]
T A[R][C];
• •
▪ 2D array of data type T • •
▪ R rows, C columns • •
▪ Type T element requires K bytes
A[R-1][0] • • • A[R-1][C-1]
 Array Size
▪ R * C * K bytes
 Arrangement
▪ Row-Major Ordering

int A[R][C];
A A A A A A
[0] • • • [0] [1] • • • [1] • • • [R-1] • • • [R-1]
[0] [C-1] [0] [C-1] [0] [C-1]

4*R*C Bytes
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 8
Carnegie Mellon

Nested Array Example


#define PCOUNT 4
zip_dig pgh[PCOUNT] =
{{1, 5, 2, 0, 6},
{1, 5, 2, 1, 3 },
{1, 5, 2, 1, 7 },
{1, 5, 2, 2, 1 }};

zip_dig
1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1
pgh[4];

76 96 116 136 156

 “zip_dig pgh[4]” equivalent to “int pgh[4][5]”


▪ Variable pgh: array of 4 elements, allocated contiguously
▪ Each element is an array of 5 int’s, allocated contiguously
 “Row-Major” ordering of all elements in memory
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 9
Carnegie Mellon

Nested Array Row Access


 Row Vectors
▪ A[i] is array of C elements
▪ Each element of type T requires K bytes
▪ Starting address A + i * (C * K)

int A[R][C];

A[0] A[i] A[R-1]

A A A A A A
[0] ••• [0] • • • [i] ••• [i] • • • [R-1] ••• [R-1]
[0] [C-1] [0] [C-1] [0] [C-1]

A A+(i*C*4) A+((R-1)*C*4)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 10


Carnegie Mellon

Nested Array Element Access


 Array Elements
▪ A[i][j] is element of type T, which requires K bytes
▪ Address A + i * (C * K) + j * K = A + (i * C + j)* K

int A[R][C];

A[0] A[i] A[R-1]

A A A A A
[0] ••• [0] • • • ••• [i] ••• • • • [R-1] ••• [R-1]
[0] [C-1] [j] [0] [C-1]

A A+(i*C*4) A+((R-1)*C*4)

A+(i*C*4)+(j*4)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 11
Carnegie Mellon

Multi-Level Array Example


zip_dig cmu = { 1, 5, 2, 1, 3 };  Variable univ denotes
zip_dig mit = { 0, 2, 1, 3, 9 }; array of 3 elements
zip_dig ucb = { 9, 4, 7, 2, 0 };  Each element is a pointer
#define UCOUNT 3 ▪ 8 bytes
int *univ[UCOUNT] = {mit, cmu, ucb};  Each pointer points to array
of int’s

cmu
1 5 2 1 3
univ
16 20 24 28 32 36
160 36 mit
0 2 1 3 9
168 16
176 56 ucb 36 40 44 48 52 56
9 4 7 2 0
56 60 64 68 72 76

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 12


Carnegie Mellon

Element Access in Multi-Level Array


int get_univ_digit
(size_t index, size_t digit)
{
return univ[index][digit];
}

salq $2, %rsi # 4*digit


addq univ(,%rdi,8), %rsi # p = univ[index] + 4*digit
movl (%rsi), %eax # return *p
ret

 Computation
▪ Element access Mem[Mem[univ+8*index]+4*digit]
▪ Must do two memory reads
▪ First get pointer to row array
▪ Then access element within array
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 13
Carnegie Mellon

Array Element Accesses


Nested array Multi-level array
int get_pgh_digit int get_univ_digit
(size_t index, size_t digit) (size_t index, size_t digit)
{ {
return pgh[index][digit]; return univ[index][digit];
} }

Accesses looks similar in C, but address computations very different:

Mem[pgh+20*index+4*digit] Mem[Mem[univ+8*index]+4*digit]

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 14


Carnegie Mellon

N X N Matrix #define N 16
typedef int fix_matrix[N][N];
Code /* Get element a[i][j] */
int fix_ele(fix_matrix a,
 Fixed dimensions size_t i, size_t j)
▪ Know value of N at {
compile time return a[i][j];
}
#define IDX(n, i, j) ((i)*(n)+(j))
 Variable dimensions, /* Get element a[i][j] */
explicit indexing int vec_ele(size_t n, int *a,
▪ Traditional way to size_t i, size_t j)
{
implement dynamic
return a[IDX(n,i,j)];
arrays }

/* Get element a[i][j] */


 Variable dimensions, int var_ele(size_t n, int a[n][n],
implicit indexing size_t i, size_t j) {
return a[i][j];
▪ Now supported by gcc }
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 15
Carnegie Mellon

16 X 16 Matrix Access
 Array Elements
▪ Address A + i * (C * K) + j * K
▪ C = 16, K = 4

/* Get element a[i][j] */


int fix_ele(fix_matrix a, size_t i, size_t j) {
return a[i][j];
}

# a in %rdi, i in %rsi, j in %rdx


salq $6, %rsi # 64*i
addq %rsi, %rdi # a + 64*i
movl (%rdi,%rdx,4), %eax # M[a + 64*i + 4*j]
ret

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 16


Carnegie Mellon

n X n Matrix Access
 Array Elements
▪ Address A + i * (C * K) + j * K
▪ C = n, K = 4
▪ Must perform integer multiplication
/* Get element a[i][j] */
int var_ele(size_t n, int a[n][n], size_t i, size_t j)
{
return a[i][j];
}

# n in %rdi, a in %rsi, i in %rdx, j in %rcx


imulq %rdx, %rdi # n*i
leaq (%rsi,%rdi,4), %rax # a + 4*n*i
movl (%rax,%rcx,4), %eax # a + 4*n*i + 4*j
ret

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 17


Carnegie Mellon

Today
 Arrays
▪ One-dimensional
▪ Multi-dimensional (nested)
▪ Multi-level
 Structures
▪ Allocation
▪ Access
▪ Alignment
 Floating Point

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 18


Carnegie Mellon

Structure Representation
r
struct rec {
int a[4];
size_t i; a i next
struct rec *next;
0 16 24 32
};

 Structure represented as block of memory


▪ Big enough to hold all of the fields
 Fields ordered according to declaration
▪ Even if another ordering could yield a more compact
representation
 Compiler determines overall size + positions of fields
▪ Machine-level program has no understanding of the structures
in the source code

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 19


Carnegie Mellon

Generating Pointer to Structure Member


r r+4*idx
struct rec {
int a[4];
size_t i; a i next
struct rec *next;
0 16 24 32
};

 Generating Pointer to int *get_ap


(struct rec *r, size_t idx)
Array Element {
▪ Offset of each structure return &r->a[idx];
member determined at }
compile time
▪ Compute as r + 4*idx # r in %rdi, idx in %rsi
leaq (%rdi,%rsi,4), %rax
ret

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 20


Carnegie Mellon

struct rec {
int a[3];
Following Linked List int i;
struct rec *next;
 C Code };
r
void set_val
(struct rec *r, int val) a i next
{
0 16 24 32
while (r) {
int i = r->i; Element i
r->a[i] = val;
r = r->next; Register Value
} %rdi r
}
%rsi val
.L11: # loop:
movslq 16(%rdi), %rax # i = M[r+16]
movl %esi, (%rdi,%rax,4) # M[r+4*i] = val
movq 24(%rdi), %rdi # r = M[r+24]
testq %rdi, %rdi # Test r
jne .L11 # if !=0 goto loop
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 21
Carnegie Mellon

Structures & Alignment


 Unaligned Data struct S1 {
char c;
c i[0] i[1] v
int i[2];
p p+1 p+5 p+9 p+17 double v;
} *p;

 Aligned Data
▪ Primitive data type requires K bytes
▪ Address must be multiple of K

c 3 bytes i[0] i[1] 4 bytes v


p+0 p+4 p+8 p+16 p+24

Multiple of 4 Multiple of 8

Multiple of 8 Multiple of 8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 22
Carnegie Mellon

Alignment Principles
 Aligned Data
▪ Primitive data type requires K bytes
▪ Address must be multiple of K
▪ Required on some machines; advised on x86-64
 Motivation for Aligning Data
▪ Memory accessed by (aligned) chunks of 4 or 8 bytes (system
dependent)
▪ Inefficient to load or store datum that spans quad word
boundaries
▪ Virtual memory trickier when datum spans 2 pages

 Compiler
▪ Inserts gaps in structure to ensure correct alignment of fields

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 23


Carnegie Mellon

Specific Cases of Alignment (x86-64)


 1 byte: char, …
▪ no restrictions on address
 2 bytes: short, …
▪ lowest 1 bit of address must be 02
 4 bytes: int, float, …
▪ lowest 2 bits of address must be 002
 8 bytes: double, long, char *, …
▪ lowest 3 bits of address must be 0002
 16 bytes: long double (GCC on Linux)
▪ lowest 4 bits of address must be 00002

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 24


Carnegie Mellon

Satisfying Alignment with Structures


 Within structure: struct S1 {
▪ Must satisfy each element’s alignment requirement char c;
int i[2];
 Overall structure placement double v;
▪ Each structure has alignment requirement K } *p;
K = Largest alignment of any element

▪ Initial address & structure length must be multiples of K
 Example:
▪ K = 8, due to double element

c 3 bytes i[0] i[1] 4 bytes v


p+0 p+4 p+8 p+16 p+24

Multiple of 4 Multiple of 8

Multiple of 8 Multiple of 8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 25
Carnegie Mellon

Meeting Overall Alignment Requirement

 For largest alignment requirement K struct S2 {


double v;
 Overall structure must be multiple of K int i[2];
char c;
} *p;

v i[0] i[1] c 7 bytes


p+0 p+8 p+16 p+24

Multiple of K=8

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 26


Carnegie Mellon

Arrays of Structures
struct S2 {
 Overall structure length double v;
int i[2];
multiple of K char c;
 Satisfy alignment requirement } a[10];
for every element

a[0] a[1] a[2] • • •


a+0 a+24 a+48 a+72

v i[0] i[1] c 7 bytes


a+24 a+32 a+40 a+48
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 27
Carnegie Mellon

Accessing Array Elements struct S3 {


short i;
float v;
 Compute array offset 12*idx short j;
} a[10];
▪ sizeof(S3), including alignment spacers
 Element j is at offset 8 within structure
 Assembler gives offset a+8
▪ Resolved during linking
a[0] • • • a[idx] • • •
a+0 a+12 a+12*idx

i 2 bytes v j 2 bytes
a+12*idx a+12*idx+8

short get_j(int idx)


# %rdi = idx
{
leaq (%rdi,%rdi,2),%rax # 3*idx
return a[idx].j;
movzwl a+8(,%rax,4),%eax
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 28
Carnegie Mellon

Saving Space
 Put large data types first
struct S4 { struct S5 {
char c; int i;
int i; char c;
char d; char d;
} *p; } *p;

 Effect (K=4)

c 3 bytes i d 3 bytes

i c d 2 bytes

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 29


Carnegie Mellon

Today
 Arrays
▪ One-dimensional
▪ Multi-dimensional (nested)
▪ Multi-level
 Structures
▪ Allocation
▪ Access
▪ Alignment
 Floating Point

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 30


Carnegie Mellon

Background
 History
▪ x87 FP
Legacy, very ugly

▪ SSE FP
▪ Supported by Shark machines
▪ Special case use of vector instructions
▪ AVX FP
▪ Newest version
▪ Similar to SSE
▪ Documented in book

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 31


Carnegie Mellon

Programming with SSE3


XMM Registers
◼ 16 total, each 16 bytes
◼ 16 single-byte integers

◼ 8 16-bit integers

◼ 4 32-bit integers

◼ 4 single-precision floats

◼ 2 double-precision floats

◼ 1 single-precision float

◼ 1 double-precision float

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 32


Carnegie Mellon

Scalar & SIMD Operations


◼ Scalar Operations: Single Precision addss %xmm0,%xmm1
%xmm0

+
%xmm1
◼ SIMD Operations: Single Precision addps %xmm0,%xmm1
%xmm0

+ + + +
%xmm1
◼ Scalar Operations: Double Precision
addsd %xmm0,%xmm1
%xmm0

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition


%xmm1 33
Carnegie Mellon

FP Basics
 Arguments passed in %xmm0, %xmm1, ...
 Result returned in %xmm0
 All XMM registers caller-saved
float fadd(float x, float y) double dadd(double x, double y)
{ {
return x + y; return x + y;
} }

# x in %xmm0, y in %xmm1 # x in %xmm0, y in %xmm1


addss %xmm1, %xmm0 addsd %xmm1, %xmm0
ret ret

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 34


Carnegie Mellon

FP Memory Referencing
 Integer (and pointer) arguments passed in regular registers
 FP values passed in XMM registers
 Different mov instructions to move between XMM registers,
and between memory and XMM registers

double dincr(double *p, double v)


{
double x = *p;
*p = x + v;
return x;
}

# p in %rdi, v in %xmm0
movapd %xmm0, %xmm1 # Copy v
movsd (%rdi), %xmm0 # x = *p
addsd %xmm0, %xmm1 # t = x + v
movsd %xmm1, (%rdi) # *p = t
ret
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 35
Carnegie Mellon

Other Aspects of FP Code


 Lots of instructions
▪ Different operations, different formats, ...
 Floating-point comparisons
▪ Instructions ucomiss and ucomisd
▪ Set condition codes CF, ZF, and PF
 Using constant values
▪ Set XMM0 register to 0 with instruction xorpd %xmm0, %xmm0
▪ Others loaded from memory

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 36


Carnegie Mellon

Summary
 Arrays
▪ Elements packed into contiguous region of memory
▪ Use index arithmetic to locate individual elements
 Structures
▪ Elements packed into single region of memory
▪ Access using offsets determined by compiler
▪ Possible require internal and external padding to ensure alignment
 Combinations
▪ Can nest structure and array code arbitrarily
 Floating Point
▪ Data held and operated on in XMM registers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 37

You might also like