10 - External Data
10 - External Data
1
Readings and Exercises
• P & H: Section 2.9, 2.14
2
Objectives
At the end of this section, you will be able to:
1. Declare different variable types
2. Use the .data and .bss declaration sections
3. Define and use constants in the .text section
4. Use command-line arguments
3
Variable Types in C
1. Local (automatic)
▪ Stored in stack memory
2. Global
3. Static
▪ Local or Global
• Global and static are stored in a separate section
of RAM
▪ Not on the stack
4
Local Variables
• In the C language, local (automatic) variables are
always allocated in the stack frame for a function
▪ Scope: local to block code where declared
▪ Lifetime: life of the block of code
int count()
{
int value = 0;
return ++value;
}
5
Global Variables
• Scope: global (from declaration onwards across files)
• Lifetime: life of program global variable
int val;
• Declared before main() main()
{
• Not on the stack val = 3;
. . .
• Implicitly 0 initialized }
int f()
{
int a;
a = val;
. . .
}
6
Static Variables
• Persist between function calls
• Not stack variables
• Initialized implicitly to 0
▪ Stack variables are not
• Initialized explicitly using constants
▪ Not allowed: static int i = myrand();
• Declared in a structure: the structure must be
static
▪ Structure must be in one place in memory
7
Static Variables Example
• Variable count is non static
int main()
{
printf("%d ", f());
printf("%d ", f());
}
8
Static Variables (cont’d)
• Variable count is static
• Initialized only once
Initialized in the first call to f() only
int f()
{
static int count = 0;
count++;
return count;
}
int main()
{
printf("%d ", f());
printf("%d ", f());
}
9
Static Local Variables
• Scope: block of code where declared
• Lifetime: life of the program
▪ Persist from call to call of the function
int count()
{
static int count = 0;
return ++count;
}
10
Static Global Variables
• Scope: local to file (from declaration onwards)
• Lifetime: life of program
main()
{
val = 3;
. . .
}
11
Global versus Static
• Global variables can be accessed from other files
12
The text, data, and .bss Sections
• Programs may allocate 3 sections of memory:
▪ text
• Contains:
▪ Program text (machine code)
▪ Read-only, programmer-initialized data
• Is read-only memory
▪ Attempts to write to this memory causes a segmentation fault
▪ data
• Contains programmer-initialized data
• Is read/write memory
13
The text, data, and .bss Sections
(cont’d)
▪ bss (block starting by symbol)
• Contains zero-initialized data
• Is read/write memory
• These sections are located in low memory, just
after the section reserved for the OS kernel
14
The text, data, and .bss Sections
(cont’d)
low
OS
text PC
data
bss
Heap
Free memory
SP
Stack FP
high
15
The text, data, and .bss Sections
(cont’d)
• Pseudo-ops are used to indicate that what follows
goes into a particular section
▪ .text
• Is the default section when assembling
▪ .data
▪ .bss
16
The text, data, and .bss Sections
(cont’d)
• The assembler uses a location counter for each
section
▪ Starts at 0, and increases as instructions and data are
processed
▪ The final step of assembly gathers all code and data
into the appropriate sections
• When the OS loads the program into RAM:
▪ The text and data sections are loaded first
▪ The bss section is then zeroed
17
External Variables
• Are non-local variables, allocated in the data or
bss sections
▪ Are used to implement C language global and static
local variables
• Can be allocated and initialized using the pseudo-
ops: .dword .word .hword .byte
▪ General form:
label: pseudo-op value1[, value2, . . .]
18
.data Section
• Eg:
.data
a_m: .hword 23
b_m: .word (11 * 4) - 2
c_m : .dword 0
array_m: .byte 10, 20, 30
a_m 23
b_m 42
c_m 0
array_m 10 20 30
19
.data Section (cont’d)
• The labels represent 64-bit addresses
▪ Use adrp and add to put the address into a register
▪ Then use ldr or str to access the variable
• Eg: C code
int i = 2, j = 12, k = 0;
int main()
{
k = i + j;
. . .
}
20
.data Section (cont’d)
int i = 2, j = 12, k = 0;
▪ Assembly code:
int main()
.data {
i_m: .word 2 k = i + j;
j_m: .word 12 . . .
k_m: .word 0 }
.text
.balign 4
.global main
main: stp x29, x30, [sp, -16]!
mov x29, sp
21
.data Section (cont’d)
int i = 2, j = 12, k = 0;
int main()
{
k = i + j;
. . .
}
adrp x19, j_m
add x19, x19, :lo12:j_m
ldr w21, [x19] // w21 = j
. . .
22
Note on ADRP (Address of Page)
• Shifts a signed, 21-bit immediate left by 12 bits, adds it
to the value of PC with the bottom 12 bits cleared to
zero, and then writes the result to a general-purpose
register.
• This permits the calculation of the address at a 4KB
aligned memory region.
• In conjunction with an ADD (immediate) instruction,
this allows for the calculation of any address within
±4GB of the current PC. adrp x19, i_m
add x19, x19, :lo12:i_m
23
.data Section (cont’d)
24
.bss Section
• Uninitialized space can be allocated with the
.skip pseudo-op
▪ Eg: 10 element int array
myarray: .skip 10 * 4
25
.bss Section (cont’d)
• The bss section usually only uses the .skip
pseudo-op
▪ All bss memory is zeroed before program execution
• Initializing memory to non-zero values (with .word,
.hword, etc.) doesn’t make sense
▪ Eg:
.bss
array_m: .skip 10 * 4 // int array
c_m: .skip 1 // char
h_m: .skip 2 // short int
26
Constants in the .text Section
• Programmer-initialized constants are put into the
text section
▪ Must be before or between functions
▪ Eg: .text
.balign 4
func1: stp . . .
. . .
ret
27
The ASCII Character Set
• American Standard Code for Information
Interchange
▪ Encodes characters using 7 bits, stored in a byte:
28
29
The ASCII Character Set (cont’d)
• In assembly, character constants can be denoted
with:
▪ The hex code
• mov w19, 0x5A
▪ The character in single quotes
• mov w19, ’Z’
• Note: may interfere with m4
▪ Note: in gdb, use p/c $w19 to print register
contents as a character
30
Creating and Addressing String
Literals
• A string is an array of characters
• Could be initialized in memory one byte at a time
▪ Eg: “cheers”
.byte ‘c’, ‘h’, ‘e’, ‘e’, ‘r’, ‘s’
31
Creating and Addressing String
Literals (cont’d)
• In C, strings are null terminated
▪ Could be done using two pseudo-ops:
.ascii "cheers"
.byte 0
32
Creating and Addressing String
Literals (cont’d)
• A string literal is a read-only array of characters,
allocated in the text section
▪ In C code, is delimited with " . . . "
▪ Eg:
int main()
{
printf("Hello, world!\n");
}
string literal
33
Creating and Addressing String
Literals (cont’d)
34
Creating and Addressing String
Literals (cont’d)
.text
fmt: .string "Hello, world!\n" // string literal
.balign 4
.global main
main: stp x29, x30, [sp, -16]!
mov x29, sp
35
External Arrays of Pointers
• Created with a list of labels
• C code
#include <stdio.h>
int main()
{
register int i;
return 0;
}
36
External Arrays of Pointers (cont’d)
▪ Assembly code:
define(i_r, w19)
define(base_r, x20)
.text
fmt: .string "season[%d] = %s\n"
37
External Arrays of Pointers (cont’d)
.text
.balign 4
.global main
main: stp x29, x30, [sp, -16]!
mov x29, sp
mov i_r, 0
b test
bl printf
38
External Arrays of Pointers (cont’d)
add i_r, i_r, 1
test: cmp i_r, 4
b.lt top
39
Command-Line Arguments
• Allow you to pass values from the shell into your
program
• In C: main(int argc, char *argv[])
▪ argc: the number of arguments
▪ argv[]: an array of pointers to the arguments
(represented as strings)
40
Command-Line Arguments (cont’d)
int main(int argc, char *argv[])
{
register int i; C code: myecho.c
for (i = 0; i < argc; i++) {
printf("%s\n", argv[i]);
}
return 0;
}
▪ Sample run:
prompt> ./myecho one two
./myecho
one
two
prompt>
41
Command-Line Arguments (cont’d)
▪ Assembly code:
• Note: argc is in w0 and argv[] is in x1 for main()
define(i_r, w19)
define(argc_r, w20)
define(argv_r, x21)
.balign 4
.global main
main: stp x29, x30, [sp, -16]!
mov x29, sp
42
Command-Line Arguments (cont’d)
mov i_r, 0 // i = 0
b test
43