2020fa CS61C 2020fa Module 2 C PDF
2020fa CS61C 2020fa Module 2 C PDF
UC Berkeley
in UC Berkeley
Teaching Professor Computer Architecture Professor
Dan Garcia (a.k.a. Machine Structures) Bora Nikolić
Introduction to the
C Programming Language
Garcia, Nikolić
cs61c.org
ENIAC (U Penn, 1946)
§ First Electronic General-
Purpose Computer
§ Blazingly fast
ú Multiply in 2.8ms!
ú 10 decimal digits x 10
decimal digits
§ But needed 2-3 days to
setup new program
§ Programmed with patch
cords and switches
ú At that time & before,
"computer" mostly referred
to people who did
calculations
Garcia, Nikolić
Introduction to C (3)
EDSAC (Cambridge, 1949)
§ First General Stored-
Program Computer
§ Programs held as
numbers in memory
ú This is the revolution:
It isn't just programmable,
but the program is just the
same type of data that the
computer computes on
ú Bits are not just the
numbers being
manipulated, but the
instructions on how to
manipulate the numbers!
§ 35-bit binary Twos
complement words
Garcia, Nikolić
Introduction to C (4)
Great Idea #1: Abstraction
(Levels of Representation/Interpretation)
High Level Language temp = v[k];
v[k] = v[k+1];
Program (e.g., C) v[k+1] = temp;
Compiler Anything can be represented
lw x3, 0(x10)
Assembly Language lw x4, 4(x10) as a number,
Program (e.g., RISC-V) sw
sw
x4,
x3,
0(x10)
4(x10)
i.e., data or instructions
Assembler 1000 1101 1110 0010 0000 0000 0000 0000
Machine Language 1000 1110 0001 0000 0000 0000 0000 0100
Program (RISC-V) 1010 1110 0001 0010 0000 0000 0000 0000
1010 1101 1110 0010 0000 0000 0000 0100
Reg[rs1]
1
ALU
alu
pc+4
Architecture Implementation
Logic Circuit Description
inst[31:7]
Imm. imm [31:0]
Gen
C
D Garcia, Nikolić
Introduction to C (5)
Introduction to C (1/2)
§ Kernighan and Ritchie
ú C is not a “very high-level”
language, nor a “big” one, and is
not specialized to any particular
area of application. But its
absence of restrictions and its
generality make it more
convenient and effective for
many tasks than supposedly
more powerful languages.
§ Enabled first operating
system not written in
assembly language!
ú UNIX - A portable OS!
Garcia, Nikolić
Introduction to C (6)
Introduction to C (2/2)
§ Why C?
ú We can write programs that allow us to exploit underlying
features of the architecture
memory management, special instructions, parallelism
§ C and derivatives (C++/Obj-C/C#) still one of the most
popular programming languages after >40 years!
§ If you are starting a new project where performance
matters use either Go or Rust
ú Rust, “C-but-safe”: By the time your C is (theoretically) correct
w/all necessary checks it should be no faster than Rust
ú Go, “Concurrency”: Practical concurrent programming to take
advantage of modern multi-core microprocessors
Garcia, Nikolić
Introduction to C (7)
Disclaimer
§ You will not learn how to fully code in C in these
lectures! You’ll still need your C reference
ú K&R is a must-have
ú Useful Reference: “JAVA in a Nutshell,” O’Reilly
Chapter 2, “How Java Differs from C”
ú Brian Harvey’s helpful transition notes
http://inst.eecs.berkeley.edu/~cs61c/resources/HarveyNotesC1-3.pdf
Introduction to C (8)
Compilation: Overview
§ C compilers map C programs directly into
architecture-specific machine code (string of 1s and 0s)
ú Unlike Java, which converts to architecture-independent
bytecode that may then be compiled by a just-in-time compiler
(JIT)
ú Unlike Python environments, which converts to a byte code at
runtime
These differ mainly in exactly when your program is converted
to low-level machine instructions (“levels of interpretation”)
§ For C, generally a two part process of compiling .c files to
.o files, then linking the .o files into executables;
ú Assembling is also done (but is hidden, i.e., done automatically,
by default); we’ll talk about that later
Garcia, Nikolić
Introduction to C (10)
C Compilation Simplified Overview (more later)
foo.c bar.c C source files (text)
Compiler Compiler/assembler
Compiler
combined here
Pre-built object
Linker lib.o file libraries
Introduction to C (11)
Compilation: Advantages
§ Reasonable compilation time: enhancements in
compilation procedure (Makefiles) allow only
modified files to be recompiled
§ Excellent run-time performance: generally much
faster than Scheme or Java for comparable code
(because it optimizes for a given architecture)
ú But these days, a lot of performance is in libraries:
ú Plenty of people do scientific computation in Python!?!
they have good libraries for accessing GPU-specific resources
Also, many times python allows the ability to drive many other
machines very easily … wait for Spark™ lecture
Also, Python can call low-level C code to do work: Cython
Garcia, Nikolić
Introduction to C (12)
Compilation: Disadvantages
§ Compiled files, including the executable, are
architecture-specific, depending on processor type
(e.g., MIPS vs. x86 vs. RISC-V) and the operating
system (e.g., Windows vs. Linux vs. MacOS)
§ Executable must be rebuilt on each new system
ú I.e., “porting your code” to a new architecture
§ “Change → Compile → Run [repeat]” iteration cycle
can be slow during development
ú but make only rebuilds changed pieces, and can compile
in parallel: make -j
ú linker is sequential though → Amdahl’s Law
Garcia, Nikolić
Introduction to C (13)
C Pre-Processor (CPP)
foo.c CPP foo.i Compiler
Introduction to C (14)
CPP Macros: A Warning...
§ You often see C preprocessor macros
defined to create small "functions"
ú But they aren't actual functions, instead it just
changes the text of the program
ú In fact, all #define does is string replacement
ú #define min(X,Y) ((X)<(Y)?(X):(Y))
§ This can produce, umm, interesting errors
with macros, if foo(z) has a side-effect
ú next = min(w, foo(z));
ú next = ((w)<(foo(z))?(w):(foo(z))); þ
Garcia, Nikolić
Introduction to C (15)
C vs. Java (1/3)
C Java
Type of Language Function Oriented Object Oriented
Programming Unit Function Class = Abstract Data Type
gcc hello.c creates javac Hello.java creates Java virtual
Compilation
machine language code machine language bytecode
Introduction to C (17) 17
C vs. Java (2/3)
C Java
Comments (C99
/* … */ /* … */ or // … end of line
same as Java)
Constants #define, const final
Preprocessor Yes No
Variable naming
sum_of_squares sumOfSquares
conventions
#include
Accessing a library import java.io.File;
<stdio.h>
From http://www.cs.princeton.edu/introcs/faq/c2java.html Garcia, Nikolić
Introduction to C (18) 18
C vs. Java (3/3) …operators nearly identical
§ arithmetic: +, -, *, /, %
§ assignment: =
§ augmented assignment: +=, -=, *=, /=, %=, &=, |=, ^=,
<<=, >>=
§ bitwise logic: ~, &, |, ^
§ bitwise shifts: <<, >>
§ boolean logic: !, &&, ||
§ equality testing: ==, !=
§ subexpression grouping: ()
§ order relations: <, <=, >, >=
§ increment and decrement: ++ and --
§ member selection: ., ->
ú Slightly different than Java because there are both structures and pointers to structures, more later
§ conditional evaluation: ? :
Garcia, Nikolić
Introduction to C (19) 19
Has there been an update to ANSI C?
§ Yes! It’s called the “C99” or “C9x” std
ú To be safe: “gcc -std=c99” to compile
ú printf(“%ld\n", __STDC_VERSION__); è
199901
§ References
ú en.wikipedia.org/wiki/C99
§ Highlights
ú Declarations in for loops, like Java
ú Java-like // comments (to end of line)
ú Variable-length non-global arrays
ú <inttypes.h>: explicit integer types
ú <stdbool.h> for boolean logic def’s
Garcia, Nikolić
Introduction to C (20)
Has there been an update to C99?
§ Yes! It’s called the “C11” (C18 fixes bugs…)
ú You need “gcc -std=c11” (or c17) to compile
ú printf(“%ld\n", __STDC_VERSION__); è 201112L
ú printf(“%ld\n", __STDC_VERSION__); è 201710L
§ References
ú en.wikipedia.org/wiki/C11_(C_standard_revision)
§ Highlights
ú Multi-threading support!
ú Unicode strings and constants
ú Removal of gets()
ú Type-generic Macros (dispatch based on type)
ú Support for complex values
ú Static assertions, Exclusive create-and-open, …
Garcia, Nikolić
Introduction to C (21)
C Syntax: main
§ To get the main function to accept
arguments, use this:
ú int main (int argc, char *argv[])
§ What does this mean?
ú argc will contain the number of strings on the
command line (the executable counts as one, plus
one for each argument). Here argc is 2:
unix% sort myFile
ú argv is a pointer to an array containing the
arguments as strings (more on pointers later).
þ
Garcia, Nikolić
Introduction to C (22)
C Syntax: True or False?
§ What evaluates to FALSE in C?
ú 0 (integer)
ú NULL (pointer: more on this later)
ú Boolean types provided by C99’s
stdbool.h
§ What evaluates to TRUE in C?
ú …everything else…
ú Same idea as in Scheme
Only #f is false, everything else is true!
Garcia, Nikolić
Introduction to C (24)
Typed Variables in C
§ Must declare the type of data a variable will hold
ú Types can't change. E.g, int var = 2;
Type Description Example
Integer Numbers (including negatives)
int 0, 78, -217, 0x7337
At least 16 bits, can be larger
unsigned
Unsigned Integers 0, 6, 35102
int
0.0, 3.14159,
float Floating point decimal
6.02e23
0.0, 3.14159,
double Equal or higher precision floating point
6.02e23
char Single character ‘a’, ‘D’, ‘\n’
Longer int, 0, 78, -217,
long
Size >= sizeof(int), at least 32b 301720971
Even longer int,
long long 31705192721092512
size >= sizeof(long), at least 64b
Garcia, Nikolić
Introduction to C (25) 25
Integers: Python vs. Java vs. C
§ C: int should be integer type that target processor works
with most efficiently
§ Only guarantee:
ú sizeof(long long)
≥ sizeof(long) ≥ sizeof(int) ≥ sizeof(short)
ú Also, short >= 16 bits, long >= 32 bits
ú All could be 64 bits
ú This is why we encourage you to use intN_t and uintN_t!!
Language sizeof(int)
Python >=32 bits (plain ints), infinite (long ints)
Java 32 bits
C Depends on computer; 16 or 32 or 64
Garcia, Nikolić
Introduction to C (26) 26
Consts and Enums in C
§ Constant is assigned a typed value once in the
declaration; value can't change during entire
execution of program
const float golden_ratio = 1.618;
const int days_in_week = 7;
const double the_law = 2.99792458e8;
ú You can have a constant version of any of the standard C
variable types
§ Enums: a group of related integer constants. E.g.,
enum cardsuit {CLUBS,DIAMONDS,HEARTS,SPADES};
enum color {RED, GREEN, BLUE};
Garcia, Nikolić
Introduction to C (27) 27
Typed Functions in C
§ You have to declare the type of data you plan to return
from a function
§ Return type can be any C variable type, and is placed to
the left of the function name
§ You can also specify the return type as void
ú Just think of this as saying that no value will be returned
§ Also need to declare types for values passed into a function
§ Variables and functions MUST be declared before they are
used
int number_of_people () { return 3; }
float dollars_and_cents () { return 10.33; }
Garcia, Nikolić
Introduction to C (28) 28
Structs in C
§ Typedef allows you to define new types.
typedef uint8_t BYTE;
BYTE b1, b2;
§ Structs are structured groups of variables e.g.,
typedef struct {
int length_in_seconds;
int year_recorded;
} SONG;
Dot notation: x.y = value
SONG song1;
song1.length_in_seconds = 213;
song1.year_recorded = 1994;
SONG song2;
song2.length_in_seconds = 248;
song2.year_recorded = 1988;
Garcia, Nikolić
Introduction to C (29) 29
C Syntax : Control Flow (1/2)
§ Within a function, remarkably close to Java
constructs (shows Java’s legacy) for control flow
ú A statement can be a {} of code or just a standalone statement
§ if-else
ú if (expression) statement
if (x == 0) y++;
if (x == 0) {y++;}
if (x == 0) {y++; j = j + y;}
ú if (expression) statement1 else statement2
There is an ambiguity in a series of if/else if/else if you don't use {}s, so use
{}s to block the code
In fact, it is a bad C habit to not always have the statement in {}s, it has
resulted in some amusing errors...
§ while
ú while (expression) statement
ú do statement while (expression);
Garcia, Nikolić
Introduction to C (30) 30
C Syntax : Control Flow (2/2)
§ for
for (initialize; check; update) statement
§ switch
switch (expression){
case const1: statements
case const2: statements
default: statements
}
break;
ú Note: until you do a break statement things keep
executing in the switch statement
§ C also has goto
But it can result in spectacularly bad code if you use it, so don't!
Garcia, Nikolić
Introduction to C (31) 31
First Big C Program: Compute Sines table
#include <stdio.h>
PI = 3.141593
#include <math.h>
Angle Sine
int main(void) 0 0.000000
{ 10
20
0.173648
0.342020
int angle_degree; 30 0.500000
40 0.642788
double angle_radian, pi, value; 50 0.766044
60 0.866025
70 0.939693
printf("Compute a table of the sine function\n\n"); 80 0.984808
90 1.000000
pi = 4.0*atan(1.0); /* could also just use pi = M_PI */ … etc …
Introduction to C (32)
C Syntax: Variable Declarations
§ Similar to Java, but with a few minor but
important differences
ú All variable declarations must appear before they
are used
ú All must be at the beginning of a block.
ú A variable may be initialized in its declaration;
if not, it holds garbage!
the contents are undefined…
§ Examples of declarations:
ú Correct: { int a = 0, b = 10; …
ú Incorrect in ANSI C: for (int i=0; …
ú Correct in C99 (and beyond): for (int i=0;…
Garcia, Nikolić
Introduction to C (34) 34
An Important Note: Undefined Behavior…
§ A lot of C has “Undefined Behavior”
ú This means it is often unpredictable behavior
It will run one way on one computer…
But some other way on another
Or even just be different each time the program is
executed!
§ Often characterized as “Heisenbugs”
ú Bugs that seem random/hard to reproduce, and
seem to disappear or change when debugging
ú Cf. “Bohrbugs” which are repeatable
Garcia, Nikolić
Introduction to C (35) 35
Address vs. Value
§ Consider memory to be a single huge array:
ú Each cell of the array has an address associated with it.
ú Each cell also stores some value.
ú Do you think they use signed or unsigned numbers?
Negative address?!
§ Don’t confuse the address referring to a memory
location with the value stored in that location.
§ For now, the abstraction lets us think we have
access to ∞ memory, numbered from 0…
101 102 103 104 105 ...
... 23 42 ...
Garcia, Nikolić
Introduction to C (36)
Pointers
§ An address refers to a particular
memory location. In other words, it
points to a memory location.
§ Pointer: A variable that contains the
address of a variable.
Location (address)
Introduction to C (37)
Pointer Syntax
§ int *p;
ú Tells compiler that variable p is address of an int
§ p = &y;
ú Tells compiler to assign address of y to p
ú & called the “address operator” in this context
§ z = *p;
ú Tells compiler to assign value at address in p to z
ú * called the “dereference operator” in this context
Garcia, Nikolić
Introduction to C (38) 38
Pointers
§ How to create a pointer: Note the “*” gets
& operator: get address of a variable used 2 different
ways in this
int *p, x; p ? x ? example. In the
declaration to
indicate that p is
x = 3; going to be a
p ? x 3
pointer, and in
the printf to
p =&x; get the value
p x 3 pointed to by p.
Introduction to C (39)
Pointers
§ How to change a variable pointed to?
ú Use dereference * operator on left of =
p x 3
*p = 5; p x 5
Garcia, Nikolić
Introduction to C (40)
Pointers and Parameter Passing (1/2)
§ Java and C pass parameters “by value”
ú procedure/function/method gets a copy of the
parameter, so changing the copy cannot change
the original
void addOne (int x) {
x = x + 1;
}
int y = 3;
addOne(y);
y is still = 3
Garcia, Nikolić
Introduction to C (41)
Pointers and Parameter Passing (2/2)
§ How to get a function to change a value?
y is now = 4
Garcia, Nikolić
Introduction to C (42)
More C Pointer Dangers
§ Declaring a pointer just allocates space to
hold the pointer – it does not allocate
something to be pointed to!
§ Local variables in C are not initialized, they
may contain anything.
§ What does the following code do?
void f()
{
int *ptr;
*ptr = 5;
} Garcia, Nikolić
Introduction to C (43)
Pointers in C … The Good, Bad, and the Ugly
§ Why use pointers?
ú If we want to pass a large struct or array, it’s easier /
faster / etc. to pass a pointer than the whole thing
Otherwise we’d need to copy a huge amount of data
ú In general, pointers allow cleaner, more compact code
§ So what are the drawbacks?
ú Pointers are probably the single largest source of bugs in
C, so be careful anytime you deal with them
Most problematic with dynamic memory
management—coming up next time
Dangling references and memory leaks
þ
Garcia, Nikolić
Introduction to C (44) 44
Pointers
§ Pointers are used to point to any data type (int,
char, a struct, etc.).
§ Normally a pointer can only point to one type (int,
char, a struct, etc.).
ú void * is a type that can point to anything (generic
pointer)
ú Use sparingly to help avoid program bugs… and security
issues… and a lot of other bad things!
§ You can even have pointers to functions…
ú int (*fn) (void *, void *) = &foo
fn is a function that accepts two void * pointers and returns an int
and is initially pointing to the function foo.
ú (*fn)(x, y) will then call the function
Garcia, Nikolić
Introduction to C (46)
Pointers and Structures
typedef struct {
int x;
int y; /* dot notation */
int h = p1.x;
} Point;
p2.y = p1.y;
Point p1;
/* arrow notation */
Point p2; int h = paddr->x;
Point *paddr; int h = (*paddr).x;
Garcia, Nikolić
Introduction to C (47) 47
NULL pointers...
§ The pointer of all 0s is special
ú The "NULL" pointer, like in Java, python, etc...
§ If you write to or read a null pointer, your program
should crash
§ Since "0 is false", its very easy to do tests for null:
ú if(!p) { /* P is a null pointer */ }
ú if(q) { /* Q is not a null pointer */ }
Garcia, Nikolić
Introduction to C (48)
Pointing to Different Size Objects
§ Modern machines are “byte-addressable”
ú Hardware’s memory composed of 8-bit storage cells, each has a unique address
59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 Byte address
8-bit character
16-bit short stored in
two bytes
32-bit integer
stored in four bytes stored in one byte þ
Garcia, Nikolić
Introduction to C (49)
Arrays (1/5)
§ Declaration:
ú int ar[2];
ú …declares a 2-element integer array
ú An array is really just a block of memory
§ Declaration and initialization
ú int ar[] = {795, 635};
ú declares and fills a 2-elt integer array
§ Accessing elements:
ú ar[num]
ú returns the numth element.
Garcia, Nikolić
Introduction to C (51)
Arrays (2/5)
§ Arrays are (almost) identical to
pointers
ú char *string and char string[]
are nearly identical declarations
ú They differ in very subtle ways: incrementing,
declaration of filled arrays
§ Key Concept: An array variable is a
“pointer” to the first element.
Garcia, Nikolić
Introduction to C (52)
Arrays (3/5)
§ Consequences:
ú ar is an array variable but looks like a pointer in many
respects (though not all)
ú ar[0] is the same as *ar
ú ar[2] is the same as *(ar+2)
ú We can use pointer arithmetic to access arrays more
conveniently.
§ Declared arrays are only allocated while the scope
is valid
char *foo() {
char string[32]; ...;
return string;
} is incorrect
Garcia, Nikolić
Introduction to C (53)
Arrays (4/5)
§ Array size n; want to access from 0 to n-1, so you
should use counter AND utilize a variable for
declaration & incr
ú Wrong
int i, ar[10];
for(i = 0; i < 10; i++){ ... }
ú Right
int ARRAY_SIZE = 10;
int i, a[ARRAY_SIZE];
for(i = 0; i < ARRAY_SIZE; i++){ ... }
Introduction to C (54)
Arrays (5/5)
§ Pitfall: An array in C does not know its own
length, & bounds not checked!
ú Consequence: We can accidentally access off the
end of an array.
ú Consequence: We must pass the array and its size
to a procedure which is going to traverse it.
§ Segmentation faults and bus errors:
ú These are VERY difficult to find; be careful!
ú You’ll learn how to debug these in lab…
Garcia, Nikolić
Introduction to C (55)
Pointer Arithmetic
§ pointer + n
ú Adds n*sizeof(“whatever pointer is
pointing to”) to the memory address
§ pointer – n
ú Adds n*sizeof(“whatever pointer is
pointing to”) to the memory address
Garcia, Nikolić
Introduction to C (56)
Pointers (1/4) …review…
§ Java and C pass parameters “by value”
ú procedure/function/method gets a copy of the
parameter, so changing the copy cannot change
the original
void addOne (int x) {
x = x + 1;
}
int y = 3;
addOne(y);
y is still = 3
Garcia, Nikolić
Introduction to C (57)
Pointers (2/4) …review…
§ How to get a function to change a value?
y is now = 4
Garcia, Nikolić
Introduction to C (58)
Pointers (3/4)
§ But what if you want to change a pointer?
ú What gets printed?
Introduction to C (59)
Pointers (4/4)
§ Idea! Pass a pointer to a pointer!
ú Declared as **h
ú Now what gets printed?
Introduction to C (60)
map (actually mutate_map easier)
#include <stdio.h>
% ./map
int x10(int), x2(int); 3 1 4
void mutate_map(int [], int n, int(*)(int)); 6 2 8
void print_array(int [], int n); 60 20 80
Garcia, Nikolić
Introduction to C (62)
Dynamic Memory Allocation (1/4)
§ C has operator sizeof() which gives size in bytes (of
type or variable)
§ Assume size of objects can be misleading and is bad
style, so use sizeof(type)
ú Many years ago an int was 16 bits, and programs were written with
this assumption.
ú What is the size of integers now?
§ “sizeof” knows the size of arrays:
int ar[3]; // Or: int ar[] = {54, 47, 99}
sizeof(ar) à 12
ú …as well for arrays whose size is determined at run-time:
int n = 3;
int ar[n]; // Or: int ar[fun_that_returns_3()];
sizeof(ar) à 12
Garcia, Nikolić
Introduction to C (64)
Dynamic Memory Allocation (2/4)
§ To allocate room for something new to point to, use
malloc() (with the help of a typecast and
sizeof):
Introduction to C (65)
Dynamic Memory Allocation (3/4)
§ Once malloc() is called, the memory
location contains garbage, so don’t use it
until you’ve set its value.
§ After dynamically allocating space, we
must dynamically free it:
ú free(ptr);
§ Use this command to clean up.
ú Even though the program frees all memory on exit
(or when main returns), don’t be lazy!
ú You never know when your main will get
transformed into a subroutine! Garcia, Nikolić
Introduction to C (66)
Dynamic Memory Allocation (4/4)
§ The following two things will cause your program to
crash or behave strangely later on, and cause VERY
VERY hard to figure out bugs:
ú free()ing the same piece of memory twice
ú calling free() on something you didn’t get back from
malloc()
§ The runtime does not check for these mistakes
ú Memory allocation is so performance-critical that there
just isn’t time to do this
ú The usual result is that you corrupt the memory allocator’s
internal structure
ú You won’t find out until much later on, in a totally
unrelated part of your code! Garcia, Nikolić
Introduction to C (67)
Managing the Heap: realloc(p, size)
§ Resize a previously allocated block at p to a new size
§ If p is NULL, then realloc behaves like malloc
§ If size is 0, then realloc behaves like free, deallocating the
block from the heap
§ Returns new address of the memory block; NOTE: it is likely to have
moved!
int *ip;
ip = (int *) malloc(10*sizeof(int));
/* always check for ip == NULL */
… … …
ip = (int *) realloc(ip,20*sizeof(int));
/* always check NULL, contents of first 10
elements retained */
… … …
realloc(ip,0); /* identical to free(ip) */
Garcia, Nikolić
Introduction to C (68)
Arrays not implemented as you’d think
void foo() { *p = 1; // p[0] would also work here
int *p, *q, x; printf("*p:%u, p:%u, &p:%u\n", *p, p, &p);
int a[4]; *q = 2; // q[0] would also work here
p = (int *) printf("*q:%u, q:%u, &q:%u\n", *q, q, &q);
malloc (sizeof(int));
q = &x; *a = 3; // a[0] would also work here
printf("*a:%u, a:%u, &a:%u\n", *a, a, &a);
}
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 ...
... ? 2? 3?
? 20
40 ?
1 ...
p q x unnamed-malloc-space
? *p:1, p:40, &p:12
24 *q:2, q:20, &q:16
a *a:3, a:24, &a:24
K&R: “An array name is not a variable” Garcia, Nikolić
Introduction to C (69)
Mini-summary
§ Pointers and arrays are virtually same
§ C knows how to increment pointers
§ C is an efficient language, with little protection
ú Array bounds not checked
ú Variables not automatically initialized
§ Use handles to change pointers
§ Dynamically allocated heap memory must be manually
deallocated in C.
ú Use malloc() and free() to allocate and deallocate
memory from heap.
§ (Beware) The cost of efficiency is more overhead for the
programmer.
ú “C gives you a lot of extra rope, don’t hang yourself with it!”
þ
Garcia, Nikolić
Introduction to C (70)
Linked List Example
§ Let’s look at an example of using structures,
pointers, malloc(), and free() to implement a
linked list of strings.
struct Node {
char *value;
struct Node *next;
};
typedef struct Node *List;
Introduction to C (72)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
?
NULL
string:
“abc”
Garcia, Nikolić
Introduction to C (73)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
? NULL
string:
?
“abc”
Garcia, Nikolić
Introduction to C (74)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
NULL
string:
?
“abc”
“????”
Garcia, Nikolić
Introduction to C (75)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
NULL
string:
?
“abc”
“abc”
Garcia, Nikolić
Introduction to C (76)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
NULL
string:
“abc”
“abc”
Garcia, Nikolić
Introduction to C (77)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
node:
… …
NULL
“abc”
þ
Garcia, Nikolić
Introduction to C (78)
Don’t forget the globals!
§ What is stored?
ú Structure declaration does not allocate memory
ú Variable declaration does allocate memory
§ So far we have talked about several different ways to allocate
memory for data:
ú Declaration of a local variable
int i; struct Node list; char *string; int ar[n];
ú “Dynamic” allocation at runtime by calling allocation function (alloc).
ptr = (struct Node *) malloc (sizeof(struct Node)*n);
§ One more possibility exists…
ú Data declared outside of any procedure
(i.e., before main). int myGlobal;
ú Similar to #1 above, but has “global” scope. main() {
}
Garcia, Nikolić
Introduction to C (80)
C Memory Management
§ C has 3 pools of memory
ú Static storage: global variable storage, basically
permanent, entire program run
ú The Stack: local variable storage, parameters, return
address (location of “activation records” in Java or “stack
frame” in C)
ú The Heap (dynamic malloc storage): data lives until
deallocated by programmer
§ C requires knowing where objects are in memory,
otherwise things don’t work as expected
ú Java hides location of objects
Garcia, Nikolić
Introduction to C (81)
Normal C Memory Management
§ A program’s address space ~ FFFF FFFFhex
contains 4 regions:
stack
ú stack: local variables, grows
downward
ú heap: space requested for pointers
via malloc() ; resizes
dynamically, grows upward heap
ú static data: variables declared static data
outside main, does not grow or
shrink code
ú code: loaded when program starts, ~ 0hex
For now, OS somehow
does not change prevents accesses between
stack and heap (gray hash
lines). Wait for virtual memory
Garcia, Nikolić
Introduction to C (82)
Where are variables allocated?
§ If declared outside a procedure
(global), allocated in “static” storage
§ If declared inside procedure (local),
allocated on the “stack”
and freed when procedure returns.
ú NB: main() is a procedure
int myGlobal;
main() {
int myTemp;
} Garcia, Nikolić
Introduction to C (83)
The Stack
§ Stack frame includes:
ú Return “instruction” address
SP frame
ú Parameters
ú Space for other local variables
frame
§ Stack frames contiguous
blocks of memory; stack pointer
tells where top stack frame is frame
§ When procedure ends, stack frame
frame is tossed off the stack; frees
memory for future stack frames
Garcia, Nikolić
Introduction to C (84)
Stack
§ Last In, First Out (LIFO) data structure
stack
main ()
{ a(0);
} Stack
Stack Pointer
void a (int m) grows
{ b(1); down
} Stack Pointer
void b (int n)
{ c(2);
} Stack Pointer
void c (int o)
{ d(3);
} Stack Pointer
void d (int p)
{
}
Stack Pointer þ
Garcia, Nikolić
Introduction to C (85)
The Heap (Dynamic memory)
§ Large pool of memory,
not allocated in contiguous order
ú back-to-back requests for heap memory could result
blocks very far apart
ú where Java new command allocates memory
§ In C, specify number of bytes of memory explicitly to
allocate item
int *ptr;
ptr = (int *) malloc(sizeof(int));
/* malloc returns type (void *),
so need to cast to right type */
ú malloc(): Allocates raw, uninitialized memory from
heap
Garcia, Nikolić
Introduction to C (87)
Memory Management
§ How do we manage memory?
§ Code, Static storage are easy:
ú they never grow or shrink
§ Stack space is also easy:
ú stack frames are created and destroyed in last-in,
first-out (LIFO) order
§ Managing the heap is tricky:
ú memory can be allocated / deallocated at any
time
Garcia, Nikolić
Introduction to C (88)
Heap Management Requirements
§ Want malloc() and free() to run
quickly
§ Want minimal memory overhead
§ Want to avoid fragmentation* –
when most of our free memory is in many
small chunks
ú In this case, we might have many free bytes but not
be able to satisfy a large request since the free
bytes are not contiguous in memory.
Introduction to C (89)
Heap Management
§ An example
ú Request R1 for 100 bytes
ú Request R2 for 1 byte R1 (100 bytes)
Garcia, Nikolić
Introduction to C (90)
Heap Management
§ An example
ú Request R1 for 100 bytes R3?
R3?
Garcia, Nikolić
Introduction to C (91)
K&R Malloc/Free Implementation
§ From Section 8.7 of K&R
ú Code in the book uses some C language features we
haven’t discussed and is written in a very terse style, don’t
worry if you can’t decipher the code
§ Each block of memory is preceded by a header that
has two fields:
size of the block and
a pointer to the next block
§ All free blocks are kept in a circular linked list, the
pointer field is unused in an allocated block
Garcia, Nikolić
Introduction to C (92)
K&R Implementation
§ malloc() searches the free list for a block
that is big enough. If none is found, more
memory is requested from the operating
system. If what it gets can’t satisfy the
request, it fails.
§ free() checks if the blocks adjacent to
the freed block are also free
ú If so, adjacent free blocks are merged (coalesced)
into a single, larger free block
ú Otherwise, freed block is just added to the free list
Garcia, Nikolić
Introduction to C (93)
Choosing a block in malloc()
§ If there are multiple free blocks of memory
that are big enough for some request, how
do we choose which one to use?
ú best-fit: choose the smallest block that is big
enough for the request
ú first-fit: choose the first block we see that is big
enough
ú next-fit: like first-fit but remember where we
finished searching and resume searching from
there
Garcia, Nikolić
Introduction to C (94)
And in conclusion…
§ C has 3 pools of memory
ú Static storage: global variable storage, basically
permanent, entire program run
ú The Stack: local variable storage, parameters, return
address
ú The Heap (dynamic storage): malloc() grabs space from
here, free() returns it.
§ malloc() handles free space with freelist
§ Three ways to find free space when given a request:
ú First fit (find first one that’s free)
ú Next fit (same as first, but remembers where left off)
ú Best fit (finds most “snug” free space) þ
Garcia, Nikolić
Introduction to C (95)
Pointers in C
§ Why use pointers?
ú If we want to pass a huge struct or array, it’s easier
/ faster / etc to pass a pointer than the whole thing.
ú In general, pointers allow cleaner, more compact
code.
§ So what are the drawbacks?
ú Pointers are probably the single largest source of
bugs in software, so be careful anytime you deal
with them.
ú Dangling reference (use ptr before malloc)
ú Memory leaks (tardy free, lose the ptr)
Garcia, Nikolić
Introduction to C (97)
Writing off the end of arrays...
int *foo = (int *) malloc(sizeof(int) * 100);
int i;
....
for(i = 0; i <= 100; ++i) {
foo[i] = 0;
}
§ Corrupts other parts of the program...
ú Including internal C data
§ May cause crashes later
Garcia, Nikolić
Introduction to C (98)
Returning Pointers into the Stack
§ Pointers in C allow access to deallocated memory,
leading to hard-to-find bugs !
int *ptr () {
int y; main
y = 3;
main main
SP
return &y;
}; ptr() printf()
(y==3) (y==?)
main () { SP SP
int *stackAddr, content;
stackAddr = ptr();
content = *stackAddr;
printf("%d", content); /* 3 */
content = *stackAddr;
printf("%d", content); /*13451514 */
};
Garcia, Nikolić
Introduction to C (99)
Use After Free
§ When you keep using a pointer..
struct foo *f
....
f = malloc(sizeof(struct foo));
....
free(f)
....
bar(f->a);
§ Reads after the free may be corrupted
ú As something else takes over that memory. Your
program will probably get wrong info!
§ Writes corrupt other data!
ú Uh oh... Your program crashes later!
Garcia, Nikolić
Introduction to C (100)
Forgetting realloc Can Move Data...
§ When you realloc it can copy data...
ú struct foo *f = malloc(sizeof(struct foo) * 10);
...
struct foo *g = f;
....
f = realloc(sizeof(struct foo) * 20);
§ Result is g may now point to invalid memory
ú So reads may be corrupted and writes may corrupt other pieces of
memory
Garcia, Nikolić
Introduction to C (101)
Freeing the Wrong Stuff...
§ If you free() something never malloc'ed()
ú Including things like
struct foo *f = malloc(sizeof(struct foo) * 10)
...
f++;
...
free(f)
Garcia, Nikolić
Introduction to C (102)
Double-Free...
§ E.g.,
struct foo *f = (struct foo *)
malloc(sizeof(struct foo) * 10);
...
free(f);
...
free(f);
§ May cause either a use after free (because
something else called malloc() and got that data)
or corrupt malloc’s data (because you are no
longer freeing a pointer called by malloc)
Garcia, Nikolić
Introduction to C (103)
Losing the initial pointer! (Memory Leak)
§ What is wrong with this code?
Garcia, Nikolić
Introduction to C (104)
Valgrind to the rescue…
§ Valgrind slows down your program by an order of
magnitude, but...
ú It adds a tons of checks designed to catch most (but not
all) memory errors
Memory leaks
Misuse of free
Writing over the end of arrays
Garcia, Nikolić
Introduction to C (105)
And In Conclusion, …
§ C has three main memory segments in
which to allocate data:
ú Static Data: Variables outside functions
ú Stack: Variables local to function
ú Heap: Objects explicitly malloc-ed/free-d.
§ Heap data is biggest source of bugs in
C code
þ
Garcia, Nikolić
Introduction to C (106)