0% found this document useful (0 votes)
114 views

2020fa CS61C 2020fa Module 2 C PDF

Uploaded by

Kert Mantie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views

2020fa CS61C 2020fa Module 2 C PDF

Uploaded by

Kert Mantie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 106

Great Ideas

UC Berkeley
in UC Berkeley
Teaching Professor Computer Architecture Professor
Dan Garcia (a.k.a. Machine Structures) Bora Nikolić

Introduction to the
C Programming Language
Garcia, Nikolić

cs61c.org
ENIAC (U Penn, 1946)
§ First Electronic General-
Purpose Computer
§ Blazingly fast
ú Multiply in 2.8ms!
ú 10 decimal digits x 10
decimal digits
§ But needed 2-3 days to
setup new program
§ Programmed with patch
cords and switches
ú At that time & before,
"computer" mostly referred
to people who did
calculations

Garcia, Nikolić

Introduction to C (3)
EDSAC (Cambridge, 1949)
§ First General Stored-
Program Computer
§ Programs held as
numbers in memory
ú This is the revolution:
It isn't just programmable,
but the program is just the
same type of data that the
computer computes on
ú Bits are not just the
numbers being
manipulated, but the
instructions on how to
manipulate the numbers!
§ 35-bit binary Twos
complement words
Garcia, Nikolić

Introduction to C (4)
Great Idea #1: Abstraction
(Levels of Representation/Interpretation)
High Level Language temp = v[k];
v[k] = v[k+1];
Program (e.g., C) v[k+1] = temp;
Compiler Anything can be represented
lw x3, 0(x10)
Assembly Language lw x4, 4(x10) as a number,
Program (e.g., RISC-V) sw
sw
x4,
x3,
0(x10)
4(x10)
i.e., data or instructions
Assembler 1000 1101 1110 0010 0000 0000 0000 0000
Machine Language 1000 1110 0001 0000 0000 0000 0000 0100
Program (RISC-V) 1010 1110 0001 0010 0000 0000 0000 0000
1010 1101 1110 0010 0000 0000 0000 0100

Hardware Architecture Description 1


+4
wb
Reg []
DataD
pc

Reg[rs1]
1
ALU
alu
pc+4

(e.g., block diagrams)


alu
pc inst[11:7] 0 DMEM
pc+4
0 IMEM AddrD
Reg[rs2] Addr
DataR
1
wb
Branch 0 0
inst[19:15] AddrA DataA
Comp. DataW mem
inst[24:20] AddrB DataB 1

Architecture Implementation
Logic Circuit Description
inst[31:7]
Imm. imm [31:0]
Gen

(Circuit Schematic Diagrams)


A
B
Out = AB+CD

C
D Garcia, Nikolić

Introduction to C (5)
Introduction to C (1/2)
§ Kernighan and Ritchie
ú C is not a “very high-level”
language, nor a “big” one, and is
not specialized to any particular
area of application. But its
absence of restrictions and its
generality make it more
convenient and effective for
many tasks than supposedly
more powerful languages.
§ Enabled first operating
system not written in
assembly language!
ú UNIX - A portable OS!

Garcia, Nikolić

Introduction to C (6)
Introduction to C (2/2)
§ Why C?
ú We can write programs that allow us to exploit underlying
features of the architecture
  memory management, special instructions, parallelism
§ C and derivatives (C++/Obj-C/C#) still one of the most
popular programming languages after >40 years!
§ If you are starting a new project where performance
matters use either Go or Rust
ú Rust, “C-but-safe”: By the time your C is (theoretically) correct
w/all necessary checks it should be no faster than Rust
ú Go, “Concurrency”: Practical concurrent programming to take
advantage of modern multi-core microprocessors

Garcia, Nikolić

Introduction to C (7)
Disclaimer
§ You will not learn how to fully code in C in these
lectures! You’ll still need your C reference
ú K&R is a must-have
ú Useful Reference: “JAVA in a Nutshell,” O’Reilly
  Chapter 2, “How Java Differs from C”
ú Brian Harvey’s helpful transition notes
  http://inst.eecs.berkeley.edu/~cs61c/resources/HarveyNotesC1-3.pdf

§ Key C concepts: Pointers, Arrays, Implications for


Memory management
ú Key security concept: All of the above are unsafe : If your program
contains an error in these areas it might not crash immediately but
instead leave the program in an inconsistent (and often exploitable) state
þ
Garcia, Nikolić

Introduction to C (8)
Compilation: Overview
§ C compilers map C programs directly into
architecture-specific machine code (string of 1s and 0s)
ú Unlike Java, which converts to architecture-independent
bytecode that may then be compiled by a just-in-time compiler
(JIT)
ú Unlike Python environments, which converts to a byte code at
runtime
  These differ mainly in exactly when your program is converted
to low-level machine instructions (“levels of interpretation”)
§ For C, generally a two part process of compiling .c files to
.o files, then linking the .o files into executables;
ú Assembling is also done (but is hidden, i.e., done automatically,
by default); we’ll talk about that later

Garcia, Nikolić

Introduction to C (10)
C Compilation Simplified Overview (more later)
foo.c bar.c C source files (text)

Compiler Compiler/assembler
Compiler
combined here

foo.o bar.o Machine code object files

Pre-built object
Linker lib.o file libraries

a.out Machine code executable file


Garcia, Nikolić

Introduction to C (11)
Compilation: Advantages
§ Reasonable compilation time: enhancements in
compilation procedure (Makefiles) allow only
modified files to be recompiled
§ Excellent run-time performance: generally much
faster than Scheme or Java for comparable code
(because it optimizes for a given architecture)
ú But these days, a lot of performance is in libraries:
ú Plenty of people do scientific computation in Python!?!
  they have good libraries for accessing GPU-specific resources
  Also, many times python allows the ability to drive many other
machines very easily … wait for Spark™ lecture
  Also, Python can call low-level C code to do work: Cython

Garcia, Nikolić

Introduction to C (12)
Compilation: Disadvantages
§ Compiled files, including the executable, are
architecture-specific, depending on processor type
(e.g., MIPS vs. x86 vs. RISC-V) and the operating
system (e.g., Windows vs. Linux vs. MacOS)
§ Executable must be rebuilt on each new system
ú I.e., “porting your code” to a new architecture
§ “Change → Compile → Run [repeat]” iteration cycle
can be slow during development
ú but make only rebuilds changed pieces, and can compile
in parallel: make -j
ú linker is sequential though → Amdahl’s Law

Garcia, Nikolić

Introduction to C (13)
C Pre-Processor (CPP)
foo.c CPP foo.i Compiler

§ C source files first pass through macro processor, CPP, before


compiler sees code
§ CPP replaces comments with a single space
§ CPP commands begin with “#”
ú #include "file.h" /* Inserts file.h into output */
ú #include <stdio.h> /* Looks for file in standard
location, but no actual difference! */
ú #define PI (3.14159) /* Define constant */
ú #if/#endif /* Conditionally include text */
§ Use –save-temps option to gcc to see result of
preprocessing
ú Full documentation at: http://gcc.gnu.org/onlinedocs/cpp/
Garcia, Nikolić

Introduction to C (14)
CPP Macros: A Warning...
§ You often see C preprocessor macros
defined to create small "functions"
ú But they aren't actual functions, instead it just
changes the text of the program
ú In fact, all #define does is string replacement
ú #define min(X,Y) ((X)<(Y)?(X):(Y))
§ This can produce, umm, interesting errors
with macros, if foo(z) has a side-effect
ú next = min(w, foo(z));
ú next = ((w)<(foo(z))?(w):(foo(z))); þ
Garcia, Nikolić

Introduction to C (15)
C vs. Java (1/3)
C Java
Type of Language Function Oriented Object Oriented
Programming Unit Function Class = Abstract Data Type
gcc hello.c creates javac Hello.java creates Java virtual
Compilation
machine language code machine language bytecode

a.out loads and


Execution java Hello interprets bytecodes
executes program
#include <stdio.h>
public class HelloWorld {
int main(void)
public static void
{
hello, world printf("Hi\n"); main(String[] args) {
System.out.println("Hi");
return 0;
} }
}
New allocates & initializes,
Storage Manual (malloc, free)
Automatic (garbage collection) frees
From http://www.cs.princeton.edu/introcs/faq/c2java.html Garcia, Nikolić

Introduction to C (17) 17
C vs. Java (2/3)
C Java
Comments (C99
/* … */ /* … */ or // … end of line
same as Java)
Constants #define, const final

Preprocessor Yes No

Variable declaration At beginning of a


(C99 same as Java)
Before you use it
block

Variable naming
sum_of_squares sumOfSquares
conventions

#include
Accessing a library import java.io.File;
<stdio.h>
From http://www.cs.princeton.edu/introcs/faq/c2java.html Garcia, Nikolić

Introduction to C (18) 18
C vs. Java (3/3) …operators nearly identical
§ arithmetic: +, -, *, /, %
§ assignment: =
§ augmented assignment: +=, -=, *=, /=, %=, &=, |=, ^=,
<<=, >>=
§ bitwise logic: ~, &, |, ^
§ bitwise shifts: <<, >>
§ boolean logic: !, &&, ||
§ equality testing: ==, !=
§ subexpression grouping: ()
§ order relations: <, <=, >, >=
§ increment and decrement: ++ and --
§ member selection: ., ->
ú Slightly different than Java because there are both structures and pointers to structures, more later
§ conditional evaluation: ? :

Garcia, Nikolić

Introduction to C (19) 19
Has there been an update to ANSI C?
§ Yes! It’s called the “C99” or “C9x” std
ú To be safe: “gcc -std=c99” to compile
ú printf(“%ld\n", __STDC_VERSION__); è
199901
§ References
ú en.wikipedia.org/wiki/C99
§ Highlights
ú Declarations in for loops, like Java
ú Java-like // comments (to end of line)
ú Variable-length non-global arrays
ú <inttypes.h>: explicit integer types
ú <stdbool.h> for boolean logic def’s
Garcia, Nikolić

Introduction to C (20)
Has there been an update to C99?
§ Yes! It’s called the “C11” (C18 fixes bugs…)
ú You need “gcc -std=c11” (or c17) to compile
ú printf(“%ld\n", __STDC_VERSION__); è 201112L
ú printf(“%ld\n", __STDC_VERSION__); è 201710L

§ References
ú en.wikipedia.org/wiki/C11_(C_standard_revision)

§ Highlights
ú Multi-threading support!
ú Unicode strings and constants
ú Removal of gets()
ú Type-generic Macros (dispatch based on type)
ú Support for complex values
ú Static assertions, Exclusive create-and-open, …
Garcia, Nikolić

Introduction to C (21)
C Syntax: main
§ To get the main function to accept
arguments, use this:
ú int main (int argc, char *argv[])
§ What does this mean?
ú argc will contain the number of strings on the
command line (the executable counts as one, plus
one for each argument). Here argc is 2:
  unix% sort myFile
ú argv is a pointer to an array containing the
arguments as strings (more on pointers later).
þ
Garcia, Nikolić

Introduction to C (22)
C Syntax: True or False?
§ What evaluates to FALSE in C?
ú 0 (integer)
ú NULL (pointer: more on this later)
ú Boolean types provided by C99’s
stdbool.h
§ What evaluates to TRUE in C?
ú …everything else…
ú Same idea as in Scheme
  Only #f is false, everything else is true!
Garcia, Nikolić

Introduction to C (24)
Typed Variables in C
§ Must declare the type of data a variable will hold
ú Types can't change. E.g, int var = 2;
Type Description Example
Integer Numbers (including negatives)
int 0, 78, -217, 0x7337
At least 16 bits, can be larger
unsigned
Unsigned Integers 0, 6, 35102
int
0.0, 3.14159,
float Floating point decimal
6.02e23
0.0, 3.14159,
double Equal or higher precision floating point
6.02e23
char Single character ‘a’, ‘D’, ‘\n’
Longer int, 0, 78, -217,
long
Size >= sizeof(int), at least 32b 301720971
Even longer int,
long long 31705192721092512
size >= sizeof(long), at least 64b
Garcia, Nikolić

Introduction to C (25) 25
Integers: Python vs. Java vs. C
§ C: int should be integer type that target processor works
with most efficiently
§ Only guarantee:
ú sizeof(long long)
≥ sizeof(long) ≥ sizeof(int) ≥ sizeof(short)
ú Also, short >= 16 bits, long >= 32 bits
ú All could be 64 bits
ú This is why we encourage you to use intN_t and uintN_t!!

Language sizeof(int)
Python >=32 bits (plain ints), infinite (long ints)
Java 32 bits
C Depends on computer; 16 or 32 or 64

Garcia, Nikolić

Introduction to C (26) 26
Consts and Enums in C
§ Constant is assigned a typed value once in the
declaration; value can't change during entire
execution of program
const float golden_ratio = 1.618;
const int days_in_week = 7;
const double the_law = 2.99792458e8;
ú You can have a constant version of any of the standard C
variable types
§ Enums: a group of related integer constants. E.g.,
enum cardsuit {CLUBS,DIAMONDS,HEARTS,SPADES};
enum color {RED, GREEN, BLUE};

Garcia, Nikolić

Introduction to C (27) 27
Typed Functions in C
§ You have to declare the type of data you plan to return
from a function
§ Return type can be any C variable type, and is placed to
the left of the function name
§ You can also specify the return type as void
ú Just think of this as saying that no value will be returned
§ Also need to declare types for values passed into a function
§ Variables and functions MUST be declared before they are
used
int number_of_people () { return 3; }
float dollars_and_cents () { return 10.33; }

Garcia, Nikolić

Introduction to C (28) 28
Structs in C
§ Typedef allows you to define new types.
typedef uint8_t BYTE;
BYTE b1, b2;
§ Structs are structured groups of variables e.g.,
typedef struct {
int length_in_seconds;
int year_recorded;
} SONG;
Dot notation: x.y = value
SONG song1;
song1.length_in_seconds = 213;
song1.year_recorded = 1994;

SONG song2;
song2.length_in_seconds = 248;
song2.year_recorded = 1988;

Garcia, Nikolić

Introduction to C (29) 29
C Syntax : Control Flow (1/2)
§ Within a function, remarkably close to Java
constructs (shows Java’s legacy) for control flow
ú A statement can be a {} of code or just a standalone statement
§ if-else
ú if (expression) statement
if (x == 0) y++;
if (x == 0) {y++;}
if (x == 0) {y++; j = j + y;}
ú if (expression) statement1 else statement2
  There is an ambiguity in a series of if/else if/else if you don't use {}s, so use
{}s to block the code
  In fact, it is a bad C habit to not always have the statement in {}s, it has
resulted in some amusing errors...
§ while
ú while (expression) statement
ú do statement while (expression);

Garcia, Nikolić

Introduction to C (30) 30
C Syntax : Control Flow (2/2)
§ for
for (initialize; check; update) statement
§ switch
switch (expression){
case const1: statements
case const2: statements
default: statements
}
break;
ú Note: until you do a break statement things keep
executing in the switch statement
§ C also has goto
  But it can result in spectacularly bad code if you use it, so don't!

Garcia, Nikolić

Introduction to C (31) 31
First Big C Program: Compute Sines table
#include <stdio.h>
PI = 3.141593
#include <math.h>
Angle Sine
int main(void) 0 0.000000
{ 10
20
0.173648
0.342020
int angle_degree; 30 0.500000
40 0.642788
double angle_radian, pi, value; 50 0.766044
60 0.866025
70 0.939693
printf("Compute a table of the sine function\n\n"); 80 0.984808
90 1.000000
pi = 4.0*atan(1.0); /* could also just use pi = M_PI */ … etc …

printf("Value of PI = %f \n\n", pi);


printf("Angle\tSine\n");
angle_degree = 0;/* initial angle value */
while (angle_degree <= 360) { /* loop til angle_degree > 360 */
angle_radian = pi * angle_degree / 180.0;
value = sin(angle_radian);
printf ("%3d\t%f\n ", angle_degree, value);
angle_degree += 10; /* increment the loop index */
}
return 0;
}
þ
Garcia, Nikolić

Introduction to C (32)
C Syntax: Variable Declarations
§ Similar to Java, but with a few minor but
important differences
ú All variable declarations must appear before they
are used
ú All must be at the beginning of a block.
ú A variable may be initialized in its declaration;
if not, it holds garbage!
  the contents are undefined…
§ Examples of declarations:
ú Correct: { int a = 0, b = 10; …
ú Incorrect in ANSI C: for (int i=0; …
ú Correct in C99 (and beyond): for (int i=0;…
Garcia, Nikolić

Introduction to C (34) 34
An Important Note: Undefined Behavior…
§ A lot of C has “Undefined Behavior”
ú This means it is often unpredictable behavior
  It will run one way on one computer…
  But some other way on another
  Or even just be different each time the program is
executed!
§ Often characterized as “Heisenbugs”
ú Bugs that seem random/hard to reproduce, and
seem to disappear or change when debugging
ú Cf. “Bohrbugs” which are repeatable

Garcia, Nikolić

Introduction to C (35) 35
Address vs. Value
§ Consider memory to be a single huge array:
ú Each cell of the array has an address associated with it.
ú Each cell also stores some value.
ú Do you think they use signed or unsigned numbers?
Negative address?!
§ Don’t confuse the address referring to a memory
location with the value stored in that location.
§ For now, the abstraction lets us think we have
access to ∞ memory, numbered from 0…
101 102 103 104 105 ...
... 23 42 ...

Garcia, Nikolić

Introduction to C (36)
Pointers
§ An address refers to a particular
memory location. In other words, it
points to a memory location.
§ Pointer: A variable that contains the
address of a variable.
Location (address)

101 102 103 104 105 ...


... 23 42 104 ...
x y p
name
Garcia, Nikolić

Introduction to C (37)
Pointer Syntax
§ int *p;
ú Tells compiler that variable p is address of an int

§ p = &y;
ú Tells compiler to assign address of y to p
ú & called the “address operator” in this context

§ z = *p;
ú Tells compiler to assign value at address in p to z
ú * called the “dereference operator” in this context
Garcia, Nikolić

Introduction to C (38) 38
Pointers
§ How to create a pointer: Note the “*” gets
& operator: get address of a variable used 2 different
ways in this
int *p, x; p ? x ? example. In the
declaration to
indicate that p is
x = 3; going to be a
p ? x 3
pointer, and in
the printf to
p =&x; get the value
p x 3 pointed to by p.

§ How get a value pointed to?


* “dereference operator”: get value pointed to

printf(“p points to %d\n”,*p);


Garcia, Nikolić

Introduction to C (39)
Pointers
§ How to change a variable pointed to?
ú Use dereference * operator on left of =

p x 3

*p = 5; p x 5

Garcia, Nikolić

Introduction to C (40)
Pointers and Parameter Passing (1/2)
§ Java and C pass parameters “by value”
ú procedure/function/method gets a copy of the
parameter, so changing the copy cannot change
the original
void addOne (int x) {
x = x + 1;
}
int y = 3;
addOne(y);

y is still = 3
Garcia, Nikolić

Introduction to C (41)
Pointers and Parameter Passing (2/2)
§ How to get a function to change a value?

void addOne (int *p) {


*p = *p + 1;
}
int y = 3;
addOne(&y);

y is now = 4
Garcia, Nikolić

Introduction to C (42)
More C Pointer Dangers
§ Declaring a pointer just allocates space to
hold the pointer – it does not allocate
something to be pointed to!
§ Local variables in C are not initialized, they
may contain anything.
§ What does the following code do?
void f()
{
int *ptr;
*ptr = 5;
} Garcia, Nikolić

Introduction to C (43)
Pointers in C … The Good, Bad, and the Ugly
§ Why use pointers?
ú If we want to pass a large struct or array, it’s easier /
faster / etc. to pass a pointer than the whole thing
  Otherwise we’d need to copy a huge amount of data
ú In general, pointers allow cleaner, more compact code
§ So what are the drawbacks?
ú Pointers are probably the single largest source of bugs in
C, so be careful anytime you deal with them
  Most problematic with dynamic memory
management—coming up next time
  Dangling references and memory leaks
þ
Garcia, Nikolić

Introduction to C (44) 44
Pointers
§ Pointers are used to point to any data type (int,
char, a struct, etc.).
§ Normally a pointer can only point to one type (int,
char, a struct, etc.).
ú void * is a type that can point to anything (generic
pointer)
ú Use sparingly to help avoid program bugs… and security
issues… and a lot of other bad things!
§ You can even have pointers to functions…
ú int (*fn) (void *, void *) = &foo
  fn is a function that accepts two void * pointers and returns an int
and is initially pointing to the function foo.
ú (*fn)(x, y) will then call the function
Garcia, Nikolić

Introduction to C (46)
Pointers and Structures
typedef struct {
int x;
int y; /* dot notation */
int h = p1.x;
} Point;
p2.y = p1.y;

Point p1;
/* arrow notation */
Point p2; int h = paddr->x;
Point *paddr; int h = (*paddr).x;

/* This works too */


p1 = p2;

Garcia, Nikolić

Introduction to C (47) 47
NULL pointers...
§ The pointer of all 0s is special
ú The "NULL" pointer, like in Java, python, etc...
§ If you write to or read a null pointer, your program
should crash
§ Since "0 is false", its very easy to do tests for null:
ú if(!p) { /* P is a null pointer */ }
ú if(q) { /* Q is not a null pointer */ }

Garcia, Nikolić

Introduction to C (48)
Pointing to Different Size Objects
§ Modern machines are “byte-addressable”
ú Hardware’s memory composed of 8-bit storage cells, each has a unique address

§ A C pointer is just abstracted memory address


§ Type declaration tells compiler how many bytes to fetch on each access
through pointer
ú E.g., 32-bit integer stored in 4 consecutive 8-bit bytes

§ But we actually want “word alignment”


ú Some processors will not allow you to address 32b values without being on 4 byte boundaries
ú Others will just be very slow if you try to access “unaligned” memory.

short *y int *x char *z

59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 Byte address

8-bit character
16-bit short stored in
two bytes
32-bit integer
stored in four bytes stored in one byte þ
Garcia, Nikolić

Introduction to C (49)
Arrays (1/5)
§ Declaration:
ú int ar[2];
ú …declares a 2-element integer array
ú An array is really just a block of memory
§ Declaration and initialization
ú int ar[] = {795, 635};
ú declares and fills a 2-elt integer array
§ Accessing elements:
ú ar[num]
ú returns the numth element.

Garcia, Nikolić

Introduction to C (51)
Arrays (2/5)
§ Arrays are (almost) identical to
pointers
ú char *string and char string[]
are nearly identical declarations
ú They differ in very subtle ways: incrementing,
declaration of filled arrays
§ Key Concept: An array variable is a
“pointer” to the first element.

Garcia, Nikolić

Introduction to C (52)
Arrays (3/5)
§ Consequences:
ú ar is an array variable but looks like a pointer in many
respects (though not all)
ú ar[0] is the same as *ar
ú ar[2] is the same as *(ar+2)
ú We can use pointer arithmetic to access arrays more
conveniently.
§ Declared arrays are only allocated while the scope
is valid
char *foo() {
char string[32]; ...;
return string;
} is incorrect
Garcia, Nikolić

Introduction to C (53)
Arrays (4/5)
§ Array size n; want to access from 0 to n-1, so you
should use counter AND utilize a variable for
declaration & incr
ú Wrong
int i, ar[10];
for(i = 0; i < 10; i++){ ... }
ú Right
int ARRAY_SIZE = 10;
int i, a[ARRAY_SIZE];
for(i = 0; i < ARRAY_SIZE; i++){ ... }

§ Why? SINGLE SOURCE OF TRUTH


ú You’re utilizing indirection and avoiding maintaining
two copies of the number 10
Garcia, Nikolić

Introduction to C (54)
Arrays (5/5)
§ Pitfall: An array in C does not know its own
length, & bounds not checked!
ú Consequence: We can accidentally access off the
end of an array.
ú Consequence: We must pass the array and its size
to a procedure which is going to traverse it.
§ Segmentation faults and bus errors:
ú These are VERY difficult to find; be careful!
ú You’ll learn how to debug these in lab…

Garcia, Nikolić

Introduction to C (55)
Pointer Arithmetic
§ pointer + n
ú Adds n*sizeof(“whatever pointer is
pointing to”) to the memory address

§ pointer – n
ú Adds n*sizeof(“whatever pointer is
pointing to”) to the memory address

Garcia, Nikolić

Introduction to C (56)
Pointers (1/4) …review…
§ Java and C pass parameters “by value”
ú procedure/function/method gets a copy of the
parameter, so changing the copy cannot change
the original
void addOne (int x) {
x = x + 1;
}
int y = 3;
addOne(y);

y is still = 3
Garcia, Nikolić

Introduction to C (57)
Pointers (2/4) …review…
§ How to get a function to change a value?

void addOne (int *p) {


*p = *p + 1;
}
int y = 3;
addOne(&y);

y is now = 4
Garcia, Nikolić

Introduction to C (58)
Pointers (3/4)
§ But what if you want to change a pointer?
ú What gets printed?

void IncrementPtr(int *p)


{ p = p + 1; } *q = 50
Aq
int A[3] = {50, 60, 70};
int *q = A;
IncrementPtr( q); 50 60 70
printf(“*q = %d\n”, *q);
Garcia, Nikolić

Introduction to C (59)
Pointers (4/4)
§ Idea! Pass a pointer to a pointer!
ú Declared as **h
ú Now what gets printed?

void IncrementPtr(int **h)


{ *h = *h + 1; } *q = 60
Aq q
int A[3] = {50, 60, 70};
int *q = A;
50 60 70
IncrementPtr(&q);
printf(“*q = %d\n”, *q); þ
Garcia, Nikolić

Introduction to C (60)
map (actually mutate_map easier)
#include <stdio.h>
% ./map
int x10(int), x2(int); 3 1 4
void mutate_map(int [], int n, int(*)(int)); 6 2 8
void print_array(int [], int n); 60 20 80

int x2 (int n) { return 2*n; }


int x10(int n) { return 10*n; }

void mutate_map(int A[], int n, int(*fp)(int)) {


for (int i = 0; i < n; i++)
A[i] = (*fp)(A[i]);
int main(void)
}
{
int A[] = {3,1,4}, n = 3;
void print_array(int A[], int n) {
print_array(A, n);
for (int i = 0; i < n; i++)
mutate_map (A, n, &x2);
printf("%d ",A[i]);
print_array(A, n);
printf("\n");
mutate_map (A, n, &x10);
}
print_array(A, n);
}

Garcia, Nikolić

Introduction to C (62)
Dynamic Memory Allocation (1/4)
§ C has operator sizeof() which gives size in bytes (of
type or variable)
§ Assume size of objects can be misleading and is bad
style, so use sizeof(type)
ú Many years ago an int was 16 bits, and programs were written with
this assumption.
ú What is the size of integers now?
§ “sizeof” knows the size of arrays:
int ar[3]; // Or: int ar[] = {54, 47, 99}
sizeof(ar) à 12
ú …as well for arrays whose size is determined at run-time:
int n = 3;
int ar[n]; // Or: int ar[fun_that_returns_3()];
sizeof(ar) à 12
Garcia, Nikolić

Introduction to C (64)
Dynamic Memory Allocation (2/4)
§ To allocate room for something new to point to, use
malloc() (with the help of a typecast and
sizeof):

ptr = (int *) malloc (sizeof(int));


ú Now, ptr points to a space somewhere in memory of
size (sizeof(int)) in bytes.
ú (int *) simply tells the compiler what will go into that
space (called a typecast).
§ malloc is almost never used for 1 var
§ ptr = (int *) malloc (n*sizeof(int));
ú This allocates an array of n integers.
Garcia, Nikolić

Introduction to C (65)
Dynamic Memory Allocation (3/4)
§ Once malloc() is called, the memory
location contains garbage, so don’t use it
until you’ve set its value.
§ After dynamically allocating space, we
must dynamically free it:
ú free(ptr);
§ Use this command to clean up.
ú Even though the program frees all memory on exit
(or when main returns), don’t be lazy!
ú You never know when your main will get
transformed into a subroutine! Garcia, Nikolić

Introduction to C (66)
Dynamic Memory Allocation (4/4)
§ The following two things will cause your program to
crash or behave strangely later on, and cause VERY
VERY hard to figure out bugs:
ú free()ing the same piece of memory twice
ú calling free() on something you didn’t get back from
malloc()
§ The runtime does not check for these mistakes
ú Memory allocation is so performance-critical that there
just isn’t time to do this
ú The usual result is that you corrupt the memory allocator’s
internal structure
ú You won’t find out until much later on, in a totally
unrelated part of your code! Garcia, Nikolić

Introduction to C (67)
Managing the Heap: realloc(p, size)
§ Resize a previously allocated block at p to a new size
§ If p is NULL, then realloc behaves like malloc
§ If size is 0, then realloc behaves like free, deallocating the
block from the heap
§ Returns new address of the memory block; NOTE: it is likely to have
moved!
int *ip;
ip = (int *) malloc(10*sizeof(int));
/* always check for ip == NULL */
… … …
ip = (int *) realloc(ip,20*sizeof(int));
/* always check NULL, contents of first 10
elements retained */
… … …
realloc(ip,0); /* identical to free(ip) */
Garcia, Nikolić

Introduction to C (68)
Arrays not implemented as you’d think
void foo() { *p = 1; // p[0] would also work here
int *p, *q, x; printf("*p:%u, p:%u, &p:%u\n", *p, p, &p);
int a[4]; *q = 2; // q[0] would also work here
p = (int *) printf("*q:%u, q:%u, &q:%u\n", *q, q, &q);
malloc (sizeof(int));
q = &x; *a = 3; // a[0] would also work here
printf("*a:%u, a:%u, &a:%u\n", *a, a, &a);
}
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 ...
... ? 2? 3?
? 20
40 ?
1 ...
p q x unnamed-malloc-space
? *p:1, p:40, &p:12
24 *q:2, q:20, &q:16
a *a:3, a:24, &a:24
K&R: “An array name is not a variable” Garcia, Nikolić

Introduction to C (69)
Mini-summary
§ Pointers and arrays are virtually same
§ C knows how to increment pointers
§ C is an efficient language, with little protection
ú Array bounds not checked
ú Variables not automatically initialized
§ Use handles to change pointers
§ Dynamically allocated heap memory must be manually
deallocated in C.
ú Use malloc() and free() to allocate and deallocate
memory from heap.
§ (Beware) The cost of efficiency is more overhead for the
programmer.
ú “C gives you a lot of extra rope, don’t hang yourself with it!”
þ
Garcia, Nikolić

Introduction to C (70)
Linked List Example
§ Let’s look at an example of using structures,
pointers, malloc(), and free() to implement a
linked list of strings.
struct Node {
char *value;
struct Node *next;
};
typedef struct Node *List;

/* Create a new (empty) list */


List ListNew(void)
{ return NULL; }
Garcia, Nikolić

Introduction to C (72)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
?
NULL
string:
“abc”
Garcia, Nikolić

Introduction to C (73)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
? NULL
string:
?
“abc”
Garcia, Nikolić

Introduction to C (74)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
NULL
string:
?
“abc”
“????”
Garcia, Nikolić

Introduction to C (75)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
NULL
string:
?
“abc”
“abc”
Garcia, Nikolić

Introduction to C (76)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}
list
node: : … …
NULL
string:
“abc”
“abc”
Garcia, Nikolić

Introduction to C (77)
Linked List Example
/* add a string to an existing list */
List list_add(List list, char *string)
{
struct Node *node =
(struct Node*) malloc(sizeof(struct Node));
node->value =
(char*) malloc(strlen(string) + 1);
strcpy(node->value, string);
node->next = list;
return node;
}

node:
… …
NULL

“abc”
þ
Garcia, Nikolić

Introduction to C (78)
Don’t forget the globals!
§ What is stored?
ú Structure declaration does not allocate memory
ú Variable declaration does allocate memory
§ So far we have talked about several different ways to allocate
memory for data:
ú Declaration of a local variable
int i; struct Node list; char *string; int ar[n];
ú “Dynamic” allocation at runtime by calling allocation function (alloc).
ptr = (struct Node *) malloc (sizeof(struct Node)*n);
§ One more possibility exists…
ú Data declared outside of any procedure
(i.e., before main). int myGlobal;
ú Similar to #1 above, but has “global” scope. main() {
}

Garcia, Nikolić

Introduction to C (80)
C Memory Management
§ C has 3 pools of memory
ú Static storage: global variable storage, basically
permanent, entire program run
ú The Stack: local variable storage, parameters, return
address (location of “activation records” in Java or “stack
frame” in C)
ú The Heap (dynamic malloc storage): data lives until
deallocated by programmer
§ C requires knowing where objects are in memory,
otherwise things don’t work as expected
ú Java hides location of objects

Garcia, Nikolić

Introduction to C (81)
Normal C Memory Management
§ A program’s address space ~ FFFF FFFFhex

contains 4 regions:
stack
ú stack: local variables, grows
downward
ú heap: space requested for pointers
via malloc() ; resizes
dynamically, grows upward heap
ú static data: variables declared static data
outside main, does not grow or
shrink code
ú code: loaded when program starts, ~ 0hex
For now, OS somehow
does not change prevents accesses between
stack and heap (gray hash
lines). Wait for virtual memory
Garcia, Nikolić

Introduction to C (82)
Where are variables allocated?
§ If declared outside a procedure
(global), allocated in “static” storage
§ If declared inside procedure (local),
allocated on the “stack”
and freed when procedure returns.
ú NB: main() is a procedure

int myGlobal;
main() {
int myTemp;
} Garcia, Nikolić

Introduction to C (83)
The Stack
§ Stack frame includes:
ú Return “instruction” address
SP frame
ú Parameters
ú Space for other local variables
frame
§ Stack frames contiguous
blocks of memory; stack pointer
tells where top stack frame is frame
§ When procedure ends, stack frame
frame is tossed off the stack; frees
memory for future stack frames

Garcia, Nikolić

Introduction to C (84)
Stack
§ Last In, First Out (LIFO) data structure
stack
main ()
{ a(0);
} Stack
Stack Pointer
void a (int m) grows
{ b(1); down
} Stack Pointer
void b (int n)
{ c(2);
} Stack Pointer
void c (int o)
{ d(3);
} Stack Pointer
void d (int p)
{
}
Stack Pointer þ
Garcia, Nikolić

Introduction to C (85)
The Heap (Dynamic memory)
§ Large pool of memory,
not allocated in contiguous order
ú back-to-back requests for heap memory could result
blocks very far apart
ú where Java new command allocates memory
§ In C, specify number of bytes of memory explicitly to
allocate item
int *ptr;
ptr = (int *) malloc(sizeof(int));
/* malloc returns type (void *),
so need to cast to right type */
ú malloc(): Allocates raw, uninitialized memory from
heap
Garcia, Nikolić

Introduction to C (87)
Memory Management
§ How do we manage memory?
§ Code, Static storage are easy:
ú they never grow or shrink
§ Stack space is also easy:
ú stack frames are created and destroyed in last-in,
first-out (LIFO) order
§ Managing the heap is tricky:
ú memory can be allocated / deallocated at any
time

Garcia, Nikolić

Introduction to C (88)
Heap Management Requirements
§ Want malloc() and free() to run
quickly
§ Want minimal memory overhead
§ Want to avoid fragmentation* –
when most of our free memory is in many
small chunks
ú In this case, we might have many free bytes but not
be able to satisfy a large request since the free
bytes are not contiguous in memory.

* This is technically called external fragmention


Garcia, Nikolić

Introduction to C (89)
Heap Management
§ An example
ú Request R1 for 100 bytes
ú Request R2 for 1 byte R1 (100 bytes)

ú Memory from R1 is freed


R2 (1 byte)
ú Request R3 for 50 bytes

Garcia, Nikolić

Introduction to C (90)
Heap Management
§ An example
ú Request R1 for 100 bytes R3?

ú Request R2 for 1 byte


ú Memory from R1 is freed
R2 (1 byte)
ú Request R3 for 50 bytes

R3?

Garcia, Nikolić

Introduction to C (91)
K&R Malloc/Free Implementation
§ From Section 8.7 of K&R
ú Code in the book uses some C language features we
haven’t discussed and is written in a very terse style, don’t
worry if you can’t decipher the code
§ Each block of memory is preceded by a header that
has two fields:
size of the block and
a pointer to the next block
§ All free blocks are kept in a circular linked list, the
pointer field is unused in an allocated block

Garcia, Nikolić

Introduction to C (92)
K&R Implementation
§ malloc() searches the free list for a block
that is big enough. If none is found, more
memory is requested from the operating
system. If what it gets can’t satisfy the
request, it fails.
§ free() checks if the blocks adjacent to
the freed block are also free
ú If so, adjacent free blocks are merged (coalesced)
into a single, larger free block
ú Otherwise, freed block is just added to the free list

Garcia, Nikolić

Introduction to C (93)
Choosing a block in malloc()
§ If there are multiple free blocks of memory
that are big enough for some request, how
do we choose which one to use?
ú best-fit: choose the smallest block that is big
enough for the request
ú first-fit: choose the first block we see that is big
enough
ú next-fit: like first-fit but remember where we
finished searching and resume searching from
there

Garcia, Nikolić

Introduction to C (94)
And in conclusion…
§ C has 3 pools of memory
ú Static storage: global variable storage, basically
permanent, entire program run
ú The Stack: local variable storage, parameters, return
address
ú The Heap (dynamic storage): malloc() grabs space from
here, free() returns it.
§ malloc() handles free space with freelist
§ Three ways to find free space when given a request:
ú First fit (find first one that’s free)
ú Next fit (same as first, but remembers where left off)
ú Best fit (finds most “snug” free space) þ
Garcia, Nikolić

Introduction to C (95)
Pointers in C
§ Why use pointers?
ú If we want to pass a huge struct or array, it’s easier
/ faster / etc to pass a pointer than the whole thing.
ú In general, pointers allow cleaner, more compact
code.
§ So what are the drawbacks?
ú Pointers are probably the single largest source of
bugs in software, so be careful anytime you deal
with them.
ú Dangling reference (use ptr before malloc)
ú Memory leaks (tardy free, lose the ptr)
Garcia, Nikolić

Introduction to C (97)
Writing off the end of arrays...
int *foo = (int *) malloc(sizeof(int) * 100);
int i;
....
for(i = 0; i <= 100; ++i) {
foo[i] = 0;
}
§ Corrupts other parts of the program...
ú Including internal C data
§ May cause crashes later

Garcia, Nikolić

Introduction to C (98)
Returning Pointers into the Stack
§ Pointers in C allow access to deallocated memory,
leading to hard-to-find bugs !
int *ptr () {
int y; main
y = 3;
main main
SP
return &y;
}; ptr() printf()
(y==3) (y==?)
main () { SP SP
int *stackAddr, content;
stackAddr = ptr();
content = *stackAddr;
printf("%d", content); /* 3 */
content = *stackAddr;
printf("%d", content); /*13451514 */
};
Garcia, Nikolić

Introduction to C (99)
Use After Free
§ When you keep using a pointer..
struct foo *f
....
f = malloc(sizeof(struct foo));
....
free(f)
....
bar(f->a);
§ Reads after the free may be corrupted
ú As something else takes over that memory. Your
program will probably get wrong info!
§ Writes corrupt other data!
ú Uh oh... Your program crashes later!
Garcia, Nikolić

Introduction to C (100)
Forgetting realloc Can Move Data...
§ When you realloc it can copy data...
ú struct foo *f = malloc(sizeof(struct foo) * 10);
...
struct foo *g = f;
....
f = realloc(sizeof(struct foo) * 20);
§ Result is g may now point to invalid memory
ú So reads may be corrupted and writes may corrupt other pieces of
memory

Garcia, Nikolić

Introduction to C (101)
Freeing the Wrong Stuff...
§ If you free() something never malloc'ed()
ú Including things like
struct foo *f = malloc(sizeof(struct foo) * 10)
...
f++;
...
free(f)

§ malloc or free may get confused..


ú Corrupt its internal storage or erase other data...

Garcia, Nikolić

Introduction to C (102)
Double-Free...
§ E.g.,
struct foo *f = (struct foo *)
malloc(sizeof(struct foo) * 10);
...
free(f);
...
free(f);
§ May cause either a use after free (because
something else called malloc() and got that data)
or corrupt malloc’s data (because you are no
longer freeing a pointer called by malloc)

Garcia, Nikolić

Introduction to C (103)
Losing the initial pointer! (Memory Leak)
§ What is wrong with this code?

int *plk = NULL;


void genPLK() {
plk = malloc(2 * sizeof(int));
… … … This MAY be a memory leak
plk++; if we don't keep somewhere else
} a copy of the original malloc'ed
pointer

Garcia, Nikolić

Introduction to C (104)
Valgrind to the rescue…
§ Valgrind slows down your program by an order of
magnitude, but...
ú It adds a tons of checks designed to catch most (but not
all) memory errors
  Memory leaks
  Misuse of free
  Writing over the end of arrays

§ Tools like Valgrind are absolutely essential for


debugging C code

Garcia, Nikolić

Introduction to C (105)
And In Conclusion, …
§ C has three main memory segments in
which to allocate data:
ú Static Data: Variables outside functions
ú Stack: Variables local to function
ú Heap: Objects explicitly malloc-ed/free-d.
§ Heap data is biggest source of bugs in
C code

þ
Garcia, Nikolić

Introduction to C (106)

You might also like