C Programming 1
C Programming 1
The Basics
C retains the basic philosophy that
programmers know what they are
doing; it only requires that they state
their intentions explicitly.
B Kernighan & D Ritchie, The C Programming Language, 1988
Three simple rules
If you violate any of the C rules, you end up with undefined behaviour:
● The stack.
○ Read-write, created during function-call setup, destroyed during function-call teardown.
○ Only possible to store data where storage size is known at compile-time.
● The heap.
○ Read-write, allocated from the operating system by a syscall, returned to the operating system by a syscall.
○ Can store data where storage size is storage size is unknown at compile-time.
● The data segment (or other in-process section).
○ Read-only, allocated within the process, stores literals and constants and so on.
Memory (“all data has a size”)
By definition, a char in C has the storage size of 1. In practice, this is equivalent to a byte.
If you have a variable v, it must have some address (i.e. location) in memory.
If you have a memory address m, and you know the type of the value at that memory address, then
A variable that stores a memory address is called a pointer (because it can “point” to other data which is at
the stored memory address).
● You cannot use indexing or pointer arithmetic on it, because there is no way of knowing the size of
the data that it points to.
● You cannot dereference it, because there is no way of knowing how to interpret the memory that it
points to.
However, you can cast to another pointer type and then do either of the above with it.
Example: *((int*)p) casts to a pointer-to-int, and then dereferences that.
“Zero is special”
A zero is special because:
● When used in a conditional, all non-zero values are “true”, and zero is false.
● Zero is conventionally placed at the end of things with an unknown or dynamic size.
○ When used in this way, it means: “stop reading the memory now, the data has ended”.
○ A zero, used in this way, is called a terminator.
● (…where have you seen this “zero is special” idea before?)
Zero is so special that all built-in types have their own version of zero!
Just like in assembly, string constants are stored in the data segment. This part of memory is read-only.
(you can try to write to it, but that is undefined behaviour)
Example: char *r = "testing"; will create space on the stack for a pointer-to-character named r, which
holds the address of a place in memory where 't', 'e', …, 'g', and '\0' are sequentially laid out.
Memory (arrays)
A pointer can point to the first character of a string, and we can assume that all subsequent values—until
we see a '\0'—are part of that string. Instead of copying around a whole string, we just copy the pointer
to the start of it!
● In fact, this is exactly what a C-style string is: just a bunch of memory with a terminator at the end
of it.
● We can even initialize a char array implicitly with a string, e.g. char y[] = "testing" . At
compile-time, C will allocate 8 bytes to the y array and initialize it as if you wrote char y[] = { 't', 'e',
's', 't', 'i', 'n', 'g', '\0' } instead.
● If passing a non-char array, it is much more common to specify the size as an additional parameter.
Use of pointers (2 / 5)
Pointers (i.e. variables that hold a memory location) are used everywhere!
We can ask the operating system for memory of a certain size, and we can store the address of the start
of that memory in a pointer. We can then give that memory back to the operating system when we’re
done with it. We request memory with functions from <stdlib.h>.
Our pointer can store the address of a function, and we can then call the function using the pointer 🤯.
● A function to pointer looks something like: int (*f)(char, char). This declares f as a pointer to
function taking parameters (char, char) and returning an int.
Use of pointers (5 / 5)
Pointers (i.e. variables that hold a memory location) are used everywhere!
Our pointer can store the memory address of another pointer! That lets us create multidimensional
arrays 🤯.
● You can write char **m to have a pointer to pointer to char—or, as I might call it, an array of strings.
● You can write int **xs to have a pointer to pointer of integers—or a table/matrix.
● You must allocate memory for each dimension manually.
Big-endian vs Little-endian
int v = 123456789;
123456789 (in decimal) = 75bcd15 (in hexadecimal). In bytes, that is: 7 5b cd 15.
In C v
Big-endian vs Little-endian
Things to note:
● &v is …6c9578
● sizeof(v) is 4
● Memory addresses are in char-units (i.e. bytes)
● When we know the address and we know the size, we also know exactly where it ends.
In C v