Clanguage
Clanguage
The C Language
Currently, the most commonly-used language for embedded systems High-level assembly Very portable: compilers exist for virtually every processor Easy-to-understand compilation Produces efficient code Fairly concise
C History
Developed between 1969 and 1973 along with Unix Due mostly to Dennis Ritchie Designed for systems programming
BCPL
Designed by Martin Richards (Cambridge) in 1967 Typeless
Everything an n-bit integer (a machine word) Pointers (addresses) and integers identical
Memory is an undifferentiated array of words Natural model for word-addressed machines Local variables depend on frame-pointer-relative addressing: dynamically-sized automatic objects not permitted Strings awkward
C History
Original machine (DEC PDP-11) was very small
C History
Many language features designed to reduce memory
Forward declarations required for everything Designed to work in one pass: must know everything No function nesting
Hello World in C
#include <stdio.h> void main() - Clumsy { printf(Hello, world!\n); + Cheaply implemented + Very flexible }
Preprocessor used to share information among source files
Hello World in C
#include <stdio.h>
Program mostly a collection of functions main function special: the entry point
void main() void qualifier indicates function does not return { printf(Hello, world!\n); anything }
Euclids algorithm in C
int gcd(int m, int n) { int r; while ( (r = m % n) != 0) { m = n; n = r; } return n; }
New Style function declaration lists number and type of arguments Originally only listed return type. Generated code did not care how many arguments were actually passed. Arguments are callby-value
Euclids algorithm in C
int gcd(int m, int n) { int r; while ( (r = m % n) != 0) { m = n; n = r; } Excess return n; arguments simply }
Frame pointer n m ret. addr. r ignored Stack pointer Automatic variable Storage allocated on stack when function entered, released when it returns. All parameters, automatic variables accessed w.r.t. frame pointer. Extra storage needed while evaluating large expressions also placed on the stack
Euclids algorithm in C
int gcd(int m, int n) { int r; while ( (r = m % n) != 0) { m = n; n = r; } return n; }
Expression: Cs basic type of statement. Arithmetic and logical Assignment (=) returns a value, so can be used in expressions % is remainder != is not equal
Euclids algorithm in C
int gcd(int m, int n) { int r; while ( (r = m % n) != 0) { m = n; n = r; High-level control-flow } statement. Ultimately return n; Each function becomes a conditional returns a single } branch. value, usually an
integer. Returned through a specific register by convention. Supports structured programming
Very natural mapping from C into PDP-11 instructions. Complex addressing modes make frame-pointer-relative accesses easy. Another idiosyncrasy: registers were memory-mapped, so taking address of a variable in a register is straightforward.
Pieces of C
Types and Variables
Definitions of data in memory Arithmetic, logical, and assignment operators in an infix notation Sequences of conditional, iteration, and branching instructions Groups of statements and variables invoked recursively
Expressions
Statements
Functions
C Types
Basic types: char, int, float, and double Meant to match the processors native types
Declaration syntax: string of specifiers followed by a declarator Declarators notation matches that in an expression Access a symbol using its declarator and get the basic type back
C Type Examples
int i; int *j, k; unsigned char *ch; float f[10]; int a[3][5][10]; int *func1(float); int (*func2)(void); Integer j: pointer to integer, int k ch: pointer to unsigned char Array of 10 floats 2-arg function Array of three arrays of five function returning int * pointer to function returning int
C Typedef
Type declarations recursive, complicated. Name new types with typedef Instead of int (*func2)(void) use typedef int func2t(void); func2t *func2;
C Structures
A struct is an object with named fields: struct { char *name; int x, y; int h, w; } box; Accessed using dot notation: box.x = 5; box.y = 2;
Copyright 2001 Stephen A. Edwards All rights reserved
Struct bit-fields
Way to aggressively pack data in memory struct { unsigned int baud : 5; unsigned int div2 : 1; unsigned int use_external_clock : 1; } flags; Compiler will pack these fields into words Very implementation dependent: no guarantees of ordering, packing, etc. Usually less efficient
C Unions
Can store objects of different types at different times union { int ival; float fval; char *sval; }; Useful for arrays of dissimilar objects Potentially very dangerous Good example of Cs philosophy
C Storage Classes
#include <stdlib.h> int global_static; static int file_static;
Linker-visible. Allocated at fixed location Visible within file. Allocated at fixed location.
void foo(int auto_param) { Visible within func. static int func_static; Allocated at fixed int auto_i, auto_a[10]; location. double *auto_d = malloc(sizeof(double)*5); }
C Storage Classes
#include <stdlib.h> int global_static; static int file_static; void foo(int auto_param) { Space allocated on static int func_static; stack by function. int auto_i, auto_a[10]; double *auto_d = malloc(sizeof(double)*5); }
Space allocated on heap by library routine.
Copyright 2001 Stephen A. Edwards All rights reserved
malloc() and free() use complicated non-constant-time algorithms Each block generally consumes two additional words of memory
Pointer to next empty block Size of this block
Using uninitialized memory Using freed memory Not allocating enough Neglecting to free disused blocks (memory leaks)
Free
malloc(
Each segment contiguous in memory (no holes) Segments do not move once allocated
malloc()
Find memory area large enough for segment Mark that memory is allocated
free()
Linked list
First-fit
Next Size
Next
First large-enough free block selected Free block divided into two Previous next pointer updated Newly-allocated region begins with a size value
free(a)
Appropriate position in free list identified Newly-freed region added to adjacent free regions
Memory Pools
An alternative: Memory pools Separate management policy for each pool Stack-based pool: can only free whole pool at once
Very cheap operation Good for build-once data structures (e.g., compilers) Useful in object-oriented programs
Arrays
Array: sequence of identical objects in memory
Filippo Brunelleschi, Ospdale degli Innocenti, Firenze, Italy, 1421
int a[10]; means space for ten integers By itself, a is the address of the first integer *a and a[0] mean the same thing
The address of a is not stored in memory: the compiler inserts code to compute it when it appears Ritchie calls this interpretation the biggest conceptual jump from BCPL to C
Multidimensional Arrays
Array declarations read right-to-left int a[10][3][2]; an array of ten arrays of three arrays of two ints In memory
3 3 ... 2 2 2 2 2 2 10
Copyright 2001 Stephen A. Edwards All rights reserved
2 2 2
Multidimensional Arrays
Passing a multidimensional array as an argument requires all but the first dimension int a[10][3][2]; void examine( a[][3][2] ) { } Address for an access such as a[i][j][k] is a + k + 2*(j + 3*i)
Multidimensional Arrays
Use arrays of pointers for variable-sized multidimensional arrays You need to allocate space for and initialize the arrays of pointers int ***a; a[3][5][4] expands to *(*(*(a+3)+5)+4)
int ***a The value
int **
int *
int
C Expressions
Traditional mathematical expressions y = a*x*x + b*x + c; Very rich set of expressions Able to deal with arithmetic and bit manipulation
C Expression Classes
arithmetic: + * / % comparison: == != < <= > >= bitwise logical: & | ^ ~ shifting: << >> lazy logical: && || ! conditional: ? : assignment: = += -= increment/decrement: ++ - sequencing: , pointer: * -> & []
Copyright 2001 Stephen A. Edwards All rights reserved
Bitwise operators
and: & or: | xor: ^ not: ~ left shift: << right shift: >> Useful for bit-field manipulations #define MASK 0x040 if (a & MASK) { } c |= MASK; c &= ~MASK; d = (a & MASK) >> 4; /* Check bits */ /* Set bits */ /* Clear bits */ /* Select field */
Conditional Operator
c = a < b ? a + 1 : b 1; Evaluate first expression. If true, evaluate second, otherwise evaluate third. Puts almost statement-like behavior in expressions. BCPL allowed code in an expression: a := 5 + valof{ int i, s = 0; for (i = 0 ; i < 10 ; i++) s += a[I]; return s; }
Copyright 2001 Stephen A. Edwards All rights reserved
Side-effects in expressions
Evaluating an expression often has side-effects a++ a=5 a = foo() increment a afterwards changes the value of a function foo may have side-effects
Pointer Arithmetic
From BCPLs view of the world Pointer arithmetic is natural: everythings an integer int *p, *q; *(p+5) equivalent to p[5] If p and q point into same array, p q is number of elements between p and q. Accessing fields of a pointed-to structure has a shorthand: p->field means (*p).field
C Statements
Expression Conditional
if (expr) { } else {} switch (expr) { case c1: case c2: } while (expr) { } zero or more iterations do while (expr) at least one iteration for ( init ; valid ; next ) { } goto label continue; break; return expr;
Iteration
Jump
setjmp/longjmp
A way to exit from deeply nested functions A hack now a formal part of the standard library #include <setjmp.h> jmp_buf jmpbuf;
Space for a return address and registers (including stack pointer, frame pointer) Stores context, returns 0
void top(void) { switch (setjmp(jmpbuf)) { case 0: child(); break; case 1: /* longjmp called */ break; Returns to context, making it } }
appear setjmp() returned 1
Macro:
Identical for min(5,x) Different when evaluating expression has side-effect: min(a++,b)
min function increments a once min macro may increment a twice if a < b
Nondeterminism in C
Library routines
malloc() returns a nondeterministically-chosen address Address used as a hash key produces nondeterministic results myfunc( func1(), func2(), func3() ) func1, func2, and func3 may be called in any order
Nondeterminism in C
Uninitialized variables
Automatic variables may take values from stack Global variables left to the whims of the OS union { int a; float b; } u; u.a = 10; printf(%g, u.b); *a undefined unless it points within an allocated array and has been initialized Very easy to violate these rules Legal: int a[10]; a[-1] = 3; a[10] = 2; a[11] = 5; int *a, *b; a - b only defined if a and b point into the same array
Pointer dereference
Nondeterminism in C
How to deal with nondeterminism?
Philosophy of C: get out of the programmers way C treats you like a consenting adult
Created by a systems programmer (Ritchie) Created by an educator (Wirth) Created by the Department of Defense
Summary
C evolved from the typeless languages BCPL and B Array-of-bytes model of memory permeates the language Original weak type system strengthened over time C programs built from
Summary of C types
Built from primitive types that match processor types char, int, float, double, pointers Struct and union aggregate heterogeneous objects Arrays build sequences of identical objects Alignment restrictions ensured by compiler Multidimensional arrays Three storage classes
global, static (address fixed at compile time) automatic (on stack) heap (provided by malloc() and free() library calls)
Summary of C expressions
Wide variety of operators
Arithmetic + - * / Logical && || (lazy) Bitwise & | Comparison < <= Assignment = += *= Increment/decrement ++ -Conditional ? :
Summary of C statements
Expression Conditional
Iteration
Branching
Summary of C
Preprocessor
Sources of nondeterminsm