C Language
C Language
#c
Table of Contents
About 1
Remarks 2
Common Compilers 2
Libraries and APIs not covered by the C Standard (and therefore being off-topic here): 3
Versions 4
Examples 4
Hello World 4
hello.c 4
Examples 9
Introduction 10
Remarks 13
Examples 14
Effective type 15
Changing bytes 17
Chapter 4: Arrays 19
Introduction 19
Syntax 19
Remarks 19
Examples 20
Array length 22
Multi-dimensional arrays 27
See also 31
Chapter 5: Assertion 32
Introduction 32
Syntax 32
Parameters 32
Remarks 32
Examples 33
Simple Assertion 34
Static Assertion 34
Chapter 6: Atomics 38
Syntax 38
Remarks 38
Examples 38
Chapter 7: Bit-fields 40
Introduction 40
Syntax 40
Parameters 40
Remarks 40
Examples 40
Bit-fields 40
Bit-field alignment 42
Chapter 8: Boolean 45
Remarks 45
Examples 45
Using stdbool.h 45
Using #define 45
Syntax 49
Parameters 49
Remarks 49
Examples 50
Introduction 55
Syntax 55
Examples 55
/* */ delimited comments 55
// delimited comments 56
Examples 58
Introduction 62
Examples 62
Introduction 87
Remarks 87
Examples 88
The Linker 88
File Types 90
The Preprocessor 91
The Compiler 93
Syntax 95
Remarks 95
Examples 95
Compound literal having length of initializer less than array size specified 96
Remarks 98
Examples 98
Examples 100
Introduction 100
Idempotence 100
Self-containment 101
Minimality 103
Cross-references 105
Remarks 107
Examples 107
Examples 111
Remarks 114
Examples 114
Remarks 116
Examples 116
Introduction 120
Typedef 123
Remarks 128
Examples 128
Example 1 128
Example 2 129
Syntax 132
Remarks 132
Examples 132
errno 132
strerror 132
perror 133
Syntax 134
Parameters 134
Remarks 134
Examples 135
fprintf 136
Output 138
fscanf() 142
Examples 145
Remarks 152
Examples 152
Example of function returning struct containing values with error codes 154
Introduction 156
Syntax 156
Examples 156
Example: 158
Example 159
Introduction 159
Usage 160
Syntax 160
Basics 161
Syntax 163
Parameters 163
Remarks 163
Examples 163
Examples 167
Remarks 171
Overview 171
General 171
Types 173
Evaluation 174
Preprocessor 175
General 175
Signals 176
Miscellaneous 177
Examples 179
Syntax 181
Remarks 181
Examples 181
Examples 184
Initialization of Variables in C 184
Remarks 189
Pros 189
Cons 189
Examples 189
Examples 193
main.c: 193
source1.c: 193
source2.c: 193
headerfile.h: 194
Introduction 195
Examples 195
Semaphores 195
Syntax 200
Remarks 200
Head-Controlled Iteration Statement/Loops 200
Examples 200
Syntax 206
Remarks 206
Examples 206
Remarks 210
Topoliges 210
Procedures 211
Bind 211
Insertion 212
Examples 213
Remarks 221
Examples 221
Introduction 225
Syntax 225
Parameters 225
Remarks 225
Examples 225
Recommendation 233
Remarks 235
Examples 235
Trigraphs 235
Digraphs 236
Introduction 238
Syntax 238
Remarks 238
Examples 238
Introduction 240
Syntax 240
Remarks 240
Examples 242
Logical OR 248
Address-of 254
Dereference 255
Indexing 255
_Alignof 257
Examples 260
Pass a 2D-array to a function 260
Introduction 267
Syntax 267
Remarks 267
Examples 267
Caution: 274
Caution: 275
Premise 279
Example 279
Conclusion 279
Introduction 285
Remarks 285
Examples 285
__cplusplus for using C externals in C++ code compiled with C++ - name mangling 298
Remarks 303
Examples 303
Examples 307
if () Statements 307
Remarks 313
Examples 313
Examples 317
Syntax 319
Parameters 319
Remarks 319
Examples 320
Syntax 322
Remarks 322
Examples 322
Single precision and long double precision floating-point remainder: fmodf(), fmodl() 323
Introduction 326
Syntax 326
Remarks 326
Examples 328
typedef 328
auto 329
static 329
extern 331
register 331
_Thread_local 332
Introduction 334
Syntax 334
Examples 334
strstr 344
strcpy() 348
snprintf() 349
strncat() 349
strncpy() 350
Convert Strings to Number: atoi(), atof() (dangerous, don't use them) 350
Introduction 354
Examples 354
Usage 359
Compatibility 360
Introduction 365
Remarks 365
Examples 365
Introduction 368
Remarks 368
Examples 368
CppUTest 368
Syntax 372
Remarks 372
Examples 373
Remarks 375
Examples 376
Warning 376
Introduction 379
Syntax 379
Remarks 379
Examples 380
Introduction 385
Remarks 385
Examples 387
Modifying any object more than once between two sequence points 387
Bit shifting using negative counts or beyond the width of the type 400
Modifying the string returned by getenv, strerror, and setlocale functions 401
Returning from a function that's declared with `_Noreturn` or `noreturn` function specifie 401
Examples 403
Syntax 406
Remarks 406
Examples 406
Introduction 409
Syntax 409
Parameters 409
Remarks 409
Examples 410
Using an explicit count argument to determine the length of the va_list 410
Introduction 417
Remarks 417
Examples 417
Here we use X-macros to declare an enum containing 4 commands and a map of their names as 419
Similarly, we can generate a jump table to call functions by the enum value. 419
Credits 421
About
You can share this PDF with anyone you feel could benefit from it, downloaded the latest version
from: c-language
It is an unofficial and free C Language ebook created for educational purposes. All the content is
extracted from Stack Overflow Documentation, which is written by many hardworking individuals at
Stack Overflow. It is neither affiliated with Stack Overflow nor official C Language.
The content is released under Creative Commons BY-SA, and the list of contributors to each
chapter are provided in the credits section at the end of this book. Images may be copyright of
their respective owners unless otherwise specified. All trademarks and registered trademarks are
the property of their respective company owners.
Use the content presented in this book at your own risk; it is not guaranteed to be correct nor
accurate, please send your feedback and corrections to [email protected]
https://riptutorial.com/ 1
Chapter 1: Getting started with C Language
Remarks
C is a general-purpose, imperative computer programming language, supporting structured
programming, lexical variable scope and recursion, while a static type system prevents many
unintended operations. By design, C provides constructs that map efficiently to typical machine
instructions, and therefore it has found lasting use in applications that had formerly been coded in
assembly language, including operating systems, as well as various application software for
computers ranging from supercomputers to embedded systems.
Despite its low-level capabilities, the language was designed to encourage cross-platform
programming. A standards-compliant and portably written C program can be compiled for a very
wide variety of computer platforms and operating systems with few changes to its source code.
The language has become available on a very wide range of platforms, from embedded
microcontrollers to supercomputers.
C was originally developed by Dennis Ritchie between 1969 and 1973 at Bell Labs and used to re-
implement the Unix operating systems. It has since become one of the most widely used
programming languages of all time, with C compilers from various vendors available for the
majority of existing computer architectures and operating systems.
Common Compilers
The process to compile a C program differs between compilers and operating systems. Most
operating systems ship without a compiler, so you will have to install one. Some common
compilers choices are:
The following documents should give you a good overview on how to get started using a few of the
most common compilers:
Note that compilers have varying levels of support for standard C with many still not completely
supporting C99. For example, as of the 2015 release, MSVC supports much of C99 yet still has
some important exceptions for support of the language itself (e.g the preprocessing seems non-
conformant) and for the C library (e.g. <tgmath.h>), nor do they necessarily document their
"implementation dependent choices". Wikipedia has a table showing support offered by some
https://riptutorial.com/ 2
popular compilers.
Some compilers (notably GCC) have offered, or continue to offer, compiler extensions that
implement additional features that the compiler producers deem necessary, helpful or believe may
become part of a future C version, but that are not currently part of any C standard. As these
extensions are compiler specific they can be considered to not be cross-compatible and compiler
developers may remove or alter them in later compiler versions. The use of such extensions can
generally be controlled by compiler flags.
Additionally, many developers have compilers that support only specific versions of C imposed by
the environment or platform they are targeting.
If selecting a compiler, it is recommended to choose a compiler that has the best support for the
latest version of C allowed for the target environment.
Code style is not covered by the standard and is primarily opinion based (different people find
different styles easier to read), as such, it is generally considered off-topic on SO. The overriding
advice on style in one's own code is that consistency is paramount - pick, or make, a style and
stick to it. Suffice it to explain that there are various named styles in common usage that are often
chosen by programmers rather than creating their own style.
Some common indent styles are: K & R style, Allman style, GNU style and so on. Some of these
styles have different variants. Allman, for example, is used as either regular Allman or the popular
variant, Allman-8. Information on some of the popular styles may be found on Wikipedia. Such
style names are taken from the standards the authors or organizations often publish for use by the
many people contributing to their code, so that everyone can easily read the code when they know
the style, such as the GNU formatting guide that makes up part of the GNU coding standards
document.
K & R style is generally recommended for use within SO documentation, whereas the more
esoteric styles, such as Pico, are discouraged.
https://riptutorial.com/ 3
• POSIX API (covering for example PThreads, Sockets, Signals)
Versions
Examples
Hello World
To create a simple C program which prints "Hello, World" on the screen, use a text editor to create
a new file (e.g. hello.c — the file extension must be .c) containing the following source code:
hello.c
#include <stdio.h>
int main(void)
{
puts("Hello, World");
return 0;
}
This line tells the compiler to include the contents of the standard library header file stdio.h in the
program. Headers are usually files containing function declarations, macros and data types, and
you must include the header file before you use them. This line includes stdio.h so it can call the
function puts().
https://riptutorial.com/ 4
See more about headers.
int main(void)
This line starts the definition of a function. It states the name of the function (main), the type and
number of arguments it expects (void, meaning none), and the type of value that this function
returns (int). Program execution starts in the main() function.
{
…
}
The curly braces are used in pairs to indicate where a block of code begins and ends. They can be
used in a lot of ways, but in this case they indicate where the function begins and ends.
puts("Hello, World");
This line calls the puts() function to output text to standard output (the screen, by default), followed
by a newline. The string to be output is included within the parentheses.
"Hello, World" is the string that will be written to the screen. In C, every string literal value must be
inside the double quotes "…".
return 0;
When we defined main(), we declared it as a function returning an int, meaning it needs to return
an integer. In this example, we are returning the integer value 0, which is used to indicate that the
program exited successfully. After the return 0; statement, the execution process will terminate.
The editor must create plain text files, not RTF or other any other format.
https://riptutorial.com/ 5
See more about compiling
If no errors are found in the the source code (hello.c), the compiler will create a binary file, the
name of which is given by the argument to the -o command line option (hello). This is the final
executable file.
We can also use the warning options -Wall -Wextra -Werror, that help to identify problems that can
cause the program to fail or produce unexpected results. They are not necessary for this simple
program but this is way of adding them:
By design, the clang command line options are similar to those of GCC.
cl hello.c
The following is the original "Hello, World!" program from the book The C Programming Language
https://riptutorial.com/ 6
by Brian Kernighan and Dennis Ritchie (Ritchie was the original developer of the C programming
language at Bell Labs), referred to as "K&R":
K&R
#include <stdio.h>
main()
{
printf("hello, world\n");
}
Notice that the C programming language was not standardized at the time of writing the first
edition of this book (1978), and that this program will probably not compile on most modern
compilers unless they are instructed to accept C90 code.
This very first example in the K&R book is now considered poor quality, in part because it lacks an
explicit return type for main() and in part because it lacks a return statement. The 2nd edition of
the book was written for the old C89 standard. In C89, the type of main would default to int, but the
K&R example does not return a defined value to the environment. In C99 and later standards, the
return type is required, but it is safe to leave out the return statement of main (and only main),
because of a special case introduced with C99 5.1.2.2.3 — it is equivalent to returning 0, which
indicates success.
The recommended and most portable form of main for hosted systems is int main (void) when the
program does not use any command line arguments, or int main(int argc, char **argv) when the
program does use the command line arguments.
A return from the initial call to the main function is equivalent to calling the exit function
with the value returned by the main function as its argument. If the main function
executes a return that specifies no value, the termination status returned to the host
environment is undefined.
If a return statement without an expression is executed, and the value of the function
call is used by the caller, the behavior is undefined. Reaching the } that terminates a
function is equivalent to executing a return statement without an expression.
If the return type of the main function is a type compatible with int, a return from the
initial call to the main function is equivalent to calling the exit function with the value
returned by the main function as its argument; reaching the } that terminates the main
function returns a value of 0. If the return type is not compatible with int, the
termination status returned to the host environment is unspecified.
https://riptutorial.com/ 7
Read Getting started with C Language online: https://riptutorial.com/c/topic/213/getting-started-
with-c-language
https://riptutorial.com/ 8
Chapter 2: — character classification &
conversion
Examples
Classifying characters read from a stream
#include <ctype.h>
#include <stdio.h>
typedef struct {
size_t space;
size_t alnum;
size_t punct;
} chartypes;
return types;
}
The classify function reads characters from a stream and counts the number of spaces,
alphanumeric and punctuation characters. It avoids several pitfalls.
• When reading a character from a stream, the result is saved as an int, since otherwise there
would be an ambiguity between reading EOF (the end-of-file marker) and a character that has
the same bit pattern.
• The character classification functions (e.g. isspace) expect their argument to be either
representable as an unsigned char, or the value of the EOF macro. Since this is exactly what
the fgetc returns, there is no need for conversion here.
• The return value of the character classification functions only distinguishes between zero
(meaning false) and nonzero (meaning true). For counting the number of occurrences, this
value needs to be converted to a 1 or 0, which is done by the double negation, !!.
#include <ctype.h>
#include <stddef.h>
typedef struct {
size_t space;
https://riptutorial.com/ 9
size_t alnum;
size_t punct;
} chartypes;
return types;
}
The classify function examines all characters from a string and counts the number of spaces,
alphanumeric and punctuation characters. It avoids several pitfalls.
• The character classification functions (e.g. isspace) expect their argument to be either
representable as an unsigned char, or the value of the EOF macro.
• The expression *p is of type char and must therefore be converted to match the above
wording.
• The char type is defined to be equivalent to either signed char or unsigned char.
• When char is equivalent to unsigned char, there is no problem, since every possible value of
the char type is representable as unsigned char.
• When char is equivalent to signed char, it must be converted to unsigned char before being
passed to the character classification functions. And although the value of the character may
change because of this conversion, this is exactly what these functions expect.
• The return value of the character classification functions only distinguishes between zero
(meaning false) and nonzero (meaning true). For counting the number of occurrences, this
value needs to be converted to a 1 or 0, which is done by the double negation, !!.
Introduction
The header ctype.h is a part of the standard C library. It provides functions for classifying and
converting characters.
All of these functions take one parameter, an int that must be either EOF or representable as an
unsigned char.
The names of the classifying functions are prefixed with 'is'. Each returns an integer non-zero
value (TRUE) if the character passed to it satisfies the related condition. If the condition is not
satisfied then the function returns a zero value (FALSE).
int a;
int c = 'A';
a = isalpha(c); /* Checks if c is alphabetic (A-Z, a-z), returns non-zero here. */
a = isalnum(c); /* Checks if c is alphanumeric (A-Z, a-z, 0-9), returns non-zero here. */
a = iscntrl(c); /* Checks is c is a control character (0x00-0x1F, 0x7F), returns zero here. */
https://riptutorial.com/ 10
a = isdigit(c); /* Checks if c is a digit (0-9), returns zero here. */
a = isgraph(c); /* Checks if c has a graphical representation (any printing character except
space), returns non-zero here. */
a = islower(c); /* Checks if c is a lower-case letter (a-z), returns zero here. */
a = isprint(c); /* Checks if c is any printable character (including space), returns non-zero
here. */
a = isupper(c); /* Checks if c is a upper-case letter (a-z), returns zero here. */
a = ispunct(c); /* Checks if c is a punctuation character, returns zero here. */
a = isspace(c); /* Checks if c is a white-space character, returns zero here. */
a = isupper(c); /* Checks if c is an upper-case letter (A-Z), returns non-zero here. */
a = isxdigit(c); /* Checks if c is a hexadecimal digit (A-F, a-f, 0-9), returns non-zero here.
*/
C99
There are two conversion functions. These are named using the prefix 'to'. These functions take
the same argument as those above. However the return value is not a simple zero or non-zero but
the passed argument changed in some manner.
int a;
int c = 'A';
The below information is quoted from cplusplus.com mapping how the original 127-character
ASCII set is considered by each of the classifying type functions (a • indicates that the function
returns non-zero for that character)
ASCII
characters iscntrl isblank isspace isupper islower isalpha i
values
0x00 ..
NUL, (other control codes) •
0x08
https://riptutorial.com/ 11
ASCII
characters iscntrl isblank isspace isupper islower isalpha i
values
0x1F
0x21 ..
!"#$%&'()*+,-./
0x2F
0x30 ..
0123456789 •
0x39
0x3a ..
:;<=>?@
0x40
0x41 ..
ABCDEF • •
0x46
0x47 ..
GHIJKLMNOPQRSTUVWXYZ • •
0x5A
0x5B ..
[]^_`
0x60
0x61 ..
abcdef • •
0x66
0x67 ..
ghijklmnopqrstuvwxyz • •
0x7A
0x7B ..
{}~bar
0x7E
0x7F (DEL) •
https://riptutorial.com/ 12
Chapter 3: Aliasing and effective type
Remarks
Violations of aliasing rules and of violating the effective type of an object are two different things
and should not be confounded.
• Aliasing is the property of two pointers a and b that refer to the same object, that is that a ==
b.
• The effective type of a data object is used by C to determine which operations can be done
on that object. In particular the effective type is used to determine if two pointers can alias
each other.
Aliasing can be a problem for optimization, because changing the object through one pointer, a
say, can change the object that is visible through the other pointer, b. If your C compiler would
have to assume that pointers could always alias each other, regardless of their type and
provenance, many optimization opportunities would be lost, and many programs would run slower.
C's strict aliasing rules refers to cases in the compiler may assume which objects do (or do not)
alias each other. There are two rules of thumb that you always should have in mind for data
pointers.
Unless said otherwise, two pointers with the same base type may alias.
Two pointers with different base type cannot alias, unless at least one of the two types
is a character type.
Here base type means that we put aside type qualifications such as const, e.g. If a is double* and b
is const double*, the compiler must generally assume that a change of *a may change *b.
Violating the second rule can have catastrophic results. Here violating the strict aliasing rule
means that you present two pointers a and b of different type to the compiler which in reality point
to the same object. The compiler then may always assume that the two point to different objects,
and will not update its idea of *b if you changed the object through *a.
If you do so the behavior of your program becomes undefined. Therefore, C puts quite severe
restrictions on pointer conversions in order to help you to avoid such situation to occur
accidentally.
Unless the source or target type is void, all pointer conversions between pointers with
different base type must be explicit.
Or in other words, they need a cast, unless you do a conversion that just adds a qualifier such as
const to the target type.
Avoiding pointer conversions in general and casts in particular protects you from aliasing
https://riptutorial.com/ 13
problems. Unless you really need them, and these cases are very special, you should avoid them
as you can.
Examples
Character types cannot be accessed through non-character types.
If an object is defined with static, thread, or automatic storage duration, and it has a character
type, either: char, unsigned char, or signed char, it may not be accessed by a non-character type. In
the below example a char array is reinterpreted as the type int, and the behavior is undefined on
every dereference of the int pointer b.
This is undefined because it violates the "effective type" rule, no data object that has an effective
type may be accessed through another type that is not a character type. Since the other type here
is int, this is not allowed.
Even if alignment and pointer sizes would be known to fit, this would not exempt from this rule,
behavior would still be undefined.
This means in particular that there is no way in standard C to reserve a buffer object of character
type that can be used through pointers with different types, as you would use a buffer that was
received by malloc or similar function.
A correct way to achieve the same goal as in the above example would be to use a union.
static bufType a = { .c = { 0 } };
https://riptutorial.com/ 14
int* b = a.i;
*b = 2;
_Thread_local bufType a = { .c = { 0 } };
int* b = a.i;
*b = 3;
}
Here, the union ensures that the compiler knows from the start that the buffer could be accessed
through different views. This also has the advantage that now the buffer has a "view" a.i that
already is of type int and no pointer conversion is needed.
Effective type
The effective type of a data object is the last type information that was associated with it, if any.
// a normal variable, effective type uint32_t, and this type never changes
uint32_t a = 0.0;
Observe that for the latter, it was not necessary that we even have an uint32_t* pointer to that
object. The fact that we have copied another uint32_t object is sufficient.
In the following code let us assume for simplicity that float and uint32_t have the same size.
https://riptutorial.com/ 15
void fun(uint32_t* u, float* f) {
float a = *f
*u = 22;
float b = *f;
print("%g should equal %g\n", a, b);
}
u and f have different base type, and thus the compiler can assume that they point to different
objects. There is no possibility that *f could have changed between the two initializations of a and
b, and so the compiler may optimize the code to something equivalent to
That is, the second load operation of *f can be optimized out completely.
float fval = 4;
uint32_t uval = 77;
fun(&uval, &fval);
4 should equal 4
is printed. But if we cheat and pass the same pointer, after converting it,
float fval = 4;
uint32_t* up = (uint32_t*)&fval;
fun(up, &fval);
we violate the strict aliasing rule. Then the behavior becomes undefined. The output could be as
above, if the compiler had optimized the second access, or something completely different, and so
your program ends up in a completely unreliable state.
restrict qualification
If we have two pointer arguments of the same type, the compiler can't make any assumption and
will always have to assume that the change to *e may change *f:
float fval = 4;
https://riptutorial.com/ 16
float eval = 77;
fun(&eval, &fval);
is 4 equal to 4?
is printed. If we pass the same pointer, the program will still do the right thing and print
is 4 equal to 22?
This can turn out to be inefficient, if we know by some outside information that e and f will never
point to the same data object. We can reflect that knowledge by adding restrict qualifiers to the
pointer parameters:
Then the compiler may always suppose that e and f point to different objects.
Changing bytes
Once an object has an effective type, you should not attempt to modify it through a pointer of
another type, unless that other type is a character type, char, signed char or unsigned char.
#include <inttypes.h>
#include <stdio.h>
int main(void) {
uint32_t a = 57;
// conversion from incompatible types needs a cast !
unsigned char* ap = (unsigned char*)&a;
for (size_t i = 0; i < sizeof a; ++i) {
/* set each byte of a to 42 */
ap[i] = 42;
}
printf("a now has value %" PRIu32 "\n", a);
}
• The access is made to the individual bytes seen with type unsigned char so each modification
is well defined.
• The two views to the object, through a and through *ap, alias, but since ap is a pointer to a
https://riptutorial.com/ 17
character type, the strict aliasing rule does not apply. Thus the compiler has to assume that
the value of a may have been changed in the for loop. The modified value of a must be
constructed from the bytes that have been changed.
• The type of a, uint32_t has no padding bits. All its bits of the representation count for the
value, here 707406378, and there can be no trap representation.
https://riptutorial.com/ 18
Chapter 4: Arrays
Introduction
Arrays are derived data types, representing an ordered collection of values ("elements") of another
type. Most arrays in C have a fixed number of elements of any one type, and its representation
stores the elements contiguously in memory without gaps or padding. C allows multidimensional
arrays whose elements are other arrays, and also arrays of pointers.
C supports dynamically allocated arrays whose size is determined at run time. C99 and later
supports variable length arrays or VLAs.
Syntax
• type name[length]; /* Define array of 'type' with name 'name' and length 'length'. */
• int arr[10] = {0}; /* Define an array and initialize ALL elements to 0. */
• int arr[10] = {42}; /* Define an array and initialize 1st elements to 42 an the rest to 0. */
• int arr[] = {4, 2, 3, 1}; /* Define and initialize an array of length 4. */
• arr[n] = value; /* Set value at index n. */
• value = arr[n]; /* Get value at index n. */
Remarks
Why do we need arrays?
Arrays provide a way to organize objects into an aggregate with its own significance. For example,
C strings are arrays of characters (chars), and a string such as "Hello, World!" has meaning as an
aggregate that is not inherent in the characters individually. Similarly, arrays are commonly used
to represent mathematical vectors and matrices, as well as lists of many kinds. Moreover, without
some way to group the elements, one would need to address each individually, such as via
separate variables. Not only is that unwieldy, it does not easily accommodate collections of
different lengths.
Except when appearing as the operand of the sizeof operator, the _Alignof operator (C2011), or
the unary & (address-of) operator, or as a string literal used to initialize an(other) array, an array is
implicitly converted into ("decays to") a pointer to its first element. This implicit conversion is tightly
coupled to the definition of the array subscripting operator ([]): the expression arr[idx] is defined
as be equivalent to *(arr + idx). Furthermore, since pointer arithmetic is commutative, *(arr +
idx) is also equivalent to *(idx + arr), which in turn is equivalent toidx[arr]. All of those
expressions are valid and evaluate to the same value, provided that either idx or arr is a pointer
(or an array, which decays to a pointer), the other is an integer, and the integer is a valid index into
the array to which the pointer points.
https://riptutorial.com/ 19
As a special case, observe that &(arr[0]) is equivalent to &*(arr + 0), which simplifies to arr. All of
those expressions are interchangeable wherever the last decays to a pointer. This simply
expresses again that an array decays to a pointer to its first element.
In contrast, if the address-of operator is applied to an array of type T[N] (i.e. &arr) then the result
has type T (*)[N] and points to the whole array. This is distinct from a pointer to the first array
element at least with respect to pointer arithmetic, which is defined in terms of the size of the
pointed-to type.
Although the first declaration of foo uses array-like syntax for parameter a, such syntax is used to
declare a function parameter declares that parameter as a pointer to the array's element type.
Thus, the second signature for foo() is semantically identical to the first. This corresponds to the
decay of array values to pointers where they appear as arguments to a function call, such that if a
variable and a function parameter are declared with the same array type then that variable's value
is suitable for use in a function call as the argument associated with the parameter.
Examples
Declaring and initializing an array
type arrName[size];
where type could be any built-in type or user-defined types such as structures, arrName is a user-
defined identifier, and size is an integer constant.
Declaring an array (an array of 10 int variables in this case) is done like this:
int array[10];
it now holds indeterminate values. To ensure it holds zero values while declaring, you can do this:
Arrays can also have initializers, this example declares an array of 10 int's, where the first 3 int's
will contain the values 1, 2, 3, all other values will be zero:
In the above method of initialization, the first value in the list will be assigned to the first member of
the array, the second value will be assigned to the second member of the array and so on. If the
https://riptutorial.com/ 20
list size is smaller than the array size, then as in the above example, the remaining members of
the array will be initialized to zeros. With designated list initialization (ISO C99), explicit
initialization of the array members is possible. For example,
In most cases, the compiler can deduce the length of the array for you, this can be achieved by
leaving the square brackets empty:
C99C11
Variable Length Arrays (VLA for short) were added in C99, and made optional in C11. They are
equal to normal arrays, with one, important, difference: The length doesn't have to be known at
compile time. VLA's have automatic storage duration. Only pointers to VLA's can have static
storage duration.
Important:
VLA's are potentially dangerous. If the array vla in the example above requires more space on the
stack than available, the stack will overflow. Usage of VLA's is therefore often discouraged in style
guides and by books and exercises.
Sometimes it's necessary to set an array to zero, after the initialization has been done.
int main(void)
{
int array[ARRLEN]; /* Allocated but not initialised, as not defined static or global. */
size_t i;
for(i = 0; i < ARRLEN; ++i)
{
array[i] = 0;
}
return EXIT_SUCCESS;
}
https://riptutorial.com/ 21
An common short cut to the above loop is to use memset() from <string.h>. Passing array as shown
below makes it decay to a pointer to its 1st element.
memset(array, 0, ARRLEN * sizeof (int)); /* Use size explicitly provided type (int here). */
or
memset(array, 0, ARRLEN * sizeof *array); /* Use size of type the pointer is pointing to. */
As in this example array is an array and not just a pointer to an array's 1st element (see Array
length on why this is important) a third option to 0-out the array is possible:
Array length
Arrays have fixed lengths that are known within the scope of their declarations. Nevertheless, it is
possible and sometimes convenient to calculate array lengths. In particular, this can make code
more flexible when the array length is determined automatically from an initializer:
int array[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
For example, suppose we want to write a function to return the last element of an array of int.
Continuing from the above, we might call it like so:
Note in particular that although the declaration of parameter input resembles that of an array, it in
fact declares input as a pointer (to int). It is exactly equivalent to declaring input as int *input.
https://riptutorial.com/ 22
The same would be true even if a dimension were given. This is possible because arrays cannot
ever be actual arguments to functions (they decay to pointers when they appear in function call
expressions), and it can be viewed as mnemonic.
It is a very common error to attempt to determine array size from a pointer, which cannot work. DO
NOT DO THIS:
return input[length - 1]; /* Oops -- not the droid we are looking for */
}
In fact, that particular error is so common that some compilers recognize it and warn about it. clang
, for instance, will emit the following warning:
warning: sizeof on array function parameter will return size of 'int *' instead of 'int []' [-
Wsizeof-array-argument]
int length = sizeof(input) / sizeof(input[0]);
^
note: declared here
int BAD_get_last(int input[])
^
int val;
int array[10];
As a side effect of the operands to the + operator being exchangeable (--> commutative law) the
following is equivalent:
*(array + 4) = 5;
*(4 + array) = 5;
array[4] = 5;
4[array] = 5; /* Weird but valid C ... */
https://riptutorial.com/ 23
and those two as well:
val = array[4];
val = 4[array]; /* Weird but valid C ... */
C doesn't perform any boundary checks, accessing contents outside of the declared array is
undefined (Accessing memory beyond allocated chunk ):
int val;
int array[10];
array[4] = 5; /* ok */
val = array[4]; /* ok */
array[19] = 20; /* undefined behavior */
val = array[15]; /* undefined behavior */
#include <stdio.h>
return 0;
}
#include <stdio.h>
#include <stdlib.h>
https://riptutorial.com/ 24
{
int * pdata;
size_t n;
return EXIT_SUCCESS;
}
This program tries to scan in an unsigned integer value from standard input, allocate a block of
memory for an array of n elements of type int by calling the calloc() function. The memory is
initialized to all zeros by the latter.
Arrays in C can be seen as a contiguous chunk of memory. More precisely, the last dimension of
the array is the contiguous part. We call this the row-major order. Understanding this and the fact
that a cache fault loads a complete cache line into the cache when accessing uncached data to
prevent subsequent cache faults, we can see why accessing an array of dimension 10000x10000
with array[0][0] would potentially load array[0][1] in cache, but accessing array[1][0] right after
would generate a second cache fault, since it is sizeof(type)*10000 bytes away from array[0][0],
and therefore certainly not on the same cache line. Which is why iterating like this is inefficient:
size_t i, j;
for (i = 0; i < ARRLEN; ++i)
{
for(j = 0; j < ARRLEN; ++j)
{
array[j][i] = 0;
}
}
https://riptutorial.com/ 25
#define ARRLEN 10000
int array[ARRLEN][ARRLEN];
size_t i, j;
for (i = 0; i < ARRLEN; ++i)
{
for(j = 0; j < ARRLEN; ++j)
{
array[i][j] = 0;
}
}
In the same vein, this is why when dealing with an array with one dimension and multiple indexes
(let's say 2 dimensions here for simplicity with indexes i and j) it is important to iterate through the
array like this:
#define DIM_X 10
#define DIM_Y 20
int array[DIM_X*DIM_Y];
size_t i, j;
for (i = 0; i < DIM_X; ++i)
{
for(j = 0; j < DIM_Y; ++j)
{
array[i*DIM_Y+j] = 0;
}
}
#define DIM_X 10
#define DIM_Y 20
#define DIM_Z 30
int array[DIM_X*DIM_Y*DIM_Z];
size_t i, j, k;
for (i = 0; i < DIM_X; ++i)
{
for(j = 0; j < DIM_Y; ++j)
{
for (k = 0; k < DIM_Z; ++k)
{
array[i*DIM_Y*DIM_Z+j*DIM_Z+k] = 0;
}
}
}
Or in a more generic way, when we have an array with N1 x N2 x ... x Nd elements, d dimensions
and indices noted as n1,n2,...,nd the offset is calculated like this
https://riptutorial.com/ 26
Picture/formula taken from: https://en.wikipedia.org/wiki/Row-major_order
Multi-dimensional arrays
The C programming language allows multidimensional arrays. Here is the general form of a
multidimensional array declaration −
type name[size1][size2]...[sizeN];
For example, the following declaration creates a three dimensional (5 x 10 x 4) integer array:
int arr[5][10][4];
Two-dimensional Arrays
The simplest form of multidimensional array is the two-dimensional array. A two-dimensional array
is, in essence, a list of one-dimensional arrays. To declare a two-dimensional integer array of
dimensions m x n, we can write as follows:
type arrayName[m][n];
Where type can be any valid C data type (int, float, etc.) and arrayName can be any valid C
identifier. A two-dimensional array can be visualized as a table with m rows and n columns. Note:
The order does matter in C. The array int a[4][3] is not the same as the array int a[3][4]. The
number of rows comes first as C is a row-major language.
A two-dimensional array a, which contains three rows and four columns can be shown as follows:
Thus, every element in the array a is identified by an element name of the form a[i][j], where a is
the name of the array, i represents which row, and j represents which column. Recall that rows
and columns are zero indexed. This is very similar to mathematical notation for subscripting 2-D
matrices.
Multidimensional arrays may be initialized by specifying bracketed values for each row. The
following define an array with 3 rows where each row has 4 columns.
int a[3][4] = {
{0, 1, 2, 3} , /* initializers for row indexed by 0 */
{4, 5, 6, 7} , /* initializers for row indexed by 1 */
{8, 9, 10, 11} /* initializers for row indexed by 2 */
https://riptutorial.com/ 27
};
The nested braces, which indicate the intended row, are optional. The following initialization is
equivalent to the previous example:
While the method of creating arrays with nested braces is optional, it is strongly encouraged as it
is more readable and clearer.
An element in a two-dimensional array is accessed by using the subscripts, i.e., row index and
column index of the array. For example −
The above statement will take the 4th element from the 3rd row of the array. Let us check the
following program where we have used a nested loop to handle a two-dimensional array:
#include <stdio.h>
int main () {
return 0;
}
When the above code is compiled and executed, it produces the following result:
a[0][0]: 0
a[0][1]: 0
a[1][0]: 1
a[1][1]: 2
a[2][0]: 2
a[2][1]: 4
a[3][0]: 3
a[3][1]: 6
a[4][0]: 4
a[4][1]: 8
Three-Dimensional array:
https://riptutorial.com/ 28
A 3D array is essentially an array of arrays of arrays: it's an array or collection of 2D arrays, and a
2D array is an array of 1D arrays.
Initializing a 3D Array:
double cprogram[3][2][4]={
{{-0.1, 0.22, 0.3, 4.3}, {2.3, 4.7, -0.9, 2}},
{{0.9, 3.6, 4.5, 4}, {1.2, 2.4, 0.22, -1}},
{{8.2, 3.12, 34.2, 0.1}, {2.1, 3.2, 4.3, -2.0}}
};
We can have arrays with any number of dimensions, although it is likely that most of the arrays
that are created will be of one or two dimensions.
#include <stdio.h>
#define SIZE (10)
int main()
{
size_t i = 0;
int *p = NULL;
int a[SIZE];
return 0;
}
https://riptutorial.com/ 29
Here, in the initialization of p in the first for loop condition, the array a decays to a pointer to its first
element, as it would in almost all places where such an array variable is used.
Then, the ++p performs pointer arithmetic on the pointer p and walks one by one through the
elements of the array, and refers to them by dereferencing them with *p.
Multidimensional arrays follow the same rules as single-dimensional arrays when passing them to
a function. However the combination of decay-to-pointer, operator precedence, and the two
different ways to declare a multidimensional array (array of arrays vs array of pointers) may make
the declaration of such functions non-intuitive. The following example shows the correct ways to
pass multidimensional arrays.
#include <assert.h>
#include <stdlib.h>
/* An array of pointers may be passed to this, since it'll decay into a pointer
to pointer, but an array of arrays may not. */
void h(int **x) {
assert(sizeof(*x) == sizeof(int*));
}
int main(void) {
int foo[2][4];
f(foo);
g(foo);
https://riptutorial.com/ 30
h(bar);
See also
Passing in Arrays to Functions
https://riptutorial.com/ 31
Chapter 5: Assertion
Introduction
An assertion is a predicate that the presented condition must be true at the moment the assertion
is encountered by the software. Most common are simple assertions, which are validated at
execution time. However, static assertions are checked at compile time.
Syntax
• assert(expression)
• static_assert(expression, message)
• _Static_assert(expression, message)
Parameters
Parameter Details
Remarks
Both assert and static_assert are macros defined in assert.h.
The definition of assert depends on the macro NDEBUG which is not defined by the standard library.
If NDEBUG is defined, assert is a no-op:
#ifdef NDEBUG
# define assert(condition) ((void) 0)
#else
# define assert(condition) /* implementation defined */
#endif
Opinion varies about whether NDEBUG should always be used for production compilations.
• The pro-camp argues that assert calls abort and assertion messages are not helpful for end
users, so the result is not helpful to user. If you have fatal conditions to check in production
code you should use ordinary if/else conditions and exit or quick_exit to end the program.
In contrast to abort, these allow the program to do some cleanup (via functions registered
with atexit or at_quick_exit).
• The con-camp argues that assert calls should never fire in production code, but if they do,
the condition that is checked means there is something dramatically wrong and the program
https://riptutorial.com/ 32
will misbehave worse if execution continues. Therefore, it is better to have the assertions
active in production code because if they fire, hell has already broken loose.
• Another option is to use a home-brew system of assertions which always perform the check
but handle errors differently between development (where abort is appropriate) and
production (where an 'unexpected internal error - please contact Technical Support' may be
more appropriate).
Examples
Precondition and Postcondition
One use case for assertion is precondition and postcondition. This can be very useful to maintain
invariant and design by contract. For a example a length is always zero or positive so this function
must return a zero or positive value.
#include <stdio.h>
/* Uncomment to disable `assert()` */
/* #define NDEBUG */
#include <assert.h>
/* Precondition: */
/* NULL is an invalid vector */
assert (a != NULL);
/* Number of dimensions can not be negative.*/
assert (count >= 0);
/* Calculation */
for (i = 0; i < count; ++i)
{
result = result + (a[i] * a[i]);
}
/* Postcondition: */
/* Resulting length can not be negative. */
assert (result >= 0);
return result;
}
#define COUNT 3
https://riptutorial.com/ 33
r = length2 (b, COUNT);
printf ("r = %i\n", r);
return 0;
}
Simple Assertion
An assertion is a statement used to assert that a fact must be true when that line of code is
reached. Assertions are useful for ensuring that expected conditions are met. When the condition
passed to an assertion is true, there is no action. The behavior on false conditions depends on
compiler flags. When assertions are enabled, a false input causes an immediate program halt.
When they are disabled, no action is taken. It is common practice to enable assertions in internal
and debug builds, and disable them in release builds, though assertions are often enabled in
release. (Whether termination is better or worse than errors depends on the program.) Assertions
should be used only to catch internal programming errors, which usually means being passed bad
parameters.
#include <stdio.h>
/* Uncomment to disable `assert()` */
/* #define NDEBUG */
#include <assert.h>
int main(void)
{
int x = -1;
assert(x >= 0);
x = -1
It's good practice to define NDEBUG globally, so that you can easily compile your code with all
assertions either on or off. An easy way to do this is define NDEBUG as an option to the compiler, or
define it in a shared configuration header (e.g. config.h).
Static Assertion
C11
Static assertions are used to check if a condition is true when the code is compiled. If it isn't, the
compiler is required to issue an error message and stop the compiling process.
https://riptutorial.com/ 34
A static assertion is one that is checked at compile time, not run time. The condition must be a
constant expression, and if false will result in a compiler error. The first argument, the condition
that is checked, must be a constant expression, and the second a string literal.
#include <assert.h>
enum {N = 5};
_Static_assert(N == 5, "N does not equal 5");
static_assert(N > 10, "N is not greater than 10"); /* compiler error */
C99
Prior to C11, there was no direct support for static assertions. However, in C99, static assertions
could be emulated with macros that would trigger a compilation failure if the compile time condition
was false. Unlike _Static_assert, the second parameter needs to be a proper token name so that a
variable name can be created with it. If the assertion fails, the variable name is seen in the
compiler error, since that variable was used in a syntactically incorrect array declaration.
enum { N = 5 };
STATIC_ASSERT(N == 5, N_must_equal_5);
STATIC_ASSERT(N > 5, N_must_be_greater_than_5); /* compile error */
Before C99, you could not declare variables at arbitrary locations in a block, so you would have to
be extremely cautious about using this macro, ensuring that it only appears where a variable
declaration would be valid.
During development, when certain code paths must be prevented from the reach of control flow,
you may use assert(0) to indicate that such a condition is erroneous:
switch (color) {
case COLOR_RED:
case COLOR_GREEN:
case COLOR_BLUE:
break;
default:
assert(0);
}
Whenever the argument of the assert() macro evaluates false, the macro will write diagnostic
information to the standard error stream and then abort the program. This information includes the
file and line number of the assert() statement and can be very helpful in debugging. Asserts can
https://riptutorial.com/ 35
be disabled by defining the macro NDEBUG.
Another way to terminate a program when an error occurs are with the standard library functions
exit, quick_exit or abort. exit and quick_exit take an argument that can be passed back to your
environment. abort() (and thus assert) can be a really severe termination of your program, and
certain cleanups that would otherwise be performed at the end of the execution, may not be
performed.
The primary advantage of assert() is that it automatically prints debugging information. Calling
abort() has the advantage that it cannot be disabled like an assert, but it may not cause any
debugging information to be displayed. In some situations, using both constructs together may be
beneficial:
When asserts are enabled, the assert() call will print debug information and terminate the
program. Execution never reaches the abort() call. When asserts are disabled, the assert() call
does nothing and abort() is called. This ensures that the program always terminates for this error
condition; enabling and disabling asserts only effects whether or not debug output is printed.
You should never leave such an assert in production code, because the debug information is not
helpful for end users and because abort is generally a much too severe termination that inhibit
cleanup handlers that are installed for exit or quick_exit to run.
A trick exists that can display an error message along with an assertion. Normally, you would write
code like this
However, you can use logical AND (&&) to give an error message as well
https://riptutorial.com/ 36
}
Now, if the assertion fails, an error message will read something like this
Assertion failed: p != NULL && "function f: p cannot be NULL", file main.c, line 5
The reason as to why this works is that a string literal always evaluates to non-zero (true). Adding
&& 1 to a Boolean expression has no effect. Thus, adding && "error message" has no effect either,
except that the compiler will display the entire expression that failed.
https://riptutorial.com/ 37
Chapter 6: Atomics
Syntax
• #ifdef __STDC_NO_ATOMICS__
• # error this implementation needs atomics
• #endif
• #include <stdatomic.h>
• unsigned _Atomic counter = ATOMIC_VAR_INIT(0);
Remarks
Atomics as part of the C language are an optional feature that is available since C11.
Their purpose is to ensure race-free access to variables that are shared between different threads.
Without atomic qualification, the state of a shared variable would be undefined if two threads
access it concurrently. Eg an increment operation (++) could be split into several assembler
instructions, a read, the addition itself and a store instruction. If another thread would be doing the
same operation their two instruction sequences could be intertwined and lead to an inconsistent
result.
• Types: All object types with the exception of array types can be qualified with _Atomic.
• Operations: There are some other operations that are specified as type generic functions,
e.g atomic_compare_exchange.
• Threads: Access to them is guaranteed not to produce data race when they are accessed
by different threads.
• Signal handlers: Atomic types are called lock-free if all operations on them are stateless. In
that case they can also be used to deal state changes between normal control flow and a
signal handler.
• There is only one data type that is guaranteed to be lock-free: atomic_flag. This is a minimal
type who's operations are intended to map to efficient test-and-set hardware instructions.
Other means to avoid race conditions are available in C11's thread interface, in particular a mutex
type mtx_t to mutually exclude threads from accessing critical data or critical sections of code. If
atomics are not available, these must be used to prevent races.
Examples
atomics and operators
https://riptutorial.com/ 38
Atomic variables can be accessed concurrently between different threads without creating race
conditions.
int myThread(void* a) {
++active; // increment active race free
// do something
--active; // decrement active race free
return 0;
}
All lvalue operations (operations that modify the object) that are allowed for the base type are
allowed and will not lead to race conditions between different threads that access them.
• Operations on atomic objects are generally orders of magnitude slower than normal
arithmetic operations. This also includes simple load or store operations. So you should only
use them for critical tasks.
• Usual arithmetic operations and assignment such as a = a+1; are in fact three operations on
a: first a load, then addition and finally a store. This is not race free. Only the operation a +=
1; and a++; are.
https://riptutorial.com/ 39
Chapter 7: Bit-fields
Introduction
Most variables in C have a size that is an integral number of bytes. Bit-fields are a part of a
structure that don't necessarily occupy a integral number of bytes; they can any number of bits.
Multiple bit-fields can be packed into a single storage unit. They are a part of standard C, but there
are many aspects that are implementation defined. They are one of the least portable parts of C.
Syntax
• type-specifier identifier : size;
Parameters
Parameter Description
Remarks
The only portable types for bit-fields are signed, unsigned or _Bool. The plain int type can be used,
but the standard says (§6.7.2¶5) … for bit-fields, it is implementation-defined whether the specifier
int designates the same type as signed int or the same type as unsigned int.
Other integer types may be allowed by a specific implementation, but using them is not portable.
Examples
Bit-fields
A simple bit-field can be used to describe things that may have a specific number of bits involved.
struct encoderPosition {
unsigned int encoderCounts : 23;
unsigned int encoderTurns : 4;
unsigned int _reserved : 5;
};
In this example we consider an encoder with 23 bits of single precision and 4 bits to describe
https://riptutorial.com/ 40
multi-turn. Bit-fields are often used when interfacing with hardware that outputs data associated
with specific number of bits. Another example could be communication with an FPGA, where the
FPGA writes data into your memory in 32 bit sections allowing for hardware reads:
struct FPGAInfo {
union {
struct bits {
unsigned int bulb1On : 1;
unsigned int bulb2On : 1;
unsigned int bulb1Off : 1;
unsigned int bulb2Off : 1;
unsigned int jetOn : 1;
};
unsigned int data;
};
};
For this example we have shown a commonly used construct to be able to access the data in its
individual bits, or to write the data packet as a whole (emulating what the FPGA might do). We
could then access the bits like this:
FPGAInfo fInfo;
fInfo.data = 0xFF34F;
if (fInfo.bits.bulb1On) {
printf("Bulb 1 is on\n");
}
This is valid, but as per the C99 standard 6.7.2.1, item 10:
You need to be aware of endianness when defining bit-fields in this way. As such it may be
necessary to use a preprocessor directive to check for the endianness of the machine. An
example of this follows:
typedef union {
struct bits {
#if defined(WIN32) || defined(LITTLE_ENDIAN)
uint8_t commFailure :1;
uint8_t hardwareFailure :1;
uint8_t _reserved :6;
#else
uint8_t _reserved :6;
uint8_t hardwareFailure :1;
uint8_t commFailure :1;
#endif
};
uint8_t data;
} hardwareStatus;
https://riptutorial.com/ 41
#include <stdio.h>
int main(void)
{
/* define a small bit-field that can hold values from 0 .. 7 */
struct
{
unsigned int uint3: 3;
} small;
return 0;
}
Bit-field alignment
Bit-fields give an ability to declare structure fields that are smaller than the character width. Bit-
fields are implemented with byte-level or word-level mask. The following example results in a
structure of 8 bytes.
struct C
{
short s; /* 2 bytes */
char c; /* 1 byte */
int bit1 : 1; /* 1 bit */
int nib : 4; /* 4 bits padded up to boundary of 8 bits. Thus 3 bits are padded */
int sept : 7; /* 7 Bits septet, padded up to boundary of 32 bits. */
};
The comments describe one possible layout, but because the standard says the alignment of the
addressable storage unit is unspecified, other layouts are also possible.
An unnamed bit-field may be of any size, but they can't be initialized or referenced.
A zero-width bit-field cannot be given a name and aligns the next field to the boundary defined by
the datatype of the bit-field. This is achieved by padding bits between the bit-fields.
struct A
{
unsigned char c1 : 3;
unsigned char c2 : 4;
unsigned char c3 : 1;
};
https://riptutorial.com/ 42
In structure B, the first unnamed bit-field skips 2 bits; the zero width bit-field after c2 causes c3 to
start from the char boundary (so 3 bits are skipped between c2 and c3. There are 3 padding bits
after c4. Thus the size of the structure is 2 bytes.
struct B
{
unsigned char c1 : 1;
unsigned char : 2; /* Skips 2 bits in the layout */
unsigned char c2 : 2;
unsigned char : 0; /* Causes padding up to next container boundary */
unsigned char c3 : 4;
unsigned char c4 : 1;
};
A bit-field is used to club together many variables into one object, similar to a structure. This
allows for reduced memory usage and is especially useful in an embedded environment.
e.g. consider the following variables having the ranges as given below.
a --> range 0 - 3
b --> range 0 - 1
c --> range 0 - 7
d --> range 0 - 1
e --> range 0 - 1
If we declare these variables separately, then each has to be at least an 8-bit integer and the total
space required will be 5 bytes. Moreover the variables will not use the entire range of an 8 bit
unsigned integer (0-255). Here we can use bit-fields.
typedef struct {
unsigned int a:2;
unsigned int b:1;
unsigned int c:3;
unsigned int d:1;
unsigned int e:1;
} bit_a;
The bit-fields in the structure are accessed the same as any other structure. The programmer
needs to take care that the variables are written in range. If out of range the behaviour is
undefined.
int main(void)
{
bit_a bita_var;
bita_var.a = 2; // to write into element a
printf ("%d",bita_var.a); // to read from element a.
return 0;
}
Often the programmer wants to zero the set of bit-fields. This can be done element by element,
but there is second method. Simply create a union of the structure above with an unsigned type
https://riptutorial.com/ 43
that is greater than, or equal to, the size of the structure. Then the entire set of bit-fields may be
zeroed by zeroing this unsigned integer.
typedef union {
struct {
unsigned int a:2;
unsigned int b:1;
unsigned int c:3;
unsigned int d:1;
unsigned int e:1;
};
uint8_t data;
} union_bit;
Usage is as follows
int main(void)
{
union_bit un_bit;
un_bit.data = 0x00; // clear the whole bit-field
un_bit.a = 2; // write into element a
printf ("%d",un_bit.a); // read from element a.
return 0;
}
In conclusion, bit-fields are commonly used in memory constrained situations where you have a lot
of variables which can take on limited ranges.
1. Arrays of bit-fields, pointers to bit-fields and functions returning bit-fields are not allowed.
2. The address operator (&) cannot be applied to bit-field members.
3. The data type of a bit-field must be wide enough to contain the size of the field.
4. The sizeof() operator cannot be applied to a bit-field.
5. There is no way to create a typedef for a bit-field in isolation (though you can certainly create
a typedef for a structure containing bit-fields).
int SomeFunction(void)
{
// Somewhere in the code
A a = { … };
printf("Address of a.c2 is %p\n", &a.c2); /* incorrect, see point 2 */
printf("Size of a.c2 is %zu\n", sizeof(a.c2)); /* incorrect, see point 4 */
}
https://riptutorial.com/ 44
Chapter 8: Boolean
Remarks
To use the predefined type _Bool and the header <stdbool.h>, you must be using the C99/C11
versions of C.
To avoid compiler warnings and possibly errors, you should only use the typedef/define example if
you're using C89 and previous versions of the language.
Examples
Using stdbool.h
C99
Using the system header file stdbool.h allows you to use bool as a Boolean data type. true
evaluates to 1 and false evaluates to 0.
#include <stdio.h>
#include <stdbool.h>
int main(void) {
bool x = true; /* equivalent to bool x = 1; */
bool y = false; /* equivalent to bool y = 0; */
if (x) /* Functionally equivalent to if (x != 0) or if (x != false) */
{
puts("This will print!");
}
if (!y) /* Functionally equivalent to if (y == 0) or if (y == false) */
{
puts("This will also print!");
}
}
bool is just a nice spelling for the data type _Bool. It has special rules when numbers or pointers
are converted to it.
Using #define
C of all versions, will effectively treat any integer value other than 0 as true for comparison
operators and the integer value 0 as false. If you don't have _Bool or bool as of C99 available, you
could simulate a Boolean data type in C using #define macros, and you might still find such things
in legacy code.
#include <stdio.h>
https://riptutorial.com/ 45
#define false 0
int main(void) {
bool x = true; /* Equivalent to int x = 1; */
bool y = false; /* Equivalent to int y = 0; */
if (x) /* Functionally equivalent to if (x != 0) or if (x != false) */
{
puts("This will print!");
}
if (!y) /* Functionally equivalent to if (y == 0) or if (y == false) */
{
puts("This will also print!");
}
}
Don't introduce this in new code since the definition of these macros might clash with modern uses
of <stdbool.h>.
C99
Added in the C standard version C99, _Bool is also a native C data type. It is capable of holding
the values 0 (for false) and 1 (for true).
#include <stdio.h>
int main(void) {
_Bool x = 1;
_Bool y = 0;
if(x) /* Equivalent to if (x == 1) */
{
puts("This will print!");
}
if (!y) /* Equivalent to if (y == 0) */
{
puts("This will also print!");
}
}
_Boolis an integer type but has special rules for conversions from other types. The result is
analogous to the usage of other types in if expressions. In the following
_Bool z = X;
To use nicer spellings bool, false and true you need to use <stdbool.h>.
https://riptutorial.com/ 46
All integers or pointers can be used in an expression that is interpreted as "truth value".
The expression argc % 4 is evaluated and leads to one of the values 0, 1, 2 or 3. The first, 0 is the
only value that is "false" and brings execution into the else part. All other values are "true" and go
into the if part.
Here the pointer A is evaluated and if it is a null pointer, an error is detected and the program exits.
Many people prefer to write something as A == NULL, instead, but if you have such pointer
comparisons as part of other complicated expressions, things become quickly difficult to read.
For this to check, you'd have to scan a complicated code in the expression and be sure about
operator preference.
is relatively easy to capture: if the pointer is valid we check if the first character is non-zero and
then check if it is a letter.
Considering that most debuggers are not aware of #define macros, but can check enum constants,
it may be desirable to do something like this:
https://riptutorial.com/ 47
# ifndef true
# define true true
# endif
# ifndef false
# define false false
# endif
#else
# include <stdbool.h>
#endif
This allows compilers for historic versions of C to function, but remains forward compatible if the
code is compiled with a modern C compiler.
For more information on typedef, see Typedef, for more on enum see Enumerations
https://riptutorial.com/ 48
Chapter 9: Command-line arguments
Syntax
• int main(int argc, char *argv[])
Parameters
Parameter Details
Remarks
A C program running in a 'hosted environment' (the normal type — as opposed to a 'freestanding
environment') must have a main function. It is traditionally defined as:
Note that argv can also be, and very often is, defined as char **argv; the behavior is the same.
Also, the parameter names can be changed because they're just local variables within the
function, but argc and argv are conventional and you should use those names.
For main functions where the code does not use any arguments, use int main(void).
• is initialized to the number of space-separated arguments given to the program from the
argc
command-line as well as the program name itself.
• argv is an array of char-pointers (strings) containing the arguments (and the program name)
that was given on the command-line.
• some systems expand command-line arguments "in the shell", others do not. On Unix if the
user types myprogram *.txt the program will receive a list of text files; on Windows it will
receive the string "*.txt".
Note: Before using argv, you might need to check the value of argc. In theory, argc could be 0, and
if argc is zero, then there are no arguments and argv[0] (equivalent to argv[argc]) is a null pointer.
It would be an unusual system with a hosted environment if you ran into this problem. Similarly, it
is possible, though very unusual, for there to be no information about the program name. In that
case, argv[0][0] == '\0' — the program name may be empty.
https://riptutorial.com/ 49
Suppose we start the program like this:
• argv[0] points to "./some_program" (the program name) if the program name is available from
the host environment. Otherwise an empty string "".
• argv[1] points to "abba",
• argv[2] points to "banana",
• argv[3] points to "mamajam",
• argv[4] contains the value NULL.
See also What should main() return in C and C++ for complete quotes from the standard.
Examples
Printing the command line arguments
Notes
The following code will print the arguments to the program, and the code will attempt to convert
each argument into a number (to a long):
https://riptutorial.com/ 50
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <limits.h>
errno = 0;
char *p;
long argument_numValue = strtol(argv[i], &p, 10);
if (p == argv[i]) {
fprintf(stderr, "Argument %d is not a number.\n", i);
}
else if ((argument_numValue == LONG_MIN || argument_numValue == LONG_MAX) && errno ==
ERANGE) {
fprintf(stderr, "Argument %d is out of range.\n", i);
}
else {
printf("Argument %d is a number, and the value is: %ld\n",
i, argument_numValue);
}
}
return 0;
}
REFERENCES:
Command-line options for applications are not treated any differently from command-line
arguments by the C language. They are just arguments which, in a Linux or Unix environment,
traditionally begin with a dash (-).
With glibc in a Linux or Unix environment you can use the getopt tools to easily define, validate,
and parse command-line options from the rest of your arguments.
These tools expect your options to be formatted according to the GNU Coding Standards, which is
an extension of what POSIX specifies for the format of command-line options.
The example below demonstrates handling command-line options with the GNU getopt tools.
#include <stdio.h>
#include <getopt.h>
#include <string.h>
https://riptutorial.com/ 51
const char *basename = strrchr(path, '/');
basename = basename ? basename + 1 : path;
if (opt == -1) {
/* a return value of -1 indicates that there are no more options */
break;
}
switch (opt) {
case 'h':
/* the help_flag and value are specified in the longopts table,
https://riptutorial.com/ 52
* which means that when the --help option is specified (in its long
* form), the help_flag variable will be automatically set.
* however, the parser for short-form options does not support the
* automatic setting of flags, so we still need this code to set the
* help_flag manually when the -h option is specified.
*/
help_flag = 1;
break;
case 'f':
/* optarg is a global variable in getopt.h. it contains the argument
* for this option. it is null if there was no argument.
*/
printf ("outarg: '%s'\n", optarg);
strncpy (filename, optarg ? optarg : "out.txt", sizeof (filename));
/* strncpy does not fully guarantee null-termination */
filename[sizeof (filename) - 1] = '\0';
break;
case 'm':
/* since the argument for this option is required, getopt guarantees
* that aptarg is non-null.
*/
strncpy (message, optarg, sizeof (message));
message[sizeof (message) - 1] = '\0';
break;
case '?':
/* a return value of '?' indicates that an option was malformed.
* this could mean that an unrecognized option was given, or that an
* option which requires an argument did not include an argument.
*/
usage (stderr, argv[0]);
return 1;
default:
break;
}
}
if (help_flag) {
usage (stdout, argv[0]);
return 0;
}
if (filename[0]) {
fp = fopen (filename, "w");
} else {
fp = stdout;
}
if (!fp) {
fprintf(stderr, "Failed to open file.\n");
return 1;
}
fclose (fp);
return 0;
}
https://riptutorial.com/ 53
gcc example.c -o example
It supports three command-line options (--help, --file, and --msg). All have a "short form" as well (
-h, -f, and -m). The "file" and "msg" options both accept arguments. If you specify the "msg"
option, its argument is required.
https://riptutorial.com/ 54
Chapter 10: Comments
Introduction
Comments are used to indicate something to the person reading the code. Comments are treated
like a blank by the compiler and do not change anything in the code's actual meaning. There are
two syntaxes used for comments in C, the original /* */ and the slightly newer //. Some
documentation systems use specially formatted comments to help produce the documentation for
code.
Syntax
• /*...*/
• //... (C99 and later only)
Examples
/* */ delimited comments
A comment starts with a forward slash followed immediately by an asterisk (/*), and ends as soon
as an asterisk immediately followed by a forward slash (*/) is encountered. Everything in between
these character combinations is a comment and is treated as a blank (basically ignored) by the
compiler.
/* this is a comment */
The comment above is a single line comment. Comments of this /* type can span multiple lines,
like so:
/* this is a
multi-line
comment */
Though it is not strictly necessary, a common style convention with multi-line comments is to put
leading spaces and asterisks on the lines subsequent to the first, and the /* and */ on new lines,
such that they all line up:
/*
* this is a
* multi-line
* comment
*/
The extra asterisks do not have any functional effect on the comment as none of them have a
related forward slash.
https://riptutorial.com/ 55
These /* type of comments can be used on their own line, at the end of a code line, or even within
lines of code:
Comments cannot be nested. This is because any subsequent /* will be ignored (as part of the
comment) and the first */ reached will be treated as ending the comment. The comment in the
following example will not work:
/* outer comment, means this is ignored => /* attempted inner comment */ <= ends the comment,
not this one => */
To comment blocks of code that contain comments of this type, that would otherwise be nested,
see the Commenting using the preprocessor example below
// delimited comments
C99
C99 introduced the use of C++-style single-line comments. This type of comment starts with two
forward slashes and runs to the end of a line:
// this is a comment
This type of comment does not allow multi-line comments, though it is possible to make a
comment block by adding several single line comments one after the other:
This type of comment may be used on its own line or at the end of a code line. However, because
they run to the end of the line, they may not be used within a code line
Large chunks of code can also be "commented out" using the preprocessor directives #if 0 and
#endif. This is useful when the code contains multi-line comments that otherwise would not nest.
https://riptutorial.com/ 56
#if 0 /* Starts the "comment", anything from here on is removed by preprocessor */
return 0;
}
#endif /* 0 */
C99
While writing // delimited comments, it is possible to make a typographical error that affects their
expected operation. If one types:
The / at the end was a typo but now will get interpreted into \. This is because the ??/ forms a
trigraph.
The ??/ trigraph is actually a longhand notation for \, which is the line continuation symbol. This
means that the compiler thinks the next line is a continuation of the current line, that is, a
continuation of the comment, which may not be what is intended.
https://riptutorial.com/ 57
Chapter 11: Common C programming idioms
and developer practices
Examples
Comparing literal and variable
if ( i == 2) //Bad-way
{
doSomething;
}
Now suppose you have mistaken == with =. Then it will take your sweet time to figure it out.
if( 2 == i) //Good-way
{
doSomething;
}
Then, if an equal sign is accidentally left out, the compiler will complain about an “attempted
assignment to literal.” This won’t protect you when comparing two variables, but every little bit
helps.
Suppose you are creating a function that requires no arguments when it is called and you are
faced with the dilemma of how you should define the parameter list in the function prototype and
the function definition.
• You have the choice of keeping the parameter list empty for both prototype and definition.
Thereby, they look just like the function call statement you will need.
• You read somewhere that one of the uses of keyword void (there are only a few of them), is
to define the parameter list of functions that do not accept any arguments in their call. So,
this is also a choice.
GENERAL ADVICE: If a language provides certain feature to use for a special purpose, you are
better off using that in your code. For example, using enums instead of #define macros (that's for
https://riptutorial.com/ 58
another example).
The special case of an unnamed parameter of type void as the only item in the list
specifies that the function has no parameters.
A simplified explanation provided by K&R (pgs- 72-73) for the above stuff:
int foo(void);
int foo(void)
{
...
<statements>
...
return 1;
}
One advantage of using the above, over int foo() type of declaration (ie. without using the
keyword void), is that the compiler can detect the error if you call your function using an erroneous
statement like foo(42). This kind of a function call statement would not cause any errors if you
leave the parameter list blank. The error would pass silently, undetected and the code would still
execute.
This also means that you should define the main() function like this:
int main(void)
{
...
<statements>
...
return 0;
https://riptutorial.com/ 59
}
Note that even though a function defined with an empty parameter list takes no arguments, it does
not provide a prototype for the function, so the compiler will not complain if the function is
subsequently called with arguments. For example:
#include <stdio.h>
int main(void)
{
parameterless(3, "arguments", "provided");
return 0;
}
If that code is saved in the file proto79.c, it can be compiled on Unix with GCC (version 7.1.0 on
macOS Sierra 10.12.5 used for demonstration) like this:
If you give the function the formal prototype static void parameterless(void), then the compilation
gives errors:
Moral — always make sure you have prototypes, and make sure your compiler tells you when you
are not obeying the rules.
https://riptutorial.com/ 60
Read Common C programming idioms and developer practices online:
https://riptutorial.com/c/topic/10543/common-c-programming-idioms-and-developer-practices
https://riptutorial.com/ 61
Chapter 12: Common pitfalls
Introduction
This section discusses some of the common mistakes that a C programmer should be aware of
and should avoid making. For more on some unexpected problems and their causes, please see
Undefined behavior
Examples
Mixing signed and unsigned integers in arithmetic operations
It is usually not a good idea to mix signed and unsigned integers in arithmetic operations. For
example, what will be output of following example?
#include <stdio.h>
int main(void)
{
unsigned int a = 1000;
signed int b = -1;
return 0;
}
Since 1000 is more than -1 you would expect the output to be a is more than b, however that will
not be the case.
Arithmetic operations between different integral types are performed within a common type
defined by the so called usual arithmetic conversions (see the language specification, 6.3.1.8).
In this case the "common type" is unsigned int, Because, as stated in Usual arithmetic conversions
,
714 Otherwise, if the operand that has unsigned integer type has rank greater or equal
to the rank of the type of the other operand, then the operand with signed integer type
is converted to the type of the operand with unsigned integer type.
This means that int operand b will get converted to unsigned int before the comparison.
When -1 is converted to an unsigned int the result is the maximal possible unsigned int value,
which is greater than 1000, meaning that a > b is false.
https://riptutorial.com/ 62
The = operator is used for assignment.
One should be careful not to mix the two. Sometimes one mistakenly writes
/* assign y to x */
if (x = y) {
/* logic */
}
/* compare if x is equal to y */
if (x == y) {
/* logic */
}
The former assigns value of y to x and checks if that value is non zero, instead of doing
comparison, which is equivalent to:
if ((x = y) != 0) {
/* logic */
}
There are times when testing the result of an assignment is intended and is commonly used,
because it avoids having to duplicate code and having to treat the first time specially. Compare
versus
Modern compilers will recognise this pattern and do not warn when the assignment is inside
parenthesis like above, but may warn for other usages. For example:
if (x = y) /* warning */
https://riptutorial.com/ 63
Some programmers use the strategy of putting the constant to the left of the operator (commonly
called Yoda conditions). Because constants are rvalues, this style of condition will cause the
compiler to throw an error if the wrong operator was used.
if (5 = y) /* Error */
if (5 == y) /* No error */
However, this severely reduces the readability of the code and is not considered necessary if the
programmer follows good C coding practices, and doesn't help when comparing two variables so it
isn't a universal solution. Furthermore, many modern compilers may give warnings when code is
written with Yoda conditions.
if (x > a);
a = x;
actually means:
if (x > a) {}
a = x;
which means x will be assigned to a in any case, which might not be what you wanted originally.
if (i < 0)
return
day = date[0];
hour = date[1];
minute = date[2];
One technique to avoid this and similar problems is to always use braces on multi-line conditionals
and loops. For example:
if (x > a) {
a = x;
}
When you are copying a string into a malloced buffer, always remember to add 1 to strlen.
https://riptutorial.com/ 64
char *dest = malloc(strlen(src) + 1); /* RIGHT */
strcpy(dest, src);
This is because strlen does not include the trailing \0 in the length. If you take the WRONG (as shown
above) approach, upon calling strcpy, your program would invoke undefined behaviour.
It also applies to situations when you are reading a string of known maximum length from stdin or
some other source. For example
#define MAX_INPUT_LEN 42
A programming best practice is to free any memory that has been allocated directly by your own
code, or implicitly by calling an internal or external function, such as a library API like strdup().
Failing to free memory can introduce a memory leak, which could accumulate into a substantial
amount of wasted memory that is unavailable to your program (or the system), possibly leading to
crashes or undefined behavior. Problems are more likely to occur if the leak is incurred repeatedly
in a loop or recursive function. The risk of program failure increases the longer a leaking program
runs. Sometimes problems appear instantly; other times problems won't be seen for hours or even
years of constant operation. Memory exhaustion failures can be catastrophic, depending on the
circumstances.
The following infinite loop is an example of a leak that will eventually exhaust available memory
leak by calling getline(), a function that implicitly allocates new memory, without freeing that
memory.
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
char *line = NULL;
size_t size = 0;
for(;;) {
getline(&line, &size, stdin); /* New memory implicitly allocated */
/* <do whatever> */
line = NULL;
}
return 0;
}
https://riptutorial.com/ 65
In contrast, the code below also uses the getline() function, but this time, the allocated memory is
correctly freed, avoiding a leak.
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
char *line = NULL;
size_t size = 0;
for(;;) {
if (getline(&line, &size, stdin) < 0) {
free(line);
line = NULL;
/* Handle failure such as setting flag, breaking out of loop and/or exiting */
}
/* <do whatever> */
free(line);
line = NULL;
return 0;
}
Leaking memory doesn't always have tangible consequences and isn't necessarily a functional
problem. While "best practice" dictates rigorously freeing memory at strategic points and
conditions, to reduce memory footprint and lower risk of memory exhaustion, there can be
exceptions. For example, if a program is bounded in duration and scope, the risk of allocation
failure might be considered too small to worry about. In that case, bypassing explicit deallocation
might be considered acceptable. For example, most modern operating systems automatically free
all memory consumed by a program when it terminates, whether it is due to program failure, a
system call to exit(), process termination, or reaching end of main(). Explicitly freeing memory at
the point of imminent program termination could actually be redundant or introduce a performance
penalty.
Allocation can fail if insufficient memory is available, and handling failures should be accounted for
at appropriate levels of the call stack. getline(), shown above is an interesting use-case because
it is a library function that not only allocates memory it leaves to the caller to free, but can fail for a
number of reasons, all of which must be taken into account. Therefore, it is essential when using a
C API, to read the documentation (man page) and pay particular attention to error conditions and
memory usage, and be aware which software layer bears the burden of freeing returned memory.
Another common memory handling practice is to consistently set memory pointers to NULL
immediately after the memory referenced by those pointers is freed, so those pointers can be
tested for validity at any time (e.g. checked for NULL / non-NULL), because accessing freed
memory can lead to severe problems such as getting garbage data (read operation), or data
corruption (write operation) and/or a program crash. In most modern operating systems, freeing
memory location 0 (NULL) is a NOP (e.g. it is harmless), as required by the C standard — so by
https://riptutorial.com/ 66
setting a pointer to NULL, there is no risk of double-freeing memory if the pointer is passed to
free(). Keep in mind that double-freeing memory can lead to very time consuming, confusing, and
difficult to diagnose failures.
If the user enters a string longer than 7 characters (- 1 for the null terminator), memory behind the
buffer buf will be overwritten. This results in undefined behavior. Malicious hackers often exploit
this in order to overwrite the return address, and change it to the address of the hacker's malicious
code.
If realloc fails, it returns NULL. If you assign the value of the original buffer to realloc's return value,
and if it returns NULL, then the original buffer (the old pointer) is lost, resulting in a memory leak.
The solution is to copy into a temporary pointer, and if that temporary is not NULL, then copy into
the real buffer.
buf = malloc(...);
...
/* WRONG */
if ((buf = realloc(buf, 16)) == NULL)
perror("realloc");
/* RIGHT */
if ((tmp = realloc(buf, 16)) != NULL)
buf = tmp;
else
perror("realloc");
Floating point types (float, double and long double) cannot precisely represent some numbers
because they have finite precision and represent the values in a binary format. Just like we have
repeating decimals in base 10 for fractions such as 1/3, there are fractions that cannot be
represented finitely in binary too (such as 1/3, but also, more importantly, 1/10). Do not directly
compare floating point values; use a delta instead.
int main(void)
https://riptutorial.com/ 67
{
double a = 0.1; // imprecise: (binary) 0.000110...
return 0;
}
Another example:
int main(void)
{
double d1 = 3.14159265358979;
double d2 = 355.0 / 113.0;
Output:
https://riptutorial.com/ 68
8:3.1415926536 <=> 3.1415929204 out of tolerance 0.0000000100 (rel diff 8.4914E-08)
9:3.1415926536 <=> 3.1415929204 out of tolerance 0.0000000010 (rel diff 8.4914E-08)
In pointer arithmetic, the integer to be added or subtracted to pointer is interpreted not as change
of address but as number of elements to move.
#include <stdio.h>
int main(void) {
int array[] = {1, 2, 3, 4, 5};
int *ptr = &array[0];
int *ptr2 = ptr + sizeof(int) * 2; /* wrong */
printf("%d %d\n", *ptr, *ptr2);
return 0;
}
This code does extra scaling in calculating pointer assigned to ptr2. If sizeof(int) is 4, which is
typical in modern 32-bit environments, the expression stands for "8 elements after array[0]", which
is out-of-range, and it invokes undefined behavior.
To have ptr2 point at what is 2 elements after array[0], you should simply add 2.
#include <stdio.h>
int main(void) {
int array[] = {1, 2, 3, 4, 5};
int *ptr = &array[0];
int *ptr2 = ptr + 2;
printf("%d %d\n", *ptr, *ptr2); /* "1 3" will be printed */
return 0;
}
Explicit pointer arithmetic using additive operators may be confusing, so using array subscripting
may be better.
#include <stdio.h>
int main(void) {
int array[] = {1, 2, 3, 4, 5};
int *ptr = &array[0];
int *ptr2 = &ptr[2];
printf("%d %d\n", *ptr, *ptr2); /* "1 3" will be printed */
return 0;
}
E1[E2] is identical to (*((E1)+(E2))) (N1570 6.5.2.1, paragraph 2), and &(E1[E2]) is equivalent to
((E1)+(E2)) (N1570 6.5.3.2, footnote 102).
Alternatively, if pointer arithmetic is preferred, casting the pointer to address a different data type
can allow byte addressing. Be careful though: endianness can become an issue, and casting to
types other than 'pointer to character' leads to strict aliasing problems.
https://riptutorial.com/ 69
#include <stdio.h>
int main(void) {
int array[3] = {1,2,3}; // 4 bytes * 3 allocated
unsigned char *ptr = (unsigned char *) array; // unsigned chars only take 1 byte
/*
* Now any pointer arithmetic on ptr will match
* bytes in memory. ptr can be treated like it
* was declared as: unsigned char ptr[12];
*/
return 0;
}
Macros are simple string replacements. (Strictly speaking, they work with preprocessing tokens,
not arbitrary strings.)
#include <stdio.h>
int main(void) {
printf("%d\n", SQUARE(1+2));
return 0;
}
You may expect this code to print 9 (3*3), but actually 5 will be printed because the macro will be
expanded to 1+2*1+2.
You should wrap the arguments and the whole macro expression in parentheses to avoid this
problem.
#include <stdio.h>
int main(void) {
printf("%d\n", SQUARE(1+2));
return 0;
}
Another problem is that the arguments of a macro are not guaranteed to be evaluated once; they
may not be evaluated at all, or may be evaluated multiple times.
#include <stdio.h>
int main(void) {
int a = 0;
printf("%d\n", MIN(a++, 10));
printf("a = %d\n", a);
return 0;
https://riptutorial.com/ 70
}
In this code, the macro will be expanded to ((a++) <= (10) ? (a++) : (10)). Since a++ (0) is smaller
than 10, a++ will be evaluated twice and it will make the value of a and what is returned from MIN
differ from you may expect.
This can be avoided by using functions, but note that the types will be fixed by the function
definition, whereas macros can be (too) flexible with types.
#include <stdio.h>
int main(void) {
int a = 0;
printf("%d\n", min(a++, 10));
printf("a = %d\n", a);
return 0;
}
Now the problem of double-evaluation is fixed, but this min function cannot deal with double data
without truncating, for example.
What distinguishes these two types of macros is the character that follows the identifier after
#define: if it's an lparen, it is a function-like macro; otherwise, it's an object-like macro. If the
intention is to write a function-like macro, there must not be any white space between the end of
the name of the macro and (. Check this for a detailed explanation.
C99
In C99 or later, you could use static inline int min(int x, int y) { … }.
C11
#include <stdio.h>
https://riptutorial.com/ 71
gen_min(ld, long double)
gen_min(ull, unsigned long long)
gen_min(i, int)
int main(void)
{
unsigned long long ull1 = 50ULL;
unsigned long long ull2 = 37ULL;
printf("min(%llu, %llu) = %llu\n", ull1, ull2, min(ull1, ull2));
long double ld1 = 3.141592653L;
long double ld2 = 3.141592652L;
printf("min(%.10Lf, %.10Lf) = %.10Lf\n", ld1, ld2, min(ld1, ld2));
int i1 = 3141653;
int i2 = 3141652;
printf("min(%d, %d) = %d\n", i1, i2, min(i1, i2));
return 0;
}
The generic expression could be extended with more types such as double, float, long long,
unsigned long, long, unsigned — and appropriate gen_min macro invocations written.
One of the most common errors in compilation happens during the linking stage. The error looks
similar to this:
$ gcc undefined_reference.c
/tmp/ccoXhwF0.o: In function `main':
undefined_reference.c:(.text+0x15): undefined reference to `foo'
collect2: error: ld returned 1 exit status
$
int foo(void);
We see here a declaration of foo (int foo();) but no definition of it (actual function). So we
provided the compiler with the function header, but there was no such function defined anywhere,
so the compilation stage passes but the linker exits with an Undefined reference error.
To fix this error in our small program we would only have to add a definition for foo:
/* Declaration of foo */
int foo(void);
/* Definition of foo */
int foo(void)
https://riptutorial.com/ 72
{
return 5;
}
Now this code will compile. An alternative situation arises where the source for foo() is in a
separate source file foo.c (and there's a header foo.h to declare foo() that is included in both foo.c
and undefined_reference.c). Then the fix is to link both the object file from foo.c and
undefined_reference.c, or to compile both the source files:
$ gcc -c undefined_reference.c
$ gcc -c foo.c
$ gcc -o working_program undefined_reference.o foo.o
$
Or:
A more complex case is where libraries are involved, like in the code:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
if (argc != 3)
{
fprintf(stderr, "Usage: %s <denom> <nom>\n", argv[0]);
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
https://riptutorial.com/ 73
}
The code is syntactically correct, declaration for pow() exists from #include <math.h>, so we try to
compile and link but get an error like this:
This happens because the definition for pow() wasn't found during the linking stage. To fix this we
have to specify we want to link against the math library called libm by specifying the -lm flag. (Note
that there are platforms such as macOS where -lm is not needed, but when you get the undefined
reference, the library is needed.)
So we run the compilation stage again, this time specifying the library (after the source or object
files):
And it works!
A common problem in code that uses multidimensional arrays, arrays of pointers, etc. is the fact
that Type** and Type[M][N] are fundamentally different types:
#include <stdio.h>
int main(void)
{
char s[4][20] = {"Example 1", "Example 2", "Example 3", "Example 4"};
print_strings(s, 4);
return 0;
}
https://riptutorial.com/ 74
print_strings(strings, 4);
^
file1.c:3:10: note: expected 'char **' but argument is of type 'char (*)[20]'
void print_strings(char **strings, size_t n)
The error states that the s array in the main function is passed to the function print_strings, which
expects a different pointer type than it received. It also includes a note expressing the type that is
expected by print_strings and the type that was passed to it from main.
The problem is due to something called array decay. What happens when s with its type
char[4][20] (array of 4 arrays of 20 chars) is passed to the function is it turns into a pointer to its
first element as if you had written &s[0], which has the type char (*)[20] (pointer to 1 array of 20
chars). This occurs for any array, including an array of pointers, an array of arrays of arrays (3-D
arrays), and an array of pointers to an array. Below is a table illustrating what happens when an
array decays. Changes in the type description are highlighted to illustrate what happens:
Before
After Decay
Decay
If an array can decay to a pointer, then it can be said that a pointer may be considered an array of
at least 1 element. An exception to this is a null pointer, which points to nothing and is
consequently not an array.
Array decay only happens once. If an array has decayed to a pointer, it is now a pointer, not an
array. Even if you have a pointer to an array, remember that the pointer might be considered an
array of at least one element, so array decay has already occurred.
In other words, a pointer to an array (char (*)[20]) will never become a pointer to a pointer (char
**). To fix the print_strings function, simply make it receive the correct type:
A problem arises when you want the print_strings function to be generic for any array of chars:
https://riptutorial.com/ 75
what if there are 30 chars instead of 20? Or 50? The answer is to add another parameter before
the array parameter:
#include <stdio.h>
/*
* Note the rearranged parameters and the change in the parameter name
* from the previous definitions:
* n (number of strings)
* => scount (string count)
*
* Of course, you could also use one of the following highly recommended forms
* for the `strings` parameter instead:
*
* char strings[scount][ccount]
* char strings[][ccount]
*/
void print_strings(size_t scount, size_t ccount, char (*strings)[ccount])
{
size_t i;
for (i = 0; i < scount; i++)
puts(strings[i]);
}
int main(void)
{
char s[4][20] = {"Example 1", "Example 2", "Example 3", "Example 4"};
print_strings(4, 20, s);
return 0;
}
Example 1
Example 2
Example 3
Example 4
When allocating multidimensional arrays with malloc, calloc, and realloc, a common pattern is to
allocate the inner arrays with multiple calls (even if the call only appears once, it may be in a loop):
/* Could also be `int **` with malloc used to allocate outer array. */
int *array[4];
int i;
The difference in bytes between the last element of one of the inner arrays and the first element of
the next inner array may not be 0 as they would be with a "real" multidimensional array (e.g. int
array[4][16];):
https://riptutorial.com/ 76
/* 0x40003c, 0x402000 */
printf("%p, %p\n", (void *)(array[0] + 15), (void *)array[1]);
Taking into account the size of int, you get a difference of 8128 bytes (8132-4), which is 2032 int-
sized array elements, and that is the problem: a "real" multidimensional array has no gaps
between elements.
If you need to use a dynamically allocated array with a function expecting a "real" multidimensional
array, you should allocate an object of type int * and use arithmetic to perform calculations:
If N is a macro or an integer literal rather than a variable, the code can simply use the more natural
2-D array notation after allocating a pointer to an array:
int M = 4;
int (*array)[N];
array = calloc(M, sizeof(*array));
array[i][j] = 1;
/* Cast to `int *` works here because `array` is a single block of M*N ints with no gaps,
just like `int array2[M * N];` and `int array3[M][N];` would be. */
func(M, N, (int *)array);
func_N(M, array);
C99
If N is not a macro or an integer literal, then array will point to a variable-length array (VLA). This
can still be used with func by casting to int * and a new function func_vla would replace func_N:
int M = 4, N = 16;
int (*array)[N];
array = calloc(M, sizeof(*array));
array[i][j] = 1;
func(M, N, (int *)array);
func_vla(M, N, array);
C11
https://riptutorial.com/ 77
Note: VLAs are optional as of C11. If your implementation supports C11 and defines the macro
__STDC_NO_VLA__ to 1, you are stuck with the pre-C99 methods.
A character surrounded by single quotes like 'a' is a character constant. A character constant is
an integer whose value is the character code that stands for the character. How to interpret
character constants with multiple characters like 'abc' is implementation-defined.
Zero or more characters surrounded by double quotes like "abc" is a string literal. A string literal is
an unmodifiable array whose elements are type char. The string in the double quotes plus
terminating null-character are the contents, so "abc" has 4 elements ({'a', 'b', 'c', '\0'})
In this example, a character constant is used where a string literal should be used. This character
constant will be converted to a pointer in an implementation-defined manner and there is little
chance for the converted pointer to be valid, so this example will invoke undefined behavior.
#include <stdio.h>
int main(void) {
const char *hello = 'hello, world'; /* bad */
puts(hello);
return 0;
}
In this example, a string literal is used where a character constant should be used. The pointer
converted from the string literal will be converted to an integer in an implementation-defined
manner, and it will be converted to char in an implementation-defined manner. (How to convert an
integer to a signed type which cannot represent the value to convert is implementation-defined,
and whether char is signed is also implementation-defined.) The output will be some meaningless
thing.
#include <stdio.h>
int main(void) {
char c = "a"; /* bad */
printf("%c\n", c);
return 0;
}
In almost all cases, the compiler will complain about these mix-ups. If it doesn't, you need to use
more compiler warning options, or it is recommended that you use a better compiler.
Almost every function in C standard library returns something on success, and something else on
error. For example, malloc will return a pointer to the memory block allocated by the function on
success, and, if the function failed to allocate the requested block of memory, a null pointer. So
https://riptutorial.com/ 78
you should always check the return value for easier debugging.
This is bad:
This is good:
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
char* x = malloc(100000000000UL * sizeof *x);
if (x == NULL) {
perror("malloc() failed");
exit(EXIT_FAILURE);
}
if (scanf("%s", x) != 1) {
fprintf(stderr, "could not read string\n");
free(x);
exit(EXIT_FAILURE);
}
/* Do stuff with x. */
/* Clean up. */
free(x);
return EXIT_SUCCESS;
}
This way you know right away the cause of error, otherwise you might spend hours looking for a
bug in a completely wrong place.
#include <stdio.h>
#include <string.h>
int main(void) {
int num = 0;
char str[128], *lf;
scanf("%d", &num);
fgets(str, sizeof(str), stdin);
https://riptutorial.com/ 79
}
42
life
This is because a newline character after 42 is not consumed in the call of scanf() and it is
consumed by fgets() before it reads life. Then, fgets() stop reading before reading life.
To avoid this problem, one way that is useful when the maximum length of a line is known -- when
solving problems in online judge syste, for example -- is avoiding using scanf() directly and
reading all lines via fgets(). You can use sscanf() to parse the lines read.
#include <stdio.h>
#include <string.h>
int main(void) {
int num = 0;
char line_buffer[128] = "", str[128], *lf;
Another way is to read until you hit a newline character after using scanf() and before using
fgets().
#include <stdio.h>
#include <string.h>
int main(void) {
int num = 0;
char str[128], *lf;
int c;
scanf("%d", &num);
while ((c = getchar()) != '\n' && c != EOF);
fgets(str, sizeof(str), stdin);
https://riptutorial.com/ 80
It is easy to get confused in the C preprocessor, and treat it as part of C itself, but that is a mistake
because the preprocessor is just a text substitution mechanism. For example, if you write
/* WRONG */
#define MAX 100;
int arr[MAX];
int arr[100;];
which is a syntax error. The remedy is to remove the semicolon from the #define line. It is almost
invariably a mistake to end a #define with a semicolon.
/*
* max(): Finds the largest integer in an array and returns it.
* If the array length is less than 1, the result is undefined.
* arr: The array of integers to search.
* num: The number of integers in arr.
*/
int max(int arr[], int num)
{
int max = arr[0];
for (int i = 0; i < num; i++)
if (arr[i] > max)
max = arr[i];
return max;
}
/*
* max(): Finds the largest integer in an array and returns it.
* If the array length is less than 1, the result is undefined.
* arr: The array of integers to search.
* num: The number of integers in arr.
*/
int max(int arr[], int num)
{
int max = arr[0];
for (int i = 0; i < num; i++)
if (arr[i] > max)
max = arr[i];
return max;
}
https://riptutorial.com/ 81
//Causes an error on the line below...
*/
/*
*/
Another solution is to avoid disabling code using comment syntax, using #ifdef or #ifndef
preprocessor directives instead. These directives do nest, leaving you free to comment your code
in the style you prefer.
#define DISABLE_MAX /* Remove or comment this line to enable max() code block */
#ifdef DISABLE_MAX
/*
* max(): Finds the largest integer in an array and returns it.
* If the array length is less than 1, the result is undefined.
* arr: The array of integers to search.
* num: The number of integers in arr.
*/
int max(int arr[], int num)
{
int max = arr[0];
for (int i = 0; i < num; i++)
if (arr[i] > max)
max = arr[i];
https://riptutorial.com/ 82
return max;
}
#endif
Some guides go so far as to recommend that code sections must never be commented and that if
code is to be temporarily disabled one could resort to using an #if 0 directive.
Arrays are zero-based, that is the index always starts at 0 and ends with index array length minus
1. Thus the following code will not output the first element of the array and will output garbage for
the final value that it prints.
#include <stdio.h>
int main(void)
{
int x = 0;
int myArray[5] = {1, 2, 3, 4, 5}; //Declaring 5 elements
printf("\n");
return 0;
}
Output: 2 3 4 5 GarbageValue
The following demonstrates the correct way to achieve the desired output:
#include <stdio.h>
int main(void)
{
int x = 0;
int myArray[5] = {1, 2, 3, 4, 5}; //Declaring 5 elements
printf("\n");
return 0;
}
Output: 1 2 3 4 5
It is important to know the length of an array before working with it as otherwise you may corrupt
the buffer or cause a segmentation fault by accessing memory locations that are out of bounds.
https://riptutorial.com/ 83
Calculating the factorial of a number is a classic example of a recursive function.
#include <stdio.h>
int factorial(int n)
{
return n * factorial(n - 1);
}
int main()
{
printf("Factorial %d = %d\n", 3, factorial(3));
return 0;
}
The problem with this function is it would loop infinitely, causing a segmentation fault — it needs a
base condition to stop the recursion.
#include <stdio.h>
int factorial(int n)
{
if (n == 1) // Base Condition, very crucial in designing the recursive functions.
{
return 1;
}
else
{
return n * factorial(n - 1);
}
}
int main()
{
printf("Factorial %d = %d\n", 3, factorial(3));
return 0;
}
Sample output
Factorial 3 = 6
This function will terminate as soon as it hits the condition n is equal to 1 (provided the initial value
of n is small enough — the upper bound is 12 when int is a 32-bit quantity).
Rules to be followed:
1. Initialize the algorithm. Recursive programs often need a seed value to start with. This is
accomplished either by using a parameter passed to the function or by providing a gateway
https://riptutorial.com/ 84
function that is non-recursive but that sets up the seed values for the recursive calculation.
2. Check to see whether the current value(s) being processed match the base case. If so,
process and return the value.
3. Redefine the answer in terms of a smaller or simpler sub-problem or sub-problems.
4. Run the algorithm on the sub-problem.
5. Combine the results in the formulation of the answer.
6. Return the results.
The original C standard had no intrinsic Boolean type, so bool, true and false had no inherent
meaning and were often defined by programmers. Typically true would be defined as 1 and false
would be defined as 0.
C99
C99 adds the built-in type _Bool and the header <stdbool.h> which defines bool (expanding to _Bool
), false and true. It also allows you to redefine bool, true and false, but notes that this is an
obsolescent feature.
More importantly, logical expressions treat anything that evaluates to zero as false and any non-
zero evaluation as true. For example:
In the above example, the function is trying to check if the upper bit is set and return true if it is.
However, by explicitly checking against true, the if statement will only succeed if (bitfield &
0x80) evaluates to whatever true is defined as, which is typically 1 and very seldom 0x80. Either
explicitly check against the case you expect:
https://riptutorial.com/ 85
{
return false;
}
}
Care must be taken when initializing variables of type float to literal values or comparing them
with literal values, because regular floating point literals like 0.1 are of type double. This may lead
to surprises:
#include <stdio.h>
int main() {
float n;
n = 0.1;
if (n > 0.1) printf("Wierd\n");
return 0;
}
// Prints "Wierd" when n is float
Here, n gets initialized and rounded to single precision, resulting in value 0.10000000149011612.
Then, n is converted back to double precision to be compared with 0.1 literal (which equals to
0.10000000000000001), resulting in a mismatch.
Besides rounding errors, mixing float variables with double literals will result in poor performance
on platforms which don't have hardware support for double precision.
https://riptutorial.com/ 86
Chapter 13: Compilation
Introduction
The C language is traditionally a compiled language (as opposed to interpreted). The C Standard
defines translation phases, and the product of applying them is a program image (or compiled
program). In c11, the phases are listed in §5.1.1.2.
Remarks
Filename
Description
extension
Windows executable file. Formed by linking object files and library files. In
.exe, .com Unix-like systems, there is no special file name extension for executable
file.
Compilers on POSIX platforms (Linux, mainframes, Mac) usually accept these options, even if
they are not called c99.
https://riptutorial.com/ 87
• See also c99 - compile standard C programs
-Wimplicit-function-
declaration Warn about implicit function declaration.
Examples
The Linker
The job of the linker is to link together a bunch of object files (.o files) into a binary executable.
The process of linking mainly involves resolving symbolic addresses to numerical addresses. The
result of the link process is normally an executable program.
https://riptutorial.com/ 88
During the link process, the linker will pick up all the object modules specified on the command
line, add some system-specific startup code in front and try to resolve all external references in the
object module with external definitions in other object files (object files can be specified directly on
the command line or may implicitly be added through libraries). It will then assign load addresses
for the object files, that is, it specifies where the code and data will end up in the address space of
the finished program. Once it's got the load addresses, it can replace all the symbolic addresses in
the object code with "real", numerical addresses in the target's address space. The program is
ready to be executed now.
This includes both the object files that the compiler created from your source code files as well as
object files that have been pre-compiled for you and collected into library files. These files have
names which end in .a or .so, and you normally don't need to know about them, as the linker
knows where most of them are located and will link them in automatically as needed.
Like the pre-processor, the linker is a separate program, often called ld (but Linux uses collect2,
for example). Also like the pre-processor, the linker is invoked automatically for you when you use
the compiler. Thus, the normal way of using the linker is as follows:
This line tells the compiler to link together three object files (foo.o, bar.o, and baz.o) into a binary
executable file named myprog. Now you have a file called myprog that you can run and which will
hopefully do something cool and/or useful.
It is possible to invoke the linker directly, but this is seldom advisable, and is typically very
platform-specific. That is, options that work on Linux won't necessarily work on Solaris, AIX,
macOS, Windows, and similarly for any other platform. If you work with GCC, you can use gcc -v
to see what is executed on your behalf.
The linker also takes some arguments to modify it's behavior. The following command would tell
gcc to link foo.o and bar.o, but also include the ncurses library.
(although libncurses.so could be libncurses.a, which is just an archive created with ar). Note that
you should list the libraries (either by pathname or via -lname options) after the object files. With
static libraries, the order that they are specified matters; often, with shared libraries, the order
https://riptutorial.com/ 89
doesn't matter.
Note that on many systems, if you are using mathematical functions (from <math.h>), you need to
specify -lm to load the mathematics library — but Mac OS X and macOS Sierra do not require this.
There are other libraries that are separate libraries on Linux and other Unix systems, but not on
macOS — POSIX threads, and POSIX realtime, and networking libraries are examples.
Consequently, the linking process varies between platforms.
This is all you need to know to begin compiling your own C programs. Generally, we also
recommend that you use the -Wall command-line option:
The -Wall option causes the compiler to warn you about legal but dubious code constructs, and
will help you catch a lot of bugs very early.
If you want the compiler to throw more warnings at you (including variables that are declared but
not used, forgetting to return a value etc.), you can use this set of options, as -Wall, despite the
name, doesn't turn all of the possible warnings on:
Note that clang has an option -Weverything which really does turn on all warnings in clang.
File Types
1. Source files: These files contain function definitions, and have names which end in .c by
convention. Note: .cc and .cpp are C++ files; not C files.
e.g., foo.c
2. Header files: These files contain function prototypes and various pre-processor statements
(see below). They are used to allow source code files to access externally-defined functions.
Header files end in .h by convention.
e.g., foo.h
3. Object files: These files are produced as the output of the compiler. They consist of function
definitions in binary form, but they are not executable by themselves. Object files end in .o
by convention, although on some operating systems (e.g. Windows, MS-DOS), they often
end in .obj.
e.g., foo.o foo.obj
4. Binary executables: These are produced as the output of a program called a "linker". The
linker links together a number of object files to produce a binary file which can be directly
https://riptutorial.com/ 90
executed. Binary executables have no special suffix on Unix operating systems, although
they generally end in .exe on Windows.
e.g., foo foo.exe
5. Libraries: A library is a compiled binary but is not in itself an an executable (i.e., there is no
main() function in a library). A library contains functions that may be used by more than one
program. A library should ship with header files which contain prototypes for all functions in
the library; these header files should be referenced (e.g; #include <library.h>) in any source
file that uses the library. The linker then needs to be referred to the library so the program
can successfully compiled. There are two types of libraries: static and dynamic.
• Static library: A static library (.a files for POSIX systems and .lib files for Windows —
not to be confused with DLL import library files, which also use the .lib extension) is
statically built into the program . Static libraries have the advantage that the program
knows exactly which version of a library is used. On the other hand, the sizes of
executables are bigger as all used library functions are included.
e.g., libfoo.a foo.lib
• Dynamic library: A dynamic library (.so files for most POSIX systems, .dylib for OSX
and .dll files for Windows) is dynamically linked at runtime by the program. These are
also sometimes referred to as shared libraries because one library image can be
shared by many programs. Dynamic libraries have the advantage of taking up less disk
space if more than one application is using the library. Also, they allow library updates
(bug fixes) without having to rebuild executables.
e.g., foo.so foo.dylib foo.dll
The Preprocessor
Before the C compiler starts compiling a source code file, the file is processed in a preprocessing
phase. This phase can be done by a separate program or be completely integrated in one
executable. In any case, it is invoked automatically by the compiler before compilation proper
begins. The preprocessing phase converts your source code into another source code or
translation unit by applying textual replacements. You can think of it as a "modified" or "expanded"
source code. That expanded source may exist as a real file in the file system, or it may only be
stored in memory for a short time before being processed further.
Preprocessor commands start with the pound sign ("#"). There are several preprocessor
commands; two of the most important are:
1. Defines:
becomes
int a = 1000000;
https://riptutorial.com/ 91
#define is used in this way so as to avoid having to explicitly write out some constant value in
many different places in a source code file. This is important in case you need to change the
constant value later on; it's much less bug-prone to change it once, in the #define, than to
have to change it in multiple places scattered all over the code.
Because #define just does advanced search and replace, you can also declare macros. For
instance:
becomes:
// in the function:
a = x;
do {
a = a ? 1 : 0;
} while(0);
At first approximation, this effect is roughly the same as with inline functions, but the
preprocessor doesn't provide type checking for #define macros. This is well known to be
error-prone and their use necessitates great caution.
Also note here, that the preprocessor would also replace comments with a blanks as
explained below.
2. Includes:
#include is used to access function definitions defined outside of a source code file. For
instance:
#include <stdio.h>
causes the preprocessor to paste the contents of <stdio.h> into the source code file at the
location of the #include statement before it gets compiled. #include is almost always used to
include header files, which are files which mainly contain function declarations and #define
statements. In this case, we use #include in order to be able to use functions such as printf
and scanf, whose declarations are located in the file stdio.h. C compilers do not allow you to
use a function unless it has previously been declared or defined in that file; #include
statements are thus the way to re-use previously-written code in your C programs.
3. Logic operations:
https://riptutorial.com/ 92
will be changed to:
variable = another_variable + 1;
if A or B were defined somewhere in the project before. If this is not the case, of course the
preprocessor will do this:
variable = another_variable * 2;
This is often used for code, that runs on different systems or compiles on different compilers.
Since there are global defines, that are compiler/system specific you can test on those
defines and always let the compiler just use the code he will compile for sure.
4. Comments
The Preprocessor replaces all comments in the source file by single spaces. Comments are
indicated by // up to the end of the line, or a combination of opening /* and closing */
comment brackets.
The Compiler
After the C pre-processor has included all the header files and expanded all macros, the compiler
can compile the program. It does this by turning the C source code into an object code file, which
is a file ending in .o which contains the binary version of the source code. Object code is not
directly executable, though. In order to make an executable, you also have to add code for all of
the library functions that were #included into the file (this is not the same as including the
declarations, which is what #include does). This is the job of the linker.
In general, the exact sequence how to invoke a C compiler depends much on the system that you
are using. Here we are using the GCC compiler, though it should be noted that many more
compilers exist:
% is the OS' command prompt. This tells the compiler to run the pre-processor on the file foo.c and
then compile it into the object code file foo.o. The -c option means to compile the source code file
into an object file but not to invoke the linker. This option -c is available on POSIX systems, such
as Linux or macOS; other systems may use different syntax.
If your entire program is in one source code file, you can instead do this:
This tells the compiler to run the pre-processor on foo.c, compile it and then link it to create an
executable called foo. The -o option states that the next word on the line is the name of the binary
executable file (program). If you don't specify the -o, (if you just type gcc foo.c), the executable will
be named a.out for historical reasons.
https://riptutorial.com/ 93
In general the compiler takes four steps when converting a .c file into an executable:
1. pre-processing - textually expands #include directives and #define macros in your .c file
2. compilation - converts the program into assembly (you can stop the compiler at this step by
adding the -S option)
3. assembly - converts the assembly into machine code
4. linkage - links the object code to external libraries to create an executable
Note also that the name of the compiler we are using is GCC, which stands for both "GNU C
compiler" and "GNU compiler collection", depending on context. Other C compilers exist. For Unix-
like operating systems, many of them have the name cc, for "C compiler", which is often a
symbolic link to some other compiler. On Linux systems, cc is often an alias for GCC. On macOS
or OS-X, it points to clang.
The POSIX standards currently mandates c99 as the name of a C compiler — it supports the C99
standard by default. Earlier versions of POSIX mandated c89 as the compiler. POSIX also
mandates that this compiler understands the options -c and -o that we used above.
Note: The -Wall option present in both gcc examples tells the compiler to print warnings about
questionable constructions, which is strongly recommended. It is a also good idea to add other
warning options, e.g. -Wextra.
As of the C 2011 Standard, listed in §5.1.1.2 Translation Phases, the translation of source code to
program image (e.g., the executable) are listed to occur in 8 ordered steps.
1. The source file input is mapped to the source character set (if necessary). Trigraphs are
replaced in this step.
2. Continuation lines (lines that end with \) are spliced with the next line.
3. The source code is parsed into whitespace and preprocessing tokens.
4. The preprocessor is applied, which executes directives, expands macros, and applies
pragmas. Each source file pulled in by #include undergoes translation phases 1 through 4
(recursively if necessary). All preprocessor related directives are then deleted.
5. Source character set values in character constants and string literals are mapped to the
execution character set.
6. String literals adjacent to each other are concatenated.
7. The source code is parsed into tokens, which comprise the translation unit.
8. External references are resolved, and the program image is formed.
An implementation of a C compiler may combine several steps together, but the resulting image
must still behave as if the above steps had occurred separately in the order listed above.
https://riptutorial.com/ 94
Chapter 14: Compound Literals
Syntax
• (type){ initializer-list }
Remarks
C standard says in C11-§6.5.2.5/3:
Note that this differs from a cast expression. For example, a cast specifies a conversion to scalar types
or void only, and the result of a cast expression is not an lvalue.
Note that:
String literals, and compound literals with const-qualified types, need not designate
distinct objects.101)
101) This allows implementations to share storage for string literals and constant compound literals with
the same or overlapping representations.
Like string literals, const-qualified compound literals can be placed into read-only
memory and can even be shared. For example,
Examples
Definition/Initialisation of Compound Literals
A compound literal is an unnamed object which is created in the scope where is defined. The
concept was first introduced in C99 standard. An example for compound literal is
https://riptutorial.com/ 95
int *p = (int [2]){ 2, 4 };
p is initialized to the address of the first element of an unnamed array of two ints.
The compound literal is an lvalue. The storage duration of the unnamed object is either static (if
the literal appears at file scope) or automatic (if the literal appears at block scope), and in the latter
case the object's lifetime ends when control leaves the enclosing block.
void f(void)
{
int *p;
/*...*/
p = (int [2]){ *p };
/*...*/
}
pis assigned the address of the first element of an array of two ints, the first having the
value previously pointed to by p and the second, zero.[...]
struct point {
unsigned x;
unsigned y;
};
A fictive function drawline receives two arguments of type struct point. The first has coordinate
values x == 1 and y == 1, whereas the second has x == 3 and y == 4
In this case the size of the array is no specified then it will be determined by the length of the
initializer.
https://riptutorial.com/ 96
Compound literal having length of initializer
less than array size specified
int *p = (int [10]){1, 2, 3};
void foo()
{
int *p;
int i = 2; j = 5;
/*...*/
p = (int [2]){ i+j, i*j };
/*...*/
}
https://riptutorial.com/ 97
Chapter 15: Constraints
Remarks
Constraints are a term used in all of the existing C specifications (recently ISO-IEC 9899-2011).
They are one of the three parts of the language described in clause 6 of the standard (along side
syntax and semantics).
(Please also note, in terms of the C standard, a "runtime-constraint" is not a kind of constraint and
has extensively different rules.)
In other words a constraint describes a rule of the language which would make an otherwise
syntactically valid program illegal. In this respect constraints are somewhat like undefined
behavior, any program which does not follow them is not defined in terms of the C language.
Constraints on the other hand have a very significant difference from Undefined Behaviors.
Namely an implementation is required to provide a diagnostic message during the translation
phase (part of compilation) if a constraint is breached, this message may be a warning or may halt
the compilation.
Examples
Duplicate variable names in the same scope
An example of a constraint as expressed in the C standard is having two variables of the same
name declared in a scope1), for example:
This code breaches the constraint and must produce a diagnostic message at compile time. This
is very useful as compared to undefined behavior as the developer will be informed of the issue
before the program is run, potentially doing anything.
Constraints thus tend to be errors which are easily detectable at compile time such as this, issues
which result in undefined behavior but would be difficult or impossible to detect at compile time are
thus not constraints.
https://riptutorial.com/ 98
1) exact wording:
C99
If an identifier has no linkage, there shall be no more than one declaration of the identifier (in a
declarator or type specifier) with the same scope and in the same name space, except for tags as
specified in 6.7.2.3.
The unary + and - operators are only usable on arithmetic types, therefore if for example one tries
to use them on a struct the program will produce a diagnostic eg:
struct foo
{
bool bar;
};
void baz(void)
{
struct foo testStruct;
-testStruct; /* This breaks the constraint so must produce a diagnostic */
}
https://riptutorial.com/ 99
Chapter 16: Create and include header files
Introduction
In modern C, header files are crucial tools that must be designed and used correctly. They allow
the compiler to cross-check independently compiled parts of a program.
Headers declare types, functions, macros etc that are needed by the consumers of a set of
facilities. All the code that uses any of those facilities includes the header. All the code that defines
those facilities includes the header. This allows the compiler to check that the uses and definitions
match.
Examples
Introduction
There are a number of guidelines to follow when creating and using header files in a C project:
• Idemopotence
If a header file is included multiple times in a translation unit (TU), it should not break builds.
• Self-containment
If you need the facilities declared in a header file, you should not have to include any other
headers explicitly.
• Minimality
You should not be able to remove any information from a header without causing builds to
fail.
Of more concern to C++ than C, but nevertheless important in C too. If the code in a TU (call
it code.c) directly uses the features declared by a header (call it "headerA.h"), then code.c
should #include "headerA.h" directly, even if the TU includes another header (call it
"headerB.h") that happens, at the moment, to include "headerA.h".
Occasionally, there might be good enough reasons to break one or more of these guidelines, but
you should both be aware that you are breaking the rule and be aware of the consequences of
doing so before you break it.
Idempotence
If a particular header file is included more than once in a translation unit (TU), there should not be
any compilation problems. This is termed 'idempotence'; your headers should be idempotent.
https://riptutorial.com/ 100
Think how difficult life would be if you had to ensure that #include <stdio.h> was only included
once.
There are two ways to achieve idempotence: header guards and the #pragma once directive.
Header guards
Header guards are simple and reliable and conform to the C standard. The first non-comment
lines in a header file should be of the form:
#ifndef UNIQUE_ID_FOR_HEADER
#define UNIQUE_ID_FOR_HEADER
The last non-comment line should be #endif, optionally with a comment after it:
#endif /* UNIQUE_ID_FOR_HEADER */
All the operational code, including other #include directives, should be between these lines.
Each name must be unique. Often, a name scheme such as HEADER_H_INCLUDED is used. Some
older code uses a symbol defined as the header guard (e.g. #ifndef BUFSIZ in <stdio.h>), but it is
not as reliable as a unique name.
One option would be to use a generated MD5 (or other) hash for the header guard name. You
should avoid emulating the schemes used by system headers which frequently use names
reserved to the implementation — names starting with an underscore followed by either another
underscore or an upper-case letter.
#pragma once
The compilers which support #pragma once include MS Visual Studio and GCC and Clang.
However, if portability is a concern, it is better to use header guards, or use both. Modern
compilers (those supporting C89 or later) are required to ignore, without comment, pragmas that
they do not recognize ('Any such pragma that is not recognized by the implementation is ignored')
but old versions of GCC were not so indulgent.
Self-containment
Modern headers should be self-contained, which means that a program that needs to use the
facilities defined by header.h can include that header (#include "header.h") and not worry about
whether other headers need to be included first.
https://riptutorial.com/ 101
Recommendation: Header files should be self-contained.
Historical rules
Once upon another millennium, the AT&T Indian Hill C Style and Coding Standards stated:
Header files should not be nested. The prologue for a header file should, therefore,
describe what other headers need to be #included for the header to be functional. In
extreme cases, where a large number of header files are to be included in several
different source files, it is acceptable to put all common #includes in one include file.
Modern rules
However, since then, opinion has tended in the opposite direction. If a source file needs to use the
facilities declared by a header header.h, the programmer should be able to write:
#include "header.h"
and (subject only to having the correct search paths set on the command line), any necessary pre-
requisite headers will be included by header.h without needing any further headers added to the
source file.
This provides better modularity for the source code. It also protects the source from the "guess
why this header was added" conundrum that arises after the code has been modified and hacked
for a decade or two.
The NASA Goddard Space Flight Center (GSFC) coding standards for C is one of the more
modern standards — but is now a little hard to track down. It states that headers should be self-
contained. It also provides a simple way to ensure that headers are self-contained: the
implementation file for the header should include the header as the first header. If it is not self-
contained, that code will not compile.
This standard requires a unit’s header to contain #include statements for all other
headers required by the unit header. Placing #include for the unit header first in the unit
body allows the compiler to verify that the header contains all required #include
statements.
https://riptutorial.com/ 102
#ifdef statements that check that the required headers are included in the proper order.
One advantage of the alternate design is that the #include list in the body file is exactly
the dependency list needed in a makefile, and this list is checked by the compiler. With
the standard design, a tool must be used to generate the dependency list. However, all
of the branch recommended development environments provide such a tool.
A major disadvantage of the alternate design is that if a unit’s required header list
changes, each file that uses that unit must be edited to update the #include statement
list. Also, the required header list for a compiler library unit may be different on different
targets.
Another disadvantage of the alternate design is that compiler library header files, and
other third party files, must be modified to add the required #ifdef statements.
• If a header header.h needs a new nested header extra.h, you do not have to check every
source file that uses header.h to see whether you need to add extra.h.
• If a header header.h no longer needs to include a specific header notneeded.h, you do not
have to check every source file that uses header.h to see whether you can safely remove
notneeded.h (but see Include what you use.
• You do not have to establish the correct sequence for including the pre-requisite headers
(which requires a topological sort to do the job properly).
Checking self-containment
See Linking against a static library for a script chkhdr that can be used to test idempotence and
self-containment of a header file.
Minimality
Headers are a crucial consistency checking mechanism, but they should be as small as possible.
In particular, that means that a header should not include other headers just because the
implementation file will need the other headers. A header should contain only those headers
necessary for a consumer of the services described.
For example, a project header should not include <stdio.h> unless one of the function interfaces
uses the type FILE * (or one of the other types defined solely in <stdio.h>). If an interface uses
size_t, the smallest header that suffices is <stddef.h>. Obviously, if another header that defines
size_t is included, there is no need to include <stddef.h> too.
If the headers are minimal, then it keeps the compilation time to a minimum too.
It is possible to devise headers whose sole purpose is to include a lot of other headers. These
seldom turn out to be a good idea in the long run because few source files will actually need all the
facilities described by all the headers. For example, a <standard-c.h> could be devised that
includes all the standard C headers — with care since some headers are not always present.
https://riptutorial.com/ 103
However, very few programs actually use the facilities of <locale.h> or <tgmath.h>.
Google's Include What You Use project, or IWYU, ensures source files include all headers used in
the code.
Suppose a source file source.c includes a header arbitrary.h which in turn coincidentally includes
freeloader.h, but the source file also explicitly and independently uses the facilities from
freeloader.h. All is well to start with. Then one day arbitrary.h is changed so its clients no longer
need the facilities of freeloader.h. Suddenly, source.c stops compiling — because it didn't meet the
IWYU criteria. Because the code in source.c explicitly used the facilities of freeloader.h, it should
have included what it uses — there should have been an explicit #include "freeloader.h" in the
source too. (Idempotency would have ensured there wasn't a problem.)
The IWYU philosophy maximizes the probability that code continues to compile even with
reasonable changes made to interfaces. Clearly, if your code calls a function that is subsequently
removed from the published interface, no amount of preparation can prevent changes becoming
necessary. This is why changes to APIs are avoided when possible, and why there are
deprecation cycles over multiple releases, etc.
This is a particular problem in C++ because standard headers are allowed to include each other.
Source file file.cpp could include one header header1.h that on one platform includes another
header header2.h. file.cpp might turn out to use the facilities of header2.h as well. This wouldn't be
a problem initially - the code would compile because header1.h includes header2.h. On another
platform, or an upgrade of the current platform, header1.h could be revised so it no longer includes
header2.h, and thenfile.cpp would stop compiling as a result.
IWYU would spot the problem and recommend that header2.h be included directly in file.cpp. This
would ensure it continues to compile. Analogous considerations apply to C code too.
The C standard says that there is very little difference between the #include <header.h> and
#include "header.h" notations.
[#include "header.h"] causes the replacement of that directive by the entire contents of
the source file identified by the specified sequence between the "…" delimiters. The
named source file is searched for in an implementation-defined manner. If this search
is not supported, or if the search fails, the directive is reprocessed as if it read [#include
<header.h>] …
https://riptutorial.com/ 104
So, the double quoted form may look in more places than the angle-bracketed form. The standard
specifies by example that the standard headers should be included in angle-brackets, even though
the compilation works if you use double quotes instead. Similarly, standards such as POSIX use
the angle-bracketed format — and you should too. Reserve double-quoted headers for headers
defined by the project. For externally-defined headers (including headers from other projects your
project relies on), the angle-bracket notation is most appropriate.
Note that there should be a space between #include and the header, even though the compilers
will accept no space there. Spaces are cheap.
#include <openssl/ssl.h>
#include <sys/stat.h>
#include <linux/kernel.h>
You should consider whether to use that namespace control in your project (it is quite probably a
good idea). You should steer clear of the names used by existing projects (in particular, both sys
and linux would be bad choices).
If you use this, your code should be careful and consistent in the use of the notation.
Header files should seldom if ever define variables. Although you will keep global variables to a
minimum, if you need a global variable, you will declare it in a header, and define it in one suitable
source file, and that source file will include the header to cross-check the declaration and
definition, and all source files that use the variable will use the header to declare it.
Corollary: you will not declare global variables in a source file — a source file will only contain
definitions.
Header files should seldom declare static functions, with the notable exception of static inline
functions which will be defined in headers if the function is needed in more than one source file.
Cross-references
• Where to document functions in C?
• List of standard header files in C and C++
• Is inline without static or extern ever useful in C99?
• How do I use extern to share variables between source files?
https://riptutorial.com/ 105
• What are the benefits of a relative path such as "../include/header.h" for a header?
• Header inclusion optimization
• Should I include every header?
https://riptutorial.com/ 106
Chapter 17: Data Types
Remarks
• While char is required to be 1 byte, 1 byte is not required to be 8 bits (often also called an
octet), even though most of modern computer platforms define it as 8 bits. The
implementation's number of bits per char is provided by the CHAR_BIT macro, defined in
<limits.h>. POSIX does require 1 byte to be 8 bits.
• Fixed width integer types should be use sparsely, C's built-in types are designed to be
natural on every architecture, the fixed width types should only be used if you explicitly need
a specifically sized integer (for example for networking).
Examples
Integer types and constants
Signed integers can be of these types (the int after short, or long is optional):
signed char c = 127; /* required to be 1 byte, see remarks for further information. */
signed short int si = 32767; /* required to be at least 16 bits. */
signed int i = 32767; /* required to be at least 16 bits */
signed long int li = 2147483647; /* required to be at least 32 bits. */
C99
For all types but char the signed version is assumed if the signed or unsigned part is omitted. The
type char constitutes a third character type, different from signed char and unsigned char and the
signedness (or not) depends on the platform.
Different types of integer constants (called literals in C jargon) can be written in different bases,
and different width, based on their prefix or suffix.
Decimal constants are always signed. Hexadecimal constants start with 0x or 0X and octal
https://riptutorial.com/ 107
constants start just with a 0. The latter two are signed or unsigned depending on whether the value
fits into the signed type or not.
Without a suffix the constant has the first type that fits its value, that is a decimal constant that is
larger than INT_MAX is of type long if possible, or long long otherwise.
The header file <limits.h> describes the limits of integers as follows. Their implementation-defined
values shall be equal or greater in magnitude (absolute value) to those shown below, with the
same sign.
https://riptutorial.com/ 108
Macro Type Value
If the value of an object of type char sign-extends when used in an expression, the value of
CHAR_MIN shall be the same as that of SCHAR_MIN and the value of CHAR_MAX shall be the same as that
of SCHAR_MAX . If the value of an object of type char does not sign-extend when used in an
expression, the value of CHAR_MIN shall be 0 and the value of CHAR_MAX shall be the same as that of
UCHAR_MAX.
C99
The C99 standard added a new header, <stdint.h>, which contains definitions for fixed width
integers. See the fixed width integer example for a more in-depth explanation.
String Literals
String literals are not modifiable (and in fact may be placed in read-only memory such as
.rodata). Attempting to alter their values results in undefined behaviour.
char* s = "foobar";
s[0] = 'F'; /* undefined behaviour */
Multiple string literals are concatenated at compile time, which means you can write construct like
these.
C99
C99
https://riptutorial.com/ 109
/* common usages are concatenations of format strings */
char* fmt = "%" PRId16; /* PRId16 macro since C99 */
C11
C99
The header <stdint.h> provides several fixed-width integer type definitions. These types are
optional and only provided if the platform has an integer type of the corresponding width, and if the
corresponding signed type has a two's complement representation of negative values.
See the remarks section for usage hints of fixed width types.
The C language has three mandatory real floating point types, float, double, and long double.
https://riptutorial.com/ 110
The header <float.h> defines various limits for floating point operations.
Floating point arithmetic is implementation defined. However, most modern platforms (arm, x86,
x86_64, MIPS) use IEEE 754 floating point operations.
C also has three optional complex floating point types that are derived from the above.
Interpreting Declarations
A distinctive syntactic peculiarity of C is that declarations mirror the use of the declared object as it
would be in a normal expression.
The following set of operators with identical precedence and associativity are reused in
declarators, namely:
The above three operators have the following precedence and associativity:
* (dereference) 2 Right-to-left
When interpreting declarations, one has to start from the identifier outwards and apply the
adjacent operators in the correct order as per the above table. Each application of an operator can
be substituted with the following English words:
Expression Interpretation
It follows that the beginning of the English interpretation will always start with the identifier and will
end with the type that stands on the left-hand side of the declaration.
Examples
https://riptutorial.com/ 111
char *names[20];
[] takes precedence over *, so the interpretation is: names is an array of size 20 of a pointer to char.
char (*place)[10];
In case of using parentheses to override the precedence, the * is applied first: place is a pointer to
an array of size 10 of char.
There is no precedence to worry about here: fn is a function taking long, short and returning int.
int *fn(void);
The () is applied first: fn is a function taking void and returning a pointer to int.
int (*fp)(void);
Overriding the precedence of (): fp is a pointer to a function taking void and returning int.
int arr[5][8];
Multidimensional arrays are not an exception to the rule; the [] operators are applied in left-to-right
order according to the associativity in the table: arr is an array of size 5 of an array of size 8 of int.
int **ptr;
The two dereference operators have equal precedence, so the associativity takes effect. The
operators are applied in right-to-left order: ptr is a pointer to a pointer to an int.
Multiple Declarations
The comma can be used as a separator (*not* acting like the comma operator) in order to delimit
multiple declarations within a single statement. The following statement contains five declarations:
https://riptutorial.com/ 112
Alternative Interpretation
Because declarations mirror use, a declaration can also be interpreted in terms of the operators
that could be applied over the object and the final resulting type of that expression. The type that
stands on the left-hand side is the final result that is yielded after applying all operators.
/*
* Subscripting "arr" and dereferencing it yields a "char" result.
* Particularly: *arr[5] is of type "char".
*/
char *arr[20];
/*
* Calling "fn" yields an "int" result.
* Particularly: fn('b') is of type "int".
*/
int fn(char);
/*
* Dereferencing "fp" and then calling it yields an "int" result.
* Particularly: (*fp)() is of type "int".
*/
int (*fp)(void);
/*
* Subscripting "strings" twice and dereferencing it yields a "char" result.
* Particularly: *strings[5][15] is of type "char"
*/
char *strings[10][20];
https://riptutorial.com/ 113
Chapter 18: Declaration vs Definition
Remarks
Source: What is the difference between a definition and a declaration?
Examples
Understanding Declaration and Definition
A declaration introduces an identifier and describes its type, be it a type, object, or function. A
declaration is what the compiler needs to accept references to that identifier. These are
declarations:
A definition actually instantiates/implements this identifier. It's what the linker needs in order to link
references to those entities. These are definitions corresponding to the above declarations:
int bar;
int g(int lhs, int rhs) {return lhs*rhs;}
double f(int i, double d) {return i+d;}
double h1(int a, int b) {return -1.5;}
double h2() {} /* prototype is implied in definition, same as double h2(void) */
However, it must be defined exactly once. If you forget to define something that's been declared
and referenced somewhere, then the linker doesn't know what to link references to and complains
about a missing symbols. If you define something more than once, then the linker doesn't know
which of the definitions to link references to and complains about duplicated symbols.
Exception:
This exception can be explained using concepts of "Strong symbols vs Weak symbols" (from a
linker's perspective) . Please look here ( Slide 22 ) for more explanation.
https://riptutorial.com/ 114
/* All are definitions. */
struct S { int a; int b; }; /* defines S */
struct X { /* defines X */
int x; /* defines non-static data member x */
};
struct X anX; /* defines anX */
https://riptutorial.com/ 115
Chapter 19: Declarations
Remarks
Declaration of identifier referring to object or function is often referred for short as simply a
declaration of object or function.
Examples
Calling a function from another C file
foo.h
/**
* This is a function declaration.
* It tells the compiler that the function exists somewhere.
*/
void foo(int id, char *name);
#endif /* FOO_DOT_H */
foo.c
#include "foo.h" /* Always include the header file that declares something
* in the C file that defines it. This makes sure that the
* declaration and definition are always in-sync. Put this
* header first in foo.c to ensure the header is self-contained.
*/
#include <stdio.h>
/**
* This is the function definition.
* It is the actual body of the function which was declared elsewhere.
*/
void foo(int id, char *name)
{
fprintf(stderr, "foo(%d, \"%s\");\n", id, name);
/* This will print how foo was called to stderr - standard error.
* e.g., foo(42, "Hi!") will print `foo(42, "Hi!")`
*/
}
main.c
#include "foo.h"
https://riptutorial.com/ 116
int main(void)
{
foo(42, "bar");
return 0;
}
First, we compile both foo.c and main.c to object files. Here we use the gcc compiler, your compiler
may have a different name and need other options.
Use of global variables is generally discouraged. It makes your program more difficult to
understand, and harder to debug. But sometimes using a global variable is acceptable.
global.h
/**
* This tells the compiler that g_myglobal exists somewhere.
* Without "extern", this would create a new variable named
* g_myglobal in _every file_ that included it. Don't miss this!
*/
extern int g_myglobal; /* _Declare_ g_myglobal, that is promise it will be _defined_ by
* some module. */
#endif /* GLOBAL_DOT_H */
global.c
#include "global.h" /* Always include the header file that declares something
* in the C file that defines it. This makes sure that the
* declaration and definition are always in-sync.
*/
main.c
#include "global.h"
https://riptutorial.com/ 117
int main(void)
{
g_myglobal = 42;
return 0;
}
See also How do I use extern to share variables between source files?
Headers may be used to declare globally used read-only resources, like string tables for example.
Declare those in a separate header which gets included by any file ("Translation Unit") which
wants to make use of them. It's handy to use the same header to declare a related enumeration to
identify all string-resources:
resources.h:
#ifndef RESOURCES_H
#define RESOURCES_H
typedef enum { /* Define a type describing the possible valid resource IDs. */
RESOURCE_UNDEFINED = -1, /* To be used to initialise any EnumResourceID typed variable to be
extern const char * const resources[RESOURCE_MAX]; /* Declare, promise to anybody who includes
To actually define the resources created a related .c-file, that is another translation unit holding the
actual instances of the what had been declared in the related header (.h) file:
resources.c:
https://riptutorial.com/ 118
#include "resources.h" /* To make sure clashes between declaration and definition are
recognised by the compiler include the declaring header into
the implementing, defining translation unit (.c file).
main.c:
#include "resources.h"
int main(void)
{
EnumResourceID resource_id = RESOURCE_UNDEFINED;
return EXIT_SUCCESS;
}
Compile the three file above using GCC, and link them to become the program file main for
example using this:
(use these -Wall -Wextra -pedantic -Wconversion to make the compiler really picky, so you don't
miss anything before posting the code to SO, will say the world, or even worth deploying it into
production)
$ ./main
And get:
https://riptutorial.com/ 119
Introduction
The above declaration declares single identifier named a which refers to some object with int type.
The second declaration declares 2 identifiers named a1 and b1 which refers to some other objects
though with the same int type.
Basically, the way this works is like this - first you put some type, then you write a single or
multiple expressions separated via comma (,) (which will not be evaluated at this point - and
which should otherwise be referred to as declarators in this context). In writing such
expressions, you are allowed to apply only the indirection (*), function call (( )) or subscript (or
array indexing - [ ]) operators onto some identifier (you can also not use any operators at all). The
identifier used is not required to be visible in the current scope. Some examples:
# Description
3 We have a comma indicating that one more expression will follow in the same declaration.
6 End of declaration.
Note that none of the above identifiers were visible prior to this declaration and so the expressions
used would not be valid before it.
After each such expression, the identifier used in it is introduced into the current scope. (If the
identifier has assigned linkage to it, it may also be re-declared with the same type of linkage so
that both identifiers refer to the same object or function)
Additionally, the equal operator sign (=) may be used for initialization. If an unevaluated expression
(declarator) is followed by = inside the declaration - we say that the identifier being introduced is
also being initialized. After the = sign we can put once again some expression, but this time it'll be
evaluated and its value will be used as initial for the object declared.
https://riptutorial.com/ 120
Examples:
Later in your code, you are allowed to write the exact same expression from the declaration part of
the newly introduced identifier, giving you an object of the type specified at the beginning of the
declaration, assuming that you've assigned valid values to all accessed objects in the way.
Examples:
void f()
{
int b2; /* you should be able to write later in your code b2
which will directly refer to the integer object
that b2 identifies */
b2 = 2; /* assign a value to b2 */
int *b3; /* you should be able to write later in your code *b3 */
int **b4; /* you should be able to write later in your code **b4 */
b4 = &b3;
void (*p)(); /* you should be able to write later in your code (*p)() */
The declaration of b3 specifies that you can potentially use b3 value as a mean to access some
integer object.
Of course, in order to apply indirection (*) to b3, you should also have a proper value stored in it
(see pointers for more info). You should also first store some value into an object before trying to
https://riptutorial.com/ 121
retrieve it (you can see more about this problem here). We've done all of this in the above
examples.
This one tells the compiler that you'll attempt to call a3. In this case a3 refers to function instead of
an object. One difference between object and function is that functions will always have some sort
of linkage. Examples:
void f1()
{
{
int f2(); /* 1 refers to some function f2 */
}
{
int f2(); /* refers to the exact same function f2 as (1) */
}
}
In the above example, the 2 declarations refer to the same function f2, whilst if they were
declaring objects then in this context (having 2 different block scopes), they would have be 2
different distinct objects.
int (*a3)(); /* you should be able to apply indirection to `a3` and then call it */
Now it may seems to be getting complicated, but if you know operators precedence you'll have 0
problems reading the above declaration. The parentheses are needed because the * operator has
less precedence then the ( ) one.
In the case of using the subscript operator, the resulting expression wouldn't be actually valid after
the declaration because the index used in it (the value inside [ and ]) will always be 1 above the
maximum allowed value for this object/function.
a4[5] will result into UB. More information about arrays can be found here.
Unfortunately for us, although syntactically possible, the declaration of a5 is forbidden by the
current standard.
https://riptutorial.com/ 122
Typedef
Typedefs are declarations which have the keyword typedef in front and before the type. E.g.:
(you can technically put the typedef after the type too - like this int typedef (*(*t0)())[5]; but this
is discouraged)
The above declarations declares an identifier for a typedef name. You can use it like this
afterwards:
t0 pf;
int (*(*pf)())[5];
As you can see the typedef name "saves" the declaration as a type to use later for other
declarations. This way you can save some keystrokes. Also as declaration using typedef is still a
declaration you are not limited only by the above example:
t0 (*pf1);
int (*(**pf1)())[5];
The "right-left" rule is a completely regular rule for deciphering C declarations. It can also be useful
in creating them.
STEP 1
Find the identifier. This is your starting point. Then say to yourself, "identifier is." You've started
your declaration.
STEP 2
https://riptutorial.com/ 123
Look at the symbols on the right of the identifier. If, say, you find () there, then you know that this
is the declaration for a function. So you would then have "identifier is function returning". Or if you
found a [] there, you would say "identifier is array of". Continue right until you run out of symbols
OR hit a right parenthesis ). (If you hit a left parenthesis (, that's the beginning of a () symbol,
even if there is stuff in between the parentheses. More on that below.)
STEP 3
Look at the symbols to the left of the identifier. If it is not one of our symbols above (say,
something like "int"), just say it. Otherwise, translate it into English using that table above. Keep
going left until you run out of symbols OR hit a left parenthesis (.
int *p[];
int *p[];
^
"p is"
int *p[];
^^
Can't move right anymore (out of symbols), so move left and find:
int *p[];
^
int *p[];
^^^
Another example:
https://riptutorial.com/ 124
int *(*func())();
int *(*func())();
^^^^
"func is"
Move right.
int *(*func())();
^^
Can't move right anymore because of the right parenthesis, so move left.
int *(*func())();
^
Can't move left anymore because of the left parenthesis, so keep going right.
int *(*func())();
^^
int *(*func())();
^
And finally, keep going left, because there's nothing left on the right.
int *(*func())();
^^^
As you can see, this rule can be quite useful. You can also use it to sanity check yourself while
you are creating declarations, and to give you a hint about where to put the next symbol and
whether parentheses are required.
Some declarations look much more complicated than they are due to array sizes and argument
https://riptutorial.com/ 125
lists in prototype form. If you see [3], that's read as "array (size 3) of...". If you see (char *,int)
that's read as *"function expecting (char ,int) and returning...".
*"fun_one is pointer to function expecting (char ,double) and returning pointer to array (size 9) of
array (size 20) of int."
As you can see, it's not as complicated if you get rid of the array sizes and argument lists:
int (*(*fun_one)())[][];
You can decipher it that way, and then put in the array sizes and argument lists later.
It is quite possible to make illegal declarations using this rule, so some knowledge of what's legal
in C is necessary. For instance, if the above had been:
int *((*fun_one)())[][];
it would have read "fun_one is pointer to function returning array of array of pointer to int". Since a
function cannot return an array, but only a pointer to an array, that declaration is illegal.
In all the above cases, you would need a set of parentheses to bind a * symbol on the left
between these () and [] right-side symbols in order for the declaration to be legal.
Legal
int i; an int
int *p; an int pointer (ptr to an int)
int a[]; an array of ints
int f(); a function returning an int
int **pp; a pointer to an int pointer (ptr to a ptr to an int)
int (*pa)[]; a pointer to an array of ints
int (*pf)(); a pointer to a function returning an int
int *ap[]; an array of int pointers (array of ptrs to ints)
https://riptutorial.com/ 126
int aa[][]; an array of arrays of ints
int *fp(); a function returning an int pointer
int ***ppp; a pointer to a pointer to an int pointer
int (**ppa)[]; a pointer to a pointer to an array of ints
int (**ppf)(); a pointer to a pointer to a function returning an int
int *(*pap)[]; a pointer to an array of int pointers
int (*paa)[][]; a pointer to an array of arrays of ints
int *(*pfp)(); a pointer to a function returning an int pointer
int **app[]; an array of pointers to int pointers
int (*apa[])[]; an array of pointers to arrays of ints
int (*apf[])(); an array of pointers to functions returning an int
int *aap[][]; an array of arrays of int pointers
int aaa[][][]; an array of arrays of arrays of int
int **fpp(); a function returning a pointer to an int pointer
int (*fpa())[]; a function returning a pointer to an array of ints
int (*fpf())(); a function returning a pointer to a function returning an int
Illegal
Source: http://ieng9.ucsd.edu/~cs30x/rt_lt.rule.html
https://riptutorial.com/ 127
Chapter 20: Enumerations
Remarks
Enumerations consist of the enum keyword and an optional identifier followed by an enumerator-list
enclosed by braces.
Using multiple "assignments" can lead to different enumerators of the same enumeration carry the
same values.
Examples
Simple Enumeration
An enumeration is a user-defined data type consists of integral constants and each integral
constant is given a name. Keyword enum is used to define enumerated data type.
If you use enum instead of int or string/ char*, you increase compile-time checking and avoid
errors from passing in invalid constants, and you document which values are legal to use.
Example 1
case GREEN:
color_name = "GREEN";
https://riptutorial.com/ 128
break;
case BLUE:
color_name = "BLUE";
break;
}
printf("%s\n", color_name);
}
int main(){
enum color chosenColor;
printf("Enter a number between 0 and 2");
scanf("%d", (int*)&chosenColor);
printColor(chosenColor);
return 0;
}
C99
Example 2
(This example uses designated initializers which are standardized since C99.)
Typedef enum
https://riptutorial.com/ 129
There are several possibilities and conventions to name an enumeration. The first is to use a tag
name just after the enum keyword.
enum color
{
RED,
GREEN,
BLUE
};
This enumeration must then always be used with the keyword and the tag like this:
If we use typedef directly when declaring the enum, we can omit the tag name and then use the type
without the enum keyword:
typedef enum
{
RED,
GREEN,
BLUE
} color;
But in this latter case we cannot use it as enum color, because we didn't use the tag name in the
definition. One common convention is to use both, such that the same name can be used with or
without enum keyword. This has the particular advantage of being compatible with C++
Function:
void printColor()
{
if (chosenColor == RED)
{
printf("RED\n");
}
else if (chosenColor == GREEN)
{
printf("GREEN\n");
}
else if (chosenColor == BLUE)
https://riptutorial.com/ 130
{
printf("BLUE\n");
}
}
enum Dupes
{
Base, /* Takes 0 */
One, /* Takes Base + 1 */
Two, /* Takes One + 1 */
Negative = -1,
AnotherZero /* Takes Negative + 1 == 0, sigh */
};
int main(void)
{
printf("Base = %d\n", Base);
printf("One = %d\n", One);
printf("Two = %d\n", Two);
printf("Negative = %d\n", Negative);
printf("AnotherZero = %d\n", AnotherZero);
return EXIT_SUCCESS;
}
Base = 0
One = 1
Two = 2
Negative = -1
AnotherZero = 0
This enables us to define compile time constants of type int that can as in this example be used
as array length.
https://riptutorial.com/ 131
Chapter 21: Error handling
Syntax
• #include <errno.h>
• int errno; /* implementation defined */
• #include <string.h>
• char *strerror(int errnum);
• #include <stdio.h>
• void perror(const char *s);
Remarks
Have in mind that errno is not necessarily a variable but that the syntax is only an indication how it
might been declared. On many modern systems with thread interfaces errno is some macro that
resolves to an object that is local to the current thread.
Examples
errno
When a standard library function fails, it often sets errno to the appropriate error code. The C
standard requires at least 3 values for errno be set:
Value Meaning
strerror
If perror is not flexible enough, you may obtain a user-readable error description by calling
strerror from <string.h>.
https://riptutorial.com/ 132
}
if (last_error) {
fprintf(stderr, "fopen: Could not open %s for writing: %s",
argv[1], strerror(last_error));
fputs("Cross fingers and continue", stderr);
}
return EXIT_SUCCESS;
}
perror
This will print an error message concerning the current value of errno.
https://riptutorial.com/ 133
Chapter 22: Files and I/O streams
Syntax
• #include <stdio.h> /* Include this to use any of the following sections */
• FILE *fopen(const char *path, const char *mode); /* Open a stream on the file at path with
the specified mode */
• FILE *freopen(const char *path, const char *mode, FILE *stream); /* Re-open an existing
stream on the file at path with the specified mode */
• int fclose(FILE *stream); /* Close an opened stream */
• size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream); /* Read at most nmemb
elements of size bytes each from the stream and write them in ptr. Returns the number of
read elements. */
• size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream); /* Write nmemb
elements of size bytes each from ptr to the stream. Returns the number of written elements.
*/
• int fseek(FILE *stream, long offset, int whence); /* Set the cursor of the stream to offset,
relative to the offset told by whence, and returns 0 if it succeeded. */
• long ftell(FILE *stream); /* Return the offset of the current cursor position from the beginning
of the stream. */
• void rewind(FILE *stream); /* Set the cursor position to the beginning of the file. */
• int fprintf(FILE *fout, const char *fmt, ...); /* Writes printf format string on fout */
• FILE *stdin; /* Standard input stream */
• FILE *stdout; /* Standard output stream */
• FILE *stderr; /* Standard error stream */
Parameters
Parameter Details
const char A string describing the opening mode of the file-backed stream. See remarks
*mode for possible values.
Can be SEEK_SET to set from the beginning of the file, SEEK_END to set from its
int whence end, or SEEK_CUR to set relative to the current cursor value. Note: SEEK_END is
non-portable.
Remarks
Mode strings:
Mode strings in fopen() and freopen() can be one of those values:
https://riptutorial.com/ 134
• "r":
Open the file in read-only mode, with the cursor set to the beginning of the file.
• "r+": Open the file in read-write mode, with the cursor set to the beginning of the file.
• "w": Open or create the file in write-only mode, with its content truncated to 0 bytes. The
cursor is set to the beginning of the file.
• "w+": Open or create the file in read-write mode, with its content truncated to 0 bytes. The
cursor is set to the beginning of the file.
• "a": Open or create the file in write-only mode, with the cursor set to the end of the file.
• "a+": Open or create the file in read-write mode, with the read-cursor set to the beginning of
the file. The output, however, will always be appended to the end of the file.
Each of these file modes may have a b added after the initial letter (e.g. "rb" or "a+b" or "ab+"). The
b means that the file should be treated as a binary file instead of a text file on those systems where
there is a difference. It doesn't make a difference on Unix-like systems; it is important on Windows
systems. (Additionally, Windows fopen allows an explicit t instead of b to indicate 'text file' — and
numerous other platform-specific options.)
C11
• "wx":Create a text file in write-only mode. The file may not exist.
• "wbx": Create a binary file in write-only mode. The file may not exist.
Examples
Open and write to file
/* Writes text to file. Unlike puts(), fputs() does not add a new-line. */
if (fputs("Output in file.\n", file) == EOF)
{
perror(path);
e = EXIT_FAILURE;
}
https://riptutorial.com/ 135
/* Close file */
if (fclose(file))
{
perror(path);
return EXIT_FAILURE;
}
return e;
}
This program opens the file with name given in the argument to main, defaulting to output.txt if no
argument is given. If a file with the same name already exists, its contents are discarded and the
file is treated as a new empty file. If the files does not already exist the fopen() call creates it.
If the fopen() call fails for some reason, it returns a NULL value and sets the global errno variable
value. This means that the program can test the returned value after the fopen() call and use
perror() if fopen() fails.
If the fopen() call succeeds, it returns a valid FILE pointer. This pointer can then be used to
reference this file until fclose() is called on it.
The fputs() function writes the given text to the opened file, replacing any previous contents of the
file. Similarly to fopen(), the fputs() function also sets the errno value if it fails, though in this case
the function returns EOF to indicate the fail (it otherwise returns a non-negative value).
The fclose() function flushes any buffers, closes the file and frees the memory pointed to by FILE
*. The return value indicates completion just as fputs() does (though it returns '0' if successful),
again also setting the errno value in the case of a fail.
fprintf
You can use fprintf on a file just like you might on a console with printf. For example to keep
track of game wins, losses and ties you might write
A side note: Some systems (infamously, Windows) do not use what most programmers would call
"normal" line endings. While UNIX-like systems use \n to terminate lines, Windows uses a pair of
characters: \r (carriage return) and \n (line feed). This sequence is commonly called CRLF.
However, whenever using C, you do not need to worry about these highly platform-dependent
details. A C compiler is required to convert every instance of \n to the correct platform line ending.
So a Windows compiler would convert \n to \r\n, but a UNIX compiler would keep it as-is.
Run process
#include <stdio.h>
https://riptutorial.com/ 136
void print_all(FILE *stream)
{
int c;
while ((c = getc(stream)) != EOF)
putchar(c);
}
int main(void)
{
FILE *stream;
print_all(stream);
pclose(stream);
return 0;
}
This program runs a process (netstat) via popen() and reads all the standard output from the
process and echoes that to standard output.
Note: popen() does not exist in the standard C library, but it is rather a part of POSIX C)
The POSIX C library defines the getline() function. This function allocates a buffer to hold the line
contents and returns the new line, the number of characters in the line, and the size of the buffer.
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
/* Open the file for reading */
char *line_buf = NULL;
size_t line_buf_size = 0;
int line_count = 0;
ssize_t line_size;
FILE *fp = fopen(FILENAME, "r");
if (!fp)
{
fprintf(stderr, "Error opening file '%s'\n", FILENAME);
return EXIT_FAILURE;
}
https://riptutorial.com/ 137
{
/* Increment our line count */
line_count++;
return EXIT_SUCCESS;
}
This is a file
which has
multiple lines
with various indentation,
blank lines
a really long line to show that getline() will reallocate the line buffer if the length of a
line is too long to fit in the buffer it has been given,
and punctuation at the end of the lines.
Output
In the example, getline() is initially called with no buffer allocated. During this first call, getline()
allocates a buffer, reads the first line and places the line's contents in the new buffer. On
subsequent calls, getline() updates the same buffer and only reallocates the buffer when it is no
https://riptutorial.com/ 138
longer large enough to fit the whole line. The temporary buffer is then freed when we are done
with the file.
Another option is getdelim(). This is the same as getline() except you specify the line ending
character. This is only necessary if the last character of the line for your file type is not '\n'.
getline() works even with Windows text files because with the multibyte line ending ("\r\n")'\n'` is
still the last character on the line.
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <stdint.h>
/* Only include our version of getline() if the POSIX version isn't available. */
/* Step through the file, pulling characters until either a newline or EOF. */
https://riptutorial.com/ 139
{
int c;
while (EOF != (c = getc(fin)))
{
/* Note we read a character. */
num_read++;
#endif
https://riptutorial.com/ 140
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
result = EXIT_SUCCESS;
fclose(fp);
}
return result;
}
This program creates and writes text in the binary form through the fwrite function to the file
output.bin.
If a file with the same name already exists, its contents are discarded and the file is treated as a
new empty file.
A binary stream is an ordered sequence of characters that can transparently record internal data.
In this mode, bytes are written between the program and the file without any interpretation.
To write integers portably, it must be known whether the file format expects them in big or little-
endian format, and the size (usually 16, 32 or 64 bits). Bit shifting and masking may then be used
to write out the bytes in the correct order. Integers in C are not guaranteed to have two's
complement representation (though almost all implementations do). Fortunately a conversion to
unsigned is guaranteed to use twos complement. The code for writing a signed integer to a binary
file is therefore a little surprising.
https://riptutorial.com/ 141
/* write a 16-bit little endian integer */
int fput16le(int x, FILE *fp)
{
unsigned int rep = x;
int e1, e2;
The other functions follow the same pattern with minor modifications for size and byte order.
fscanf()
Let's say we have a text file and we want to read all words in that file, in order to do some
requirements.
file.txt:
This is just
a test file
to be used by fscanf()
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
FILE *fp;
printAllWords(fp);
fclose(fp);
return EXIT_SUCCESS;
}
https://riptutorial.com/ 142
i++;
}
}
Word 1: This
Word 2: is
Word 3: just
Word 4: a
Word 5: test
Word 6: file
Word 7: to
Word 8: be
Word 9: used
Word 10: by
Word 11: fscanf()
The stdio.h header defines the fgets() function. This function reads a line from a stream and
stores it in a specified string. The function stops reading text from the stream when either n - 1
characters are read, the newline character ('\n') is read or the end of file (EOF) is reached.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_LINE_LENGTH 80
if (argc < 1)
return EXIT_FAILURE;
path = argv[1];
/* Open file */
FILE *file = fopen(path, "r");
if (!file)
{
perror(path);
return EXIT_FAILURE;
}
https://riptutorial.com/ 143
printf("\n");
}
/* Close file */
if (fclose(file))
{
return EXIT_FAILURE;
perror(path);
}
}
Calling the program with an argument that is a path to a file containing the following text:
This is a file
which has
multiple lines
with various indentation,
blank lines
a really long line to show that the line will be counted as two lines if the length of a line
is too long to fit in the buffer it has been given,
and punctuation at the end of the lines.
This very simple example allows a fixed maximum line length, such that longer lines will effectively
be counted as two lines. The fgets() function requires that the calling code provide the memory to
be used as the destination for the line that is read.
POSIX makes the getline() function available which instead internally allocates memory to
enlarge the buffer as necessary for a line of any length (as long as there is sufficient memory).
https://riptutorial.com/ 144
Chapter 23: Formatted Input/Output
Examples
Printing the Value of a Pointer to an Object
To print the value of a pointer to an object (as opposed to a function pointer) use the p conversion
specifier. It is defined to print void-pointers only, so to print out the value of a non void-pointer it
needs to be explicitly converted ("casted*") to void*.
int main(void)
{
int i;
int * p = &i;
return EXIT_SUCCESS;
}
C99
Another way to print pointers in C99 or later uses the uintptr_t type and the macros from
<inttypes.h>:
int main(void)
{
int i;
int *p = &i;
return 0;
}
In theory, there might not be an integer type that can hold any pointer converted to an integer (so
the type uintptr_t might not exist). In practice, it does exist. Pointers to functions need not be
convertible to the uintptr_t type — though again they most often are convertible.
If the uintptr_t type exists, so does the intptr_t type. It is not clear why you'd ever want to treat
addresses as signed integers, though.
K&RC89
https://riptutorial.com/ 145
Pre-Standard History:
Prior to C89 during K&R-C times there was no type void* (nor header <stdlib.h>, nor prototypes,
and hence no int main(void) notation), so the pointer was cast to long unsigned int and printed
using the lx length modifier/conversion specifier.
The example below is just for informational purpose. Nowadays this is invalid code, which
very well might provoke the infamous Undefined Behaviour.
int main()
{
int i;
int *p = &i;
return 0;
}
Subtracting the values of two pointers to an object results in a signed integer *1. So it would be
printed using at least the d conversion specifier.
To make sure there is a type being wide enough to hold such a "pointer-difference", since C99
<stddef.h> defines the type ptrdiff_t. To print a ptrdiff_t use the t length modifier.
C99
int main(void)
{
int a[2];
int * p1 = &a[0], * p2 = &a[1];
ptrdiff_t pd = p2 - p1;
return EXIT_SUCCESS;
}
p1 = 0x7fff6679f430
p2 = 0x7fff6679f434
https://riptutorial.com/ 146
p2 - p1 = 1
Please note that the resulting value of the difference is scaled by the size of the type the pointers
subtracted point to, an int here. The size of an int for this example is 4.
*1If the two pointers to be subtracted do not point to the same object the behaviour is undefined.
Conversion Type of
Description
Specifier Argument
https://riptutorial.com/ 147
Conversion Type of
Description
Specifier Argument
write the number of bytes printed so far into the int pointed
n int *
at.
Note that length modifiers can be applied to %n (e.g. %hhn indicates that a following n conversion
specifier applies to a pointer to a signed char argument, according to the ISO/IEC 9899:2011
§7.21.6.1 ¶7).
Note that the floating point conversions apply to types float and double because of default
promotion rules — §6.5.2.2 Function calls, ¶7 The ellipsis notation in a function prototype
declarator causes argument type conversion to stop after the last declared parameter. The default
argument promotions are performed on trailing arguments.) Thus, functions such as printf() are
only ever passed double values, even if the variable referenced is of type float.
With the g and G formats, the choice between e and f (or E and F) notation is documented in the C
standard and in the POSIX specification for printf():
• If P > X >= -4, the conversion shall be with style f (or F) and precision P - (X+1).
• Otherwise, the conversion shall be with style e (or E) and precision P - 1.
Finally, unless the '#' flag is used, any trailing zeros shall be removed from the
fractional portion of the result and the decimal-point character shall be removed if there
is no fractional portion remaining.
Accessed through including <stdio.h>, the function printf() is the primary tool used for printing
text to the console in C.
printf("Hello world!");
// Hello world!
Normal, unformatted character arrays can be printed by themselves by placing them directly in
https://riptutorial.com/ 148
between the parentheses.
int x = 3;
char y = 'Z';
char* z = "Example";
printf("Int: %d, Char: %c, String: %s", x, y, z);
// Int: 3, Char: Z, String: Example
Alternatively, integers, floating-point numbers, characters, and more can be printed using the
escape character %, followed by a character or sequence of characters denoting the format, known
as the format specifier.
All additional arguments to the function printf() are separated by commas, and these arguments
should be in the same order as the format specifiers. Additional arguments are ignored, while
incorrectly typed arguments or a lack of arguments will cause errors or undefined behavior. Each
argument can be either a literal value or a variable.
After successful execution, the number of characters printed is returned with type int. Otherwise,
a failure returns a negative value.
Length modifiers
The C99 and C11 standards specify the following length modifiers for printf(); their meanings are:
a, A, e, E, f, F, g, or
l double (for compatibility with scanf(); undefined in C90)
G
j d, i, o, u, x, or X intmax_t or uintmax_t
a, A, e, E, f, F, g, or
L long double
G
https://riptutorial.com/ 149
If a length modifier appears with any conversion specifier other than as specified above, the
behavior is undefined.
Microsoft specifies some different length modifiers, and explicitly does not support hh, j, z, or t.
d, i, o, x, or
I32 __int32
X
d, i, o, x, or
I64 __int64
X
Wide character with printf and wprintf functions. (An lc, lC, wc or
l or w c or C wC type specifier is synonymous with C in printf functions and with
c in wprintf functions.)
Wide-character string with printf and wprintf functions. (An ls, lS,
l or w s, S, or Z ws or wS type specifier is synonymous with S in printf functions and
with s in wprintf functions.)
Note that the C, S, and Z conversion specifiers and the I, I32, I64, and w length modifiers are
Microsoft extensions. Treating l as a modifier for long double rather than double is different from
the standard, though you'll be hard-pressed to spot the difference unless long double has a
different representation from double.
The C standard (C11, and C99 too) defines the following flags for printf():
+ signed The result of a signed conversion shall always begin with a sign ( '+'
https://riptutorial.com/ 150
Flag Conversions Meaning
numeric or '-' ). The conversion shall begin with a sign only when a negative
value is converted if this flag is not specified.
These flags are also supported by Microsoft with the same meanings.
https://riptutorial.com/ 151
Chapter 24: Function Parameters
Remarks
In C, it is common to use return values to denote errors that occur; and to return data through the
use of passed in pointers. This can be done for multiple reasons; including not having to allocate
memory on the heap or using static allocation at the point where the function is called.
Examples
Using pointer parameters to return multiple values
A common pattern in C, to easily imitate returning multiple values from a function, is to use
pointers.
#include <stdio.h>
int main(void)
{
int a = 0;
double b = 0.0;
return 0;
}
C99C11
/* Type "void" and VLAs ("int friend_indexes[static size]") require C99 at least.
In C11 VLAs are optional. */
void getListOfFriends(size_t size, int friend_indexes[static size]) {
size_t i = 0;
for (; i < size; i++) {
https://riptutorial.com/ 152
friend_indexes[i] = 1;
}
}
Here the static inside the [] of the function parameter, request that the argument array must have
at least as many elements as are specified (i.e. size elements). To be able to use that feature we
have to ensure that the size parameter comes before the array parameter in the list.
int main(void) {
size_t size_of_list = LIST_SIZE;
int friends_indexes[size_of_list];
return 0;
}
See also
Passing multidimensional arrays to a function
In C, all function parameters are passed by value, so modifying what is passed in callee functions
won't affect caller functions' local variables.
#include <stdio.h>
void modify(int v) {
printf("modify 1: %d\n", v); /* 0 is printed */
v = 42;
printf("modify 2: %d\n", v); /* 42 is printed */
}
int main(void) {
int v = 0;
printf("main 1: %d\n", v); /* 0 is printed */
modify(v);
printf("main 2: %d\n", v); /* 0 is printed, not 42 */
return 0;
}
You can use pointers to let callee functions modify caller functions' local variables. Note that this is
not pass by reference but the pointer values pointing at the local variables are passed.
https://riptutorial.com/ 153
#include <stdio.h>
void modify(int* v) {
printf("modify 1: %d\n", *v); /* 0 is printed */
*v = 42;
printf("modify 2: %d\n", *v); /* 42 is printed */
}
int main(void) {
int v = 0;
printf("main 1: %d\n", v); /* 0 is printed */
modify(&v);
printf("main 2: %d\n", v); /* 42 is printed */
return 0;
}
However returning the address of a local variable to the callee results in undefined behaviour. See
Dereferencing a pointer to variable beyond its lifetime.
The order of execution of parameters is undefined in C programming. Here it may execute from
left to right or from right to left. The order depends on the implementation.
#include <stdio.h>
int main(void)
{
int a = 1;
function(a++, ++a);
return 0;
}
Most examples of a function returning a value involve providing a pointer as one of the arguments
to allow the function to modify the value pointed to, similar to the following. The actual return value
of the function is usually some type such as an int to indicate the status of the result, whether it
worked or not.
https://riptutorial.com/ 154
}
However you can also use a struct as a return value which allows you to return both an error
status along with other values as well. For instance.
typedef struct {
int iStat; /* Return status */
int iValue; /* Return value */
} RetValue;
return iRetStatus;
}
if (iRet.iStat == 1) {
/* do things with iRet.iValue, the returned value */
}
return 0;
}
https://riptutorial.com/ 155
Chapter 25: Function Pointers
Introduction
Function pointers are pointers that point to functions instead of data types. They can be used to
allow variability in the function that is to be called, at run-time.
Syntax
• returnType (*name)(parameters)
Examples
Assigning a Function Pointer
#include <stdio.h>
int main(void)
{
int num = 0; /* declare number to increment */
int (*fp)(int); /* declare a function pointer */
https://riptutorial.com/ 156
return 0;
}
#include <stdio.h>
enum Op
{
ADD = '+',
SUB = '-',
};
int main(void)
{
int a, b, c;
int (*fp)(int,int);
fp = getmath(ADD);
a = 1, b = 2;
c = (*fp)(a, b);
printf("%d + %d = %d\n", a, b, c);
return 0;
}
Best Practices
Using typedef
https://riptutorial.com/ 157
It might be handy to use a typedef instead of declaring the function pointer each time by hand.
Example:
Posit that we have a function, sort, that expects a function pointer to a function compare such that:
"compare" is expected to return 0 if the two elements are deemed equal, a positive
value if the first element passed is "larger" in some sense than the latter element and
otherwise the function returns a negative value (meaning that the first element is
"lesser" than the latter).
Without a typedef we would pass a function pointer as an argument to a function in the following
manner:
Function pointers are the only place where you should include the pointer property of the type, e.g.
do not try to define types like typedef struct something_struct *something_type. This applies even
for a structure with members which are not supposed to accessed directly by API callers, for
example the stdio.h FILE type (which as you now will notice is not a pointer).
https://riptutorial.com/ 158
Taking context pointers.
A function pointer should almost always take a user-supplied void * as a context pointer.
Example
void caller()
{
/* context, the coefficients of the cubics */
double coeffs[8] = {1, 2, 3, 4, 5, 6, 7, 8};
double min;
Using the context pointer means that the extra parameters do not need to be hard-coded into the
function pointed to, or require the use globals.
The library function qsort() does not follow this rule, and one can often get away without context
for trivial comparison functions. But for anything more complicated, the context pointer becomes
essential.
See also
Functions pointers
Introduction
Just like char and int, a function is a fundamental feature of C. As such, you can declare a pointer
to one: which means that you can pass which function to call to another function to help it do its
job. For example, if you had a graph() function that displayed a graph, you could pass which
https://riptutorial.com/ 159
function to graph into graph().
Usage
So the above code will graph whatever function you passed into it - as long as that function meets
certain criteria: namely, that you pass a double in and get a double out. There are many functions
like that - sin(), cos(), tan(), exp() etc. - but there are many that aren't, such as graph() itself!
Syntax
So how do you specify which functions you can pass into graph() and which ones you can't? The
conventional way is by using a syntax that may not be easy to read or understand:
The problem above is that there are two things trying to be defined at the same time: the structure
of the function, and the fact that it's a pointer. So, split the two definitions! But by using typedef, a
better syntax (easier to read & understand) can be achieved.
All C functions are in actuality pointers to a spot in the program memory where some code exists.
The main use of a function pointer is to provide a "callback" to other functions (or to simulate
classes and objects).
https://riptutorial.com/ 160
returnType (*name)(parameters)
Basics
Just like you can have a pointer to an int, char, float, array/string, struct, etc. - you can have a
pointer to a function.
Declaring the pointer takes the return value of the function, the name of the function, and the
type of arguments/parameters it receives.
void Print(void){
printf("look ma' - no hands, only pointers!\n");
}
As seen in more advanced examples in this document, declaring a pointer to a function could get
messy if the function is passed more than a few parameters. If you have a few pointers to
functions that have identical "structure" (same type of return value, and same type of parameters)
it's best to use the typedef command to save you some typing, and to make the code more clear:
https://riptutorial.com/ 161
}
int main()
{
ptrInt ptr1 = Add;
ptrInt ptr2 = Multiply;
You can also create an Array of function-pointers. If all the pointers are of the same "structure":
It is also possible to define an array of function-pointers of different types, though that would
require casting when-ever you want to access the specific function. You can learn more here.
https://riptutorial.com/ 162
Chapter 26: Generic selection
Syntax
• _Generic ( assignment-expression , generic-assoc-list )
Parameters
Parameter Details
Remarks
1. All type qualifiers will be dropped during the evaluation of _Generic primary expression.
2. _Generic primary expression is evaluated at translation phase 7. So phases like string
concatenation have been finished before its evaluation.
Examples
Check whether a variable is of a certain qualified type
#include <stdio.h>
int main(void)
{
const int i = 1;
int j = 1;
double k = 1.0;
printf("i is %s\n", is_const_int(i));
printf("j is %s\n", is_const_int(j));
printf("k is %s\n", is_const_int(k));
}
Output:
i is a const int
j is a non-const int
k is of other type
https://riptutorial.com/ 163
However, if the type generic macro is implemented like this:
i is a non-const int
j is a non-const int
k is of other type
This is because all type qualifiers are dropped for the evaluation of the controlling expression of a
_Generic primary expression.
#include <stdio.h>
int main(void) {
print(42);
print(3.14);
print("hello, world");
}
Output:
int: 42
double: 3.14
unknown argument
Note that if the type is neither int nor double, a warning would be generated. To eliminate the
warning, you can add that type to the print(X) macro.
If a selection on multiple arguments for a type generic expression is wanted, and all types in
question are arithmetic types, an easy way to avoid nested _Generic expressions is to use addition
of the parameters in the controlling expression:
https://riptutorial.com/ 164
double max_double(double, double);
Here, the controlling expression (X)+(Y) is only inspected according to its type and not evaluated.
The usual conversions for arithmetic operands are performed to determine the selected type.
For more complex situation, a selection can be made based on more than one argument to the
operator, by nesting them together.
This example selects between four externally implemented functions, that take combinations of
two int and/or string arguments, and return their sum.
#define AddStr(y) \
_Generic((y), int: AddStrInt, \
char*: AddStrStr, \
const char*: AddStrStr )
#define AddInt(y) \
_Generic((y), int: AddIntInt, \
char*: AddIntStr, \
const char*: AddIntStr )
#define Add(x, y) \
_Generic((x) , int: AddInt(y) , \
char*: AddStr(y) , \
const char*: AddStr(y)) \
((x), (y))
int c = 1;
const char d[] = "0";
result = Add( d , ++c );
}
Even though it appears as if argument y is evaluated more than once, it isn't 1. Both arguments
are evaluated only once, at the end of macro Add: ( x , y ), just like in an ordinary function call.
https://riptutorial.com/ 165
1(Quoted from: ISO:IEC 9899:201X 6.5.1.1 Generic selection 3)
The controlling expression of a generic selection is not evaluated.
https://riptutorial.com/ 166
Chapter 27: Identifier Scope
Examples
Block Scope
An identifier has block scope if its corresponding declaration appears inside a block (parameter
declaration in function definition apply). The scope ends at the end of the corresponding block.
No different entities with the same identifier can have the same scope, but scopes may overlap. In
case of overlapping scopes the only visible one is the one declared in the innermost scope.
#include <stdio.h>
int main(void)
{
int foo = 3; // foo has scope main function block
printf("%d\n", foo); // 3
test(5);
printf("%d\n", foo); // 3
return 0;
} // end of scope for main:foo
#include <stdio.h>
/* The parameter name, apple, has function prototype scope. These names
are not significant outside the prototype itself. This is demonstrated
below. */
int main(void)
{
int orange = 5;
orange = test_function(orange);
printf("%d\r\n", orange); //orange = 6
return 0;
https://riptutorial.com/ 167
}
Note that you get puzzling error messages if you introduce a type name in a prototype:
struct whatever
{
int a;
// ...
};
Place the structure definition before the function declaration, or add struct whatever; as a line
before the function declaration, and there is no problem. You should not introduce new type
names in a function prototype because there's no way to use that type, and hence no way to
define or use that function.
File Scope
#include <stdio.h>
void test_function(void)
{
https://riptutorial.com/ 168
foo += 2;
}
int main(void)
{
foo = 1;
test_function();
printf("%d\r\n", foo); //foo = 3;
return 0;
}
Function scope
Function scope is the special scope for labels. This is due to their unusual property. A label is
visible through the entire function it is defined and one can jump (using instruction gotolabel) to it
from any point in the same function. While not useful, the following example illustrate the point:
#include <stdio.h>
INSIDE may seem defined inside the if block, as it is the case for i which scope is the block, but it
is not. It is visible in the whole function as the instruction goto INSIDE; illustrates. Thus there can't
be two labels with the same identifier in a single function.
A possible usage is the following pattern to realize correct complex cleanups of allocated
ressources:
#include <stdlib.h>
#include <stdio.h>
void a_function(void) {
double* a = malloc(sizeof(double[34]));
if (!a) {
fprintf(stderr,"can't allocate\n");
return; /* No point in freeing a if it is null */
}
FILE* b = fopen("some_file","r");
if (!b) {
fprintf(stderr,"can't open\n");
goto CLEANUP1; /* Free a; no point in closing b */
}
/* do something reasonable */
if (error) {
https://riptutorial.com/ 169
fprintf(stderr,"something's wrong\n");
goto CLEANUP2; /* Free a and close b to prevent leaks */
}
/* do yet something else */
CLEANUP2:
close(b);
CLEANUP1:
free(a);
}
Labels such as CLEANUP1 and CLEANUP2 are special identifiers that behave differently from all other
identifiers. They are visible from everywhere inside the function, even in places that are executed
before the labeled statement, or even in places that could never be reached if none of the goto is
executed. Labels are often written in lower-case rather than upper-case.
https://riptutorial.com/ 170
Chapter 28: Implementation-defined
behaviour
Remarks
Overview
The C standard describes the language syntax, the functions provided by the standard library, and
the behavior of conforming C processors (roughly speaking, compilers) and conforming C
programs. With respect to behavior, the standard for the most part specifies particular behaviors
for programs and processors. On the other hand, some operations have explicit or implicit
undefined behavior -- such operations are always to be avoided, as you cannot rely on anything
about them. In between, there are a variety of implementation defined behaviors. These behaviors
may vary between C processors, runtimes, and standard libraries (collectively, implementations),
but they are consistent and reliable for any given implementation, and conforming implementations
document their behavior in each of these areas.
The balance of these remarks constitute a list of all the implementation-defined behaviors and
characteristics specified in the C2011 standard, with references to the standard. Many of them use
the terminology of the standard. Some others rely more generally on the context of the standard,
such as the eight stages of translating source code into a program, or the difference between
hosted and freestanding implementations. Some that may be particularly surprising or notable are
presented in bold typeface. Not all the behaviors described are supported by earlier C standards,
but generally speaking, they have implementation-defined behavior in all versions of the standard
that support them.
• The number of bits in one byte (3.6/3). At least 8, the actual value can be queried with the
macro CHAR_BIT.
https://riptutorial.com/ 171
Source translation
• The manner in which physical source file multibyte characters are mapped to the source
character set (5.1.1.2/1).
• The execution-set character(s) to which character literals and characters in string constants
are converted (during translation phase 5) when there is otherwise no corresponding
character (5.1.1.2/1).
Operating environment
• The manner in which the diagnostic messages to be emitted are identified (5.1.1.3/1).
• The name and type of the function called at startup in a freestanding implementation (
5.1.2.1/1).
• In a hosted environment, any allowed signatures for the main() function other than int
main(int argc, char *arg[]) and int main(void) (5.1.2.2.1/1).
• The manner in which a hosted implementation defines the strings pointed to by the second
argument to main() (5.1.2.2.1/2).
• What constitutes an "interactive device" for the purpose of sections 5.1.2.3 (Program
Execution) and 7.21.3 (Files) (5.1.2.3/7).
• The char values corresponding to the defined alphabetic escape sequences (5.2.2/3).
• The accuracy of floating-point arithmetic operations and of the standard library's conversions
from internal floating point representations to string representations (5.2.4.2.2/6).
• The value of macro FLT_ROUNDS, which encodes the default floating-point rounding mode (
5.2.4.2.2/8).
https://riptutorial.com/ 172
• The rounding behaviors characterized by supported values of FLT_ROUNDS greater than 3 or
less than -1 (5.2.4.2.2/8).
Types
• The result of attempting to (indirectly) access an object with thread storage duration from a
thread other than the one with which the object is associated (6.2.4/4)
• The value of a char to which a character outside the basic execution set has been assigned (
6.2.5/3).
• The supported extended signed integer types, if any, (6.2.5/4), and any extension keywords
used to identify them.
• Whether char has the same representation and behavior as signed char or as unsigned
char (6.2.5/15). Can be queried with CHAR_MIN, which is either 0 or SCHAR_MIN if char is unsigned
or signed, respectively.
• The number, order, and encoding of bytes in the representations of objects, except
where explicitly specified by the standard (6.2.6.1/2).
• Which of the three recognized forms of integer representation applies in any given
situation, and whether certain bit patterns of integer objects are trap representations (
6.2.6.2/2).
• Whether and in what contexts any extended alignments are supported (6.2.8/3).
• The integer conversion ranks of any extended signed integer types relative to each other (
6.3.1.1/1).
• When an in-range but unrepresentable value is assigned to a floating-point object, how the
representable value stored in the object is chosen from between the two nearest
representable values (6.3.1.4/2; 6.3.1.5/1; 6.4.4.2/3).
• The result of converting an integer to a pointer type, except for integer constant
https://riptutorial.com/ 173
expressions with value 0 (6.3.2.3/5).
Source form
• The locations within #pragma directives where header name tokens are recognized (6.4/4).
• The characters, including multibyte characters, other than underscore, unaccented Latin
letters, universal character names, and decimal digits that may appear in identifiers (
6.4.2.1/1).
• With some exceptions, the manner in which the source characters in an integer character
constant are mapped to execution-set characters (6.4.4.4/2; 6.4.4.4/10).
• The current locale used for computing the value of a wide character constant, and most
other aspects of the conversion for many such constants (6.4.4.4/11).
• Whether differently-prefixed wide string literal tokens can be concatenated and, if so, the
treatment of the resulting multibyte character sequence (6.4.5/5)
• The locale used during translation phase 7 to convert wide string literals to multibyte
character sequences, and their value when the result is not representable in the execution
character set (6.4.5/6).
• The manner in which header names are mapped to file names (6.4.7/2).
Evaluation
• Whether and how floating-point expressions are contracted when FP_CONTRACT is not used (
6.5/8).
• The values of the results of the sizeof and _Alignof operators (6.5.3.4/5).
Runtime behavior
• Whether the type of a bitfield declared as int is the same type as unsigned int or as signed
int (6.7.2/5).
• What types bitfields may take, other than optionally-qualified _Bool, signed int, and unsigned
int; whether bitfields may have atomic types (6.7.2.1/5).
• Aspects of how implementations lay out the storage for bitfields (6.7.2.1/11).
https://riptutorial.com/ 174
• The alignment of non-bitfield members of structures and unions (6.7.2.1/14).
Preprocessor
• Whether character constants are converted to integer values the same way in preprocessor
conditionals as in ordinary expressions, and whether a single-character constant may have a
negative value (6.10.1/4).
• The manner in which a header name is formed from the tokens of a multi-token #include
directive (6.10.2/4).
• Whether a \ character is inserted before the \ introducing a universal character name in the
result of the preprocessor's # operator (6.10.3.2/2).
• The behavior of the #pragma preprocessing directive for pragmas other than STDC (6.10.6/1).
• The value of the __DATE__ and __TIME__ macros if no translation date or time, respectively, is
available (6.10.8.1/1).
• The internal character encoding used for wchar_t if macro __STDC_ISO_10646__ is not defined (
6.10.8.2/1).
• The internal character encoding used for char32_t if macro __STDC_UTF_32__ is not defined (
6.10.8.2/1).
Standard Library
General
• Any additional floating-point exceptions beyond those defined by the standard (7.6/6).
• Any additional floating-point rounding modes beyond those defined by the standard (7.6/8).
• Any additional floating-point environments beyond those defined by the standard (7.6/10).
https://riptutorial.com/ 175
• The default value of the floating-point environment access switch (7.6.1/2).
• Whether the feraiseexcept() function additionally raises the "inexact" floating-point exception
whenever it raises the "overflow" or "underflow" floating-point exception (7.6.2.3/2).
Locale-related functions
Math functions
• The types represented by float_t and double_t when the FLT_EVAL_METHOD macro has a value
different from 0, 1, and 2 (7.12/2).
• Any supported floating-point classifications beyond those defined by the standard (7.12/6).
• The value returned by the math.h functions in the event of a domain error (7.12.1/2).
• The value returned by the math.h functions in the event of a pole error (7.12.1/3).
• The value returned by the math.h functions when the result underflows, and aspects of
whether errno is set to ERANGE and whether a floating-point exception is raised under those
circumstances (7.12.1/6).
• Whether the fmod() functions return 0 or raise a domain error when their second argument is
0 (7.12.10.1/3).
• Whether the remainder() functions return 0 or raise a domain error when their second
argument is 0 (7.12.10.2/3).
• The number of significant bits in the quotient moduli computed by the remquo() functions (
7.12.10.3/2).
• Whether the remquo() functions return 0 or raise a domain error when their second argument
is 0 (7.12.10.3/3).
Signals
• The complete set of supported signals, their semantics, and their default handling (7.14/4).
• When a signal is raised and there is a custom handler associated with that signal, which
signals, if any, are blocked for the duration of the execution of the handler (7.14.1.1/3).
• Which signals other than SIGFPE, SIGILL, and SIGSEGV cause the behavior upon returning from
a custom signal handler to be undefined (7.14.1.1/3).
https://riptutorial.com/ 176
• Which signals are initially configured to be ignored (regardless of their default handling;
7.14.1.1/6).
Miscellaneous
• The specific null pointer constant to which macro NULL expands (7.19/3).
File-handling functions
• Whether the last line of a text stream requires a terminating newline (7.21.2/2).
• Whether the same file can simultaneously be open multiple times (7.21.3/8).
• The behavior of the remove() function when the target file is open (7.21.4.1/2).
• The behavior of the rename() function when the target file already exists (7.21.4.2/2).
• Whether files created via the tmpfile() function are removed in the event that the program
terminates abnormally (7.21.4.3/2).
• Which mode changes under which circumstances are permitted via freopen() (7.21.5.4/3).
I/O functions
• Which of the permitted representations of infinite and not-a-number FP values are produced
by the printf()-family functions (7.21.6.1/8).
• The manner in which pointers are formatted by the printf()-family functions (7.21.6.1/8).
• The behavior of scanf()-family functions when the - character appears in an internal position
of the scanlist of a [ field (7.21.6.2/12).
https://riptutorial.com/ 177
• The errno value set by ftell() on failure (7.21.9.4/3).
• The meaning to the strtod()-family functions of some supported aspects of a NaN formatting
(7.22.1.3p4).
• Whether the strtod()-family functions set errno to ERANGE when the result underflows (
7.22.1.3/10).
• The behavior of the memory-allocation functions when the number of bytes requested is 0 (
7.22.3/1).
• What cleanups, if any, are performed and what status is returned to the host OS when the
abort() function is called (7.22.4.1/2).
• What status is returned to the host environment when exit() is called (7.22.4.4/5).
• The handling of open streams and what status is returned to the host environment when
_Exit() is called (7.22.4.5/2).
• The set of environment names accessible via getenv() and the method for altering the
environment (7.22.4.6/2).
• The range and precision of times representable via types clock_t and time_t (7.27.1/4).
• The beginning of the era that serves as the reference for the times returned by the clock()
function (7.27.2.1/3).
• The beginning of the epoch that serves as the reference for the times returned by the
timespec_get() function (when the time base is TIME_UTC; 7.27.2.5/3).
• The strftime() replacement for the %Z conversion specifier in the "C" locale (7.27.3.5/7).
• Which of the permitted representations of infinite and not-a-number FP values are produced
by the wprintf()-family functions (7.29.2.1/8).
• The manner in which pointers are formatted by the wprintf()-family functions (7.29.2.1/8).
https://riptutorial.com/ 178
position of the scanlist of a [ field (7.29.2.2/12).
• The meaning to the wstrtod()-family functions of some supported aspects of NaN formatting
(7.29.4.1.1/4).
• Whether the wstrtod()-family functions set errno to ERANGE when the result underflows (
7.29.4.1.1/10).
Examples
Right shift of a negative integer
// Supposing SCHAR_MAX, the maximum value that can be represented by a signed char, is
// 127, the behavior of this assignment is implementation-defined:
signed char integer;
integer = 128;
// The allocation functions have implementation-defined behavior when the requested size
// of the allocation is zero.
void *p = malloc(0);
Each signed integer type may be represented in any one of three formats; it is implementation-
defined which one is used. The implementation in use for any given signed integer type at least as
wide as int can be determined at runtime from the two lowest-order bits of the representation of
value -1 in that type, like so:
switch (SIGN_REP(long)) {
case sign_magnitude: { /* do something */ break; }
case ones_compl: { /* do otherwise */ break; }
case twos_compl: { /* do yet else */ break; }
case 0: { _Static_assert(SIGN_REP(long), "bogus sign representation"); }
}
https://riptutorial.com/ 179
The same pattern applies to the representation of narrower types, but they cannot be tested by
this technique because the operands of & are subject to "the usual arithmetic conversions" before
the result is computed.
https://riptutorial.com/ 180
Chapter 29: Implicit and Explicit Conversions
Syntax
• Explicit Conversion (aka "Casting"): (type) expression
Remarks
"Explicit conversion" is also commonly referred to as "casting".
Examples
Integer Conversions in Function Calls
Given that the function has a proper prototype, integers are widened for calls to functions
according to the rules of integer conversion, C11 6.3.1.3.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
Usually you should not truncate a wide signed type to a narrower signed type, because obviously
the values can't fit and there is no clear meaning that this should have. The C standard cited
above defines these cases to be "implementation-defined", that is, they are not portable.
#include <stdio.h>
#include <stdint.h>
https://riptutorial.com/ 181
void param_u64(uint64_t val) {
printf("%s val is " PRI64u "\n", __func__, val); /* Fixed with format string */
}
int main(void) {
return 0;
}
Pointer conversions to void* are implicit, but any other pointer conversion must be explicit. While
the compiler allows an explicit conversion from any pointer-to-data type to any other pointer-to-
data type, accessing an object through a wrongly typed pointer is erroneous and leads to
undefined behavior. The only case that these are allowed are if the types are compatible or if the
pointer with which your are looking at the object is a character type.
https://riptutorial.com/ 182
#include <stdio.h>
struct struct_b {
int a;
int b;
} data_b;
int main(void) {
/*
* Explicit ptr conversion for other types
*
* Note that here although the have identical definitions,
* the types are not compatible, and that the this call is
* erroneous and leads to undefined behavior on execution.
*/
func_struct_b((struct struct_b*)&data_a);
/* My output shows: */
/* func_charp Address of ptr is 0x601030 */
/* func_voidp Address of ptr is 0x601030 */
/* func_struct_b Address of ptr is 0x601030 */
return 0;
}
https://riptutorial.com/ 183
Chapter 30: Initialization
Examples
Initialization of Variables in C
In the absence of explicit initialization, external and static variables are guaranteed to be
initialized to zero; automatic variables (including register variables) have indeterminate1 (i.e.,
garbage) initial values.
Scalar variables may be initialized when they are defined by following the name with an equals
sign and an expression:
int x = 1;
char squota = '\'';
long day = 1000L * 60L * 60L * 24L; /* milliseconds/day */
For external and static variables, the initializer must be a constant expression2; the initialization is
done once, conceptually before the program begins execution.
For automatic and register variables, the initializer is not restricted to being a constant: it may be
any expression involving previously defined values, even function calls.
instead of
low = 0;
high = n - 1;
In effect, initialization of automatic variables are just shorthand for assignment statements. Which
form to prefer is largely a matter of taste. We generally use explicit assignments, because
initializers in declarations are harder to see and further away from the point of use. On the other
hand, variables should only be declared when they're about to be used whenever possible.
Initializing an array:
An array may be initialized by following its declaration with a list of initializers enclosed in braces
https://riptutorial.com/ 184
and separated by commas.
For example, to initialize an array days with the number of days in each month:
int days_of_month[] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 }
When the size of the array is omitted, the compiler will compute the length by counting the
initializers, of which there are 12 in this case.
If there are fewer initializers for an array than the specified size, the others will be zero for all types
of variables.
It is an error to have too many initializers. There is no standard way to specify repetition of an
initializer — but GCC has an extension to do so.
C99
In C89/C90 or earlier versions of C, there was no way to initialize an element in the middle of an
array without supplying all the preceding values as well.
C99
With C99 and above, designated initializers allow you to initialize arbitrary elements of an array,
leaving any uninitialized values as zeros.
Character arrays are a special case of initialization; a string may be used instead of the braces
and commas notation:
In this case, the array size is six (five characters plus the terminating '\0').
2 Note that a constant expression is defined as something that can be evaluated at compile-time.
So, int global_var = f(); is invalid. Another common misconception is thinking of a const qualified
variable as a constant expression. In C, const means "read-only", not "compile time constant". So,
global definitions like const int SIZE = 10; int global_arr[SIZE]; and const int SIZE = 10; int
global_var = SIZE; are not legal in C.
https://riptutorial.com/ 185
Structures and arrays of structures can be initialized by a series of values enclosed in braces, one
value per member of the structure.
struct Date
{
int year;
int month;
int day;
};
Note that the array initialization could be written without the interior braces, and in times past
(before 1990, say) often would have been written without them:
Although this works, it is not good modern style — you should not attempt to use this notation in
new code and should fix the compiler warnings it usually yields.
C99
C99 introduced the concept of designated initializers. These allow you to specify which elements
of an array, structure or union are to be initialized by the values following.
int array[] = { [4] = 29, [5] = 31, [17] = 101, [18] = 103, [19] = 107, [20] = 109 };
The term in square brackets, which can be any constant integer expression, specifies which
element of the array is to be initialized by the value of the term after the = sign. Unspecified
elements are default initialized, which means zeros are defined. The example shows the
designated initializers in order; they do not have to be in order. The example shows gaps; those
are legitimate. The example doesn't show two different initializations for the same element; that
https://riptutorial.com/ 186
too is allowed (ISO/IEC 9899:2011, §6.7.9 Initialization, ¶19 The initialization shall occur in
initializer list order, each initializer provided for a particular subobject overriding any previously
listed initializer for the same subobject).
In this example, the size of the array is not defined explicitly, so the maximum index specified in
the designated initializers dictates the size of the array — which would be 21 elements in the
example. If the size was defined, initializing an entry beyond the end of the array would be an
error, as usual.
You can specify which elements of a structure are initialized by using the .element notation:
struct Date
{
int year;
int month;
int day;
};
You can specify which element of a union is initialize with a designated initializer.
C89
Prior to the C standard, there was no way to initialize a union. The C89/C90 standard allows you to
initialize the first member of a union — so the choice of which member is listed first matters.
struct discriminated_union
{
enum { DU_INT, DU_DOUBLE } discriminant;
union
{
int du_int;
double du_double;
} du;
};
C11
Note that C11 allows you to use anonymous union members inside a structure, so that you don't
need the du name in the previous example:
https://riptutorial.com/ 187
struct discriminated_union
{
enum { DU_INT, DU_DOUBLE } discriminant;
union
{
int du_int;
double du_double;
};
};
These constructs can be combined for arrays of structures containing elements that are arrays,
etc. Using full sets of braces ensures that the notation is unambiguous.
struct date_range
{
Date dr_from;
Date dr_to;
char dr_what[80];
};
GCC provides an extension that allows you to specify a range of elements in an array that should
be given the same initializer:
The triple dots need to be separate from the numbers lest one of the dots be interpreted as part of
a floating point number (maximimal munch rule).
https://riptutorial.com/ 188
Chapter 31: Inline assembly
Remarks
Inline assembly is the practice of adding assembly instructions in the middle of C source code. No
ISO C standard requires support of inline assembly. Since it is not required, the syntax for inline
assembly varies from compiler to compiler. Even though it is typically supported there are very few
reasons to use inline assembly and many reasons not to.
Pros
1. Performance By writing the specific assembly instructions for an operation, you can achieve
better performance than the assembly code generated by the compiler. Note that these
performance gains are rare. In most cases you can achieve better performance gains just by
rearranging your C code so the optimizer can do its job.
2. Hardware interface Device driver or processor startup code may need some assembly code
to access the correct registers and to guarantee certain operations occur in a specific order
with a specific delay between operations.
Cons
1. Compiler Portability Syntax for inline assembly is not guaranteed to be the same from one
compiler to another. If you are writing code with inline assembly that should be supported by
different compilers, use preprocessor macros (#ifdef) to check which compiler is being used.
Then, write a separate inline assembly section for each supported compiler.
2. Processor Portability You can't write inline assembly for an x86 processor and expect it to
work on an ARM processor. Inline assembly is intended to be written for a specific processor
or processor family. If you have inline assembly that you want supported on different
processors, use preprocessor macros to check which processor the code is being compiled
for and to select the appropriate assembly code section.
3. Future Performance Changes Inline assembly may be written expecting delays based
upon a certain processor clock speed. If the program is compiled for a processor with a
faster clock, the assembly code may not perform as expected.
Examples
gcc Basic asm support
where AssemblerInstructions is the direct assembly code for the given processor. The volatile
https://riptutorial.com/ 189
keyword is optional and has no effect as gcc does not optimize code within a basic asm statement.
AssemblerInstructions can contain multiple assembly instructions. A basic asm statement is used if
you have an asm routine that must exist outside of a C function. The following example is from the
GCC manual:
In this example, you could then use DebugBreak() in other places in your code and it will execute
the assembly instruction int $3. Note that even though gcc will not modify any code in a basic asm
statement, the optimizer may still move consecutive asm statements around. If you have multiple
assembly instructions that must occur in a specific order, include them in one asm statement.
where AssemblerTemplate is the template for the assembler instruction, OutputOperands are any C
variables that can be modified by the assembly code, InputOperands are any C variables used as
input parameters, Clobbers are a list or registers that are modified by the assembly code, and
GotoLabels are any goto statement labels that may be used in the assembly code.
The extended format is used within C functions and is the more typical usage of inline assembly.
Below is an example from the Linux kernel for byte swapping 16-bit and 32-bit numbers for an
ARM processor:
https://riptutorial.com/ 190
#define __arch_swab32 __arch_swab32
#endif
Each asm section uses the variable x as its input and output parameter. The C function then
returns the manipulated result.
With the extended asm format, gcc may optimize the assembly instructions in an asm block
following the same rules it uses for optimizing C code. If you want your asm section to remain
untouched, use the volatile keyword for the asm section.
We can put assembly instructions inside a macro and use the macro like you would call a function.
#define mov(x,y) \
{ \
__asm__ ("l.cmov %0,%1,%2" : "=r" (x) : "r" (y), "r" (0x0000000F)); \
}
///Using
mov(state[0][1], sbox[si][sj]);
Using inline assembly instructions embedded in C code can improve the run time of a program.
This is very helpful in time critical situations like cryptographic algorithms such as AES. For
example, for a simple shift operation that is needed in the AES algorithm, we can substitute a
direct Rotate Right assembly instruction with C shift operator >>.
We can change three shift + assign and one assign C expression with only one assembly Rotate
Right operation.
__asm__ ("l.ror %0,%1,%2" : "=r" (* (unsigned int *) subkey) : "r" (w), "r" (0x10));
https://riptutorial.com/ 191
Read Inline assembly online: https://riptutorial.com/c/topic/4263/inline-assembly
https://riptutorial.com/ 192
Chapter 32: Inlining
Examples
Inlining functions used in more than one source file
For small functions that get called often, the overhead associated with the function call can be a
significant fraction of the total execution time of that function. One way of improving performance,
then, is to eliminate the overhead.
In this example we use four functions (plus main()) in three source files. Two of those (plusfive()
and timestwo()) each get called by the other two located in "source1.c" and "source2.c". The main()
is included so we have a working example.
main.c:
#include <stdio.h>
#include <stdlib.h>
#include "headerfile.h"
int main(void) {
int start = 3;
int intermediate = complicated1(start);
printf("First result is %d\n", intermediate);
intermediate = complicated2(start);
printf("Second result is %d\n", intermediate);
return 0;
}
source1.c:
#include <stdio.h>
#include <stdlib.h>
#include "headerfile.h"
source2.c:
#include <stdio.h>
#include <stdlib.h>
#include "headerfile.h"
https://riptutorial.com/ 193
return tmp;
}
headerfile.h:
#ifndef HEADERFILE_H
#define HEADERFILE_H
#endif
Functions timestwo and plusfive get called by both complicated1 and complicated2, which are in
different "translation units", or source files. In order to use them in this way, we have to define
them in the header.
We use the -O2 optimization option because some compilers don't inline without optimization
turned on.
The effect of the inline keyword is that the function symbol in question is not emitted into the
object file. Otherwise an error would occur in the last line, where we are linking the object files to
form the final executable. If we would not have inline, the same symbol would be defined in both
.o files, and a "multiply defined symbol" error would occur.
In situations where the symbol is actually needed, this has the disadvantage that the symbol is not
produced at all. There are two possibilities to deal with that. The first is to add an extra extern
declaration of the inlined functions in exactly one of the .c files. So add the following to source1.c:
The other possibility is to define the function with static inline instead of inline. This method has
the drawback that eventually a copy of the function in question may be produced in every object
file that is produced with this header.
https://riptutorial.com/ 194
Chapter 33: Interprocess Communication
(IPC)
Introduction
Inter-process communication (IPC) mechanisms allow different independent processes to
communicate with each other. Standard C does not provide any IPC mechanisms. Therefore, all
such mechanisms are defined by the host operating system. POSIX defines an extensive set of
IPC mechanisms; Windows defines another set; and other systems define their own variants.
Examples
Semaphores
Semaphores are used to synchronize operations between two or more processes. POSIX defines
two different sets of semaphore functions:
This section describes the System V IPC semaphores, so called because they originated with Unix
System V.
First, you'll need to include the required headers. Old versions of POSIX required #include
<sys/types.h>; modern POSIX and most systems do not require it.
#include <sys/sem.h>
Then, you'll need to define a key in both the parent as well as the child.
This key needs to be the same in both programs or they will not refer to the same IPC structure.
There are ways to generate an agreed key without hard-coding its value.
Next, depending on your compiler, you may or may not need to do this step: declare a union for
the purpose of semaphore operations.
union semun {
int val;
struct semid_ds *buf;
unsigned short *array;
};
https://riptutorial.com/ 195
Next, define your try (semwait) and raise (semsignal) structures. The names P and V originate from
Dutch
int id;
// 2nd argument is number of semaphores
// 3rd argument is the mode (IPC_CREAT creates the semaphore set if needed)
if ((id = semget(KEY, 1, 0666 | IPC_CREAT) < 0) {
/* error handling code */
}
union semun u;
u.val = 1;
if (semctl(id, 0, SETVAL, u) < 0) { // SETVAL is a macro to specify that you're setting the
value of the semaphore to that specified by the union u
/* error handling code */
}
Now, you can decrement or increment the semaphore as you need. At the start of your critical
section, you decrement the counter using the semop() function:
Note that every function returns 0 on success and -1 on failure. Not checking these return statuses
can cause devastating problems.
The below program will have a process fork a child and both parent and child attempt to print
characters onto the terminal without any synchronization.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
https://riptutorial.com/ 196
int main()
{
int pid;
pid = fork();
srand(pid);
if(pid < 0)
{
perror("fork"); exit(1);
}
else if(pid)
{
char *s = "abcdefgh";
int l = strlen(s);
for(int i = 0; i < l; ++i)
{
putchar(s[i]);
fflush(stdout);
sleep(rand() % 2);
putchar(s[i]);
fflush(stdout);
sleep(rand() % 2);
}
}
else
{
char *s = "ABCDEFGH";
int l = strlen(s);
for(int i = 0; i < l; ++i)
{
putchar(s[i]);
fflush(stdout);
sleep(rand() % 2);
putchar(s[i]);
fflush(stdout);
sleep(rand() % 2);
}
}
}
aAABaBCbCbDDcEEcddeFFGGHHeffgghh
(2nd run):
aabbccAABddBCeeCffgDDghEEhFFGGHH
Compiling and running this program should give you a different output each time .
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
https://riptutorial.com/ 197
#include <string.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
union semun {
int val;
struct semid_ds *buf;
unsigned short *array;
};
int main()
{
int id = semget(KEY, 1, 0666 | IPC_CREAT);
if(id < 0)
{
perror("semget"); exit(11);
}
union semun u;
u.val = 1;
if(semctl(id, 0, SETVAL, u) < 0)
{
perror("semctl"); exit(12);
}
int pid;
pid = fork();
srand(pid);
if(pid < 0)
{
perror("fork"); exit(1);
}
else if(pid)
{
char *s = "abcdefgh";
int l = strlen(s);
for(int i = 0; i < l; ++i)
{
if(semop(id, &p, 1) < 0)
{
perror("semop p"); exit(13);
}
putchar(s[i]);
fflush(stdout);
sleep(rand() % 2);
putchar(s[i]);
fflush(stdout);
if(semop(id, &v, 1) < 0)
{
perror("semop p"); exit(14);
}
sleep(rand() % 2);
}
}
else
{
https://riptutorial.com/ 198
char *s = "ABCDEFGH";
int l = strlen(s);
for(int i = 0; i < l; ++i)
{
if(semop(id, &p, 1) < 0)
{
perror("semop p"); exit(15);
}
putchar(s[i]);
fflush(stdout);
sleep(rand() % 2);
putchar(s[i]);
fflush(stdout);
if(semop(id, &v, 1) < 0)
{
perror("semop p"); exit(16);
}
sleep(rand() % 2);
}
}
}
Output:
aabbAABBCCccddeeDDffEEFFGGHHgghh
Compiling and running this program will give you the same output each time.
https://riptutorial.com/ 199
Chapter 34: Iteration Statements/Loops: for,
while, do-while
Syntax
• /* all versions */
• for ([expression]; [expression]; [expression]) one_statement
• for ([expression]; [expression]; [expression]) { zero or several statements }
• while (expression) one_statement
• while (expression) { zero or several statements }
• do one_statement while (expression);
• do { one or more statements } while (expression);
• // since C99 in addition to the form above
• for (declaration; [expression]; [expression]) one_statement;
• for (declaration; [expression]; [expression]) { zero or several statements }
Remarks
Iteration Statement/Loops fall into two categories:
C99
Examples
For loop
In order to execute a block of code over an over again, loops comes into the picture. The for loop
is to be used when a block of code is to executed a fixed number of times. For example, in order
https://riptutorial.com/ 200
to fill an array of size n with the user inputs, we need to execute scanf() for n times.
C99
In this way the scanf() function call is executed n times (10 times in our example), but is written
only once.
Here, the variable i is the loop index, and it is best declared as presented. The type size_t (size
type) should be used for everything that counts or loops through data objects.
This way of declaring variables inside the for is only available for compilers that have been
updated to the C99 standard. If for some reason you are still stuck with an older compiler you can
declare the loop index before the for loop:
C99
While loop
A while loop is used to execute a piece of code while a condition is true. The while loop is to be
used when a block of code is to be executed a variable number of times. For example the code
shown gets the user input, as long as the user inserts numbers which are not 0. If the user inserts
0, the while condition is not true anymore so execution will exit the loop and continue on to any
subsequent code:
int num = 1;
while (num != 0)
{
scanf("%d", &num);
}
Do-While loop
Unlike for and while loops, do-while loops check the truth of the condition at the end of the loop,
https://riptutorial.com/ 201
which means the do block will execute once, and then check the condition of the while at the
bottom of the block. Meaning that a do-while loop will always run at least once.
For example this do-while loop will get numbers from user, until the sum of these values is greater
than or equal to 50:
do
{
scanf("%d", &num);
sum += num;
In a for loop, the loop condition has three expressions, all optional.
C99
C99
Historical versions of C only allowed an expression, here, and the declaration of a loop variable
had to be placed before the for.
• The second expression, expression2, is the test condition. It is first executed after the
initialization. If the condition is true, then the control enters the body of the loop. If not, it
shifts to outside the body of the loop at the end of the loop. Subsequently, this conditon is
checked after each execution of the body as well as the update statement. When true, the
control moves back to the beginning of the body of the loop. The condition is usually
intended to be a check on the number of times the body of the loop executes. This is the
primary way of exiting a loop, the other way being using jump statements.
• The third expression, expression3, is the update statement. It is executed after each
execution of the body of the loop. It is often used to increment a variable keeping count of
the number of times the loop body has executed, and this variable is called an iterator.
https://riptutorial.com/ 202
Each instance of execution of the loop body is called an iteration.
Example:
C99
0123456789
In the above example, first i = 0 is executed, initializing i. Then, the condition i < 10 is checked,
which evaluates to be true. The control enters the body of the loop and the value of i is printed.
Then, the control shifts to i++, updating the value of i from 0 to 1. Then, the condition is again
checked, and the process continues. This goes on till the value of i becomes 10. Then, the
condition i < 10 evaluates false, after which the control moves out of the loop.
Infinite Loops
A loop is said to be an infinite loop if the control enters but never leaves the body of the loop. This
happens when the test condition of the loop never evaluates to false.
Example:
C99
In the above example, the variable i, the iterator, is initialized to 0. The test condition is initially
true. However, i is not modified anywhere in the body and the update expression is empty.
Hence, i will remain 0, and the test condition will never evaluate to false, leading to an infinite
loop.
Assuming that there are no jump statements, another way an infinite loop might be formed is by
explicitly keeping the condition true:
while (true)
{
/* body of the loop */
}
In a for loop, the condition statement optional. In this case, the condition is always true vacuously,
leading to an infinite loop.
https://riptutorial.com/ 203
for (;;)
{
/* body of the loop */
}
However, in certain cases, the condition might be kept true intentionally, with the intention of
exiting the loop using a jump statement such as break.
while (true)
{
/* statements */
if (condition)
{
/* more statements */
break;
}
}
Sometimes, the straight forward loop cannot be entirely contained within the loop body. This is
because, the loop needs to be primed by some statements B. Then, the iteration begins with some
statements A, which are then followed by B again before looping.
do_B();
while (condition) {
do_A();
do_B();
}
To avoid potential cut/paste problems with repeating B twice in the code, Duff's Device could be
applied to start the loop from the middle of the while body, using a switch statement and fall
through behavior.
Duff's Device was actually invented to implement loop unrolling. Imagine applying a mask to a
block of memory, where n is a signed integral type with a positive value.
do {
*ptr++ ^= mask;
} while (--n > 0);
do {
*ptr++ ^= mask;
*ptr++ ^= mask;
*ptr++ ^= mask;
https://riptutorial.com/ 204
*ptr++ ^= mask;
} while ((n -= 4) > 0);
But, with Duff's Device, the code can follow this unrolling idiom that jumps into the right place in
the middle of the loop if n is not divisible by 4.
switch (n % 4) do {
case 0: *ptr++ ^= mask; /* FALL THROUGH */
case 3: *ptr++ ^= mask; /* FALL THROUGH */
case 2: *ptr++ ^= mask; /* FALL THROUGH */
case 1: *ptr++ ^= mask; /* FALL THROUGH */
} while ((n -= 4) > 0);
This kind of manual unrolling is rarely required with modern compilers, since the compiler's
optimization engine can unroll loops on the programmer's behalf.
https://riptutorial.com/ 205
Chapter 35: Jump Statements
Syntax
• return val; /* Returns from the current function. val can be a value of any type that is converts
to the function's return type. */
• return; /* Returns from the current void-function. */
• break; /* Unconditionally jumps beyond the end ("breaks out") of an Iteration Statement
(loop) or out of the innermost switch statement. */
• continue; /* Unconditionally jumps to the beginning of an Iteration Statement (loop). */
• goto LBL; /* Jumps to label LBL. */
• LBL: statement /* any statement in the same function. */
Remarks
These are the jumps that are integrated into C by means of keywords.
C also has another jump construct, long jump, that is specified with a data type, jmp_buf, and C
library calls, setjmp and longjmp.
See also
Iteration Statements/Loops: for, while, do-while
Examples
Using goto to jump out of nested loops
Jumping out of nested loops would usually require use of a boolean variable with a check for this
variable in the loops. Supposing we are iterating over i and j, it could look like this
size_t i,j;
for (i = 0; i < myValue && !breakout_condition; ++i) {
for (j = 0; j < mySecondValue && !breakout_condition; ++j) {
... /* Do something, maybe modifying breakout_condition */
/* When breakout_condition == true the loops end */
}
}
But the C language offers the goto clause, which can be useful in this case. By using it with a label
declared after the loops, we can easily break out of the loops.
size_t i,j;
for (i = 0; i < myValue; ++i) {
for (j = 0; j < mySecondValue; ++j) {
...
https://riptutorial.com/ 206
if(breakout_condition)
goto final;
}
}
final:
However, often when this need comes up a return could be better used instead. This construct is
also considered "unstructured" in structural programming theory.
/* normal processing */
free(ptr);
return SUCCESS;
out_of_memory:
free(ptr); /* harmless, and necessary if we have further errors */
return FAILURE;
Use of goto keeps error flow separate from normal program control flow. It is however also
considered "unstructured" in the technical sense.
Using return
Returning a value
One commonly used case: returning from main()
/* Do stuff. */
return EXIT_SUCCESS;
}
Additional notes:
1. For a function having a return type as void (not including void * or related types), the return
statement should not have any associated expression; i.e, the only allowed return statement
would be return;.
https://riptutorial.com/ 207
2. For a function having a non-void return type, the return statement shall not appear without
an expression.
3. For main() (and only for main()), an explicit return statement is not required (in C99 or later).
If the execution reaches the terminating }, an implicit value of 0 is returned. Some people
think omitting this return is bad practice; others actively suggest leaving it out.
Returning nothing
Returning from a void function
int main(void)
{
int sum = 0;
printf("Enter digits to be summed up or 0 to exit:\n");
do
{
int c = getchar();
if (EOF == c)
{
printf("Read 'end-of-file', exiting!\n");
break;
}
if ('\n' != c)
{
flush_input_stream(stdin);
}
if (!isdigit(c))
https://riptutorial.com/ 208
{
printf("%c is not a digit! Start over!\n", c);
continue;
}
if ('0' == c)
{
printf("Exit requested.\n");
break;
}
sum += c - '0';
return EXIT_SUCCESS;
}
if (0 != i)
{
fprintf(stderr, "Flushed %zu characters from input.\n", i);
}
}
https://riptutorial.com/ 209
Chapter 36: Linked lists
Remarks
The C language does not define a linked list data structure. If you are using C and need a linked
list, you either need to use a linked list from an existing library (such as GLib) or write your own
linked list interface. This topic shows examples for linked lists and double linked lists that can be
used as a starting point for writing your own linked lists.
Data structure
struct singly_node
{
struct singly_node * next;
};
Data structure
struct doubly_node
{
struct doubly_node * prev;
struct doubly_node * next;
};
Topoliges
Linear or open
https://riptutorial.com/ 210
Circular or ring
Procedures
Bind
https://riptutorial.com/ 211
void doubly_node_make_empty_circularly_list (struct doubly_node * head)
{
doubly_node_bind (head, head);
}
Insertion
Lets assume a empty list always contains one node instead of NULL. Then insertion procedures
do not have to take NULL into consideration.
void doubly_node_insert_between
(struct doubly_node * prev, struct doubly_node * next, struct doubly_node * insertion)
{
doubly_node_bind (prev, insertion);
doubly_node_bind (insertion, next);
}
void doubly_node_insert_before
(struct doubly_node * tail, struct doubly_node * insertion)
{
doubly_node_insert_between (tail->prev, tail, insertion);
}
void doubly_node_insert_after
https://riptutorial.com/ 212
(struct doubly_node * head, struct doubly_node * insertion)
{
doubly_node_insert_between (head, head->next, insertion);
}
Examples
Inserting a node at the beginning of a singly linked list
The code below will prompt for numbers and continue to add them to the beginning of a linked list.
/* This program will demonstrate inserting a node at the beginning of a linked list */
#include <stdio.h>
#include <stdlib.h>
struct Node {
int data;
struct Node* next;
};
return 0;
}
https://riptutorial.com/ 213
struct Node *currentNode = malloc(sizeof *currentNode);
currentNode->data = nodeValue;
currentNode->next = (*head);
*head = currentNode;
}
1. The list is empty, so we need to add a new node. In which case, our memory looks like this
where HEAD is a pointer to the first node:
The line currentNode->next = *headNode; will assign the value of currentNode->next to be NULL since
headNode originally starts out at a value of NULL.
Now, we want to set our head node pointer to point to our current node.
----- -------------
|HEAD | --> |CURRENTNODE| --> NULL /* The head node points to the current node */
----- -------------
2. The list is already populated; we need to add a new node to the beginning. For the sake of
simplicity, let's start out with 1 node:
----- -----------
HEAD --> FIRST NODE --> NULL
----- -----------
https://riptutorial.com/ 214
So far, we have looked at inserting a node at the beginning of a singly linked list. However, most of
the times you will want to be able to insert nodes elsewhere as well. The code written below
shows how it is possible to write an insert() function to insert nodes anywhere in the linked lists.
#include <stdio.h>
#include <stdlib.h>
struct Node {
int data;
struct Node* next;
};
print_list(head);
return 0;
}
/* Assign data */
currentNode->data = value;
/* Holds a pointer to the 'next' field that we have to link to the new node.
By initializing it to &head we handle the case of insertion at the beginning. */
struct Node **nextForPosition = &head;
/* Iterate to get the 'next' field we are looking for.
Note: Insert at the end if position is larger than current number of elements. */
for (i = 0; i < position && *nextForPosition != NULL; i++) {
/* nextForPosition is pointing to the 'next' field of the node.
So *nextForPosition is a pointer to the next node.
Update it with a pointer to the 'next' field of the next node. */
nextForPosition = &(*nextForPosition)->next;
}
/* Here, we are taking the link to the next node (the one our newly inserted node should
point to) by dereferencing nextForPosition, which points to the 'next' field of the node
that is in the position we want to insert our node at.
We assign this link to our next value. */
currentNode->next = *nextForPosition;
/* Now, we want to correct the link of the node before the position of our
https://riptutorial.com/ 215
new node: it will be changed to be a pointer to our new node. */
*nextForPosition = currentNode;
return head;
}
You can also perform this task recursively, but I have chosen in this example to use an iterative
approach. This task is useful if you are inserting all of your nodes at the beginning of a linked list.
Here is an example:
#include <stdio.h>
#include <stdlib.h>
#define NUM_ITEMS 10
struct Node {
int data;
struct Node *next;
};
int main(void) {
int i;
struct Node *head = NULL;
https://riptutorial.com/ 216
void insert_node(struct Node **headNode, int nodeValue, int position) {
int i;
struct Node *currentNode = (struct Node *)malloc(sizeof(struct Node));
struct Node *nodeBeforePosition = *headNode;
currentNode->data = nodeValue;
if(position == 1) {
currentNode->next = *headNode;
*headNode = currentNode;
return;
}
currentNode->next = nodeBeforePosition->next;
nodeBeforePosition->next = currentNode;
}
/* Iterator will be NULL by the end, so the last node will be stored in
previousNode. We will set the last node to be the headNode */
*headNode = previousNode;
}
Basically, the concept of reversing the linked list here is that we actually reverse the links
themselves. Each node's next member will become the node before it, like so:
Finally, the head should point to the 5th node instead, and each node should point to the node
https://riptutorial.com/ 217
previous of it.
Node 1 should point to NULL since there was nothing before it. Node 2 should point to node 1, node
3 should point to node 2, et cetera.
However, there is one small problem with this method. If we break the link to the next node and
change it to the previous node, we will not be able to traverse to the next node in the list since the
link to it is gone.
The solution to this problem is to simply store the next element in a variable (nextNode) before
changing the link.
An example of code showing how nodes can be inserted at a doubly linked list, how the list can
easily be reversed, and how it can be printed in reverse.
#include <stdio.h>
#include <stdlib.h>
/* This data is not always stored in a structure, but it is sometimes for ease of use */
struct Node {
/* Sometimes a key is also stored and used in the functions */
int data;
struct Node* next;
struct Node* previous;
};
int main(void) {
/* Sometimes in a doubly linked list the last node is also stored */
struct Node *head = NULL;
printf("Insert a node at the beginning, and then print the list backwards\n");
insert_at_beginning(&head, 10);
print_list_backwards(head);
printf("Insert a node at the end, and then print the list forwards.\n");
insert_at_end(&head, 15);
print_list(head);
free_list(head);
return 0;
}
https://riptutorial.com/ 218
void print_list_backwards(struct Node *headNode) {
if (NULL == headNode)
{
return;
}
/*
Iterate through the list, and once we get to the end, iterate backwards to print
out the items in reverse order (this is done with the pointer to the previous node).
This can be done even more easily if a pointer to the last node is stored.
*/
struct Node *i = headNode;
while (i->next != NULL) {
i = i->next; /* Move to the end of the list */
}
while (i != NULL) {
printf("Value: %d\n", i->data);
i = i->previous;
}
}
if (NULL == pheadNode)
{
return;
}
/*
This is done similarly to how we insert a node at the beginning of a singly linked
list, instead we set the previous member of the structure as well
*/
currentNode = malloc(sizeof *currentNode);
currentNode->next = NULL;
currentNode->previous = NULL;
currentNode->data = value;
currentNode->next = *pheadNode;
(*pheadNode)->previous = currentNode;
*pheadNode = currentNode;
}
if (NULL == pheadNode)
https://riptutorial.com/ 219
{
return;
}
/*
This can, again be done easily by being able to have the previous element. It
would also be even more useful to have a pointer to the last node, which is commonly
used.
*/
currentNode->data = value;
currentNode->next = NULL;
currentNode->previous = NULL;
if (*pheadNode == NULL) {
*pheadNode = currentNode;
return;
}
i->next = currentNode;
currentNode->previous = i;
}
Note that sometimes, storing a pointer to the last node is useful (it is more efficient to simply be
able to jump straight to the end of the list than to need to iterate through to the end):
Sometimes, a key is also used to identify elements. It is simply a member of the Node structure:
struct Node {
int data;
int key;
struct Node* next;
struct Node* previous;
};
The key is then used when any tasks are performed on a specific element, like deleting elements.
https://riptutorial.com/ 220
Chapter 37: Literals for numbers, characters
and strings
Remarks
The term literal is commonly used to describe a sequence of characters in a C code that
designates a constant value such as a number (e.g. 0) or a string (e.g. "C"). Strictly speaking, the
standard uses the term constant for integer constants, floating constants, enumeration constants
and character constants, reserving the term 'literal' for string literals, but this is not common usage.
Literals can have prefixes or suffixes (but not both) which are extra characters that can start or
end a literal to change its default type or its representation.
Examples
Integer literals
Integer literals are used to provide integral values. Three numerical bases are supported, indicated
by prefixes:
Decimal None 5
Octal 0 0345
Note that this writing doesn't include any sign, so integer literals are always positive. Something
like -1 is treated as an expression that has one integer literal (1) that is negated with a -
The type of a decimal integer literal is the first data type that can fit the value from int and long.
Since C99, long long is also supported for very large literals.
The type of an octal or hexadecimal integer literal is the first data type that can fit the value from
int, unsigned, long, and unsigned long. Since C99, long long and unsigned long long are also
supported for very large literals.
Suffix Explanation
L, l long int
https://riptutorial.com/ 221
Suffix Explanation
U, u unsigned
The U and L/LL suffixes can be combined in any order and case. It is an error to duplicate suffixes
(e.g. provide two U suffixes) even if they have different cases.
String literals
String literals are used to specify arrays of characters. They are sequences of characters enclosed
within double quotes (e.g. "abcd" and have the type char*).
The L prefix makes the literal a wide character array, of type wchar_t*. For example, L"abcd".
u8 char UTF-8
For the latter two, it can be queried with feature test macros if the encoding is effectively the
corresponding UTF encoding.
Floating point literals are used to represent signed real numbers. The following suffixes can be
used to specify type of a literal:
In order to use these suffixes, the literal must be a floating point literal. For example, 3f is an error,
since 3 is an integer literal, while 3.f or 3.0f are correct. For long double, the recommendation is to
always use capital L for the sake of readability.
https://riptutorial.com/ 222
Character literals
Character literals are a special type of integer literals that are used to represent one character.
They are enclosed in single quotes, e.g. 'a' and have the type int. The value of the literal is an
integer value according to the machine's character set. They do not allow suffixes.
The L prefix before a character literal makes it a wide character of type wchar_t. Likewise since
C11 u and U prefixes make it wide characters of type char16_t and char32_t, respectively.
When intending to represent certain special characters, such as a character that is non-printing,
escape sequences are used. Escape sequences use a sequence of characters that are translated
into another character. All escape sequences consist of two or more characters, the first of which
is a backslash \. The characters immediately following the backslash determine what character
literal the sequence is interpreted as.
\b Backspace
\f Form feed
\r Carriage return
\t Horizontal tab
\v Vertical tab
\\ Backslash
\? Question mark
https://riptutorial.com/ 223
Escape Sequence Represented Character
A universal character name is a Unicode code point. A universal character name may map to
more than one character. The digits n are interpreted as hexadecimal digits. Depending on the
UTF encoding in use, a universal character name sequence may result in a code point that
consists of multiple characters, instead of a single normal char character.
When using the line feed escape sequence in text mode I/O, it is converted to the OS-specific
newline byte or byte sequence.
The question mark escape sequence is used to avoid trigraphs. For example, ??/ is compiled as
the trigraph representing a backslash character '\', but using ?\?/ would result in the string "??/".
There may be one, two or three octal numerals n in the octal value escape sequence.
https://riptutorial.com/ 224
Chapter 38: Memory management
Introduction
For managing dynamically allocated memory, the standard C library provides the functions
malloc(), calloc(), realloc() and free(). In C99 and later, there is also aligned_alloc(). Some
systems also provide alloca().
Syntax
• void *aligned_alloc(size_t alignment, size_t size); /* Only since C11 */
• void *calloc(size_t nelements, size_t size);
• void free(void *ptr);
• void *malloc(size_t size);
• void *realloc(void *ptr, size_t size);
• void *alloca(size_t size); /* from alloca.h, not standard, not portable, dangerous. */
Parameters
name description
size (malloc, realloc and total size of the memory in bytes. For aligned_alloc the size
aligned_alloc) must be a integral multiple of alignment.
Remarks
C11
Systems such as those based on POSIX provide other ways of allocating aligned memory (e.g.
posix_memalign()), and also have other memory management options (e.g. mmap()).
Examples
https://riptutorial.com/ 225
Freeing Memory
The memory pointed to by p is reclaimed (either by the libc implementation or by the underlying
OS) after the call to free(), so accessing that freed memory block via p will lead to undefined
behavior. Pointers that reference memory elements that have been freed are commonly called
dangling pointers, and present a security risk. Furthermore, the C standard states that even
accessing the value of a dangling pointer has undefined behavior. Note that the pointer p itself can
be re-purposed as shown above.
Please note that you can only call free() on pointers that have directly been returned from the
malloc(), calloc(), realloc() and aligned_alloc() functions, or where documentation tells you the
memory has been allocated that way (functions like strdup () are notable examples). Freeing a
pointer that is,
is forbidden. Such an error will usually not be diagnosed by your compiler but will lead the program
execution in an undefined state.
There are two common strategies to prevent such instances of undefined behavior.
The first and preferable is simple - have p itself cease to exist when it is no longer needed, for
example:
if (something_is_needed())
{
https://riptutorial.com/ 226
free(p);
}
By calling free() directly before the end of the containing block (i.e. the }), p itself ceases to exist.
The compiler will give a compilation error on any attempt to use p after that.
A second approach is to also invalidate the pointer itself after releasing the memory to which it
points:
free(p);
p = NULL; // you may also use 0 instead of NULL
• On many platforms, an attempt to dereference a null pointer will cause instant crash:
Segmentation fault. Here, we get at least a stack trace pointing to the variable that was used
after being freed.
Without setting pointer to NULL we have dangling pointer. The program will very likely still
crash, but later, because the memory to which the pointer points will silently be corrupted.
Such bugs are difficult to trace because they can result in a call stack that completely
unrelated to the initial problem.
• It is safe to free a null pointer. The C Standard specifies that free(NULL) has no effect:
The free function causes the space pointed to by ptr to be deallocated, that is,
made available for further allocation. If ptr is a null pointer, no action occurs.
Otherwise, if the argument does not match a pointer earlier returned by the calloc
, malloc, or realloc function, or if the space has been deallocated by a call to free
or realloc, the behavior is undefined.
• Sometimes the first approach cannot be used (e.g. memory is allocated in one function, and
deallocated much later in a completely different function)
Allocating Memory
Standard Allocation
The C dynamic memory allocation functions are defined in the <stdlib.h> header. If one wishes to
allocate memory space for an object dynamically, the following code can be used:
https://riptutorial.com/ 227
}
This computes the number of bytes that ten ints occupy in memory, then requests that many
bytes from malloc and assigns the result (i.e., the starting address of the memory chunk that was
just created using malloc) to a pointer named p.
It is good practice to use sizeof to compute the amount of memory to request since the result of
sizeof is implementation defined (except for character types, which are char, signed char and
unsigned char, for which sizeof is defined to always give 1).
Because malloc might not be able to service the request, it might return a null pointer. It is
important to check for this to prevent later attempts to dereference the null pointer.
Memory dynamically allocated using malloc() may be resized using realloc() or, when no longer
needed, released using free().
Alternatively, declaring int array[10]; would allocate the same amount of memory. However, if it
is declared inside a function without the keyword static, it will only be usable within the function it
is declared in and the functions it calls (because the array will be allocated on the stack and the
space will be released for reuse when the function returns). Alternatively, if it is defined with static
inside a function, or if it is defined outside any function, then its lifetime is the lifetime of the
program. Pointers can also be returned from a function, however a function in C can not return an
array.
Zeroed Memory
The memory returned by malloc may not be initialized to a reasonable value, and care should be
taken to zero the memory with memset or to immediately copy a suitable value into it. Alternatively,
calloc returns a block of the desired size where all bits are initialized to 0. This need not be the
same as the representation of floating-point zero or a null pointer constant.
A note on calloc: Most (commonly used) implementations will optimise calloc() for performance, so it will be faster
than calling malloc(), then memset(), even though the net effect is identical.
Aligned Memory
C11
C11 introduced a new function aligned_alloc() which allocates space with the given alignment. It
can be used if the memory to be allocated is needed to be aligned at certain boundaries which
can't be satisfied by malloc() or calloc(). malloc() and calloc() functions allocate memory that's
https://riptutorial.com/ 228
suitably aligned for any object type (i.e. the alignment is alignof(max_align_t)). But with
aligned_alloc() greater alignments can be requested.
The C11 standard imposes two restrictions: 1) the size (second argument) requested must be an
integral multiple of the alignment (first argument) and 2) the value of alignment should be a valid
alignment supported by the implementation. Failure to meet either of them results in undefined
behavior.
Reallocating Memory
You may need to expand or shrink your pointer storage space after you have allocated memory to
it. The void *realloc(void *ptr, size_t size) function deallocates the old object pointed to by ptr
and returns a pointer to an object that has the size specified by size. ptr is the pointer to a
memory block previously allocated with malloc, calloc or realloc (or a null pointer) to be
reallocated. The maximal possible contents of the original memory is preserved. If the new size is
larger, any additional memory beyond the old size are uninitialized. If the new size is shorter, the
contents of the shrunken part is lost. If ptr is NULL, a new block is allocated and a pointer to it is
returned by the function.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *p = malloc(10 * sizeof *p);
if (NULL == p)
{
perror("malloc() failed");
return EXIT_FAILURE;
}
p[0] = 42;
p[9] = 15;
https://riptutorial.com/ 229
p = temporary;
}
/* From here on, array can be used with the new size it was
* realloc'ed to, until it is free'd. */
free(p);
return EXIT_SUCCESS;
}
The reallocated object may or may not have the same address as *p. Therefore it is important to
capture the return value from realloc which contains the new address if the call is successful.
Make sure you assign the return value of realloc to a temporary instead of the original p. realloc
will return null in case of any failure, which would overwrite the pointer. This would lose your data
and create a memory leak.
C99
Since C99, C has variable length arrays, VLA, that model arrays with bounds that are only known
at initialization time. While you have to be careful not to allocate too large VLA (they might smash
your stack), using pointers to VLA and using them in sizeof expressions is fine.
Here matrix is a pointer to elements of type double[m], and the sizeof expression with double[n][m]
ensures that it contains space for n such elements.
All this space is allocated contiguously and can thus be deallocated by a single call to free.
https://riptutorial.com/ 230
The presence of VLA in the language also affects the possible declarations of arrays and pointers
in function headers. Now, a general integer expression is permitted inside the [] of array
parameters. For both functions the expressions in [] use parameters that have declared before in
the parameter list. For sumAll these are the lengths that the user code expects for the matrix. As
for all array function parameters in C the innermost dimension is rewritten to a pointer type, so this
is equivalent to the declaration
That is, n is not really part of the function interface, but the information can be useful for
documentation and it could also be used by bounds checking compilers to warn about out-of-
bounds access.
Likwise, for main, the expression argc+1 is the minimal length that the C standard prescribes for the
argv argument.
Note that officially VLA support is optional in C11, but we know of no compiler that implements
C11 and that doesn't have them. You could test with the macro __STDC_NO_VLA__ if you must.
If the size of the space requested is zero, the behavior of realloc is implementation-defined. This
is similar for all memory allocation functions that receive a size parameter of value 0. Such
functions may in fact return a non-null pointer, but that must never be dereferenced.
This means realloc(ptr,0) may not really free/deallocate the memory, and thus it should never be
used as a replacement for free.
malloc() often calls underlying operating system functions to obtain pages of memory. But there is
nothing special about the function and it can be implemented in straight C by declaring a large
static array and allocating from it (there is a slight difficulty in ensuring correct alignment, in
practice aligning to 8 bytes is almost always adequate).
To implement a simple scheme, a control block is stored in the region of memory immediately
before the pointer to be returned from the call. This means that free() may be implemented by
https://riptutorial.com/ 231
subtracting from the returned pointer and reading off the control information, which is typically the
block size plus some information that allows it to be put back in the free list - a linked list of
unallocated blocks.
When the user requests an allocation, the free list is searched until a block of identical or larger
size to the amount requested is found, then if necessary it is split. This can lead to memory
fragmentation if the user is continually making many allocations and frees of unpredictable size
and and at unpredictable intervals (not all real programs behave like that, the simple scheme is
often adequate for small programs).
Many programs require large numbers of allocations of small objects of the same size. This is very
easy to implement. Simply use a block with a next pointer. So if a block of 32 bytes is required:
union block
{
union block * next;
unsigned char payload[32];
}
void *block_alloc()
{
void *answer = head;
if (answer)
head = head->next;
return answer;
}
https://riptutorial.com/ 232
This scheme is extremely fast and efficient, and can be made generic with a certain loss of clarity.
Caveat: alloca is only mentioned here for the sake of completeness. It is entirely non-portable (not
covered by any of the common standards) and has a number of potentially dangerous features
that make it un-safe for the unaware. Modern C code should replace it with Variable Length Arrays
(VLA).
Manual page
#include <alloca.h>
// glibc version of stdlib.h include alloca.h by default
Allocate memory on the stack frame of the caller, the space referenced by the returned pointer is
automatically free'd when the caller function finishes.
While this function is convenient for automatic memory management, be aware that requesting
large allocation could cause a stack overflow, and that you cannot use free with memory allocated
with alloca (which could cause more issue with stack overflow).
For these reason it is not recommended to use alloca inside a loop nor a recursive function.
And because the memory is free'd upon function return you cannot return the pointer as a function
result (the behavior would be undefined).
Summary
Recommendation
C99
https://riptutorial.com/ 233
Modern alternative.
This works where alloca() does, and works in places where alloca() doesn't (inside loops, for
example). It does assume either a C99 implementation or a C11 implementation that does not
define __STDC_NO_VLA__.
https://riptutorial.com/ 234
Chapter 39: Multi-Character Character
Sequence
Remarks
Not all preprocessors support trigraph sequence processing. Some compilers give an extra option
or switch for processing them. Others use a separate program to convert trigraphs.
The GCC compiler does not recognize them unless you explicitly request it to do so (use -
trigraphs to enable them; use -Wtrigraphs, part of -Wall, to get warnings about trigraphs).
As most platforms in use today support the full range of single characters used in C, digraphs are
preferred over trigraphs but the use of any multi-character character sequences is generally
discouraged.
Examples
Trigraphs
The symbols [ ] { } ^ \ | ~ # are frequently used in C programs, but in the late 1980s, there
were code sets in use (ISO 646 variants, for example, in Scandinavian countries) where the ASCII
character positions for these were used for national language variant characters (e.g. £ for # in the
UK; Æ Å æ å ø Ø for { } { } | \ in Denmark; there was no ~ in EBCDIC). This meant that it was
hard to write C code on machines that used these sets.
To solve this problem, the C standard suggested the use of combinations of three characters to
produce a single character called a trigraph. A trigraph is a sequence of three characters, the first
two of which are question marks.
The following is a simple example that uses trigraph sequences instead of #, { and }:
??=include <stdio.h>
int main()
??<
printf("Hello World!\n");
??>
This will be changed by the C preprocessor by replacing the trigraphs with their single-character
equivalents as if the code had been written:
#include <stdio.h>
int main()
https://riptutorial.com/ 235
{
printf("Hello World!\n");
}
Trigraph Equivalent
??= #
??/ \
??' ^
??( [
??) ]
??! |
??< {
??> }
??- ~
Note that trigraphs are problematic because, for example, ??/ is a backslash and can affect the
meaning of continuation lines in comments, and have to be recognized inside strings and
character literals (e.g. '??/??/' is a single character, a backslash).
Digraphs
C99
In 1994 more readable alternatives to five of the trigraphs were supplied. These use only two
characters and are known as digraphs. Unlike trigraphs, digraphs are tokens. If a digraph occurs
in another token (e.g. string literals or character constants) then it will not be treated as a digraph,
but remain as it is.
The following shows the difference before and after processing the digraphs sequence.
#include <stdio.h>
int main()
<%
printf("Hello %> World!\n"); /* Note that the string contains a digraph */
%>
#include <stdio.h>
https://riptutorial.com/ 236
int main()
{
printf("Hello %> World!\n"); /* Note the unchanged digraph within the string. */
}
Digraph Equivalent
<: [
:> ]
<% {
%> }
%: #
https://riptutorial.com/ 237
Chapter 40: Multithreading
Introduction
In C11 there is a standard thread library, <threads.h>, but no known compiler that yet implements
it. Thus, to use multithreading in C you must use platform specific implementations such as the
POSIX threads library (often referred to as pthreads) using the pthread.h header.
Syntax
• thrd_t // Implementation-defined complete object type identifying a thread
• int thrd_create( thrd_t *thr, thrd_start_t func, void *arg ); // Creates a thread
• int thrd_equal( thrd_t thr0, thrd_t thr1 ); // Check if arguments refer to the same thread
• thr_t thrd_current(void); // Returns identifier of the thread that calls it
• int thrd_sleep( const struct timespec *duration, struct timespec *remaining ); // Suspend call
thread execution for at least a given time
• void thrd_yield(void); // Permit other threads to run instead of the thread that calls it
• _Noreturn void thrd_exit( int res ); // Terminates the thread the thread that calls it
• int thrd_detatch( thrd_t thr; // Detaches a given thread from the current environment
• int thrd_join( thrd_t thr, int *res ); // Blocks the current thread until the given thread finishes
Remarks
Using threads can introduce extra undefined behavior such as a
http://www.riptutorial.com/c/example/2622/data-race. For race-free access to variables that are
shared between different threads C11 provides the mtx_lock() mutex functionality or the (optional)
http://www.riptutorial.com/c/topic/4924/atomics data-types and associated functions in stdatomic.h.
Examples
C11 Threads simple example
#include <threads.h>
#include <stdio.h>
return 0;
}
https://riptutorial.com/ 238
thrd_create(&thread, run, NULL);
thrd_join(&thread, &result);
https://riptutorial.com/ 239
Chapter 41: Operators
Introduction
An operator in a programming language is a symbol that tells the compiler or interpreter to perform
a specific mathematical, relational or logical operation and produce a final result.
C has many powerful operators. Many C operators are binary operators, which means they have
two operands. For example, in a / b, / is a binary operator that accepts two operands (a, b). There
are some unary operators which take one operand (for example: ~, ++), and only one ternary
operator ? :.
Syntax
• expr1 operator
• operator expr2
• expr1 operator expr2
• expr1 ? expr2 : expr3
Remarks
Operators have an arity, a precedence and an associativity.
• Arity indicates the number of operands. In C, three different operator arities exist:
○ Unary (1 operand)
○ Binary (2 operands)
○ Ternary (3 operands)
• Precedence indicates which operators "bind" first to their operands. That is, which operator
has priority to operate on its operands. For instance, the C language obeys the convention
that multiplication and division have precedence over addition and subtraction:
a * b + c
(a * b) + c
If this is not what was wanted, precedence can be forced using parentheses, because they
have the highest precedence of all operators.
a * (b + c)
This new expression will produce a result that differs from the previous two expressions.
https://riptutorial.com/ 240
The C language has many precedence levels; A table is given below of all operators, in
descending order of precedence.
Precedence Table
Operators Associativity
+- left to right
== != left to right
^ left to right
| left to right
|| left to right
?: right to left
, left to right
• Associativity indicates how equal-precedence operators binds by default, and there are two
kinds: Left-to-Right and Right-to-Left. An example of Left-to-Right binding is the subtraction
operator (-). The expression
a - b - c - d
((a - b) - c) - d
https://riptutorial.com/ 241
An example of Right-to-Left associativity are the dereference * and post-increment ++
operators. Both have equal precedence, so if they are used in an expression such as
* ptr ++
, this is equivalent to
* (ptr ++)
because the rightmost, unary operator (++) binds first to its single operand.
Examples
Relational Operators
Relational operators check if a specific relation between two operands is true. The result is
evaluated to 1 (which means true) or 0 (which means false). This result is often used to affect
control flow (via if, while, for), but can also be stored in variables.
Equals "=="
Checks whether the supplied operands are equal.
1 == 0; /* evaluates to 0. */
1 == 1; /* evaluates to 1. */
int x = 5;
int y = 5;
int *xptr = &x, *yptr = &y;
xptr == yptr; /* evaluates to 0, the operands hold different location addresses. */
*xptr == *yptr; /* evaluates to 1, the operands point at locations that hold the same value.
*/
Attention: This operator should not be confused with the assignment operator (=)!
1 != 0; /* evaluates to 1. */
1 != 1; /* evaluates to 0. */
int x = 5;
int y = 5;
int *xptr = &x, *yptr = &y;
xptr != yptr; /* evaluates to 1, the operands hold different location addresses. */
*xptr != *yptr; /* evaluates to 0, the operands point at locations that hold the same value.
*/
https://riptutorial.com/ 242
This operator effectively returns the opposite result to that of the equals (==) operator.
Not "!"
Check whether an object is equal to 0.
!someVal
someVal == 0
5 > 4 /* evaluates to 1. */
4 > 5 /* evaluates to 0. */
4 > 4 /* evaluates to 0. */
5 < 4 /* evaluates to 0. */
4 < 5 /* evaluates to 1. */
4 < 4 /* evaluates to 0. */
5 >= 4 /* evaluates to 1. */
4 >= 5 /* evaluates to 0. */
4 >= 4 /* evaluates to 1. */
5 <= 4 /* evaluates to 0. */
4 <= 5 /* evaluates to 1. */
4 <= 4 /* evaluates to 1. */
https://riptutorial.com/ 243
Assignment Operators
Assigns the value of the right-hand operand to the storage location named by the left-hand
operand, and returns the value.
a += b /* equal to: a = a + b */
a -= b /* equal to: a = a - b */
a *= b /* equal to: a = a * b */
a /= b /* equal to: a = a / b */
a %= b /* equal to: a = a % b */
a &= b /* equal to: a = a & b */
a |= b /* equal to: a = a | b */
a ^= b /* equal to: a = a ^ b */
a <<= b /* equal to: a = a << b */
a >>= b /* equal to: a = a >> b */
One important feature of these compound assignments is that the expression on the left hand side
(a) is only evaluated once. E.g if p is a pointer
*p += 27;
*p = *p + 27;
It should also be noted that the result of an assignment such as a = b is what is known as an
rvalue. Thus, the assignment actually has a value which can then be assigned to another variable.
This allows the chaining of assignments to set multiple variables in a single statement.
This rvalue can be used in the controlling expressions of if statements (or loops or switch
statements) that guard some code on the result of another expression or function call. For
example:
char *buffer;
if ((buffer = malloc(1024)) != NULL)
{
/* do something with buffer */
free(buffer);
}
else
{
/* report allocation failure */
https://riptutorial.com/ 244
}
Because of this, care must be taken to avoid a common typo which can lead to mysterious bugs.
int a = 2;
/* ... */
if (a = 1)
/* Delete all files on my hard drive */
This will have disastrous results, as a = 1 will always evaluate to 1 and thus the controlling
expression of the if statement will always be true (read more about this common pitfall here). The
author almost certainly meant to use the equality operator (==) as shown below:
int a = 2;
/* ... */
if (a == 1)
/* Delete all files on my hard drive */
Operator Associativity
int a, b = 1, c = 2;
a = b = c;
This assigns c to b, which returns b, which is than assigned to a. This happens because all
assignment-operators have right associativity, that means the rightmost operation in the
expression is evaluated first, and proceeds from right to left.
Arithmetic Operators
Basic Arithmetic
Return a value that is the result of applying the left hand operand to the right hand operand, using
the associated mathematical operation. Normal mathematical rules of commutation apply (i.e.
addition and multiplication are commutative, subtraction, division and modulus are not).
Addition Operator
The addition operator (+) is used to add two operands together. Example:
#include <stdio.h>
int main(void)
{
int a = 5;
int b = 7;
https://riptutorial.com/ 245
return 0;
}
Subtraction Operator
The subtraction operator (-) is used to subtract the second operand from the first. Example:
#include <stdio.h>
int main(void)
{
int a = 10;
int b = 7;
return 0;
}
Multiplication Operator
The multiplication operator (*) is used to multiply both operands. Example:
#include <stdio.h>
int main(void)
{
int a = 5;
int b = 7;
return 0;
}
Division Operator
The division operator (/) divides the first operand by the second. If both operands of the division
are integers, it will return an integer value and discard the remainder (use the modulo operator %
for calculating and acquiring the remainder).
If one of the operands is a floating point value, the result is an approximation of the fraction.
Example:
https://riptutorial.com/ 246
#include <stdio.h>
return 0;
}
Modulo Operator
The modulo operator (%) receives integer operands only, and is used to calculate the remainder
after the first operand is divided by the second. Example:
#include <stdio.h>
return 0;
}
https://riptutorial.com/ 247
#include <stdio.h>
int main(void)
{
int a = 1;
int b = 4;
int c = 1;
int d = 4;
a++;
printf("a = %d\n",a); /* Will output "a = 2" */
b--;
printf("b = %d\n",b); /* Will output "b = 3" */
As the example for c and d shows, both operators have two forms, as prefix notation and postfix
notation. Both have the same effect in incrementing (++) or decrementing (--) the variable, but
differ by the value they return: prefix operations do the operation first and then return the value,
whereas postfix operations first determine the value that is to be returned, and then do the
operation.
Logical Operators
Logical AND
Performs a logical boolean AND-ing of the two operands returning 1 if both of the operands are
non-zero. The logical AND operator is of type int.
0 && 0 /* Returns 0. */
0 && 1 /* Returns 0. */
2 && 0 /* Returns 0. */
2 && 3 /* Returns 1. */
Logical OR
Performs a logical boolean OR-ing of the two operands returning 1 if any of the operands are non-
zero. The logical OR operator is of type int.
https://riptutorial.com/ 248
0 || 0 /* Returns 0. */
0 || 1 /* Returns 1. */
2 || 0 /* Returns 1. */
2 || 3 /* Returns 1. */
Logical NOT
Performs a logical negation. The logical NOT operator is of type int. The NOT operator checks if
at least one bit is equal to 1, if so it returns 0. Else it returns 1;
!1 /* Returns 0. */
!5 /* Returns 0. */
!0 /* Returns 1. */
Short-Circuit Evaluation
There are some crucial properties common to both && and ||:
• the left-hand operand (LHS) is fully evaluated before the right-hand operand (RHS) is
evaluated at all,
• there is a sequence point between the evaluation of the left-hand operand and the right-hand
operand,
• and, most importantly, the right-hand operand is not evaluated at all if the result of the left-
hand operand determines the overall result.
• if the LHS evaluates to 'true' (non-zero), the RHS of || will not be evaluated (because the
result of 'true OR anything' is 'true'),
• if the LHS evaluates to 'false' (zero), the RHS of && will not be evaluated (because the result
of 'false AND anything' is 'false').
If a negative value is passed to the function, the value >= 0 term evaluates to false and the value <
NUM_NAMES term is not evaluated.
Increment / Decrement
The increment and decrement operators exist in prefix and postfix form.
https://riptutorial.com/ 249
int a = 1;
int b = 1;
int tmp = 0;
Note that arithmetic operations do not introduce sequence points, so certain expressions with ++ or
-- operators may introduce undefined behaviour.
Evaluates its first operand, and, if the resulting value is not equal to zero, evaluates its second
operand. Otherwise, it evaluates its third operand, as shown in the following example:
a = b ? c : d;
is equivalent to:
if (b)
a = c;
else
a = d;
int x = 5;
int y = 42;
printf("%i, %i\n", 1 ? x : y, 0 ? x : y); /* Outputs "5, 42" */
The conditional operator can be nested. For example, the following code determines the bigger of
three numbers:
The following example writes even integers to one file and odd integers to another file:
#include<stdio.h>
int main()
{
FILE *even, *odds;
int n = 10;
size_t k = 0;
https://riptutorial.com/ 250
for(k = 1; k < n + 1; k++)
{
k%2==0 ? fprintf(even, "\t%5d\n", k)
: fprintf(odds, "\t%5d\n", k);
}
fclose(even);
fclose(odds);
return 0;
}
The conditional operator associates from right to left. Consider the following:
Comma Operator
Evaluates its left operand, discards the resulting value, and then evaluates its rights operand and
result yields the value of its rightmost operand.
Note that the comma used in functions calls that separate arguments is NOT the comma operator,
rather it's called a separator which is different from the comma operator. Hence, it doesn't have
the properties of the comma operator.
The above printf() call contains both the comma operator and the separator.
The comma operator is often used in the initialization section as well as in the updating section of
a for loop. For example:
for(k = 1; k < 10; printf("\%d\\n", k), k += 2); /*outputs the odd numbers below 9/*
Cast Operator
https://riptutorial.com/ 251
Performs an explicit conversion into the given type from the value resulting from evaluating the
given expression.
int x = 3;
int y = 4;
printf("%f\n", (double)x / y); /* Outputs "0.750000". */
Here the value of x is converted to a double, the division promotes the value of y to double, too, and
the result of the division, a double is passed to printf for printing.
sizeof Operator
printf("%zu\n", sizeof(int)); /* Valid, outputs the size of an int object, which is platform-
dependent. */
printf("%zu\n", sizeof int); /* Invalid, types as arguments need to be surrounded by
parentheses! */
char ch = 'a';
printf("%zu\n", sizeof(ch)); /* Valid, will output the size of a char object, which is always
1 for all platforms. */
printf("%zu\n", sizeof ch); /* Valid, will output the size of a char object, which is always
1 for all platforms. */
Pointer Arithmetic
Pointer addition
Given a pointer and a scalar type N, evaluates into a pointer to the Nth element of the pointed-to
type that directly succeeds the pointed-to object in memory.
It does not matter if the pointer is used as the operand value or the scalar value. This means that
things such as 3 + arr are valid. If arr[k] is the k+1 member of an array, then arr+k is a pointer to
arr[k]. In other words, arr or arr+0 is a pointer to arr[0], arr+1 is a pointer to arr[2], and so on. In
https://riptutorial.com/ 252
general, *(arr+k) is same as arr[k].
Unlike the usual arithmetic, addition of 1 to a pointer to an int will add 4 bytes to the current
address value. As array names are constant pointers, + is the only operator we can use to access
the members of an array via pointer notation using the array name. However, by defining a pointer
to an array, we can get more flexibility to process the data in an array. For example, we can print
the members of an array as follows:
#include<stdio.h>
static const size_t N = 5
int main()
{
size_t k = 0;
int arr[] = {1, 2, 3, 4, 5};
for(k = 0; k < N; k++)
{
printf("\n\t%d", *(arr + k));
}
return 0;
}
By defining a pointer to the array, the above program is equivalent to the following:
#include<stdio.h>
static const size_t N = 5
int main()
{
size_t k = 0;
int arr[] = {1, 2, 3, 4, 5};
int *ptr = arr; /* or int *ptr = &arr[0]; */
for(k = 0; k < N; k++)
{
printf("\n\t%d", ptr[k]);
/* or printf("\n\t%d", *(ptr + k)); */
/* or printf("\n\t%d", *ptr++); */
}
return 0;
}
See that the members of the array arr are accessed using the operators + and ++. The other
operators that can be used with the pointer ptr are - and --.
Pointer subtraction
Given two pointers to the same type, evaluates into an object of type ptrdiff_t that holds the
scalar value that must be added to the second pointer in order to obtain the value of the first
pointer.
https://riptutorial.com/ 253
printf("q - p = %ti\n", diff); /* Outputs "1". */
printf("*(p + (q - p)) = %d\n", *(p + diff)); /* Outputs "4". */
Access Operators
The member access operators (dot . and arrow ->) are used to access a member of a struct.
Member of object
Evaluates into the lvalue denoting the object that is a member of the accessed object.
struct MyStruct
{
int x;
int y;
};
struct MyStruct
{
int x;
int y;
};
p->x = 42;
p->y = 123;
Address-of
The unary & operator is the address of operator. It evaluates the given expression, where the
resulting object must be an lvalue. Then, it evaluates into an object whose type is a pointer to the
resulting object's type, and contains the address of the resulting object.
https://riptutorial.com/ 254
int x = 3;
int *p = &x;
printf("%p = %p\n", void *)&x, (void *)p); /* Outputs "A = A", for some implementation-
defined A. */
Dereference
The unary * operator dereferences a pointer. It evaluates into the lvalue resulting from
dereferencing the pointer that results from evaluating the given expression.
int x = 42;
int *p = &x;
printf("x = %d, *p = %d\n", x, *p); /* Outputs "x = 42, *p = 42". */
*p = 123;
printf("x = %d, *p = %d\n", x, *p); /* Outputs "x = 123, *p = 123". */
Indexing
Indexing is syntactic sugar for pointer addition followed by dereferencing. Effectively, an
expression of the form a[i] is equivalent to *(a + i) — but the explicit subscript notation is
preferred.
int arr[] = { 1, 2, 3, 4, 5 };
printf("arr[2] = %i\n", arr[2]); /* Outputs "arr[2] = 3". */
Interchangeability of indexing
Adding a pointer to an integer is a commutative operation (i.e. the order of the operands does not
change the result) so pointer + integer == integer + pointer.
Usage of an expression 3[arr] instead of arr[3] is generally not recommended, as it affects code
readability. It tends to be a popular in obfuscated programming contests.
The first operand must be a function pointer (a function designator is also acceptable because it
will be converted to a pointer to the function), identifying the function to call, and all other
operands, if any, are collectively known as the function call's arguments. Evaluates into the return
value resulting from calling the appropriate function with the respective arguments.
https://riptutorial.com/ 255
return x * 2 + y;
}
Bitwise Operators
Symbol Operator
| bitwise inclusive OR
#include <stdio.h>
int main(void)
{
unsigned int a = 29; /* 29 = 0001 1101 */
unsigned int b = 48; /* 48 = 0011 0000 */
int c = 0;
c = a | b; /* 61 = 0011 1101 */
printf("%d | %d = %d\n", a, b, c );
c = a ^ b; /* 45 = 0010 1101 */
printf("%d ^ %d = %d\n", a, b, c );
https://riptutorial.com/ 256
c = a >> 2; /* 7 = 0000 0111 */
printf("%d >> 2 = %d\n", a, c );
return 0;
}
Bitwise operations with signed types should be avoided because the sign bit of such a bit
representation has a particular meaning. Particular restrictions apply to the shift operators:
• Left shifting a 1 bit into the signed bit is erroneous and leads to undefined behavior.
• Right shifting a negative value (with sign bit 1) is implementation defined and therefore not
portable.
• If the value of the right operand of a shift operator is negative or is greater than or equal to
the width of the promoted left operand, the behavior is undefined.
Masking:
Masking refers to the process of extracting the desired bits from (or transforming the desired bits
in) a variable by using logical bitwise operations. The operand (a constant or variable) that is used
to perform masking is called a mask.
The following function uses a mask to display the bit pattern of a variable:
#include <limits.h>
void bit_pattern(int u)
{
int i, x, word;
unsigned mask = 1;
word = CHAR_BIT * sizeof(int);
mask = mask << (word - 1); /* shift 1 to the leftmost position */
for(i = 1; i <= word; i++)
{
x = (u & mask) ? 1 : 0; /* identify the bit */
printf("%d", x); /* print bit value */
mask >>= 1; /* shift mask to the right by 1 bit */
}
}
_Alignof
C11
https://riptutorial.com/ 257
Queries the alignment requirement for the specified type. The alignment requirement is a positive
integral power of 2 representing the number of bytes between which two objects of the type may
be allocated. In C, the alignment requirement is measured in size_t.
The type name may not be an incomplete type nor a function type. If an array is used as the type,
the type of the array element is used.
This operator is often accessed through the convenience macro alignof from <stdalign.h>.
int main(void)
{
printf("Alignment of char = %zu\n", alignof(char));
printf("Alignment of max_align_t = %zu\n", alignof(max_align_t));
printf("alignof(float[10]) = %zu\n", alignof(float[10]));
printf("alignof(struct{char c; int n;}) = %zu\n",
alignof(struct {char c; int n;}));
}
Possible Output:
Alignment of char = 1
Alignment of max_align_t = 16
alignof(float[10]) = 4
alignof(struct{char c; int n;}) = 4
http://en.cppreference.com/w/c/language/_Alignof
Short circuiting is a functionality that skips evaluating parts of a (if/while/...) condition when able. In
case of a logical operation on two operands, the first operand is evaluated (to true or false) and if
there is a verdict (i.e first operand is false when using &&, first operand is true when using ||) the
second operand is not evaluated.
Example:
#include <stdio.h>
int main(void) {
int a = 20;
int b = -5;
return 0;
}
https://riptutorial.com/ 258
#include <stdio.h>
int print(int i) {
printf("print function %d\n", i);
return i;
}
int main(void) {
int a = 20;
return 0;
}
Output:
$ ./a.out
print function 20
I will be printed!
Short circuiting is important, when you want to avoid evaluating terms that are (computationally)
costly. Moreover, it can heavily affect the flow of your program like in this case: Why does this
program print "forked!" 4 times?
https://riptutorial.com/ 259
Chapter 42: Pass 2D-arrays to functions
Examples
Pass a 2D-array to a function
Passing a 2d array to a functions seems simple and obvious and we happily write:
#include <stdio.h>
#include <stdlib.h>
#define ROWS 3
#define COLS 2
int main()
{
int array_2D[ROWS][COLS] = { {1, 2}, {3, 4}, {5, 6} };
int n = ROWS;
int m = COLS;
fun1(array_2D, n, m);
return EXIT_SUCCESS;
}
But the compiler, here GCC in version 4.9.4 , does not appreciate it well.
The reasons for this are twofold: the main problem is that arrays are not pointers and the second
inconvenience is the so called pointer decay. Passing an array to a function will decay the array to
a pointer to the first element of the array--in the case of a 2d array it decays to a pointer to the first
row because in C arrays are sorted row-first.
#include <stdio.h>
https://riptutorial.com/ 260
#include <stdlib.h>
#define ROWS 3
#define COLS 2
int main()
{
int array_2D[ROWS][COLS] = { {1, 2}, {3, 4}, {5, 6} };
int n = ROWS;
int m = COLS;
fun1(array_2D, n, m);
return EXIT_SUCCESS;
}
#include <stdio.h>
#include <stdlib.h>
#define ROWS 3
#define COLS 2
int main()
{
int array_2D[ROWS][COLS] = { {1, 2}, {3, 4}, {5, 6} };
int rows = ROWS;
fun1(array_2D, rows);
return EXIT_SUCCESS;
}
n = rows;
/* Works, because that information is passed (as "COLS").
It is also redundant because that value is known at compile time (in "COLS"). */
m = (int) (sizeof(a[0])/sizeof(a[0][0]));
https://riptutorial.com/ 261
/* Does not work here because the "decay" in "pointer decay" is meant
literally--information is lost. */
printf("FUN1: %zu\n",sizeof(a)/sizeof(a[0]));
C99
The number of columns is predefined and hence fixed at compile time, but the predecessor to the
current C-standard (that was ISO/IEC 9899:1999, current is ISO/IEC 9899:2011) implemented
VLAs (TODO: link it) and although the current standard made it optional, almost all modern C-
compilers support it (TODO: check if MS Visual Studio supports it now).
#include <stdio.h>
#include <stdlib.h>
if(argc != 3){
fprintf(stderr,"Usage: %s rows cols\n",argv[0]);
exit(EXIT_FAILURE);
}
rows = atoi(argv[1]);
cols = atoi(argv[2]);
int array_2D[rows][cols];
exit(EXIT_SUCCESS);
}
n = rows;
/* Does not work anymore, no sizes are specified anymore
https://riptutorial.com/ 262
m = (int) (sizeof(a[0])/sizeof(a[0][0])); */
m = cols;
It becomes a bit clearer if we intentionally make an error in the call of the function by changing the
declaration to void fun1(int **a, int rows, int cols). That causes the compiler to complain in a
different, but equally nebulous way
We can react in several ways, one of it is to ignore all of it and do some illegible pointer juggling:
#include <stdio.h>
#include <stdlib.h>
if(argc != 3){
fprintf(stderr,"Usage: %s rows cols\n",argv[0]);
exit(EXIT_FAILURE);
}
rows = atoi(argv[1]);
cols = atoi(argv[2]);
int array_2D[rows][cols];
printf("Make array with %d rows and %d columns\n", rows, cols);
for (i = 0; i < rows; i++) {
for (j = 0; j < cols; j++) {
array_2D[i][j] = i * cols + j;
printf("array[%d][%d]=%d\n", i, j, array_2D[i][j]);
https://riptutorial.com/ 263
}
}
exit(EXIT_SUCCESS);
}
n = rows;
m = cols;
Or we do it right and pass the needed information to fun1. To do so wee need to rearrange the
arguments to fun1: the size of the column must come before the declaration of the array. To keep
it more readable the variable holding the number of rows has changed its place, too, and is first
now.
#include <stdio.h>
#include <stdlib.h>
if(argc != 3){
fprintf(stderr,"Usage: %s rows cols\n",argv[0]);
exit(EXIT_FAILURE);
}
rows = atoi(argv[1]);
cols = atoi(argv[2]);
int array_2D[rows][cols];
printf("Make array with %d rows and %d columns\n", rows, cols);
for (i = 0; i < rows; i++) {
for (j = 0; j < cols; j++) {
array_2D[i][j] = i * cols + j;
printf("array[%d][%d]=%d\n", i, j, array_2D[i][j]);
}
}
https://riptutorial.com/ 264
exit(EXIT_SUCCESS);
}
n = rows;
m = cols;
This looks awkward to some people, who hold the opinion that the order of variables should not
matter. That is not much of a problem, just declare a pointer and let it point to the array.
#include <stdio.h>
#include <stdlib.h>
if(argc != 3){
fprintf(stderr,"Usage: %s rows cols\n",argv[0]);
exit(EXIT_FAILURE);
}
rows = atoi(argv[1]);
cols = atoi(argv[2]);
int array_2D[rows][cols];
printf("Make array with %d rows and %d columns\n", rows, cols);
for (i = 0; i < rows; i++) {
for (j = 0; j < cols; j++) {
array_2D[i][j] = i * cols + j;
printf("array[%d][%d]=%d\n", i, j, array_2D[i][j]);
}
}
// a "rows" number of pointers to "int". Again a VLA
int *a[rows];
// initialize them to point to the individual rows
for (i = 0; i < rows; i++) {
a[i] = array_2D[i];
}
exit(EXIT_SUCCESS);
https://riptutorial.com/ 265
}
n = rows;
m = cols;
Often the easiest solution is simply to pass 2D and higher arrays around as flat memory.
/* pass it to a subroutine */
manipulate_matrix(matrix, width, height);
https://riptutorial.com/ 266
Chapter 43: Pointers
Introduction
A pointer is a type of variable which can store the address of another object or a function.
Syntax
• <Data type> *<Variable name>;
• int *ptrToInt;
• void *ptrToVoid; /* C89+ */
• struct someStruct *ptrToStruct;
• int **ptrToPtrToInt;
• int arr[length]; int *ptrToFirstElem = arr; /* For <C99 'length' needs to be a compile time
constant, for >=C11 it might need to be one. */
• int *arrayOfPtrsToInt[length]; /* For <C99 'length' needs to be a compile time constant, for
>=C11 it might need to be one. */
Remarks
The position of the asterisk does not affect the meaning of the definition:
/* The * operator binds to right and therefore these are all equivalent. */
int *i;
int * i;
int* i;
However, when defining multiple pointers at once, each requires its own asterisk:
An array of pointers is also possible, where an asterisk is given before the array variable's name:
int *foo[2]; /* foo is a array of pointers, can be accessed as *foo[0] and *foo[1] */
Examples
Common errors
Improper use of pointers are frequently a source of bugs that can include security bugs or program
crashes, most often due to segmentation faults.
https://riptutorial.com/ 267
Not checking for allocation failures
Memory allocation is not guaranteed to succeed, and may instead return a NULL pointer. Using the
returned value, without checking if the allocation is successful, invokes undefined behavior. This
usually leads to a crash, but there is no guarantee that a crash will happen so relying on that can
also lead to problems.
Safe way:
Non-portable allocation:
Portable allocation:
Memory leaks
https://riptutorial.com/ 268
Failure to de-allocate memory using free leads to a buildup of non-reusable memory, which is no
longer used by the program; this is called a memory leak. Memory leaks waste memory resources
and can lead to allocation failures.
Logical errors
All allocations must follow the same pattern:
Failure to adhere to this pattern, such as using memory after a call to free (dangling pointer) or
before a call to malloc (wild pointer), calling free twice ("double free"), etc., usually causes a
segmentation fault and results in a crash of the program.
These errors can be transient and hard to debug – for example, freed memory is usually not
immediately reclaimed by the OS, and thus dangling pointers may persist for a while and appear
to work.
On systems where it works, Valgrind is an invaluable tool for identifying what memory is leaked
and where it was originally allocated.
int* myFunction()
{
int x = 10;
return &x;
}
Here, x has automatic storage duration (commonly known as stack allocation). Because it is
allocated on the stack, its lifetime is only as long as myFunction is executing; after myFunction has
exited, the variable x is destroyed. This function gets the address of x (using &x), and returns it to
the caller, leaving the caller with a pointer to a non-existent variable. Attempting to access this
variable will then invoke undefined behavior.
Most compilers don't actually clear a stack frame after the function exits, thus dereferencing the
returned pointer often gives you the expected data. When another function is called however, the
memory being pointed to may be overwritten, and it appears that the data being pointed to has
been corrupted.
To resolve this, either malloc the storage for the variable to be returned, and return a pointer to the
newly created storage, or require that a valid pointer is passed in to the function instead of
returning one, for example:
https://riptutorial.com/ 269
#include <stdlib.h>
#include <stdio.h>
int *solution1(void)
{
int *x = malloc(sizeof *x);
if (x == NULL)
{
/* Something went wrong */
return NULL;
}
*x = 10;
return x;
}
*x = 10;
}
int main(void)
{
{
/* Use solution1() */
free(foo); /* Tidy up */
}
{
/* Use solution2() */
int bar;
solution2(&bar);
return 0;
}
https://riptutorial.com/ 270
Post incrementing / decrementing is executed before dereferencing. Therefore, this expression will
increment the pointer p itself and return what was pointed by p before incrementing without
changing it.
This rule also applies to post decrementing: *p-- will decrement the pointer p itself, not what is
pointed by p.
Dereferencing a Pointer
int a = 1;
int *a_pointer = &a;
To dereference a_pointer and change the value of a, we use the following operation
*a_pointer = 2;
However, one would be mistaken to dereference a NULL or otherwise invalid pointer. This
p1 = (int *) 0xbad;
p2 = NULL;
*p1 = 42;
*p2 = *p1 + 1;
is usually undefined behavior. p1 may not be dereferenced because it points to an address 0xbad
which may not be a valid address. Who knows what's there? It might be operating system
memory, or another program's memory. The only time code like this is used, is in embedded
development, which stores particular information at hard-coded addresses. p2 cannot be
dereferenced because it is NULL, which is invalid.
struct MY_STRUCT
{
int my_int;
float my_float;
};
We can define MY_STRUCT to omit the struct keyword so we don't have to type struct MY_STRUCT
https://riptutorial.com/ 271
each time we use it. This, however, is optional.
MY_STRUCT *instance;
If this statement appears at file scope, instance will be initialized with a null pointer when the
program starts. If this statement appears inside a function, its value is undefined. The variable
must be initialized to point to a valid MY_STRUCT variable, or to dynamically allocated space, before it
can be dereferenced. For example:
When the pointer is valid, we can dereference it to access its members using one of two different
notations:
int a = (*instance).my_int;
float b = instance->my_float;
While both these methods work, it is better practice to use the arrow -> operator rather than the
combination of parentheses, the dereference * operator and the dot . operator because it is easier
to read and understand, especially with nested uses.
In this case, copy contains a copy of the contents of instance. Changing my_int of copy will not
change it in instance.
In this case, ref is a reference to instance. Changing my_int using the reference will change it in
instance.
It is common practice to use pointers to structs as parameters in functions, rather than the structs
themselves. Using the structs as function parameters could cause the stack to overflow if the
struct is large. Using a pointer to a struct only uses enough stack space for the pointer, but can
cause side effects if the function changes the struct which is passed into the function.
Function pointers
https://riptutorial.com/ 272
Let's take a basic function:
my_pointer = &my_function;
...
printf("%d\n", result);
}
Although this syntax seems more natural and coherent with basic types, attributing and
dereferencing function pointers don't require the usage of & and * operators. So the following
snippet is equally valid:
https://riptutorial.com/ 273
int a = 4;
callback(a);
}
Another readability trick is that the C standard allows one to simplify a function pointer in
arguments like above (but not in variable declaration) to something that looks like a function
prototype; thus the following can be equivalently used for function definitions and declarations:
See also
Function Pointers
Initializing Pointers
Pointer initialization is a good way to avoid wild pointers. The initialization is simple and is no
different from initialization of a variable.
#include <stddef.h>
int main()
{
int *p1 = NULL;
char *p2 = NULL;
float *p3 = NULL;
...
}
In most operating systems, inadvertently using a pointer that has been initialized to NULL will often
result in the program crashing immediately, making it easy to identify the cause of the problem.
Using an uninitialized pointer can often cause hard-to-diagnose bugs.
Caution:
The result of dereferencing a NULL pointer is undefined, so it will not necessarily cause a crash
even if that is the natural behaviour of the operating system the program is running on. Compiler
optimizations may mask the crash, cause the crash to occur before or after the point in the source
code at which the null pointer dereference occurred, or cause parts of the code that contains the
null pointer dereference to be unexpectedly removed from the program. Debug builds will not
usually exhibit these behaviours, but this is not guaranteed by the language standard. Other
unexpected and/or undesirable behaviour is also allowed.
https://riptutorial.com/ 274
Because NULL never points to a variable, to allocated memory, or to a function, it is safe to use as a
guard value.
Caution:
Usually NULL is defined as (void *)0. But this does not imply that the assigned memory address is
0x0. For more clarification refer to C-faq for NULL pointers
Note that you can also initialize pointers to contain values other than NULL.
int i1;
int main()
{
int *p1 = &i1;
const char *p2 = "A constant string to point to";
float *p3 = malloc(10 * sizeof(float));
}
For any object (i.e, variable, array, union, struct, pointer or function) the unary address operator
can be used to access the address of that object.
Suppose that
int i = 1;
int *p = NULL;
So then a statement p = &i;, copies the address of the variable i to the pointer p.
Pointer Arithmetic
K&R
void* is a catch all type for pointers to object types. An example of this in use is with the malloc
function, which is declared as
void* malloc(size_t);
The pointer-to-void return type means that it is possible to assign the return value from malloc to a
https://riptutorial.com/ 275
pointer to any other type of object:
It is generally considered good practice to not explicitly cast the values into and out of void
pointers. In specific case of malloc() this is because with an explicit cast, the compiler may
otherwise assume, but not warn about, an incorrect return type for malloc(), if you forget to include
stdlib.h. It is also a case of using the correct behavior of void pointers to better conform to the
DRY (don't repeat yourself) principle; compare the above to the following, wherein the following
code contains several needless additional places where a typo could cause issues:
void* memcpy(void *restrict target, void const *restrict source, size_t size);
have their arguments specified as void * because the address of any object, regardless of the
type, can be passed in. Here also, a call should not use a cast
Const Pointers
Single Pointers
• Pointer to an int
The pointer can point to different integers and the int's can be changed through the pointer.
This sample of code assigns b to point to int b then changes b's value to 100.
int b;
int* p;
p = &b; /* OK */
*p = 100; /* OK */
The pointer can point to different integers but the int's value can't be changed through the
pointer.
int b;
const int* p;
p = &b; /* OK */
*p = 100; /* Compiler Error */
https://riptutorial.com/ 276
• const pointer to int
The pointer can only point to one int but the int's value can be changed through the pointer.
int a, b;
int* const p = &b; /* OK as initialisation, no assignment */
*p = 100; /* OK */
p = &a; /* Compiler Error */
The pointer can only point to one int and the int can not be changed through the pointer.
int a, b;
const int* const p = &b; /* OK as initialisation, no assignment */
p = &a; /* Compiler Error */
*p = 100; /* Compiler Error */
Pointer to Pointer
• Pointer to a pointer to an int
This code assigns the address of p1 to the to double pointer p (which then points to int* p1
(which points to int)).
void f1(void)
{
int a, b;
int *p1;
int **p;
p1 = &b; /* OK */
p = &p1; /* OK */
*p = &a; /* OK */
**p = 100; /* OK */
}
void f2(void)
{
int b;
const int *p1;
const int **p;
p = &p1; /* OK */
*p = &b; /* OK */
**p = 100; /* error: assignment of read-only location ‘**p’ */
}
https://riptutorial.com/ 277
void f3(void)
{
int b;
int *p1;
int * const *p;
p = &p1; /* OK */
*p = &b; /* error: assignment of read-only location ‘*p’ */
**p = 100; /* OK */
}
void f4(void)
{
int b;
int *p1;
int ** const p = &p1; /* OK as initialisation, not assignment */
p = &p1; /* error: assignment of read-only variable ‘p’ */
*p = &b; /* OK */
**p = 100; /* OK */
}
void f5(void)
{
int b;
const int *p1;
const int * const *p;
p = &p1; /* OK */
*p = &b; /* error: assignment of read-only location ‘*p’ */
**p = 100; /* error: assignment of read-only location ‘**p’ */
}
void f6(void)
{
int b;
const int *p1;
const int ** const p = &p1; /* OK as initialisation, not assignment */
p = &p1; /* error: assignment of read-only variable ‘p’ */
*p = &b; /* OK */
**p = 100; /* error: assignment of read-only location ‘**p’ */
}
void f7(void)
{
int b;
int *p1;
int * const * const p = &p1; /* OK as initialisation, not assignment */
p = &p1; /* error: assignment of read-only variable ‘p’ */
*p = &b; /* error: assignment of read-only location ‘*p’ */
https://riptutorial.com/ 278
**p = 100; /* OK */
}
Premise
The most confusing thing surrounding pointer syntax in C and C++ is that there are actually two
different meanings that apply when the pointer symbol, the asterisk (*), is used with a variable.
Example
Firstly, you use * to declare a pointer variable.
int i = 5;
/* 'p' is a pointer to an integer, initialized as NULL */
int *p = NULL;
/* '&i' evaluates into address of 'i', which then assigned to 'p' */
p = &i;
/* 'p' is now holding the address of 'i' */
When you're not declaring (or multiplying), * is used to dereference a pointer variable:
*p = 123;
/* 'p' was pointing to 'i', so this changes value of 'i' to 123 */
When you want an existing pointer variable to hold address of other variable, you don't use *, but
do it like this:
p = &another_variable;
A common confusion among C-programming newbies arises when they declare and initialize a
pointer variable at the same time.
int *p = &i;
Since int i = 5; and int i; i = 5; give the same result, some of them might thought int *p = &i;
and int *p; *p = &i; give the same result too. The fact is, no, int *p; *p = &i; will attempt to
deference an uninitialized pointer which will result in UB. Never use * when you're not declaring
nor dereferencing a pointer.
Conclusion
The asterisk (*) has two distinct meanings within C in relation to pointers, depending on where it's
https://riptutorial.com/ 279
used. When used within a variable declaration, the value on the right hand side of the equals
side should be a pointer value to an address in memory. When used with an already declared
variable, the asterisk will dereference the pointer value, following it to the pointed-to place in
memory, and allowing the value stored there to be assigned or retrieved.
Takeaway
It is important to mind your P's and Q's, so to speak, when dealing with pointers. Be mindful of
when you're using the asterisk, and what it means when you use it there. Overlooking this tiny
detail could result in buggy and/or undefined behavior that you really don't want to have to deal
with.
Pointer to Pointer
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int A = 42;
int* pA = &A;
int** ppA = &pA;
int*** pppA = &ppA;
return EXIT_SUCCESS;
}
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int A = 42;
int* pA = &A;
int** ppA = &&A; /* Compilation error here! */
int*** pppA = &&&A; /* Compilation error here! */
...
Introduction
A pointer is declared much like any other variable, except an asterisk (*) is placed between the
type and the name of the variable to denote it is a pointer.
int *pointer; /* inside a function, pointer is uninitialized and doesn't point to any valid
object yet */
To declare two pointer variables of the same type, in the same declaration, use the asterisk
https://riptutorial.com/ 280
symbol before each identifier. For example,
The address-of or reference operator denoted by an ampersand (&) gives the address of a given
variable which can be placed in a pointer of appropriate type.
int value = 1;
pointer = &value;
The indirection or dereference operator denoted by an asterisk (*) gets the contents of an object
pointed to by a pointer.
If the pointer points to a structure or union type then you can dereference it and access its
members directly using the -> operator:
SomeStruct *s = &someObject;
s->someMember = 5; /* Equivalent to (*s).someMember = 5 */
In C, a pointer is a distinct value type which can be reassigned and otherwise is treated as a
variable in its own right. For example the following example prints the value of the pointer
(variable) itself.
Because a pointer is a mutable variable, it is possible for it to not point to a valid object, either by
being set to null
pointer = 0; /* or alternatively */
pointer = NULL;
or simply by containing an arbitrary bit pattern that isn't a valid address. The latter is a very bad
situation, because it cannot be tested before the pointer is being dereferenced, there is only a test
for the case a pointer is null:
if (!pointer) exit(EXIT_FAILURE);
A pointer may only be dereferenced if it points to a valid object, otherwise the behavior is
undefined. Many modern implementations may help you by raising some kind of error such as a
segmentation fault and terminate execution, but others may just leave your program in an invalid
state.
https://riptutorial.com/ 281
The value returned by the dereference operator is a mutable alias to the original variable, so it can
be changed, modifying the original variable.
*pointer += 1;
printf("Value of pointed to variable after change: %d\n", *pointer);
/* Value of pointed to variable after change: 2 */
Pointers are also re-assignable. This means that a pointer pointing to an object can later be used
to point to another object of the same type.
Like any other variable, pointers have a specific type. You can't assign the address of a short int
to a pointer to a long int, for instance. Such behavior is referred to as type punning and is
forbidden in C, though there are a few exceptions.
Although pointer must be of a specific type, the memory allocated for each type of pointer is equal
to the memory used by the environment to store addresses, rather than the size of the type that is
pointed to.
#include <stdio.h>
int main(void) {
printf("Size of int pointer: %zu\n", sizeof (int*)); /* size 4 bytes */
printf("Size of int variable: %zu\n", sizeof (int)); /* size 4 bytes */
printf("Size of char pointer: %zu\n", sizeof (char*)); /* size 4 bytes */
printf("Size of char variable: %zu\n", sizeof (char)); /* size 1 bytes */
printf("Size of short pointer: %zu\n", sizeof (short*)); /* size 4 bytes */
printf("Size of short variable: %zu\n", sizeof (short)); /* size 2 bytes */
return 0;
}
(NB: if you are using Microsoft Visual Studio, which does not support the C99 or C11 standards,
you must use %Iu1 instead of %zu in the above sample.)
Note that the results above can vary from environment to environment in numbers but all
environments would show equal sizes for different types of pointer.
https://riptutorial.com/ 282
double point[3] = {0.0, 1.0, 2.0};
double *ptr = point;
So essentially ptr and the array name are interchangeable. This rule also means that an array
decays to a pointer when passed to a subroutine.
A pointer may point to any element in an array, or to the element beyond the last element. It is
however an error to set a pointer to any other value, including the element before the array. (The
reason is that on segmented architectures the address before the first element may cross a
segment boundary, the compiler ensures that does not happen for the last element plus one).
Footnote 1: Microsoft format information can be found via printf() and format specification syntax.
The qsort() standard library function is a good example of how one can use void pointers to make
a single function operate on a large variety of different types.
void qsort (
void *base, /* Array to be sorted */
size_t num, /* Number of elements in array */
size_t size, /* Size in bytes of each element */
int (*compar)(const void *, const void *)); /* Comparison function for two elements */
The array to be sorted is passed as a void pointer, so an array of any type of element can be
operated on. The next two arguments tell qsort() how many elements it should expect in the array,
and how large, in bytes, each element is.
The last argument is a function pointer to a comparison function which itself takes two void
pointers. By making the caller provide this function, qsort() can effectively sort elements of any
type.
Here's an example of such a comparison function, for comparing floats. Note that any comparison
function passed to qsort() needs to have this type signature. The way it is made polymorphic is by
casting the void pointer arguments to pointers of the type of element we wish to compare.
https://riptutorial.com/ 283
float fa = *((float *)a);
float fb = *((float *)b);
if (fa < fb)
return -1;
if (fa > fb)
return 1;
return 0;
}
Since we know that qsort will use this function to compare floats, we cast the void pointer
arguments back to float pointers before dereferencing them.
Now, the usage of the polymorphic function qsort on an array "array" with length "len" is very
simple:
https://riptutorial.com/ 284
Chapter 44: Preprocessor and Macros
Introduction
All preprocessor commands begins with the hash (pound) symbol #. A C macro is just a
preprocessor command that is defined using the #define preprocessor directive. During the
preprocessing stage, the C preprocessor (a part of the C compiler) simply substitutes the body of
the macro wherever its name appears.
Remarks
When a compiler encounters a macro in the code, it performs simple string replacement, no
additional operations are performed. Because of this, changes by the preprocessor do not respect
scope of C programs - for example, a macro definition is not limited to being within a block, so is
unaffected by a '}' that ends a block statement.
The preprocessor is, by design, not turing complete - there are several types of computation that
cannot be done by the preprocessor alone.
Usually compilers have a command line flag (or configuration setting) that allows us to stop
compilation after the preprocessing phase and to inspect the result. On POSIX platforms this flag
is -E. So, running gcc with this flag prints the expanded source to stdout:
$ gcc -E cprog.c
Often the preprocessor is implemented as a separate program, which is invoked by the compiler,
common name for that program is cpp. A number of preprocessors emit supporting information,
such as information about line numbers - which is used by subsequent phases of compilation to
generate debugging information. In the case the preprocessor is based on gcc, the -P option
suppresses such information.
$ cpp -P cprog.c
Examples
Conditional inclusion and conditional function signature modification
To conditionally include a block of code, the preprocessor has several directives (e.g #if, #ifdef,
#else, #endif, etc).
https://riptutorial.com/ 285
#else
#define DLOG(x)
#endif
The #if directives behaves similar to the C if statement, it shall only contain integral constant
expressions, and no casts. It supports one additional unary operator, defined( identifier ), which
returns 1 if the identifier is defined, and 0 otherwise.
In most cases a release build of an application is expected to have as little overhead as possible.
However during testing of an interim build, additional logs and information about problems found
can be helpful.
For example assume there is some function SHORT SerOpPluAllRead(PLUIF *pPif, USHORT usLockHnd)
which when doing a test build it is desired will generate a log about its use. However this function
is used in multiple places and it is desired that when generating the log, part of the information is
to know where is the function being called from.
So using conditional compilation you can have something like the following in the include file
declaring the function. This replaces the standard version of the function with a debug version of
the function. The preprocessor is used to replace calls to the function SerOpPluAllRead() with calls
to the function SerOpPluAllRead_Debug() with two additional arguments, the name of the file and the
line number of where the function is used.
Conditional compilation is used to choose whether to override the standard function with a debug
version or not.
#if 0
// function declaration and prototype for our debug version of the function.
SHORT SerOpPluAllRead_Debug(PLUIF *pPif, USHORT usLockHnd, char *aszFilePath, int nLineNo);
// macro definition to replace function call using old name with debug function with
additional arguments.
#define SerOpPluAllRead(pPif,usLock) SerOpPluAllRead_Debug(pPif,usLock,__FILE__,__LINE__)
#else
https://riptutorial.com/ 286
// standard function declaration that is normally used with builds.
SHORT SerOpPluAllRead(PLUIF *pPif, USHORT usLockHnd);
#endif
This allows you to override the standard version of the function SerOpPluAllRead() with a version
that will provide the name of the file and line number in the file of where the function is called.
There is one important consideration: any file using this function must include the header file
where this approach is used in order for the preprocessor to modify the function. Otherwise you
will see a linker error.
The definition of the function would look something like the following. What this source does is to
request that the preprocessor rename the function SerOpPluAllRead() to be SerOpPluAllRead_Debug()
and to modify the argument list to include two additional arguments, a pointer to the name of the
file where the function was called and the line number in the file at which the function is used.
#if defined(SerOpPluAllRead)
// forward declare the replacement function which we will call once we create our log.
SHORT SerOpPluAllRead_Special(PLUIF *pPif, USHORT usLockHnd);
// only print the last 30 characters of the file name to shorten the logs.
iLen = strlen (aszFilePath);
if (iLen > 30) {
iLen = iLen - 30;
}
else {
iLen = 0;
}
sprintf (xBuffer, "SerOpPluAllRead_Debug(): husHandle = %d, File %s, lineno = %d", pPif-
>husHandle, aszFilePath + iLen, nLineNo);
IssueDebugLog(xBuffer);
// now that we have issued the log, continue with standard processing.
return SerOpPluAllRead_Special(pPif, usLockHnd);
}
// our special replacement function name for when we are generating logs.
SHORT SerOpPluAllRead_Special(PLUIF *pPif, USHORT usLockHnd)
#else
// standard, normal function name (signature) that is replaced with our debug version.
SHORT SerOpPluAllRead(PLUIF *pPif, USHORT usLockHnd)
#endif
{
if (STUB_SELF == SstReadAsMaster()) {
return OpPluAllRead(pPif, usLockHnd);
}
return OP_NOT_MASTER;
}
https://riptutorial.com/ 287
The most common uses of #include preprocessing directives are as in the following:
#include <stdio.h>
#include "myheader.h"
#includereplaces the statement with the contents of the file referred to. Angle brackets (<>) refer
to header files installed on the system, while quotation marks ("") are for user-supplied files.
Macros themselves can expand other macros once, as this example illustrates:
#if VERSION == 1
#define INCFILE "vers1.h"
#elif VERSION == 2
#define INCFILE "vers2.h"
/* and so on */
#else
#define INCFILE "versN.h"
#endif
/* ... */
#include INCFILE
Macro Replacement
This defines a function-like macro that multiplies a variable by 10 and stores the new value:
double b = 34;
int c = 23;
The replacement is done before any other interpretation of the program text. In the first call to
TIMES10 the name A from the definition is replaced by b and the so expanded text is then put in
place of the call. Note that this definition of TIMES10 is not equivalent to
because this could evaluate the replacement of A, twice, which can have unwanted side effects.
The following defines a function-like macro which value is the maximum of its arguments. It has
the advantages of working for any compatible types of the arguments and of generating in-line
code without the overhead of function calling. It has the disadvantages of evaluating one or the
other of its arguments a second time (including side effects) and of generating more code than a
https://riptutorial.com/ 288
function if invoked several times.
Because of this, such macros that evaluate their arguments multiple times are usually avoided in
production code. Since C11 there is the _Generic feature that allows to avoid such multiple
invocations.
The abundant parentheses in the macro expansions (right hand side of the definition) ensure that
the arguments and the resulting expression are bound properly and fit well into the context in
which the macro is called.
Error directive
If the preprocessor encounters an #error directive, compilation is halted and the diagnostic
message included is printed.
#define DEBUG
#ifdef DEBUG
#error "Debug Builds Not Supported"
#endif
int main(void) {
return 0;
}
Possible output:
$ gcc error.c
error.c: error: #error "Debug Builds Not Supported"
If there are sections of code that you are considering removing or want to temporarily disable, you
can comment it out with a block comment.
https://riptutorial.com/ 289
However, if the source code you have surrounded with a block comment has block style
comments in the source, the ending */ of the existing block comments can cause your new block
comment to be invalid and cause compilation problems.
/* Return 5 */
return i;
}
*/
In the previous example, the last two lines of the function and the last '*/' are seen by the compiler,
so it would compile with errors. A safer method is to use an #if 0 directive around the code you
want to block out.
#if 0
/* #if 0 evaluates to false, so everything between here and the #endif are
* removed by the preprocessor. */
int myUnusedFunction(void)
{
int i = 5;
return i;
}
#endif
A benefit with this is that when you want to go back and find the code, it's much easier to do a
search for "#if 0" than searching all your comments.
Another very important benefit is that you can nest commenting out code with #if 0. This cannot
be done with comments.
An alternative to using #if 0 is to use a name that will not be #defined but is more descriptive of
why the code is being blocked out. For instance if there is a function that seems to be useless
dead code you might use #if defined(POSSIBLE_DEAD_CODE) or #if defined(FUTURE_CODE_REL_020201)
for code needed once other functionality is in place or something similar. Then when going back
through to remove or enable that source, those sections of source are easy to find.
Token pasting
Token pasting allows one to glue together two macro arguments. For example, front##back yields
frontback. A famous example is Win32's <TCHAR.H> header. In standard C, one can write L"string"
to declare a wide character string. However, Windows API allows one to convert between wide
character strings and narrow character strings simply by #defineing UNICODE. In order to implement
the string literals, TCHAR.H uses this
#ifdef UNICODE
#define TEXT(x) L##x
https://riptutorial.com/ 290
#endif
Whenever a user writes TEXT("hello, world"), and UNICODE is defined, the C preprocessor
concatenates L and the macro argument. L concatenated with "hello, world" gives L"hello, world"
.
Predefined Macros
A predefined macro is a macro that is already understood by the C pre processor without the
program needing to define it. Examples include
There's also a related predefined identifier, __func__ (ISO/IEC 9899:2011 §6.4.2.2), which is not a
macro:
The identifier __func__ shall be implicitly declared by the translator as if, immediately
following the opening brace of each function definition, the declaration:
__FILE__, __LINE__ and __func__ are especially useful for debugging purposes. For example:
Pre-C99 compilers, may or may not support __func__ or may have a macro that acts the same that
is named differently. For example, gcc used __FUNCTION__ in C89 mode.
https://riptutorial.com/ 291
• __STDC_ISO_10646__ An integer constant of the form yyyymmL (for example,
199712L). If this symbol is defined, then every character in the Unicode required
set, when stored in an object of type wchar_t, has the same value as the short
identifier of that character. The Unicode required set consists of all the characters
that are defined by ISO/IEC 10646, along with all amendments and technical
corrigenda, as of the specified year and month. If some other encoding is used,
the macro shall not be defined and the actual encoding used is implementation-
defined.
Pretty much every header file should follow the include guard idiom:
https://riptutorial.com/ 292
my-header-file.h
#ifndef MY_HEADER_FILE_H
#define MY_HEADER_FILE_H
#endif
This ensures that when you #include "my-header-file.h" in multiple places, you don't get duplicate
declarations of functions, variables, etc. Imagine the following hierarchy of files:
header-1.h
typedef struct {
…
} MyStruct;
header-2.h
#include "header-1.h"
main.c
#include "header-1.h"
#include "header-2.h"
int main() {
// do something
}
This code has a serious problem: the detailed contents of MyStruct is defined twice, which is not
allowed. This would result in a compilation error that can be difficult to track down, since one
header file includes another. If you instead did it with header guards:
header-1.h
#ifndef HEADER_1_H
#define HEADER_1_H
typedef struct {
…
} MyStruct;
#endif
header-2.h
https://riptutorial.com/ 293
#ifndef HEADER_2_H
#define HEADER_2_H
#include "header-1.h"
#endif
main.c
#include "header-1.h"
#include "header-2.h"
int main() {
// do something
}
#ifndef HEADER_1_H
#define HEADER_1_H
typedef struct {
…
} MyStruct;
#endif
#ifndef HEADER_2_H
#define HEADER_2_H
typedef struct {
…
} MyStruct;
#endif
#endif
int main() {
// do something
}
When the compiler reaches the second inclusion of header-1.h, HEADER_1_H was already defined by
the previous inclusion. Ergo, it boils down to the following:
#define HEADER_1_H
https://riptutorial.com/ 294
typedef struct {
…
} MyStruct;
#define HEADER_2_H
int main() {
// do something
}
Note: There are multiple different conventions for naming the header guards. Some people like to
name it HEADER_2_H_, some include the project name like MY_PROJECT_HEADER_2_H. The important thing
is to ensure that the convention you follow makes it so that each file in your project has a unique
header guard.
If the structure details were not included in the header, the type declared would be incomplete or
an opaque type. Such types can be useful, hiding implementation details from users of the
functions. For many purposes, the FILE type in the standard C library can be regarded as an
opaque type (though it usually isn't opaque so that macro implementations of the standard I/O
functions can make use of the internals of the structure). In that case, the header-1.h could contain:
#ifndef HEADER_1_H
#define HEADER_1_H
#endif
Note that the structure must have a tag name (here MyStruct — that's in the tags namespace,
separate from the ordinary identifiers namespace of the typedef name MyStruct), and that the { … }
is omitted. This says "there is a structure type struct MyStruct and there is an alias for it MyStruct".
In the implementation file, the details of the structure can be defined to make the type complete:
struct MyStruct {
…
};
If you are using C11, you could repeat the typedef struct MyStruct MyStruct; declaration without
causing a compilation error, but earlier versions of C would complain. Consequently, it is still best
to use the include guard idiom, even though in this example, it would be optional if the code was
only ever compiled with compilers that supported C11.
https://riptutorial.com/ 295
Many compilers support the #pragma once directive, which has the same results:
my-header-file.h
#pragma once
However, #pragma once is not part of the C standard, so the code is less portable if you use it.
A few headers do not use the include guard idiom. One specific example is the standard
<assert.h> header. It may be included multiple times in a single translation unit, and the effect of
doing so depends on whether the macro NDEBUG is defined each time the header is included. You
may occasionally have an analogous requirement; such cases will be few and far between.
Ordinarily, your headers should be protected by the include guard idiom.
FOREACH implementation
We can also use macros for making code easier to read and write. For example we can implement
macros for implementing the foreach construct in C for some data structures like singly- and
doubly-linked lists, queues, etc.
#include <stdio.h>
#include <stdlib.h>
struct LinkedListNode
{
int data;
struct LinkedListNode *next;
};
/* Usage */
int main(void)
{
struct LinkedListNode *list, **plist = &list, *node;
int i;
https://riptutorial.com/ 296
}
}
You can make a standard interface for such data-structures and write a generic implementation of
FOREACH as:
#include <stdio.h>
#include <stdlib.h>
CollectionItem *collectionHead;
/* Other fields */
} Collection;
/* must implement */
void *first(void *coll)
{
return ((Collection*)coll)->collectionHead;
}
/* must implement */
void *last(void *coll)
{
return NULL;
}
/* must implement */
void *next(void *coll, CollectionItem *curr)
{
return curr->next;
}
https://riptutorial.com/ 297
Collection *new_Collection()
{
Collection *nc = malloc(sizeof(Collection));
nc->first = first;
nc->last = last;
nc->next = next;
return nc;
}
/* generic implementation */
#define FOREACH(node, collection) \
for (node = (collection)->first(collection); \
node != (collection)->last(collection); \
node = (collection)->next(collection, node))
int main(void)
{
Collection *coll = new_Collection();
CollectionItem *node;
int i;
To use this generic implementation just implement these functions for your data structure.
__cplusplus for using C externals in C++ code compiled with C++ - name
mangling
There are times when an include file has to generate different output from the preprocessor
depending on whether the compiler is a C compiler or a C++ compiler due to language
differences.
For example a function or other external is defined in a C source file but is used in a C++ source
file. Since C++ uses name mangling (or name decoration) in order to generate unique function
names based on function argument types, a C function declaration used in a C++ source file will
cause link errors. The C++ compiler will modify the specified external name for the compiler output
using the name mangling rules for C++. The result is link errors due to externals not found when
the C++ compiler output is linked with the C compiler output.
Since C compilers do not do name mangling but C++ compilers do for all external labels (function
names or variable names) generated by the C++ compiler, a predefined preprocessor macro,
https://riptutorial.com/ 298
__cplusplus, was introduced to allow for compiler detection.
In order to work around this problem of incompatible compiler output for external names between
C and C++, the macro __cplusplus is defined in the C++ Preprocessor and is not defined in the C
Preprocessor. This macro name can be used with the conditional preprocessor #ifdef directive or
#if with the defined() operator to tell whether a source code or include file is being compiled as
C++ or C.
#ifdef __cplusplus
printf("C++\n");
#else
printf("C\n");
#endif
#if defined(__cplusplus)
printf("C++\n");
#else
printf("C\n");
#endif
In order to specify the correct function name of a function from a C source file compiled with the C
compiler that is being used in a C++ source file you could check for the __cplusplus defined
constant in order to cause the extern "C" { /* ... */ }; to be used to declare C externals when
the header file is included in a C++ source file. However when compiled with a C compiler, the
extern "C" { */ ... */ }; is not used. This conditional compilation is needed because extern "C" {
/* ... */ }; is valid in C++ but not in C.
#ifdef __cplusplus
// if we are being compiled with a C++ compiler then declare the
// following functions as C functions to prevent name mangling.
extern "C" {
#endif
#ifdef __cplusplus
// if this is a C++ compiler, we need to close off the extern declaration.
};
#endif
Function-like macros
Function-like macros are similar to inline functions, these are useful in some cases, such as
temporary debug log:
#ifdef DEBUG
# define LOGFILENAME "/tmp/logfile.log"
# define LOG(str) do { \
https://riptutorial.com/ 299
FILE *fp = fopen(LOGFILENAME, "a"); \
if (fp) { \
fprintf(fp, "%s:%d %s\n", __FILE__, __LINE__, \
/* don't print null pointer */ \
str ?str :"<null>"); \
fclose(fp); \
} \
else { \
perror("Opening '" LOGFILENAME "' failed"); \
} \
} while (0)
#else
/* Make it a NOOP if DEBUG is not defined. */
# define LOG(LINE) (void)0
#endif
#include <stdio.h>
Here in both cases (with DEBUG or not) the call behaves the same way as a function with void return
type. This ensures that the if/else conditionals are interpreted as expected.
In the DEBUG case this is implemented through a do { ... } while(0) construct. In the other case,
(void)0 is a statement with no side effect that is just ignored.
If you use GCC, you can also implement a function-like macro that returns result using a non-
standard GNU extension — statement expressions. For example:
#include <stdio.h>
#define POW(X, Y) \
({ \
int i, r = 1; \
for (i = 0; i < Y; ++i) \
r *= X; \
r; \ // returned value is result of last operation
})
int main(void)
{
int result;
https://riptutorial.com/ 300
result = POW(2, 3);
printf("Result: %d\n", result);
}
C99
Let's say you want to create some print-macro for debugging your code, let's take this macro as an
example:
The function somefunc() returns -1 if failed and 0 if succeeded, and it is called from plenty different
places within the code:
if(retVal == -1)
{
debug_printf("somefunc() has failed");
}
retVal = somefunc();
if(retVal == -1)
{
debug_printf("somefunc() has failed");
}
What happens if the implementation of somefunc() changes, and it now returns different values
matching different possible error types? You still want use the debug macro and print the error
value.
To solve this problem the __VA_ARGS__ macro was introduced. This macro allows multiple
parameters X-macro's:
Example:
https://riptutorial.com/ 301
Usage:
This macro allows you to pass multiple parameters and print them, but now it forbids you from
sending any parameters at all.
debug_print("Hey");
This would raise some syntax error as the macro expects at least one more argument and the pre-
processor would not ignore the lack of comma in the debug_print() macro. Also
debug_print("Hey",); would raise a syntax error as you cant keep the argument passed to macro
empty.
To solve this, ##__VA_ARGS__ macro was introduced, this macro states that if no variable arguments
exist, the comma is deleted by the pre-processor from code.
Example:
Usage:
https://riptutorial.com/ 302
Chapter 45: Random Number Generation
Remarks
Due to the flaws of rand(), many other default implementations have emerged over the years.
Among those are:
Examples
Basic Random Number Generation
The function rand() can be used to generate a pseudo-random integer value between 0 and
RAND_MAX (0 and RAND_MAX included).
srand(int) is used to seed the pseudo-random number generator. Each time rand() is seeded wih
the same seed, it must produce the same sequence of values. It should only be seeded once
before calling rand(). It should not be repeatedly seeded, or reseeded every time you wish to
generate a new batch of pseudo-random numbers.
Standard practice is to use the result of time(NULL) as a seed. If your random number generator
requires to have a deterministic sequence, you can seed the generator with the same value on
each program start. This is generally not required for release code, but is useful in debug runs to
make bugs reproducible.
It is advised to always seed the generator, if not seeded, it behaves as if it was seeded with
srand(1).
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(void) {
int i;
srand(time(NULL));
i = rand();
Possible output:
https://riptutorial.com/ 303
Notes:
The C Standard does not guarantee the quality of the random sequence produced. In the past,
some implementations of rand() had serious issues in distribution and randomness of the
generated numbers. The usage of rand() is not recommended for serious random number
generation needs, like cryptography.
Here's a standalone random number generator that doesn't rely on rand() or similar library
functions.
Why would you want such a thing? Maybe you don't trust your platform's builtin random number
generator, or maybe you want a reproducible source of randomness independent of any particular
library implementation.
This code is PCG32 from pcg-random.org, a modern, fast, general-purpose RNG with excellent
statistical properties. It's not cryptographically secure, so don't use it for cryptography.
#include <stdint.h>
#include <stdio.h>
int main(void) {
pcg32_random_t rng; /* RNG state */
int i;
https://riptutorial.com/ 304
for (i = 0; i < 6; i++)
printf("0x%08x\n", pcg32_random_r(&rng));
return 0;
}
Usually when generating random numbers it is useful to generate integers within a range, or a p
value between 0.0 and 1.0. Whilst modulus operation can be used to reduce the seed to a low
integer this uses the low bits, which often go through a short cycle, resulting in a slight skewing of
distribution if N is large in proportion to RAND_MAX.
The macro
i = (int)(uniform() * N)
Unfortunately there is a technical flaw, in that RAND_MAX is permitted to be larger than a variable
of type double can accurately represent. This means that RAND_MAX + 1.0 evaluates to RAND_MAX
and the function occasionally returns unity. This is unlikely however.
Xorshift Generation
A good and easy alternative to the flawed rand() procedures, is xorshift, a class of pseudo-random
number generators discovered by George Marsaglia. The xorshift generator is among the fastest
non-cryptographically-secure random number generators. More information and other example
implementaions are available on the xorshift Wikipedia page
Example implementation
#include <stdint.h>
/* These state variables must be initialised so that they are not all zero. */
uint32_t w, x, y, z;
uint32_t xorshift128(void)
{
uint32_t t = x;
t ^= t << 11U;
t ^= t >> 8U;
x = y; y = z; z = w;
w ^= w >> 19U;
w ^= t;
return w;
}
https://riptutorial.com/ 305
Read Random Number Generation online: https://riptutorial.com/c/topic/365/random-number-
generation
https://riptutorial.com/ 306
Chapter 46: Selection Statements
Examples
if () Statements
One of the simplest ways to control program flow is by using if selection statements. Whether a
block of code is to be executed or not to be executed can be decided by this statement.
if(cond)
{
statement(s); /*to be executed, on condition being true*/
}
For example,
if (a > 1) {
puts("a is larger than 1");
}
Where a > 1 is a condition that has to evaluate to true in order to execute the statements inside
the if block. In this example "a is larger than 1" is only printed if a > 1 is true.
if selection statements can omit the wrapping braces { and } if there is only one statement within
the block. The above example can be rewritten to
if (a > 1)
puts("a is larger than 1");
However for executing multiple statements within block the braces have to used.
The condition for if can include multiple expressions. if will only perform the action if the end
result of expression is true.
For example
will only execute the printf and a++ if both a and b are greater than 1.
While if performs an action only when its condition evaluate to true, if / else allows you to specify
https://riptutorial.com/ 307
the different actions when the condition true and when the condition is false.
Example:
if (a > 1)
puts("a is larger than 1");
else
puts("a is not larger than 1");
Just like the if statement, when the block within if or else is consisting of only one statement,
then the braces can be omitted (but doing so is not recommended as it can easily introduce
problems involuntarily). However if there's more than one statement within the if or else block,
then the braces have to be used on that particular block.
if (a > 1)
{
puts("a is larger than 1");
a--;
}
else
{
puts("a is not larger than 1");
a++;
}
switch () Statements
switchstatements are useful when you want to have your program do many different things
according to the value of a particular test variable.
int a = 1;
switch (a) {
case 1:
puts("a is 1");
break;
case 2:
puts("a is 2");
break;
default:
puts("a is neither 1 nor 2");
break;
}
int a = 1;
if (a == 1) {
puts("a is 1");
} else if (a == 2) {
puts("a is 2");
https://riptutorial.com/ 308
} else {
puts("a is neither 1 nor 2");
}
If the value of a is 1 when the switch statement is used, a is 1 will be printed. If the value of a is 2
then, a is 2 will be printed. Otherwise, a is neither 1 nor 2 will be printed.
case n: is used to describe where the execution flow will jump in when the value passed to switch
statement is n. n must be compile-time constant and the same n can exist at most once in one
switch statement.
default:is used to describe that when the value didn't match any of the choices for case n:. It is a
good practice to include a default case in every switch statement to catch unexpected behavior.
Note: If you accidentally forget to add a break after the end of a case, the compiler will assume that
you intend to "fall through" and all the subsequent case statements, if any, will be executed
(unless a break statement is found in any of the subsequent cases), regardless of whether the
subsequent case statement(s) match or not. This particular property is used to implement Duff's
Device. This behavior is often considered a flaw in the C language specification.
int a = 1;
switch (a) {
case 1:
case 2:
puts("a is 1 or 2");
case 3:
puts("a is 1, 2 or 3");
break;
default:
puts("a is neither 1, 2 nor 3");
break;
}
Note that the default case is not necessary, especially when the set of values you get in the switch
is finished and known at compile time.
https://riptutorial.com/ 309
case PING:
// do something
break;
case ERROR:
// do something else
break;
}
}
• most compilers will report a warning if you don't handle a value (this would not be reported if
a default case were present)
• for the same reason, if you add a new value to the enum, you will be notified of all the places
where you forgot to handle the new value (with a default case, you would need to manually
explore your code searching for such cases)
• The reader does not need to figure out "what is hidden by the default:", whether there other
enum values or whether it is a protection for "just in case". And if there are other enum values,
did the coder intentionally use the default case for them or is there a bug that was
introduced when he added the value?
• handling each enum value makes the code self explanatory as you can't hide behind a wild
card, you must explicitly handle each of them.
Thus you may add an extra check before your switch to detect it, if you really need it.
switch(t) {
// Same code than before
}
}
While the if ()... else statement allows to define only one (default) behaviour which occurs when
the condition within the if () is not met, chaining two or more if () ... else statements allow to
define a couple more behaviours before going to the last else branch acting as a "default", if any.
Example:
if (a >= 1)
https://riptutorial.com/ 310
{
printf("a is greater than or equals 1.\n");
}
else if (a == 0) //we already know that a is smaller than 1
{
printf("a equals 0.\n");
}
else /* a is smaller than 1 and not equals 0, hence: */
{
printf("a is negative.\n");
}
Nested if()...else statements take more execution time (they are slower) in comparison to an
if()...else ladder because the nested if()...else statements check all the inner conditional
statements once the outer conditional if() statement is satisfied, whereas the if()..else ladder
will stop condition testing once any of the if() or the else if() conditional statements are true.
An if()...else ladder:
#include <stdio.h>
Is, in the general case, considered to be better than the equivalent nested if()...else:
#include <stdio.h>
https://riptutorial.com/ 311
if (a < b)
{
if (a < c)
{
printf("\na = %d is the smallest.", a);
}
else
{
printf("\nc = %d is the smallest.", c);
}
}
else
{
if(b < c)
{
printf("\nb = %d is the smallest.", b);
}
else
{
printf("\nc = %d is the smallest.", c);
}
}
return 0;
}
https://riptutorial.com/ 312
Chapter 47: Sequence points
Remarks
International Standard ISO/IEC 9899:201x Programming languages — C
Here is the complete list of sequence points from Annex C of the online 2011 pre-publication draft
of the C language standard:
Sequence points
Examples
https://riptutorial.com/ 313
Sequenced expressions
a && b
a || b
a , b
a ? b : c
for ( a ; b ; c ) { ... }
In all cases, the expression a is fully evaluated and all side effects are applied before either b or c
are evaluated. In the fourth case, only one of b or c will be evaluated. In the last case, b is fully
evaluated and all side effects are applied before c is evaluated.
Unsequenced expressions
C11
a + b;
a - b;
a * b;
a / b;
a % b;
a & b;
a | b;
In the above examples, the expression a may be evaluated before or after the expression b, b may
be evaluated before a, or they may even be intermixed if they correspond to several instructions.
f(a, b);
Here not only a and b are unsequenced (i.e. the , operator in a function call does not produce a
sequence point) but also f, the expression that determines the function that is to be called.
https://riptutorial.com/ 314
Side effects may be applied immediately after evaluation or deferred until a later point.
Expressions like
or
x++ & x;
f(x++, x);
x++ * x;
a[i++] = i;
Function calls as f(a) always imply a sequence point between the evaluation of the arguments
and the designator (here f and a) and the actual call. If two such calls are unsequenced, the two
function calls are indeterminately sequenced, that is, one is executed before the other, and order
is unspecified.
unsigned counter = 0;
unsingned account(void) {
return counter++;
}
int main(void) {
printf("the order is %u %u\n", account(), account());
}
This implicit twofold modification of counter during the evaluation of the printf arguments is valid,
we just don't know which of the calls comes first. As the order is unspecified, it may vary and
cannot be depended on. So the printout could be:
the order is 0 1
or
the order is 1 0
https://riptutorial.com/ 315
printf("the order is %u %u\n", counter++, counter++); // undefined behavior
has undefined behavior because there is no sequence point between the two modifications of
counter.
https://riptutorial.com/ 316
Chapter 48: Side Effects
Examples
Pre/Post Increment/Decrement operators
In C, there are two unary operators - '++' and '--' that are very common source of confusion. The
operator ++ is called the increment operator and the operator -- is called the decrement operator.
Both of them can be used used in either prefix form or postfix form. The syntax for prefix form for
++ operator is ++operand and the syntax for postfix form is operand++. When used in the prefix form,
the operand is incremented first by 1 and the resultant value of the operand is used in the
evaluation of the expression. Consider the following example:
int n, x = 5;
n = ++x; /* x is incremented by 1(x=6), and result is assigned to n(6) */
/* this is a short form for two statements: */
/* x = x + 1; */
/* n = x ; */
When used in the postfix form, the operand's current value is used in the expression and then the
value of the operand is incremented by 1. Consider the following example:
int n, x = 5;
n = x++; /* value of x(5) is assigned first to n(5), and then x is incremented by 1; x(6) */
/* this is a short form for two statements: */
/* n = x; */
/* x = x + 1; */
int main()
{
int a, b, x = 42;
a = ++x; /* a and x are 43 */
b = x++; /* b is 43, x is 44 */
a = x--; /* a is is 44, x is 43 */
b = --x; /* b and x are 42 */
return 0;
}
From the above it is clear that post operators return the current value of a variable and then modify
it, but pre operators modify the variable and then return the modified value.
In all versions of C, the order of evaluation of pre and post operators are not defined, hence the
following code can return unexpected outputs:
https://riptutorial.com/ 317
int main()
{
int a, x = 42;
a = x++ + x; /* wrong */
a = x + x; /* right */
++x;
int ar[10];
x = 0;
ar[x] = x++; /* wrong */
ar[x++] = x; /* wrong */
ar[x] = x; /* right */
++x;
return 0;
}
Note that it is also good practice to use pre over post operators when used alone in a statement.
Look at the above code for this.
Note also, that when a function is called, all side effects on arguments must take place before the
function runs.
int foo(int x)
{
return x;
}
int main()
{
int a = 42;
int b = foo(a++); /* This returns 43, even if it seems like it should return 42 */
return 0;
}
https://riptutorial.com/ 318
Chapter 49: Signal handling
Syntax
• void (*signal(int sig, void (*func)(int)))(int);
Parameters
Parameter Details
The signal to set the signal handler to, one of SIGABRT, SIGFPE, SIGILL, SIGTERM,
sig
SIGINT, SIGSEGV or some implementation defined value
The signal handler, which is either of the following: SIG_DFL, for the default
func handler, SIG_IGN to ignore the signal, or a function pointer with the signature void
foo(int sig);.
Remarks
The usage of signal handlers with only the guarantees from the C standard imposes various
limitations what can, or can't be done in the user defined signal handler.
• If the user defined function returns while handling SIGSEGV, SIGFPE, SIGILL or any other
implementation-defined hardware interrupt, the behavior is undefined by the C standard.
This is because C's interface doesn't give means to change the faulty state (e.g after a
division by 0) and thus when returning the program is in exactly the same erroneous state
than before the hardware interrupt occurred.
• If the user defined function was called as the result of a call to abort, or raise, the signal
handler is not allowed to call raise, again.
• Signals can arrive in the middle of any operation, and therefore the indivisibility of operations
can in generally not be guaranteed nor does signal handling work well with optimization.
Therefore all modifications to data in a signal handler must be to variables
○ of type sig_atomic_t (all versions) or a lock-free atomic type (since C11, optional)
○ that are volatile qualified.
• Other functions from the C standard library will usually not respect these restrictions,
because they may change variables in the global state of the program. The C standard only
makes guarantees for abort, _Exit (since C99), quick_exit (since C11), signal (for the same
signal number), and some atomic operations (since C11).
Behavior is undefined by the C standard if any of the rules above are violated. Platforms may have
specific extensions, but these are generally not portable beyond that platform.
https://riptutorial.com/ 319
• Usually systems have their own list of functions that are asynchronous signal safe, that is of
C library functions that can be used from a signal handler. E.g often printf is among these
function.
• In particular the C standard doesn't define much about the interaction with its threads
interface (since C11) or any platform specific thread libraries such as POSIX threads. Such
platforms have to specify the interaction of such thread libraries with signals by themselves.
Examples
Signal Handling with “signal()”
Signal numbers can be synchronous (like SIGSEGV – segmentation fault) when they are triggered by
a malfunctioning of the program itself or asynchronous (like SIGINT - interactive attention) when
they are initiated from outside the program, e.g by a keypress as Cntrl-C.
The signal() function is part of the ISO C standard and can be used to assign a function to handle
a specific signal
C11
C11
default:
https://riptutorial.com/ 320
/* Reset the signal to the default handler,
so we will not be called again if things go
wrong on return. */
signal(sig, SIG_DFL);
/* let everybody know that we are finished */
finished = sig;
return;
}
}
int main(void)
{
/* Catch the SIGSEGV signal, raised on segmentation faults (i.e NULL ptr access */
if (signal(SIGSEGV, &handler) == SIG_ERR) {
perror("could not establish handler for SIGSEGV");
return EXIT_FAILURE;
}
/* Then: */
if (finished) {
fprintf(stderr, "we have been terminated by signal %d\n", (int)finished);
return EXIT_FAILURE;
}
Using signal() imposes important limitations what you are allowed to do inside the signal
handlers, see the remarks for further information.
POSIX recommends the usage of sigaction() instead of signal(), due to its underspecified
behavior and significant implementation variations. POSIX also defines many more signals than
ISO C standard, including SIGUSR1 and SIGUSR2, which can be used freely by the programmer for
any purpose.
https://riptutorial.com/ 321
Chapter 50: Standard Math
Syntax
• #include <math.h>
• double pow(double x, double y);
• float powf(float x, float y);
• long double powl(long double x, long double y);
Remarks
1. To link with math library use -lm with gcc flags.
2. A portable program that needs to check for an error from a mathematical function should set
errno to zero, and make the following call feclearexcept(FE_ALL_EXCEPT); before calling a
mathematical function. Upon return from the mathematical function, if errno is nonzero, or
the following call returns nonzero fetestexcept(FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW |
FE_UNDERFLOW); then an error occurred in the mathematical function. Read manpage of
math_error for more information.
Examples
Double precision floating-point remainder: fmod()
This function returns the floating-point remainder of the division of x/y. The returned value has the
same sign as x.
int main(void)
{
double x = 10.0;
double y = 5.1;
return 0;
}
Output:
4.90000
Important: Use this function with care, as it can return unexpected values due to the operation of
floating point values.
https://riptutorial.com/ 322
#include <math.h>
#include <stdio.h>
int main(void)
{
printf("%f\n", fmod(1, 0.1));
printf("%19.17f\n", fmod(1, 0.1));
return 0;
}
Output:
0.1
0.09999999999999995
C99
These functions returns the floating-point remainder of the division of x/y. The returned value has
the same sign as x.
Single Precision:
int main(void)
{
float x = 10.0;
float y = 5.1;
Output:
4.90000
int main(void)
{
long double x = 10.0;
long double y = 5.1;
https://riptutorial.com/ 323
printf("%Lf\n", modulus); /* Lf is for long double. */
}
Output:
4.90000
The following example code computes the sum of 1+4(3+3^2+3^3+3^4+...+3^N) series using
pow() family of standard math library.
#include <stdio.h>
#include <math.h>
#include <errno.h>
#include <fenv.h>
int main()
{
double pwr, sum=0;
int i, n;
printf("\n1+4(3+3^2+3^3+3^4+...+3^N)=?\nEnter N:");
scanf("%d",&n);
if (n<=0) {
printf("Invalid power N=%d", n);
return -1;
}
return 0;
}
Example Output:
1+4(3+3^2+3^3+3^4+...+3^N)=?
Enter N:10
N= 0 S= 1
N= 1 S= 13
N= 2 S= 49
N= 3 S= 157
N= 4 S= 481
N= 5 S= 1453
N= 6 S= 4369
N= 7 S= 13117
https://riptutorial.com/ 324
N= 8 S= 39361
N= 9 S= 118093
N= 10 S= 354289
https://riptutorial.com/ 325
Chapter 51: Storage Classes
Introduction
A storage class is used to set the scope of a variable or function. By knowing the storage class of
a variable, we can determine the life-time of that variable during the run-time of the program.
Syntax
• [auto|register|static|extern] <Data type> <Variable name>[ = <Value>];
• Examples:
Remarks
Storage class specifiers are the keywords which can appear next to the top-level type of a
declaration. The use of these keywords affects the storage duration and linkage of the declared
object, depending on whether it is declared at file scope or at block scope:
Storage
Keyword Linkage Remarks
Duration
https://riptutorial.com/ 326
Storage
Keyword Linkage Remarks
Duration
Every object has an associated storage duration (regardless of scope) and linkage (relevant to
declarations at file scope only), even when these keywords are omitted.
The ordering of storage class specifiers with respect to top-level type specifiers (int, unsigned,
short, etc.) and top-level type qualifiers (const, volatile) is not enforced, so both of these
declarations are valid:
It is, however, considered a good practice to put storage class specifiers first, then any type
qualifiers, then the type specifier (void, char, int, signed long, unsigned long long, long double...).
extern int b = 5; /* legal and redundant at file scope, illegal at block scope */
Storage Duration
Storage duration can be either static or automatic. For a declared object, it is determined
depending on its scope and the storage class specifiers.
https://riptutorial.com/ 327
Static Storage Duration
Variables with static storage duration live throughout the whole execution of the program and can
be declared both at file scope (with or without static) and at block scope (by putting static
explicitly). They are usually allocated and initialized by the operating system at program startup
and reclaimed when the process terminates. In practice, executable formats have dedicated
sections for such variables (data, bss and rodata) and these whole sections from the file are
mapped into memory at certain ranges.
This storage duration was introduced in C11. This wasn't available in earlier C standards. Some
compilers provide a non-standard extension with similar semantics. For example, gcc supports
__thread specifier which can be used in earlier C standards which didn't have _Thread_local.
Variables with thread storage duration can be declared at both file scope and block scope. If
declared at block scope, it shall also use static or extern storage specifier. Its lifetime is the entire
execution the thread in which it's created. This is the only storage specifier that can appear
alongside another storage specifier.
In typical implementations, automatic variables are located at certain offsets in the stack frame of
a function or in registers.
Examples
typedef
https://riptutorial.com/ 328
Defines a new type based on an existing type. Its syntax mirrors that of a variable declaration.
/* NodeRef is a type used for pointers to a structure type with the tag "node" */
typedef struct node *NodeRef;
/* SigHandler is the function pointer type that gets passed to the signal function. */
typedef void (*SigHandler)(int);
While not technically a storage class, a compiler will treat it as one since none of the other storage
classes are allowed if the typedef keyword is used.
The typedefs are important and should not be substituted with #define macro.
However,
auto
This storage class denotes that an identifier has automatic storage duration. This means once the
scope in which the identifier was defined ends, the object denoted by the identifier is no longer
valid.
Since all objects, not living in global scope or being declared static, have automatic storage
duration by default when defined, this keyword is mostly of historical interest and should not be
used:
int foo(void)
{
/* An integer with automatic storage duration. */
auto int i = 3;
/* Same */
int j = 5;
return 0;
} /* The values of i and j are no longer able to be used. */
static
The static storage class serves different purposes, depending on the location of the declaration in
https://riptutorial.com/ 329
the file:
/* Same; static is attached to the function type of f, not the return type int. */
static int f(int n);
2. To save data for use with the next call of a function (scope=block):
void foo()
{
static int a = 0; /* has static storage duration and its lifetime is the
* entire execution of the program; initialized to 0 on
* first function call */
int b = 0; /* b has block scope and has automatic storage duration and
* only "exists" within function */
a += 10;
b += 10;
int main(void)
{
int i;
for (i = 0; i < 5; i++)
{
foo();
}
return 0;
}
Static variables retain their value even when called from multiple different threads.
C99
https://riptutorial.com/ 330
size_t i;
for (i = 0; i < 512; ++i)
printf("%d\n", a[i]);
}
The required number of items (or even a non-null pointer) is not necessarily checked by the
compiler, and compilers are not required to notify you in any way if you don't have enough
elements. If a programmer passes fewer than 512 elements or a null pointer, undefined
behavior is the result. Since it is impossible to enforce this, extra care must be used when
passing a value for that parameter to such a function.
extern
Used to declare an object or function that is defined elsewhere (and that has external linkage).
In general, it is used to declare an object or function to be used in a module that is not the one in
which the corresponding object or function is defined:
/* file1.c */
int foo = 2; /* Has external linkage since it is declared at file scope. */
/* file2.c */
#include <stdio.h>
int main(void)
{
/* `extern` keyword refers to external definition of `foo`. */
extern int foo;
printf("%d\n", foo);
return 0;
}
C99
Things get slightly more interesting with the introduction of the inline keyword in C99:
/* Should usually be place in a header file such that all users see the definition */
/* Hints to the compiler that the function `bar` might be inlined */
/* and suppresses the generation of an external symbol, unless stated otherwise. */
inline void bar(int drink)
{
printf("You ordered drink no.%d\n", drink);
}
register
Hints to the compiler that access to an object should be as fast as possible. Whether the compiler
https://riptutorial.com/ 331
actually uses the hint is implementation-defined; it may simply treat it as equivalent to auto.
The only property that is definitively different for all objects that are declared with register is that
they cannot have their address computed. Thereby register can be a good tool to ensure certain
optimizations:
is an object that can never alias because no code can pass its address to another function where
it might be changed unexpectedly.
cannot decay into a pointer to its first element (i.e. array turning into &array[0]). This means that
the elements of such an array cannot be accessed and the array itself cannot be passed to a
function.
In fact, the only legal usage of an array declared with a register storage class is the sizeof
operator; any other operator would require the address of the first element of the array. For that
reason, arrays generally should not be declared with the register keyword since it makes them
useless for anything other than size computation of the entire array, which can be done just as
easily without the register keyword.
The register storage class is more appropriate for variables that are defined inside a block and
are accessed with high frequency. For example,
C11
_Thread_local
C11
This was a new storage specifier introduced in C11 along with multi-threading. This isn't available
in earlier C standards.
Denotes thread storage duration. A variable declared with _Thread_local storage specifier denotes
that the object is local to that thread and its lifetime is the entire execution of the thread in which
it's created. It can also appear along with static or extern.
https://riptutorial.com/ 332
#include <threads.h>
#include <stdio.h>
#define SIZE 5
return 0;
}
int main(void)
{
thrd_t id[SIZE];
int arr[SIZE] = {1, 2, 3, 4, 5};
/* create 5 threads. */
for(int i = 0; i < SIZE; i++) {
thrd_create(&id[i], thread_func, &arr[i]);
}
https://riptutorial.com/ 333
Chapter 52: Strings
Introduction
In C, a string is not an intrinsic type. A C-string is the convention to have a one-dimensional array
of characters which is terminated by a null-character, by a '\0'.
This means that a C-string with a content of "abc" will have four characters 'a', 'b', 'c' and '\0'.
Syntax
• char str1[] = "Hello, world!"; /* Modifiable */
• char str2[14] = "Hello, world!"; /* Modifiable */
• char* str3 = "Hello, world!"; /* Non-modifiable*/
Examples
Calculate the Length: strlen()
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
return EXIT_SUCCESS;
}
This program computes the length of its second input argument and stores the result in len. It then
prints that length to the terminal. For example, when run with the parameters program_name "Hello,
world!", the program will output The length of the second argument is 13. because the string Hello,
world! is 13 characters long.
strlencounts all the bytes from the beginning of the string up to, but not including, the terminating
NUL character, '\0'. As such, it can only be used when the string is guaranteed to be NUL-
terminated.
https://riptutorial.com/ 334
Also keep in mind that if the string contains any Unicode characters, strlen will not tell you how
many characters are in the string (since some characters may be multiple bytes long). In such
cases, you need to count the characters (i.e., code units) yourself. Consider the output of the
following example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char asciiString[50] = "Hello world!";
char utf8String[50] = "Γειά σου Κόσμε!"; /* "Hello World!" in Greek */
Output:
#include <stdio.h>
#include <string.h>
int main(void)
{
/* Always ensure that your string is large enough to contain the characters
* and a terminating NUL character ('\0')!
*/
char mystring[10];
https://riptutorial.com/ 335
strcpy(mystring, "bar");
printf("%s\n", mystring);
return 0;
}
Outputs:
foo
foobar
bar
The strcmp function lexicographically compare two null-terminated character arrays. The functions
return a negative value if the first argument appears before the second in lexicographical order,
zero if they compare equal, or positive if the first argument appears after the second in
lexicographical order.
#include <stdio.h>
#include <string.h>
int main(void)
{
compare("BBB", "BBB");
compare("BBB", "CCCCC");
compare("BBB", "AAAAAA");
return 0;
}
Outputs:
https://riptutorial.com/ 336
As strcmp, strcasecmp function also compares lexicographically its arguments after translating each
character to its lowercase correspondent:
#include <stdio.h>
#include <string.h>
int main(void)
{
compare("BBB", "bBB");
compare("BBB", "ccCCC");
compare("BBB", "aaaaaa");
return 0;
}
Outputs:
#include <stdio.h>
#include <string.h>
int main(void)
{
compare("BBB", "Bb", 1);
compare("BBB", "Bb", 2);
compare("BBB", "Bb", 3);
return 0;
}
https://riptutorial.com/ 337
Outputs:
BBB equals Bb
BBB comes before Bb
BBB comes before Bb
The function strtok breaks a string into a smaller strings, or tokens, using a set of delimiters.
#include <stdio.h>
#include <string.h>
int main(void)
{
int toknum = 0;
char src[] = "Hello,, world!";
const char delimiters[] = ", !";
char *token = strtok(src, delimiters);
while (token != NULL)
{
printf("%d: [%s]\n", ++toknum, token);
token = strtok(NULL, delimiters);
}
/* source is now "Hello\0, world\0\0" */
}
Output:
1: [Hello]
2: [world]
The string of delimiters may contain one or more delimiters and different delimiter strings may be
used with each call to strtok.
Calls to strtok to continue tokenizing the same source string should not pass the source string
again, but instead pass NULL as the first argument. If the same source string is passed then the first
token will instead be re-tokenized. That is, given the same delimiters, strtok would simply return
the first token again.
Note that as strtok does not allocate new memory for the tokens, it modifies the source string.
That is, in the above example, the string src will be manipulated to produce the tokens that are
referenced by the pointer returned by the calls to strtok. This means that the source string cannot
be const (so it can't be a string literal). It also means that the identity of the delimiting byte is lost
(i.e. in the example the "," and "!" are effectively deleted from the source string and you cannot tell
which delimiter character matched).
Note also that multiple consecutive delimiters in the source string are treated as one; in the
example, the second comma is ignored.
is neither thread safe nor re-entrant because it uses a static buffer while parsing. This
strtok
means that if a function calls strtok, no function that it calls while it is using strtok can also use
https://riptutorial.com/ 338
strtok, and it cannot be called by any function that is itself using strtok.
An example that demonstrates the problems caused by the fact that strtokis not re-entrant is as
follows:
do
{
char *part;
/* Nested calls to strtok do not work as desired */
printf("[%s]\n", first);
part = strtok(first, ".");
while (part != NULL)
{
printf(" [%s]\n", part);
part = strtok(NULL, ".");
}
} while ((first = strtok(NULL, ",")) != NULL);
Output:
[1.2]
[1]
[2]
The expected operation is that the outer do while loop should create three tokens consisting of
each decimal number string ("1.2", "3.5", "4.2"), for each of which the strtok calls for the inner
loop should split it into separate digit strings ("1", "2", "3", "5", "4", "2").
However, because strtok is not re-entrant, this does not occur. Instead the first strtok correctly
creates the "1.2\0" token, and the inner loop correctly creates the tokens "1" and "2". But then the
strtok in the outer loop is at the end of the string used by the inner loop, and returns NULL
immediately. The second and third substrings of the src array are not analyzed at all.
C11
The standard C libraries do not contain a thread-safe or re-entrant version but some others do,
such as POSIX' strtok_r. Note that on MSVC the strtok equivalent, strtok_s is thread-safe.
C11
C11 has an optional part, Annex K, that offers a thread-safe and re-entrant version named
strtok_s. You can test for the feature with __STDC_LIB_EXT1__. This optional part is not widely
supported.
The strtok_s function differs from the POSIX strtok_r function by guarding against storing outside
of the string being tokenized, and by checking runtime constraints. On correctly written programs,
though, the strtok_s and strtok_r behave the same.
Using strtok_s with the example now yields the correct response, like so:
https://riptutorial.com/ 339
/* you have to announce that you want to use Annex K */
#define __STDC_WANT_LIB_EXT1__ 1
#include <string.h>
#ifndef __STDC_LIB_EXT1__
# error "we need strtok_s from Annex K"
#endif
do
{
char *part;
char *posn;
printf("[%s]\n", first);
part = strtok_s(first, ".", &posn);
while (part != NULL)
{
printf(" [%s]\n", part);
part = strtok_s(NULL, ".", &posn);
}
}
while ((first = strtok_s(NULL, ",", &next)) != NULL);
[1.2]
[1]
[2]
[3.5]
[3]
[5]
[4.2]
[4]
[2]
The strchr and strrchr functions find a character in a string, that is in a NUL-terminated character
array. strchr return a pointer to the first occurrence and strrchr to the last one.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char toSearchFor = 'A';
https://riptutorial.com/ 340
}
{
char *firstOcc = strchr(argv[1], toSearchFor);
if (firstOcc != NULL)
{
printf("First position of %c in %s is %td.\n",
toSearchFor, argv[1], firstOcc-argv[1]); /* A pointer difference's result
is a signed integer and uses the length modifier 't'. */
}
else
{
printf("%c is not in %s.\n", toSearchFor, argv[1]);
}
}
{
char *lastOcc = strrchr(argv[1], toSearchFor);
if (lastOcc != NULL)
{
printf("Last position of %c in %s is %td.\n",
toSearchFor, argv[1], lastOcc-argv[1]);
}
}
return EXIT_SUCCESS;
}
$ ./pos AAAAAAA
First position of A in AAAAAAA is 0.
Last position of A in AAAAAAA is 6.
$ ./pos BAbbbbbAccccAAAAzzz
First position of A in BAbbbbbAccccAAAAzzz is 1.
Last position of A in BAbbbbbAccccAAAAzzz is 15.
$ ./pos qwerty
A is not in qwerty.
One common use for strrchr is to extract a file name from a path. For example to extract
myfile.txt from C:\Users\eak\myfile.txt:
return NULL;
}
If we know the length of the string, we can use a for loop to iterate over its characters:
https://riptutorial.com/ 341
char * string = "hello world"; /* This 11 chars long, excluding the 0-terminator. */
size_t i = 0;
for (; i < 11; i++) {
printf("%c\n", string[i]); /* Print each character of the string. */
}
Alternatively, we can use the standard function strlen() to get the length of a string if we don't
know what the string is:
Finally, we can take advantage of the fact that strings in C are guaranteed to be null-terminated
(which we already did when passing it to strlen() in the previous example ;-)). We can iterate over
the array regardless of its size and stop iterating once we reach a null-character:
size_t i = 0;
while (string[i] != '\0') { /* Stop looping when we reach the null-character. */
printf("%c\n", string[i]); /* Print each character of the string. */
i++;
}
We can create strings using string literals, which are sequences of characters surrounded by
double quotation marks; for example, take the string literal "hello world". String literals are
automatically null-terminated.
We can create strings using several methods. For instance, we can declare a char * and initialize
it to point to the first character of a string:
When initializing a char * to a string constant as above, the string itself is usually allocated in read-
only data; string is a pointer to the first element of the array, which is the character 'h'.
Since the string literal is allocated in read-only memory, it is non-modifiable1. Any attempt to
modify it will lead to undefined behaviour, so it's better to add const to get a compile-time error like
this
https://riptutorial.com/ 342
To create a modifiable string, you can declare a character array and initialize its contents using a
string literal, like so:
char modifiable_string[] = {'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '\0'};
Since the second version uses brace-enclosed initializer, the string is not automatically null-
terminated unless a '\0' character is included explicitly in the character array usually as its last
element.
1 Non-modifiable implies that the characters in the string literal can't be modified, but remember that the pointer
string can be modified (can point somewhere else or can be incremented or decremented).
2 Both strings have similar effect in a sense that characters of both strings can't be modified. It should be noted that
string is a pointer to char and it is a modifiable l-value so it can be incremented or point to some other location while
the array string_arr is a non-modifiable l-value, it can't be modified.
char * string_array[] = {
"foo",
"bar",
"baz"
};
Remember: when we assign string literals to char *, the strings themselves are allocated in read-
only memory. However, the array string_array is allocated in read/write memory. This means that
we can modify the pointers in the array, but we cannot modify the strings they point to.
In C, the parameter to main argv (the array of command-line arguments passed when the program
was run) is an array of char *: char * argv[].
We can also create arrays of character arrays. Since strings are arrays of characters, an array of
strings is simply an array whose elements are arrays of characters:
char modifiable_string_array_literals[][4] = {
"foo",
"bar",
"baz"
};
https://riptutorial.com/ 343
This is equivalent to:
char modifiable_string_array[][4] = {
{'f', 'o', 'o', '\0'},
{'b', 'a', 'r', '\0'},
{'b', 'a', 'z', '\0'}
};
Note that we specify 4 as the size of the second dimension of the array; each of the strings in our
array is actually 4 bytes since we must include the null-terminating character.
strstr
return -1;
}
strstr searches the haystack (first) argument for the string pointed to by needle. If found, strstr
returns the address of the occurrence. If it could not find needle, it returns NULL. We use zbpos so
that we don't keep finding the same needle over and over again. In order to skip the first instance,
we add an offset of zbpos. A Notepad clone might call findnext like this, in order to implement its
"Find Next" dialogue:
/*
Called when the user clicks "Find Next"
doc: The text of the document to search
findwhat: The string to find
*/
void onfindnext(const char *doc, const char *findwhat)
{
static int i;
String literals
String literals represent null-terminated, static-duration arrays of char. Because they have static
storage duration, a string literal or a pointer to the same underlying array can safely be used in
https://riptutorial.com/ 344
several ways that a pointer to an automatic array cannot. For example, returning a string literal
from a function has well-defined behavior:
For historical reasons, the elements of the array corresponding to a string literal are not formally
const. Nevertheless, any attempt to modify them has undefined behavior. Typically, a program that
attempts to modify the array corresponding to a string literal will crash or otherwise malfunction.
On the other hand, a pointer to or into the underlying array of a string literal is not itself inherently
special; its value can freely be modified to point to something else:
Furthermore, although initializers for char arrays can have the same form as string literals, use of
such an initializer does not confer the characteristics of a string literal on the initialized array. The
initializer simply designates the length and initial contents of the array. In particular, the elements
are modifiable if not explicitly declared const:
You can call memset to zero out a string (or any other memory block).
Where str is the string to zero out, and n is the number of bytes in the string.
int main(void)
{
char str[42] = "fortytwo";
size_t n = sizeof str; /* Take the size not the length. */
https://riptutorial.com/ 345
printf("'%s'\n", str);
printf("'%s'\n", str);
return EXIT_SUCCESS;
}
Prints:
'fortytwo'
''
Another example:
int main(void)
{
char str[42] = FORTY_STR TWO_STR;
size_t n = sizeof str; /* Take the size not the length. */
char * point_to_two = strstr(str, TWO_STR);
printf("'%s'\n", str);
printf("'%s'\n", str);
printf("'%s'\n", str);
return EXIT_SUCCESS;
}
Prints:
'fortytwo'
'forty'
''
Given a string, strspn calculates the length of the initial substring (span) consisting solely of a
specific list of characters. strcspn is similar, except it calculates the length of the initial substring
consisting of any characters except those listed:
https://riptutorial.com/ 346
/*
Provided a string of "tokens" delimited by "separators", print the tokens along
with the token separators that get skipped.
*/
#include <stdio.h>
#include <string.h>
int main(void)
{
const char sepchars[] = ",.;!?";
char foo[] = ";ball call,.fall gall hall!?.,";
char *s;
int n;
if (n > 0)
printf("skipping separators: << %.*s >> (length=%d)\n", n, s, n);
if (n > 0)
printf("token found: << %.*s >> (length=%d)\n", n, s, n);
return 0;
}
Analogous functions using wide-character strings are wcsspn and wcscspn; they're used the same
way.
Copying strings
#include <stdio.h>
int main(void) {
int a = 10, b;
char c[] = "abc", *d;
https://riptutorial.com/ 347
b = a; /* Integer is copied */
a = 20; /* Modifying a leaves b unchanged - b is a 'deep copy' of a */
printf("%d %d\n", a, b); /* "20 10" will be printed */
d = c;
/* Only copies the address of the string -
there is still only one string stored in memory */
c[1] = 'x';
/* Modifies the original string - d[1] = 'x' will do exactly the same thing */
return 0;
}
The above example compiled because we used char *d rather than char d[3]. Using the latter
would cause a compiler error. You cannot assign to arrays in C.
#include <stdio.h>
int main(void) {
char a[] = "abc";
char b[8];
b = a; /* compile error */
printf("%s\n", b);
return 0;
}
To actually copy strings, strcpy() function is available in string.h. Enough space must be
allocated for the destination before copying.
#include <stdio.h>
#include <string.h>
int main(void) {
char a[] = "abc";
char b[8];
return 0;
}
C99
https://riptutorial.com/ 348
snprintf()
To avoid buffer overrun, snprintf() may be used. It is not the best solution performance-wise since
it has to parse the template string, but it is the only buffer limit-safe function for copying strings
readily-available in standard library, that can be used without any extra steps.
#include <stdio.h>
#include <string.h>
int main(void) {
char a[] = "012345678901234567890";
char b[8];
#if 0
strcpy(b, a); /* causes buffer overrun (undefined behavior), so do not execute this here!
*/
#endif
return 0;
}
strncat()
A second option, with better performance, is to use strncat() (a buffer overflow checking version
of strcat()) - it takes a third argument that tells it the maximum number of bytes to copy:
char dest[32];
dest[0] = '\0';
strncat(dest, source, sizeof(dest) - 1);
/* copies up to the first (sizeof(dest) - 1) elements of source into dest,
then puts a \0 on the end of dest */
Note that this formulation use sizeof(dest) - 1; this is crucial because strncat() always adds a
null byte (good), but doesn't count that in the size of the string (a cause of confusion and buffer
overwrites).
Also note that the alternative — concatenating after a non-empty string — is even more fraught.
Consider:
https://riptutorial.com/ 349
Note, though, that the size specified as the length was not the size of the destination array, but the
amount of space left in it, not counting the terminal null byte. This can cause big overwriting
problems. It is also a bit wasteful; to specify the length argument correctly, you know the length of
the data in the destination, so you could instead specify the address of the null byte at the end of
the existing content, saving strncat() from rescanning it:
This produces the same output as before, but strncat() doesn't have to scan over the existing
content of dst before it starts copying.
strncpy()
The last option is the strncpy() function. Although you might think it should come first, it is a rather
deceptive function that has two main gotchas:
1. If copying via strncpy() hits the buffer limit, a terminating null-character won't be written.
2. strncpy() always completely fills the destination, with null bytes if necessary.
(Such quirky implementation is historical and was initially intended for handling UNIX file names)
Even then, if you have a big buffer it becomes very inefficient to use strncpy() because of
additional null padding.
Warning: The functions atoi, atol, atoll and atof are inherently unsafe, because: If the value of
the result cannot be represented, the behavior is undefined. (7.20.1p1)
#include <stdio.h>
#include <stdlib.h>
val = atoi(argv[1]);
https://riptutorial.com/ 350
printf("String value = %s, Int value = %d\n", argv[1], val);
return 0;
}
When the string to be converted is a valid decimal integer that is in range, the function works:
$ ./atoi 100
String value = 100, Int value = 100
$ ./atoi 200
String value = 200, Int value = 200
For strings that start with a number, followed by something else, only the initial number is parsed:
$ ./atoi 0x200
0
$ ./atoi 0123x300
123
$ ./atoi hello
Formatting the hard disk...
Because of the ambiguities above and this undefined behavior, the atoi family of functions should
never be used.
C99
#include <stdio.h>
int main ()
{
char buffer [50];
double PI = 3.1415926;
sprintf (buffer, "PI = %.7f", PI);
printf ("%s\n",buffer);
return 0;
}
https://riptutorial.com/ 351
Read formatted data from string
#include <stdio.h>
int main ()
{
char sentence []="date : 06-06-2012";
char str [50];
int year;
int month;
int day;
sscanf (sentence,"%s : %2d-%2d-%4d", str, &day, &month, &year);
printf ("%s -> %02d-%02d-%4d\n",str, day, month, year);
return 0;
}
C99
Since C99 the C library has a set of safe conversion functions that interpret a string as a number.
Their names are of the form strtoX, where X is one of l, ul, d, etc to determine the target type of
the conversion
/* At this point we know that everything went fine so ret may be used */
If the string in fact contains no number at all, this usage of strtod returns 0.0.
If this is not satisfactory, the additional parameter endptr can be used. It is a pointer to pointer that
will be pointed to the end of the detected number in the string. If it is set to 0, as above, or NULL, it
is simply ignored.
This endptr parameter provides indicates if there has been a successful conversion and if so,
where the number ended:
char *check = 0;
https://riptutorial.com/ 352
double ret = strtod(argv[1], &check); /* attempt conversion */
/* At this point we know that everything went fine so ret may be used */
These functions have a third parameter nbase that holds the number base in which the number is
written.
long a = strtol("101", 0, 2 ); /* a = 5L */
long b = strtol("101", 0, 8 ); /* b = 65L */
long c = strtol("101", 0, 10); /* c = 101L */
long d = strtol("101", 0, 16); /* d = 257L */
long e = strtol("101", 0, 0 ); /* e = 101L */
long f = strtol("0101", 0, 0 ); /* f = 65L */
long g = strtol("0x101", 0, 0 ); /* g = 257L */
The special value 0 for nbase means the string is interpreted in the same way as number literals
are interpreted in a C program: a prefix of 0x corresponds to a hexadecimal representation,
otherwise a leading 0 is octal and all other numbers are seen as decimal.
Thus the most practical way to interpret a command-line argument as a number would be
...
return EXIT_SUCCESS;
}
This means that the program can be called with a parameter in octal, decimal or hexadecimal.
https://riptutorial.com/ 353
Chapter 53: Structs
Introduction
Structures provide a way to group a set of related variables of diverse types into a single unit of
memory. The structure as a whole can be referenced by a single name or pointer; the structure
members can be accessed individually too. Structures can be passed to functions and returned
from functions. They are defined using the keyword struct.
Examples
Simple data structures
Structure data types are useful way to package related data and have them behave like a single
variable.
struct point
{
int x;
int y;
};
Typedef Structs
Combining typedef with struct can make code clearer. For example:
https://riptutorial.com/ 354
typedef struct
{
int x, y;
} Point;
as opposed to:
struct Point
{
int x, y;
};
Point point;
instead of:
struct Point
{
int x, y;
};
to have advantage of both possible definitions of point. Such a declaration is most convenient if
you learned C++ first, where you may omit the struct keyword if the name is not ambiguous.
typedefnames for structs could be in conflict with other identifiers of other parts of the program.
Some consider this a disadvantage, but for most people having a struct and another identifier the
same is quite disturbing. Notorious is e.g POSIX' stat
where you see a function stat that has one argument that is struct stat.
typedef'dstructs without a tag name always impose that the whole struct declaration is visible to
code that uses it. The entire struct declaration must then be placed in a header file.
Consider:
#include "bar.h"
struct foo
{
bar *aBar;
};
https://riptutorial.com/ 355
So with a typedefd struct that has no tag name, the bar.h file always has to include the whole
definition of bar. If we use
See Typedef
Pointers to structs
When you have a variable containing a struct, you can access its fields using the dot operator (.).
However, if you have a pointer to a struct, this will not work. You have to use the arrow operator (-
>) to access its fields. Here's an example of a terribly simple (some might say "terrible and simple")
implementation of a stack that uses pointers to structs and demonstrates the arrow operator.
#include <stdlib.h>
#include <stdio.h>
/* structs */
struct stack
{
struct node *top;
int size;
};
struct node
{
int data;
struct node *next;
};
/* function declarations */
int push(int, struct stack*);
int pop(struct stack*);
void destroy(struct stack*);
int main(void)
{
int result = EXIT_SUCCESS;
size_t i;
/* initialize stack */
stack->top = NULL;
stack->size = 0;
/* push 10 ints */
https://riptutorial.com/ 356
{
int data = 0;
for(i = 0; i < 10; i++)
{
printf("Pushing: %d\n", data);
if (-1 == push(data, stack))
{
perror("push() failed");
result = EXIT_FAILURE;
break;
}
++data;
}
}
if (EXIT_SUCCESS == result)
{
/* pop 5 ints */
for(i = 0; i < 5; i++)
{
printf("Popped: %i\n", pop(stack));
}
}
/* destroy stack */
destroy(stack);
return result;
}
return result;
}
https://riptutorial.com/ 357
stack->size--;
free(top);
return data;
}
C99
Type Declaration
A structure with at least one member may additionally contain a single array member of
unspecified length at the end of the structure. This is called a flexible array member:
struct ex1
{
size_t foo;
int flex[];
};
struct ex2_header
{
int foo;
char bar;
};
struct ex2
{
struct ex2_header hdr;
int flex[];
};
https://riptutorial.com/ 358
/* Prints "8,8" on my machine, so there is no padding. */
printf("%zu,%zu\n", sizeof(size_t), sizeof(struct ex1));
/* Also prints "8,8" on my machine, so there is no padding in the ex2 structure itself. */
printf("%zu,%zu\n", sizeof(struct ex2_header), sizeof(struct ex2));
The flexible array member is considered to have an incomplete array type, so its size cannot be
calculated using sizeof.
Usage
You can declare and initialize an object with a structure type containing a flexible array member,
but you must not attempt to initialize the flexible array member since it is treated as if it does not
exist. It is forbidden to try to do this, and compile errors will result.
Similarly, you should not attempt to assign a value to any element of a flexible array member when
declaring a structure in this way since there may not be enough padding at the end of the structure
to allow for any objects required by the flexible array member. The compiler will not necessarily
prevent you from doing this, however, so this can lead to undefined behavior.
You may instead choose to use malloc, calloc, or realloc to allocate the structure with extra
storage and later free it, which allows you to use the flexible array member as you wish:
/* valid: allocate an object of structure type `ex1` along with an array of 2 ints */
struct ex1 *pe1 = malloc(sizeof(*pe1) + 2 * sizeof(pe1->flex[0]));
/* valid: allocate an object of structure type ex2 along with an array of 4 ints */
struct ex2 *pe2 = malloc(sizeof(struct ex2) + sizeof(int[4]));
/* valid: allocate 5 structure type ex3 objects along with an array of 3 ints per object */
struct ex3 *pe3 = malloc(5 * (sizeof(*pe3) + sizeof(int[3])));
pe1->flex[0] = 3; /* valid */
pe3[0]->flex[0] = pe1->flex[0]; /* valid */
C99
https://riptutorial.com/ 359
Flexible array members did not exist prior to C99 and are treated as errors. A common
workaround is to declare an array of length 1, a technique called the 'struct hack':
struct ex1
{
size_t foo;
int flex[1];
};
This will affect the size of the structure, however, unlike a true flexible array member:
To use the flex member as a flexible array member, you'd allocate it with malloc as shown above,
except that sizeof(*pe1) (or the equivalent sizeof(struct ex1)) would be replaced with
offsetof(struct ex1, flex) or the longer, type-agnostic expression sizeof(*pe1)-sizeof(pe1->flex).
Alternatively, you might subtract 1 from the desired length of the "flexible" array since it's already
included in the structure size, assuming the desired length is greater than 0. The same logic may
be applied to the other usage examples.
Compatibility
If compatibility with compilers that do not support flexible array members is desired, you may use a
macro defined like FLEXMEMB_SIZE below:
struct ex1
{
size_t foo;
int flex[FLEXMEMB_SIZE];
};
When allocating objects, you should use the offsetof(struct ex1, flex) form to refer to the
structure size (excluding the flexible array member) since it is the only expression that will remain
consistent between compilers that support flexible array members and compilers that do not:
The alternative is to use the preprocessor to conditionally subtract 1 from the specified length. Due
to the increased potential for inconsistency and general human error in this form, I moved the logic
into a separate function:
https://riptutorial.com/ 360
struct ex1 tmp;
#if __STDC_VERSION__ < 199901L
if (n != 0)
n--;
#endif
return malloc(sizeof(tmp) + n * sizeof(tmp.flex[0]));
}
...
In C, all arguments are passed to functions by value, including structs. For small structs, this is a
good thing as it means there is no overhead from accessing the data through a pointer. However,
it also makes it very easy to accidentally pass a huge struct resulting in poor performance,
particularly if the programmer is used to other languages where arguments are passed by
reference.
struct coordinates
{
int x;
int y;
int z;
};
https://riptutorial.com/ 361
Object-based programming using structs
Structs may be used to implement code in an object oriented manner. A struct is similar to a class,
but is missing the functions which normally also form part of a class, we can add these as function
pointer member variables. To stay with our coordinates example:
/* coordinates.h */
/* Constructor */
coordinate *coordinate_create(void);
/* Destructor */
void coordinate_destroy(coordinate *this);
/* coordinates.c */
#include "coordinates.h"
#include <stdio.h>
#include <stdlib.h>
/* Constructor */
coordinate *coordinate_create(void)
{
coordinate *c = malloc(sizeof(*c));
if (c != 0)
{
c->setx = &coordinate_setx;
c->sety = &coordinate_sety;
c->print = &coordinate_print;
c->x = 0;
c->y = 0;
}
return c;
}
/* Destructor */
void coordinate_destroy(coordinate *this)
{
if (this != NULL)
{
free(this);
}
}
/* Methods */
static void coordinate_setx(coordinate *this, int x)
https://riptutorial.com/ 362
{
if (this != NULL)
{
this->x = x;
}
}
/* main.c */
#include "coordinates.h"
#include <stddef.h>
int main(void)
{
/* Create and initialize pointers to coordinate objects */
coordinate *c1 = coordinate_create();
coordinate *c2 = coordinate_create();
/* Now we can use our objects using our methods and passing the object as parameter */
c1->setx(c1, 1);
c1->sety(c1, 2);
c2->setx(c2, 3);
c2->sety(c2, 4);
c1->print(c1);
c2->print(c2);
/* After using our objects we destroy them using our "destructor" function */
coordinate_destroy(c1);
c1 = NULL;
coordinate_destroy(c2);
c2 = NULL;
return 0;
}
https://riptutorial.com/ 363
Read Structs online: https://riptutorial.com/c/topic/1119/structs
https://riptutorial.com/ 364
Chapter 54: Structure Padding and Packing
Introduction
By default, C compilers lay out structures so that each member can be accessed fast, without
incurring penalties for 'unaligned access, a problem with RISC machines such as the DEC Alpha,
and some ARM CPUs.
Depending on the CPU architecture and the compiler, a structure may occupy more space in
memory than the sum of the sizes of its component members. The compiler can add padding
between members or at the end of the structure, but not at the beginning.
Remarks
Eric Raymond has an article on The Lost Art of C Structure Packing which is useful reading.
Examples
Packing structures
By default structures are padded in C. If you want to avoid this behaviour, you have to explicitly
request it. Under GCC it's __attribute__((__packed__)). Consider this example on a 64-bit machine:
struct foo {
char *p; /* 8 bytes */
char c; /* 1 byte */
long x; /* 8 bytes */
};
The structure will be automatically padded to have8-byte alignment and will look like this:
struct foo {
char *p; /* 8 bytes */
char c; /* 1 byte */
long x; /* 8 bytes */
};
So sizeof(struct foo) will give us 24 instead of 17. This happened because of a 64 bit compiler
read/write from/to Memory in 8 bytes of word in each step and obvious when try to write char c; a
one byte in memory a complete 8 bytes (i.e. word) fetched and consumes only first byte of it and
its seven successive of bytes remains empty and not accessible for any read and write operation
for structure padding.
https://riptutorial.com/ 365
Structure packing
But if you add the attribute packed, the compiler will not add padding:
• To save space.
• To format a data structure to transmit over network without depending on each architecture
alignment of each node of the network.
It must be taken in consideration that some processors such as the ARM Cortex-M0 do not allow
unaligned memory access; in such cases, structure packing can lead to undefined behaviour and
can crash the CPU.
Structure padding
struct test_32 {
int a; // 4 byte
short b; // 2 byte
int c; // 4 byte
} str_32;
We might expect this struct to occupy only 10 bytes of memory, but by printing sizeof(str_32) we
see it uses 12 bytes.
This happened because the compiler aligns variables for fast access. A common pattern is that
when the base type occupies N bytes (where N is a power of 2 such as 1, 2, 4, 8, 16 — and
seldom any bigger), the variable should be aligned on an N-byte boundary (a multiple of N bytes).
For the structure shown with sizeof(int) == 4 and sizeof(short) == 2, a common layout is:
Thus struct test_32 occupies 12 bytes of memory. In this example, there is no trailing padding.
The compiler will ensure that any struct test_32 variables are stored starting on a 4-byte
https://riptutorial.com/ 366
boundary, so that the members within the structure will be properly aligned for fast access.
Memory allocation functions such as malloc(), calloc() and realloc() are required to ensure that
the pointer returned is sufficiently well aligned for use with any data type, so dynamically allocated
structures will be properly aligned too.
You can end up with odd situations such as on a 64-bit Intel x86_64 processor (e.g. Intel Core i7
— a Mac running macOS Sierra or Mac OS X), where when compiling in 32-bit mode, the
compilers place double aligned on a 4-byte boundary; but, on the same hardware, when compiling
in 64-bit mode, the compilers place double aligned on an 8-byte boundary.
https://riptutorial.com/ 367
Chapter 55: Testing frameworks
Introduction
Many developers use unit tests to check that their software works as expected. Unit tests check
small units of larger pieces of software, and ensure that the outputs match expectations. Testing
frameworks make unit testing easier by providing set-up/tear-down services and coordinating the
tests.
There are many unit testing frameworks available for C. For example, Unity is a pure C framework.
People quite often use C++ testing frameworks to test C code; there are many C++ test
frameworks too.
Remarks
Test Harness:
1. Link-time substitution
2. Function pointer substitution
3. Preprocessor substitution
4. Combined link-time and function pointer substitution
Note on C++ testing frameworks used in C: Using C++ frameworks for testing a C program is quite
a common practice as explained here.
Examples
CppUTest
CppUTest is an xUnit-style framework for unit testing C and C++. It is written in C++ and aims for
portability and simplicity in design. It has support for memory leak detection, building mocks, and
running its tests along with the Google Test. Comes with helper scripts and sample projects for
Visual Studio and Eclipse CDT.
#include <CppUTest/CommandLineTestRunner.h>
#include <CppUTest/TestHarness.h>
TEST_GROUP(Foo_Group) {}
TEST(Foo_Group, Foo_TestOne) {}
https://riptutorial.com/ 368
as to enable colored output, to run only a
specific test or a group of tests, etc. This
will return the number of failed tests. */
A test group may have a setup() and a teardown() method. The setup method is called prior to
each test and the teardown() method is called after. Both are optional and either may be omitted
independently. Other methods and variables may also be declared inside a group and will be
available to all tests of that group.
TEST_GROUP(Foo_Group)
{
size_t data_bytes = 128;
void * data;
void setup()
{
data = malloc(data_bytes);
}
void teardown()
{
free(data);
}
void clear()
{
memset(data, 0, data_bytes);
}
}
Unity is an xUnit-style test framework for unit testing C. It is written completely in C and is portable,
quick, simple, expressive and extensible. It is designed to especially be also useful for unit testing
for embedded systems.
A simple test case that checks the return value of a function, might look as follows
void test_FunctionUnderTest_should_ReturnFive(void)
{
TEST_ASSERT_EQUAL_INT( 5, FunctionUnderTest() );
}
#include "unity.h"
#include "UnitUnderTest.h" /* The unit to be tested. */
void setUp (void) {} /* Is run before every test, put unit init calls here. */
https://riptutorial.com/ 369
void tearDown (void) {} /* Is run after every test, put unit clean-up calls here. */
void test_TheFirst(void)
{
TEST_IGNORE_MESSAGE("Hello world!"); /* Ignore this test but print a message. */
}
Unity comes with some example projects, makefiles and some Ruby rake scripts that help make
creating longer test files a bit easier.
CMocka
CMocka is an elegant unit testing framework for C with support for mock objects. It only requires
the standard C library, works on a range of computing platforms (including embedded) and with
different compilers. It has a tutorial on testing with mocks, API documentation, and a variety of
examples.
#include <stdarg.h>
#include <stddef.h>
#include <setjmp.h>
#include <cmocka.h>
https://riptutorial.com/ 370
/* If setup and teardown functions are not
needed, then NULL may be passed instead */
int count_fail_tests =
cmocka_run_group_tests (tests, setup, teardown);
return count_fail_tests;
}
https://riptutorial.com/ 371
Chapter 56: Threads (native)
Syntax
• #ifndef __STDC_NO_THREADS__
• # include <threads.h>
• #endif
• void call_once(once_flag *flag, void (*func)(void));
• int cnd_broadcast(cnd_t *cond);
• void cnd_destroy(cnd_t *cond);
• int cnd_init(cnd_t *cond);
• int cnd_signal(cnd_t *cond);
• int cnd_timedwait(cnd_t *restrict cond, mtx_t *restrict mtx, const struct timespec
*restrict ts);
• int cnd_wait(cnd_t *cond, mtx_t *mtx);
• void mtx_destroy(mtx_t *mtx);
• int mtx_init(mtx_t *mtx, int type);
• int mtx_lock(mtx_t *mtx);
• int mtx_timedlock(mtx_t *restrict mtx, const struct timespec *restrict ts);
• int mtx_trylock(mtx_t *mtx);
• int mtx_unlock(mtx_t *mtx);
• int thrd_create(thrd_t *thr, thrd_start_t func, void *arg);
• thrd_t thrd_current(void);
• int thrd_detach(thrd_t thr);
• int thrd_equal(thrd_t thr0, thrd_t thr1);
• _Noreturn void thrd_exit(int res);
• int thrd_join(thrd_t thr, int *res);
• int thrd_sleep(const struct timespec *duration, struct timespec* remaining);
• void thrd_yield(void);
• int tss_create(tss_t *key, tss_dtor_t dtor);
• void tss_delete(tss_t key);
• void *tss_get(tss_t key);
• int tss_set(tss_t key, void *val);
Remarks
C11 threads are an optional feature. Their absence can be tested with __STDC__NO_THREAD__.
Currently (Jul 2016) this feature is not yet implemented by all C libraries that otherwise support
C11.
https://riptutorial.com/ 372
Examples
Start several threads
#include <stdio.h>
#include <threads.h>
#include <stdlib.h>
struct my_thread_data {
double factor;
};
int my_thread_func(void* a) {
struct my_thread_data* d = a;
// do something with d
printf("we found %g\n", d->factor);
// return an success or error code
return d->factor > 1.0;
}
In most cases all data that is accessed by several threads should be initialized before the threads
are created. This ensures that all threads start with a clear state and no race condition occurs.
#include <threads.h>
#include <stdlib.h>
https://riptutorial.com/ 373
// the flag to protect big, must be global and/or static
static once_flag onceBig = ONCE_INIT;
void destroyBig(void) {
free((void*)Big);
}
void initBig(void) {
// assign to temporary with no const qualification
double* b = malloc(largeNum);
if (!b) {
perror("allocation failed for Big");
exit(EXIT_FAILURE);
}
// now initialize and store Big
initializeBigWithSophisticatedValues(largeNum, b);
Big = b;
// ensure that the space is freed on exit or quick_exit
atexit(destroyBig);
at_quick_exit(destroyBig);
}
The once_flag is used to coordinate different threads that might want to initialize the same data Big
. The call to call_once guarantees that
https://riptutorial.com/ 374
Chapter 57: Type Qualifiers
Remarks
Type qualifiers are the keywords which describe additional semantics about a type. They are an
integral part of type signatures. They can appear both at the topmost level of a declaration (directly
affecting the identifier) or at sub-levels (relevant to pointers only, affecting the pointed-to values):
Keyword Remarks
Prevents the mutation of the declared object (by appearing at the topmost level)
const or prevents the mutation of the pointed-to value (by appearing next to a pointer
subtype).
Informs the compiler that the declared object (at topmost level) or the pointed-to
volatile value (in pointer subtypes) may change its value as a result of external
conditions, not only as a result of program control flow.
An optimization hint, relevant to pointers only. Declares intent that for the lifetime
restrict of the pointer, no other pointers will be used to access the same pointed-to
object.
The ordering of type qualifiers with respect to storage class specifiers (static, extern, auto,
register), type modifiers (signed, unsigned, short, long) and type specifiers (int, char, double, etc.) is
not enforced, but the good practice is to put them in the aforementioned order:
Top-level qualifications
/* "a" cannot be mutated by the program but can change as a result of external conditions */
const volatile int a = 5;
/* for the lifetime of "ptr", no other pointer could point to the same "int" object */
int *restrict ptr;
https://riptutorial.com/ 375
/* neither "s2" (because of top-level const) nor "*s2" can be mutated */
const char *const s2 = "World";
/* "*p" may change its value as a result of external conditions, "**p" and "p" cannot */
char *volatile *p;
/* "q", "*q" and "**q" may change their values as a result of external conditions */
volatile char *volatile *volatile q;
Examples
Unmodifiable (const) variables
The const qualification only means that we don't have the right to change the data. It doesn't mean
that the value cannot change behind our back.
During the execution of the other calls *a might have changed, and so this function may return
either false or true.
Warning
const int a = 0;
int *a_ptr = (int*)&a; /* This conversion must be explicitly done with a cast */
*a_ptr += 10; /* This has undefined behavior */
But doing so is an error that leads to undefined behavior. The difficulty here is that this may
behave as expected in simple examples as this, but then go wrong when the code grows.
Volatile variables
https://riptutorial.com/ 376
The volatile keyword tells the compiler that the value of the variable may change at any time as a
result of external conditions, not only as a result of program control flow.
The compiler will not optimize anything that has to do with the volatile variable.
void main()
{
...
while (!quit) {
// Do something that does not modify the quit variable
}
...
}
void interrupt_handler(void)
{
quit = true;
}
The compiler is allowed to notice the while loop does not modify the quit variable and convert the
loop to a endless while (true) loop. Even if the quit variable is set on the signal handler for SIGINT
and SIGTERM, the compiler does not know that.
Declaring quit as volatile will tell the compiler to not optimize the loop and the problem will be
solved.
The same problem happens when accessing hardware, as we see in this example:
The behavior of the optimizer is to read the variable's value once, there is no need to reread it,
since the value will always be the same. So we end up with an infinite loop. To force the compiler
to do what we want, we modify the declaration to:
https://riptutorial.com/ 377
uint8_t volatile * pReg = (uint8_t volatile *) 0x1717;
https://riptutorial.com/ 378
Chapter 58: Typedef
Introduction
The typedef mechanism allows the creation of aliases for other types. It does not create new
types. People often use typedef to improve the portability of code, to give aliases to structure or
union types, or to create aliases for function (or function pointer) types.
In the C standard, typedef is classified as a 'storage class' for convenience; it occurs syntactically
where storage classes such as static or extern could appear.
Syntax
• typedef existing_name alias_name;
Remarks
Disadvantages of Typedef
Also, typedef'd structs without a tag name are a major cause of needless imposition of ordering
relationships among header files.
Consider:
#ifndef FOO_H
#define FOO_H 1
struct foo {
struct bar *bar;
};
#endif
With such a definition, not using typedefs, it is possible for a compilation unit to include foo.h to get
at the FOO_DEF definition. If it doesn't attempt to dereference the bar member of the foo struct then
there will be no need to include the bar.h file.
Typedef vs #define
#define
https://riptutorial.com/ 379
is a C pre-processor directive which is also used to define the aliases for various data types similar
to typedef but with the following differences:
• typedef is limited to giving symbolic names to types only where as #define can be used to
define alias for values as well.
• Note that #define cptr char * followed by cptr a, b; does not do the same as typedef char
*cptr; followed by cptr a, b;. With the #define, b is a plain char variable, but it is also a
pointer with the typedef.
Examples
Typedef for Structures and Unions
Person person;
Compared to the traditional way of declaring structs, programmers wouldn't need to have struct
every time they declare an instance of that struct.
Note that the name Person (as opposed to struct Person) is not defined until the final semicolon.
Thus for linked lists and tree structures which need to contain a pointer to the same structure type,
you must use either:
or:
struct Person {
char name[32];
int age;
Person *next;
};
https://riptutorial.com/ 380
typedef union Float Float;
union Float
{
float f;
char b[sizeof(float)];
};
A structure similar to this can be used to analyze the bytes that make up a float value.
Instead of:
/* write once */
typedef long long ll;
typedef struct mystructure mystruct;
This reduces the amount of typing needed if the type is used many times in the program.
Improving portability
The attributes of data types vary across different architectures. For example, an int may be a 2-
byte type in one implementation and an 4-byte type in another. Suppose a program needs to use a
4-byte type to run correctly.
In one implementation, let the size of int be 2 bytes and that of long be 4 bytes. In another, let the
size of int be 4 bytes and that of long be 8 bytes. If the program is written using the second
implementation,
For the program to run in the first implementation, all the int declarations will have to be changed
to long.
https://riptutorial.com/ 381
To avoid this, one can use typedef
Then, only the typedef statement would need to be changed each time, instead of examining the
whole program.
C99
The <stdint.h> header and the related <inttypes.h> header define standard type names (using
typedef) for integers of various sizes, and these names are often the best choice in modern code
that needs fixed size integers. For example, uint8_t is an unsigned 8-bit integer type; int64_t is a
signed 64-bit integer type. The type uintptr_t is an unsigned integer type big enough to hold any
pointer to object. These types are theoretically optional — but it is rare for them not to be
available. There are variants like uint_least16_t (the smallest unsigned integer type with at least
16 bits) and int_fast32_t (the fastest signed integer type with at least 32 bits). Also, intmax_t and
uintmax_t are the largest integer types supported by the implementation. These types are
mandatory.
If a set of data has a particular purpose, one can use typedef to give it a meaningful name.
Moreover, if the property of the data changes such that the base type must change, only the
typedef statement would have to be changed, instead of examining the whole program.
We can use typedef to simplify the usage of function pointers. Imagine we have some functions, all
having the same signature, that use their argument to print out something in different ways:
#include<stdio.h>
void print_to_n(int n)
{
for (int i = 1; i <= n; ++i)
printf("%d\n", i);
}
void print_n(int n)
{
printf("%d\n, n);
}
Now we can use a typedef to create a named function pointer type called printer:
https://riptutorial.com/ 382
This creates a type, named printer_t for a pointer to a function that takes a single int argument
and returns nothing, which matches the signature of the functions we have above. To use it we
create a variable of the created type and assign it a pointer to one of the functions in question:
printer_t p = &print_to_n;
void (*p)(int) = &print_to_n; // This would be required without the type
Thus the typedef allows a simpler syntax when dealing with function pointers. This becomes more
apparent when function pointers are used in more complex situations, such as arguments to
functions.
If you are using a function that takes a function pointer as a parameter without a function pointer
type defined the function definition would be,
Likewise functions can return function pointers and again, the use of a typedef can make the
syntax simpler when doing so.
A classic example is the signal function from <signal.h>. The declaration for it (from the C
standard) is:
That's a function that takes two arguments — an int and a pointer to a function which takes an int
as an argument and returns nothing — and which returns a pointer to function like its second
argument.
https://riptutorial.com/ 383
then we could declare signal() using:
On the whole, this is easier to understand (even though the C standard did not elect to define a
type to do the job). The signal function takes two arguments, an int and a SigCatcher, and it
returns a SigCatcher — where a SigCatcher is a pointer to a function that takes an int argument
and returns nothing.
Although using typedef names for pointer to function types makes life easier, it can also lead to
confusion for others who will maintain your code later on, so use with caution and proper
documentation. See also Function Pointers.
https://riptutorial.com/ 384
Chapter 59: Undefined behavior
Introduction
In C, some expressions yield undefined behavior. The standard explicitly chooses to not define
how a compiler should behave if it encounters such an expression. As a result, a compiler is free
to do whatever it sees fit and may produce useful results, unexpected results, or even crash.
Code that invokes UB may work as intended on a specific system with a specific compiler, but will
likely not work on another system, or with a different compiler, compiler version or compiler
settings.
Remarks
What is Undefined Behavior (UB)?
Undefined behavior is a term used in the C standard. The C11 standard (ISO/IEC 9899:2011)
defines the term undefined behavior as
These are the results which can happen due to undefined behavior according to standard:
NOTE Possible undefined behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program execution in a
documented manner characteristic of the environment (with or without the issuance of
a diagnostic message), to terminating a translation or execution (with the issuance of a
diagnostic message).
The following quote is often used to describe (less formally though) results happening from
undefined behavior:
“When the compiler encounters [a given undefined construct] it is legal for it to make
demons fly out of your nose” (the implication is that the compiler may choose any
arbitrarily bizarre way to interpret the code without violating the ANSI C standard)
Undefined behavior allows more opportunities for optimization; The compiler can justifiably
assume that any code does not contain undefined behaviour, which can allow it to avoid run-time
checks and perform optimizations whose validity would be costly or impossible to prove otherwise.
https://riptutorial.com/ 385
Why is UB hard to track down?
There are at least two reasons why undefined behavior creates bugs that are difficult to detect:
• The compiler is not required to - and generally can't reliably - warn you about undefined
behavior. In fact requiring it to do so would go directly against the reason for the existence of
undefined behaviour.
• The unpredictable results might not start unfolding at the exact point of the operation where
the construct whose behavior is undefined occurs; Undefined behaviour taints the whole
execution and its effects may happen at any time: During, after, or even before the undefined
construct.
In the case of null-pointer dereference, C language differs from managed languages such as Java
or C#, where the behavior of null-pointer dereference is defined: an exception is thrown, at the
exact time (NullPointerException in Java, NullReferenceException in C#), thus those coming from
Java or C# might incorrectly believe that in such a case, a C program must crash, with or without
the issuance of a diagnostic message.
Additional information
• Explicitly undefined behavior, that is where the C standard explicitly tells you that you are off
limits.
• Implicitly undefined behavior, where there is simply no text in the standard that foresees a
behavior for the situation you brought your program in.
Also have in mind that in many places the behavior of certain constructs is deliberately undefined
by the C standard to leave room for compiler and library implementors to come up with their own
definitions. A good example are signals and signal handlers, where extensions to C, such as the
POSIX operating system standard, define much more elaborated rules. In such cases you just
have to check the documentation of your platform; the C standard can't tell you anything.
Also note that if undefined behavior occurs in program it doesn't mean that just the point where
undefined behavior occurred is problematic, rather entire program becomes meaningless.
Because of such concerns it is important (especially since compilers don't always warn us about
UB) for person programming in C to be at least familiar with the kind of things that trigger
undefined behavior.
It should be noted there are some tools (e.g. static analysis tools such as PC-Lint) which aid in
detecting undefined behavior, but again, they can't detect all occurrences of undefined behavior.
https://riptutorial.com/ 386
Examples
Dereferencing a null pointer
A NULL pointer is guaranteed by the C standard to compare unequal to any pointer to a valid object,
and dereferencing it invokes undefined behavior.
Modifying any object more than once between two sequence points
int i = 42;
i = i++; /* Assignment changes variable, post-increment as well */
int a = i++ + i--;
Code like this often leads to speculations about the "resulting value" of i. Rather than specifying
an outcome, however, the C standards specify that evaluating such an expression produces
undefined behavior. Prior to C2011, the standard formalized these rules in terms of so-called
sequence points:
Between the previous and next sequence point a scalar object shall have its stored
value modified at most once by the evaluation of an expression. Furthermore, the prior
value shall be read only to determine the value to be stored.
That scheme proved to be a little too coarse, resulting in some expressions exhibiting undefined
behavior with respect to C99 that plausibly should not do. C2011 retains sequence points, but
introduces a more nuanced approach to this area based on sequencing and a relationship it calls
"sequenced before":
If a side effect on a scalar object is unsequenced relative to either a different side effect
on the same scalar object or a value computation using the value of the same scalar
object, the behavior is undefined. If there are multiple allowable orderings of the
subexpressions of an expression, the behavior is undefined if such an unsequenced
side effect occurs in any of the orderings.
The full details of the "sequenced before" relation are too long to describe here, but they
supplement sequence points rather than supplanting them, so they have the effect of defining
behavior for some evaluations whose behavior previously was undefined. In particular, if there is a
sequence point between two evaluations, then the one before the sequence point is "sequenced
before" the one after.
https://riptutorial.com/ 387
The following example has well-defined behaviour:
int i = 42;
i = (i++, i+42); /* The comma-operator creates a sequence point */
int i = 42;
printf("%d %d\n", i++, i++); /* commas as separator of function arguments are not comma-
operators */
As with any form of undefined behavior, observing the actual behavior of evaluating expressions
that violate the sequencing rules is not informative, except in a retrospective sense. The language
standard provides no basis for expecting such observations to be predictive even of the future
behavior of the same program.
int foo(void) {
/* do stuff */
/* no return here */
}
int main(void) {
/* Trying to use the (not) returned value causes UB */
int value = foo();
return 0;
}
When a function is declared to return a value then it has to do so on every possible code path
through it. Undefined behavior occurs as soon as the caller (which is expecting a return value)
tries to use the return value1.
Note that the undefined behaviour happens only if the caller attempts to use/access the value from
the function. For example,
int foo(void) {
/* do stuff */
/* no return here */
}
int main(void) {
/* The value (not) returned from foo() is unused. So, this program
* doesn't cause *undefined behaviour*. */
foo();
return 0;
}
C99
The main() function is an exception to this rule in that it is possible for it to be terminated without a
return statement because an assumed return value of 0 will automatically be used in this case2.
https://riptutorial.com/ 388
1 (ISO/IEC 9899:201x, 6.9.1/12)
If the } that terminates a function is reached, and the value of the function call is used
by the caller, the behavior is undefined.
Per paragraph 6.5/5 of both C99 and C11, evaluation of an expression produces undefined
behavior if the result is not a representable value of the expression's type. For arithmetic types,
that's called an overflow. Unsigned integer arithmetic does not overflow because paragraph
6.2.5/9 applies, causing any unsigned result that otherwise would be out of range to be reduced to
an in-range value. There is no analogous provision for signed integer types, however; these can
and do overflow, producing undefined behavior. For example,
int main(void) {
int i = INT_MAX + 1; /* Overflow happens here */
return 0;
}
Most instances of this type of undefined behavior are more difficult to recognize or predict.
Overflow can in principle arise from any addition, subtraction, or multiplication operation on signed
integers (subject to the usual arithmetic conversions) where there are not effective bounds on or a
relationship between the operands to prevent it. For example, this function:
int square(int x) {
return x * x; /* overflows for some values of x */
}
is reasonable, and it does the right thing for small enough argument values, but its behavior is
undefined for larger argument values. You cannot judge from the function alone whether programs
that call it exhibit undefined behavior as a result. It depends on what arguments they pass to it.
On the other hand, consider this trivial example of overflow-safe signed integer arithmetic:
int zero(int x) {
return x - x; /* Cannot overflow */
}
The relationship between the operands of the subtraction operator ensures that the subtraction
never overflows. Or consider this somewhat more practical example:
https://riptutorial.com/ 389
while (fgetc(f1) != EOF) count1++; /* might overflow */
while (fgetc(f2) != EOF) count2++; /* might overflow */
As long as that the counters do not overflow individually, the operands of the final subtraction will
both be non-negative. All differences between any two such values are representable as int.
int a;
printf("%d", a);
The variable a is an int with automatic storage duration. The example code above is trying to print
the value of an uninitialized variable (a was never initialized). Automatic variables which are not
initialized have indeterminate values; accessing these can lead to undefined behavior.
Note: Variables with static or thread local storage, including global variables without the static
keyword, are initialized to either zero, or their initialized value. Hence the following is legal.
static int b;
printf("%d", b);
A very common mistake is to not initialize the variables that serve as counters to 0. You add
values to them, but since the initial value is garbage, you will invoke Undefined Behavior, such
as in the question Compilation on terminal gives off pointer warning and strange symbols.
Example:
#include <stdio.h>
int main(void) {
int i, counter;
for(i = 0; i < 10; ++i)
counter += i;
printf("%d\n", counter);
return 0;
}
Output:
https://riptutorial.com/ 390
32812
The above rules are applicable for pointers as well. For example, the following results in undefined
behavior
int main(void)
{
int *p;
p++; // Trying to increment an uninitialized pointer.
}
Note that the above code on its own might not cause an error or segmentation fault, but trying to
dereference this pointer later would cause the undefined behavior.
return 0;
}
Some compilers helpfully point this out. For example, gcc warns with:
warning: address of stack memory associated with local variable 'baz' returned
[-Wreturn-stack-address]
for the above code. But compilers may not be able to help in complex code.
(1) Returning reference to variable declared static is defined behaviour, as the variable is not
destroyed after leaving current scope.
(2) According to ISO/IEC 9899:2011 6.2.4 §2, "The value of a pointer becomes indeterminate
when the object it points to reaches the end of its lifetime."
(3) Dereferencing the pointer returned by the function foo is undefined behaviour as the memory it
https://riptutorial.com/ 391
references holds an indeterminate value.
Division by zero
int x = 0;
int y = 5 / x; /* integer division */
or
double x = 0.0;
double y = 5.0 / x; /* floating point division */
or
int x = 0;
int y = 5 % x; /* modulo operation */
For the second line in each example, where the value of the second operand (x) is zero, the
behaviour is undefined.
Note that most implementations of floating point math will follow a standard (e.g. IEEE 754), in
which case operations like divide-by-zero will have consistent results (e.g., INFINITY) even though
the C standard says the operation is undefined.
int array[3];
int *beyond_array = array + 3;
*beyond_array = 0; /* Accesses memory that has not been allocated. */
The third line accesses the 4th element in an array that is only 3 elements long, leading to
undefined behavior. Similarly, the behavior of the second line in the following code fragment is
also not well defined:
int array[3];
array[3] = 0;
Note that pointing past the last element of an array is not undefined behavior (beyond_array = array
+ 3 is well defined here), but dereferencing it is (*beyond_array is undefined behavior). This rule
also holds for dynamically allocated memory (such as buffers created through malloc).
A wide variety of standard library functions have among their effects copying byte sequences from
https://riptutorial.com/ 392
one memory region to another. Most of these functions have undefined behavior when the source
and destination regions overlap.
... attempts to copy 10 bytes where the source and destination memory areas overlap by three
bytes. To visualize:
overlapping area
|
_ _
| |
v v
T h i s i s a n e x a m p l e \0
^ ^
| |
| destination
|
source
Among the standard library functions with a limitation of this kind are memcpy(), strcpy(), strcat(),
sprintf(), and sscanf(). The standard says of these and several other functions:
If copying takes place between objects that overlap, the behavior is undefined.
The memmove() function is the principal exception to this rule. Its definition specifies that the function
behaves as if the source data were first copied into a temporary buffer and then written to the
destination address. There is no exception for overlapping source and destination regions, nor any
need for one, so memmove() has well-defined behavior in such cases.
The distinction reflects an efficiency vs. generality tradeoff. Copying such as these functions
perform usually occurs between disjoint regions of memory, and often it is possible to know at
development time whether a particular instance of memory copying will be in that category.
Assuming non-overlap affords comparatively more efficient implementations that do not reliably
produce correct results when the assumption does not hold. Most C library functions are allowed
the more efficient implementations, and memmove() fills in the gaps, serving the cases where the
source and destination may or do overlap. To produce the correct effect in all cases, however, it
must perform additional tests and / or employ a comparatively less efficient implementation.
C11
https://riptutorial.com/ 393
• uninitialized
• defined with automatic storage duration
• it's address is never taken
1 (Quoted from: ISO:IEC 9899:201X 6.3.2.1 Lvalues, arrays, and function designators 2)
If the lvalue designates an object of automatic storage duration that could have been declared with
the register storage class (never had its address taken), and that object is uninitialized (not
declared with an initializer and no assignment to it has been performed prior to use), the behavior
is undefined.
Data race
C11
C11 introduced support for multiple threads of execution, which affords the possibility of data
races. A program contains a data race if an object in it is accessed1 by two different threads,
where at least one of the accesses is non-atomic, at least one modifies the object, and program
semantics fail to ensure that the two accesses cannot overlap temporally.2 Note well that actual
concurrency of the accesses involved is not a condition for a data race; data races cover a
broader class of issues arising from (allowed) inconsistencies in different threads' views of
memory.
#include <threads.h>
int a = 0;
return 0;
}
int b = a;
thrd_join( id , NULL );
}
https://riptutorial.com/ 394
The main thread calls thrd_create to start a new thread running function Function. The second
thread modifies a, and the main thread reads a. Neither of those access is atomic, and the two
threads do nothing either individually or jointly to ensure that they do not overlap, so there is a
data race.
Among the ways this program could avoid the data race are
• the main thread could perform its read of a before starting the other thread;
• the main thread could perform its read of a after ensuring via thrd_join that the other has
terminated;
• the threads could synchronize their accesses via a mutex, each one locking that mutex
before accessing a and unlocking it afterward.
As the mutex option demonstrates, avoiding a data race does not require ensuring a specific order
of operations, such as the child thread modifying a before the main thread reads it; it is sufficient
(for avoiding a data race) to ensure that for a given execution, one access will happen before the
other.
2 (Quoted from ISO:IEC 9889:201x, section 5.1.2.4 "Multi-threaded executions and data races")
The execution of a program contains a data race if it contains two conflicting actions in different
threads, at least one of which is not atomic, and neither happens before the other. Any such data
race results in undefined behavior.
Even just reading the value of a pointer that was freed (i.e. without trying to dereference the
pointer) is undefined behavior(UB), e.g.
char *p = malloc(5);
free(p);
if (p == NULL) /* NOTE: even without dereferencing, this may have UB */
{
[…] The value of a pointer becomes indeterminate when the object it points to (or just
past) reaches the end of its lifetime.
The use of indeterminate memory for anything, including apparently harmless comparison or
arithmetic, can have undefined behavior if the value can be a trap representation for the type.
In this code example, the char pointer p is initialized to the address of a string literal. Attempting to
modify the string literal has undefined behavior.
https://riptutorial.com/ 395
char *p = "hello world";
p[0] = 'H'; // Undefined behavior
However, modifying a mutable array of char directly, or through a pointer is naturally not undefined
behavior, even if its initializer is a literal string. The following is fine:
a[0] = 'H';
p[7] = 'W';
That's because the string literal is effectively copied to the array each time the array is initialized
(once for variables with static duration, each time the array is created for variables with automatic
or thread duration — variables with allocated duration aren't initialized), and it is fine to modify
array contents.
int * x = malloc(sizeof(int));
*x = 9;
free(x);
free(x);
Otherwise, if the argument does not match a pointer earlier returned by the calloc,
malloc, or realloc function, or if the space has been deallocated by a call to free or
realloc, the behavior is undefined.
Using an incorrect format specifier in the first argument to printf invokes undefined behavior. For
example, the code below invokes undefined behavior:
long z = 'B';
printf("%c\n", z);
printf("%f\n",0);
Above line of code is undefined behavior. %f expects double. However 0 is of type int.
Note that your compiler usually can help you avoid cases like these, if you turn on the proper flags
during compiling (-Wformat in clang and gcc). From the last example:
https://riptutorial.com/ 396
warning: format specifies type 'double' but the argument has type
'int' [-Wformat]
printf("%f\n",0);
~~ ^
%d
The following might have undefined behavior due to incorrect pointer alignment:
The undefined behavior happens as the pointer is converted. According to C11, if a conversion
between two pointer types produces a result that is incorrectly aligned (6.3.2.3), the behavior is
undefined. Here an uint32_t could require alignment of 2 or 4 for example.
calloc on the other hand is required to return a pointer that is suitably aligned for any object type;
thus memory_block is properly aligned to contain an uint32_t in its initial part. Then, on a system
where uint32_t has required alignment of 2 or 4, memory_block + 1 will be an odd address and thus
not properly aligned.
Observe that the C standard requests that already the cast operation is undefined. This is imposed
because on platforms where addresses are segmented, the byte address memory_block + 1 may
not even have a proper representation as an integer pointer.
Casting char * to pointers to other types without any concern to alignment requirements is
sometimes incorrectly used for decoding packed structures such as file headers or network
packets.
You can avoid the undefined behavior arising from misaligned pointer conversion by using memcpy:
Here no pointer conversion to uint32_t* takes place and the bytes are copied one by one.
This copy operation for our example only leads to valid value of mvalue because:
• We used calloc, so the bytes are properly initialized. In our case all bytes have value 0, but
any other proper initialization would do.
• uint32_t is an exact width type and has no padding bits
• Any arbitrary bit pattern is a valid representation for any unsigned type.
https://riptutorial.com/ 397
char *ptr1 = buffer - 1; /* undefined behavior */
char *ptr2 = buffer + 5; /* OK, pointing to the '\0' inside the array */
char *ptr3 = buffer + 6; /* OK, pointing to just beyond */
char *ptr4 = buffer + 7; /* undefined behavior */
According to C11, if addition or subtraction of a pointer into, or just beyond, an array object and an
integer type produces a result that does not point into, or just beyond, the same array object, the
behavior is undefined (6.5.6).
Additionally it is naturally undefined behavior to dereference a pointer that points to just beyond
the array:
foo_ptr = (int *)&foo_readonly; /* (1) This casts away the const qualifier */
*foo_ptr = 20; /* This is undefined behavior */
return 0;
}
(1)In GCC this can throw the following warning: warning: assignment discards ‘const’ qualifier
from pointer target type [-Wdiscarded-qualifiers]
The %s conversion of printf states that the corresponding argument a pointer to the initial element
of an array of character type. A null pointer does not point to the initial element of any array of
character type, and thus the behavior of the following is undefined:
However, the undefined behavior does not always mean that the program crashes — some
systems take steps to avoid the crash that normally happens when a null pointer is dereferenced.
For example Glibc is known to print
https://riptutorial.com/ 398
(null)
for the code above. However, add (just) a newline to the format string and you will get a crash:
char *foo = 0;
printf("%s\n", foo); /* undefined behavior */
In this case, it happens because GCC has an optimization that turns printf("%s\n", argument); into
a call to puts with puts(argument), and puts in Glibc does not handle null pointers. All this behavior
is standard conforming.
Note that null pointer is different from an empty string. So, the following is valid and has no
undefined behaviour. It'll just print a newline:
If, within a translation unit, the same identifier appears with both internal and external
linkage, the behavior is undefined.
Note that if an prior declaration of an identifier is visible then it'll have the prior declaration's
linkage. C11, §6.2.2, 4 allows it:
For an identifier declared with the storage-class specifier extern in a scope in which a
prior declaration of that identifier is visible,31) if the prior declaration specifies internal
or external linkage, the linkage of the identifier at the later declaration is the same as
the linkage specified at the prior declaration. If no prior declaration is visible, or if the
prior declaration specifies no linkage, then the identifier has external linkage.
https://riptutorial.com/ 399
Using fflush on an input stream
The POSIX and C standards explicitly state that using fflush on an input stream is undefined
behavior. The fflush is defined only for output streams.
#include <stdio.h>
int main()
{
int i;
char input[4096];
scanf("%i", &i);
fflush(stdin); // <-- undefined behavior
gets(input);
return 0;
}
There is no standard way to discard unread characters from an input stream. On the other hand,
some implementations uses fflush to clear stdin buffer. Microsoft defines the behavior of fflush
on an input stream: If the stream is open for input, fflush clears the contents of the buffer.
According to POSIX.1-2008, the behavior of fflush is undefined unless the input file is seekable.
Bit shifting using negative counts or beyond the width of the type
If the shift count value is a negative value then both left shift and right shift operations are
undefined1:
If left shift is performed on a positive value and result of the mathematical value is not
representable in the type, it's undefined1:
/* Assuming an int is 32-bits wide, the value '5 * 2^72' doesn't fit
* in an int. So, this is undefined. */
Note that right shift on a negative value (.e.g -5 >> 3) is not undefined but implementation-defined
.
https://riptutorial.com/ 400
If the value of the right operand is negative or is greater than or equal to the width of
the promoted left operand, the behavior is undefined.
Modifying the strings returned by the standard functions getenv(), strerror() and setlocale() is
undefined. So, implementations may use static storage for these strings.
The getenv function returns a pointer to a string associated with the matched list
member. The string pointed to shall not be modified by the program, but may be
overwritten by a subsequent call to the getenv function.
The strerror function returns a pointer to the string, the contents of which are
localespecific. The array pointed to shall not be modified by the program, but may be
overwritten by a subsequent call to the strerror function.
The pointer to string returned by the setlocale function is such that a subsequent call
with that string value and its associated category will restore that part of the program’s
locale. The string pointed to shall not be modified by the program, but may be
overwritten by a subsequent call to the setlocale function.
Similarly the localeconv() function returns a pointer to struct lconv which shall not be modified.
The localeconv function returns a pointer to the filled-in object. The structure pointed to
by the return value shall not be modified by the program, but may be overwritten by a
subsequent call to the localeconv function.
C11
The function specifier _Noreturn was introduced in C11. The header <stdnoreturn.h> provides a
macro noreturn which expands to _Noreturn. So using _Noreturn or noreturn from <stdnoreturn.h> is
fine and equivalent.
A function that's declared with _Noreturn (or noreturn) is not allowed to return to its caller. If such a
function does return to its caller, the behavior is undefined.
In the following example, func() is declared with noreturn specifier but it returns to its caller.
https://riptutorial.com/ 401
#include <stdio.h>
#include <stdlib.h>
#include <stdnoreturn.h>
void func(void)
{
printf("In func()...\n");
} /* Undefined behavior as func() returns */
int main(void)
{
func();
return 0;
}
$ gcc test.c
test.c: In function ‘func’:
test.c:9:1: warning: ‘noreturn’ function does return
}
^
$ clang test.c
test.c:9:1: warning: function declared 'noreturn' should not return [-Winvalid-noreturn]
}
^
#include <stdio.h>
#include <stdlib.h>
#include <stdnoreturn.h>
int main(void)
{
my_exit();
return 0;
}
https://riptutorial.com/ 402
Chapter 60: Unions
Examples
Difference between struct and union
This illustrates that union members shares memory and that struct members does not share
memory.
#include <stdio.h>
#include <string.h>
union My_Union
{
int variable_1;
int variable_2;
};
struct My_Struct
{
int variable_1;
int variable_2;
};
Some C implementations permit code to write to one member of a union type then read from
another in order to perform a sort of reinterpreting cast (parsing the new type as the bit
representation of the old one).
It is important to note however, this is not permitted by the C standard current or past and will
result in undefined behavior, none the less is is a very common extension offered by compilers (so
check your compiler docs if you plan to do this).
One real life example of this technique is the "Fast Inverse Square Root" algorithm which relies on
https://riptutorial.com/ 403
implementation details of IEEE 754 floating point numbers to perform an inverse square root more
quickly than using floating point operations, this algorithm can be performed either through pointer
casting (which is very dangerous and breaks the strict aliasing rule) or through a union (which is
still undefined behavior but works in many compilers):
union floatToInt
{
int32_t intMember;
float floatMember; /* Float must be 32 bits IEEE 754 for this to work */
};
This technique was widely used in computer graphics and games in the past due to its greater
speed compared to using floating point operations, and is very much a compromise, losing some
accuracy and being very non portable in exchange for speed.
The members of a union share the same space in memory. This means that writing to one
member overwrites the data in all other members and that reading from one member results in the
same data as reading from all other members. However, because union members can have
differing types and sizes, the data that is read can be interpreted differently, see
http://www.riptutorial.com/c/example/9399/using-unions-to-reinterpret-values
The simple example below demonstrates a union with two members, both of the same type. It
shows that writing to member m_1 results in the written value being read from member m_2 and
writing to member m_2 results in the written value being read from member m_1.
#include <stdio.h>
https://riptutorial.com/ 404
u.m_2 = 2; /* Write to m_2 */
printf("u.m_1: %i\n", u.m_1); /* Read from m_1 */
return 0;
}
Result
u.m_2: 1
u.m_1: 2
https://riptutorial.com/ 405
Chapter 61: Valgrind
Syntax
• valgrind program-name optional-arguments < test input
Remarks
Valgrind is a debugging tool that can be used to diagnose errors regarding memory management
in C programs. Valgrind can be used to detect errors like invalid pointer usage, including writing or
reading past the allocated space, or making an invalid call to free(). It can also be used for
improving applications through functions that conduct memory profiling.
Examples
Running Valgrind
This will run your program and produce a report of any allocations and de-allocations it did. It will
also warn you about common errors like using uninitialized memory, dereferencing pointers to
strange places, writing off the end of blocks allocated using malloc, or failing to free blocks.
Adding flags
See valgrind --help for more information about the (many) options, or look at the documentation at
http://valgrind.org/ for detailed information about what the output means.
#include <stdio.h>
#include <stdlib.h>
https://riptutorial.com/ 406
return 0;
}
With no extra arguments, valgrind will not look for this error.
But if we turn on --leak-check=yes or --tool=memcheck, it will complain and display the lines
responsible for those memory leaks if the program was compiled in debug mode:
If the program is not compiled in debug mode (for example with the -g flag in GCC) it will still show
us where the leak happened in terms of the relevant function, but not the lines.
This lets us go back and look at what block was allocated in that line and try to trace forward to
see why it wasn't freed.
Valgrind provides you with the lines at which the error occurred at the end of each line in the
format (file.c:line_no). Errors in valgrind are summarised in the following way:
This happens when the code starts to access memory which does not belong to the program. The
size of the memory accessed also gives you an indication of what variable was used.
https://riptutorial.com/ 407
According to the error, at line 7 of the main of valg.c, the call to printf() passed an uninitialized
variable to printf.
According to valgrind, the code freed the memory illegally (a second time) at line 10 of valg.c,
whereas it was already freed at line 9, and the block itself was allocated memory at line 7.
https://riptutorial.com/ 408
Chapter 62: Variable arguments
Introduction
Variable arguments are used by functions in the printf family (printf, fprintf, etc) and others to
allow a function to be called with a different number of arguments each time, hence the name
varargs.
To implement functions using the variable arguments feature, use #include <stdarg.h>.
To call functions which take a variable number of arguments, ensure there is a full prototype with
the trailing ellipsis in scope: void err_exit(const char *format, ...); for example.
Syntax
• void va_start(va_list ap, last); /* Start variadic argument processing; last is the last function
parameter before the ellipsis (“...”) */
• type va_arg(va_list ap, type); /* Get next variadic argument in list; be sure to pass the correct
promoted type */
• void va_end(va_list ap); /* End argument processing */
• void va_copy(va_list dst, va_list src); /* C99 or later: copy argument list, i.e. current position
in argument processing, into another list (e.g. to pass over arguments multiple times) */
Parameters
Parameter Details
name of last non-variadic function argument, so the compiler finds the correct
last place to start processing variadic arguments; may not be declared as a register
variable, a function, or an array type
promoted type of the variadic argument to read (e.g. int for a short int
type
argument)
Remarks
The va_start, va_arg, va_end, and va_copy functions are actually macros.
https://riptutorial.com/ 409
Be sure to always call va_start first, and only once, and to call va_end last, and only once, and on
every exit point of the function. Not doing so may work on your system but surely is not portable
and thus invites bugs.
Take care to declare your function correctly, i.e. with a prototype, and mind the restrictions on the
last non-variadic argument (not register, not a function or array type). It is not possible to declare
a function that takes only variadic arguments, as at least one non-variadic argument is needed to
be able to start argument processing.
When calling va_arg, you must request the promoted argument type, that is:
• short is promoted to int (and unsigned short is also promoted to int unless sizeof(unsigned
short) == sizeof(int), in which case it is promoted to unsigned int).
• float is promoted to double.
• signed char is promoted to int; unsigned char is also promoted to int unless sizeof(unsigned
char) == sizeof(int), which is seldom the case.
• char is usually promoted to int.
• C99 types like uint8_t or int16_t are similarly promoted.
Historic (i.e. K&R) variadic argument processing is declared in <varargs.h> but should not be used
as it’s obsolete. Standard variadic argument processing (the one described here and declared in
<stdarg.h>) was introduced in C89; the va_copy macro was introduced in C99 but provided by many
compilers prior to that.
Examples
Using an explicit count argument to determine the length of the va_list
With any variadic function, the function must know how to interpret the variable arguments list.
With the printf() or scanf() functions, the format string tells the function what to expect.
The simplest technique is to pass an explicit count of the other arguments (which are normally all
the same type). This is demonstrated in the variadic function in the code below which calculates
the sum of a series of integers, where there may be any number of integers but that count is
specified as an argument prior to the variable argument list.
#include <stdio.h>
#include <stdarg.h>
return sum;
}
https://riptutorial.com/ 410
int main(void)
{
printf("%d\n", sum(5, 1, 2, 3, 4, 5)); /* prints 15 */
printf("%d\n", sum(10, 5, 9, 2, 5, 111, 6666, 42, 1, 43, -6218)); /* prints 666 */
return 0;
}
With any variadic function, the function must know how to interpret the variable arguments list. The
“traditional” approach (exemplified by printf) is to specify number of arguments up front. However,
this is not always a good idea:
/* First argument specifies the number of parameters; the remainder are also int */
extern int sum(int n, ...);
Sometimes it's more robust to add an explicit terminator, exemplified by the POSIX execlp()
function. Here's another function to calculate the sum of a series of double numbers:
#include <stdarg.h>
#include <stdio.h>
#include <math.h>
va_start(va, x);
for (; !isnan(x); x = va_arg(va, double)) {
sum += x;
}
va_end(va);
return sum;
}
https://riptutorial.com/ 411
Implementing functions with a `printf()`-like interface
One common use of variable-length argument lists is to implement functions that are a thin
wrapper around the printf() family of functions. One such example is a set of error reporting
functions.
errmsg.h
#ifndef ERRMSG_H_INCLUDED
#define ERRMSG_H_INCLUDED
#include <stdarg.h>
#include <stdnoreturn.h> // C11
#endif
This is a bare-bones example; such packages can be much elaborate. Normally, programmers will
use either errmsg() or warnmsg(), which themselves use verrmsg() internally. If someone comes up
with a need to do more, though, then the exposed verrmsg() function will be useful. You could
avoid exposing it until you have a need for it (YAGNI — you aren't gonna need it), but the need will
arise eventually (you are gonna need it — YAGNI).
errmsg.c
This code only needs to forward the variadic arguments to the vfprintf() function for outputting to
standard error. It also reports the system error message corresponding to the system error
number (errno) passed to the functions.
#include "errmsg.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void
verrmsg(int errnum, const char *fmt, va_list ap)
{
if (fmt)
vfprintf(stderr, fmt, ap);
if (errnum != 0)
fprintf(stderr, ": %s", strerror(errnum));
putc('\n', stderr);
}
void
errmsg(int exitcode, int errnum, const char *fmt, ...)
{
va_list ap;
va_start(ap, fmt);
verrmsg(errnum, fmt, ap);
va_end(ap);
exit(exitcode);
https://riptutorial.com/ 412
}
void
warnmsg(int errnum, const char *fmt, ...)
{
va_list ap;
va_start(ap, fmt);
verrmsg(errnum, fmt, ap);
va_end(ap);
}
Using errmsg.h
#include "errmsg.h"
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
If either the open() or read() system calls fails, the error is written to standard error and the
program exits with exit code 1. If the close() system call fails, the error is merely printed as a
warning message, and the program continues.
If you are using GCC (the GNU C Compiler, which is part of the GNU Compiler Collection), or
using Clang, then you can have the compiler check that the arguments you pass to the error
message functions match what printf() expects. Since not all compilers support the extension, it
needs to be compiled conditionally, which is a little bit fiddly. However, the protection it gives is
worth the effort.
https://riptutorial.com/ 413
First, we need to know how to detect that the compiler is GCC or Clang emulating GCC. The
answer is that GCC defines __GNUC__ to indicate that.
See common function attributes for information about the attributes — specifically the format
attribute.
Rewritten errmsg.h
#ifndef ERRMSG_H_INCLUDED
#define ERRMSG_H_INCLUDED
#include <stdarg.h>
#include <stdnoreturn.h> // C11
#if !defined(PRINTFLIKE)
#if defined(__GNUC__)
#define PRINTFLIKE(n,m) __attribute__((format(printf,n,m)))
#else
#define PRINTFLIKE(n,m) /* If only */
#endif /* __GNUC__ */
#endif /* PRINTFLIKE */
#endif
Using a format string provides information about the expected number and type of the subsequent
variadic arguments in such a way as to avoid the need for an explicit count argument or a
terminator value.
The example below shows a a function that wraps the standard printf() function, only allowing for
https://riptutorial.com/ 414
the use of variadic arguments of the type char, int and double (in decimal floating point format).
Here, like with printf(), the first argument to the wrapping function is the format string. As the
format string is parsed the function is able to determine if there is another variadic argument
expected and what it's type should be.
#include <stdio.h>
#include <stdarg.h>
if (*format == '%')
{
++format;
switch(*format)
{
case 'c' :
f = printf("%d", va_arg(ap, int)); /* print next variadic argument, note
type promotion from char to int */
break;
case 'd' :
f = printf("%d", va_arg(ap, int)); /* print next variadic argument */
break;
case 'f' :
f = printf("%f", va_arg(ap, double)); /* print next variadic argument */
break;
default :
f = -1; /* invalid format specifier */
break;
}
}
else
{
f = printf("%c", *format); /* print any other characters */
}
return printed;
https://riptutorial.com/ 415
}
https://riptutorial.com/ 416
Chapter 63: X-macros
Introduction
X-macros are a preprocessor-based technique for minimizing repetitious code and maintaining
data / code correspondences. Multiple distinct macro expansions based on a common set of data
are supported by representing the whole group of expansions via a single master macro, with that
macro's replacement text consisting of a sequence of expansions of an inner macro, one for each
datum. The inner macro is traditionally named X(), hence the name of the technique.
Remarks
The user of an X-macro-style master macro is expected to provide his own definition for the inner
X() macro, and within its scope to expand the master macro. The master's inner macro references
are thus expanded according to the user's definition of X(). In this way, the amount of repetitive
boilerplate code in the source file can be reduced (appearing only once, in the replacement text of
X()), as is favored by adherents to the "Do not Repeat Yourself" (DRY) philosophy.
Additionally, by redefining X() and expanding the master macro one or more additional times, X
macros can facilitate maintaining corresponding data and code -- one expansion of the macro
declares the data (as array elements or enum members, for example), and the other expansions
produce corresponding code.
Although the "X-macro" name comes from the traditional name of the inner macro, the technique
does not depend on that particular name. Any valid macro name can be used in its place.
Criticisms include
A good explanation of X macros can be found in Randy Meyers' article [X-Macros] in Dr. Dobbs (
http://www.drdobbs.com/the-new-c-x-macros/184401387).
Examples
Trivial use of X-macros for printfs
/* define X to use */
#define X(val) printf("X(%d) made this print\n", val);
https://riptutorial.com/ 417
X_123
#undef X
/* good practice to undef X to facilitate reuse later on */
This example will result in the preprocessor generating the following code:
Next you can use the enumerated value in your code and easily print its identifier using :
printf("%s\n", enum2string(MyEnum_item2));
The X-macro approach can be generalized a bit by making the name of the "X" macro an
argument of the master macro. This has the advantages of helping to avoid macro name collisions
and of allowing use of a general-purpose macro as the "X" macro.
As always with X macros, the master macro represents a list of items whose significance is
specific to that macro. In this variation, such a macro might be defined like so:
https://riptutorial.com/ 418
One might then generate code to print the item names like so:
In contrast to standard X macros, where the "X" name is a built-in characteristic of the master
macro, with this style it may be unnecessary or even undesirable to afterward undefine the macro
used as the argument (PRINTSTRING in this example).
Code generation
X-Macros can be used for code generation, by writing repetitive code: iterate over a list to do some
tasks, or to declare a set of constants, objects or functions.
https://riptutorial.com/ 419
This requires all functions to have the same signature. If they take no arguments and return an int,
we would put this in a header with the enum definition:
All of the following can be in different compilation units assuming the part above is included as a
header:
An example of this technique being used in real code is for GPU command dispatching in
Chromium.
https://riptutorial.com/ 420
Credits
S.
Chapters Contributors
No
— character
2 classification & Alejandro Caro, Jonathan Leffler, Roland Illig, Toby
conversion
https://riptutorial.com/ 421
arguments McLean, Shog9, syb0rg, Toby, Woodrow Barlow, Yotam
Salmon
Common C
programming idioms Chandrahas Aroori, Jonathan Leffler, Nityesh Agarwal,
11
and developer Shubham Agrawal
practices
Declaration vs
18 Ashish Ahuja, foxtrot9, Kerrek SB, Toby
Definition
https://riptutorial.com/ 422
Formatted alk, fluter, Jonathan Leffler, Jossi, lardenn, MikeCAT, polarysekt
23
Input/Output , StardustGogeta
Implementation-
28 Jens Gustedt, John Bollinger, P.P.
defined behaviour
Implicit and Explicit alk, Firas Moalla, Jens Gustedt, Jeremy Thien, kdopen, Lundin,
29
Conversions Toby
Interprocess
33 Communication cʟᴅsᴇᴇᴅ, EsmaeelE, Jonathan Leffler, Toby
(IPC)
Iteration
alk, GoodDeeds, Jens Gustedt, jxh, L.V.Rao, Malcolm McLean,
34 Statements/Loops:
Nagaraj, RamenChef, reshad, Toby
for, while, do-while
https://riptutorial.com/ 423
juleslasne, Luiz Berti, madD7, Malcolm McLean, Mark Yisri,
Matthieu, Neui, P.P., Paul Campbell, Paul V, reflective_mind,
Seth, Srikar, stackptr, syb0rg, Tamarous, tbodt, the sudhakar,
Toby, tofro, Vivek S, vuko_zrno, Wyzard
Multi-Character
39 Jonathan Leffler, PassionInfinite, Toby
Character Sequence
Pass 2D-arrays to
42 deamentiaemundi, Malcolm McLean, Shrinivas Patgar, Toby
functions
https://riptutorial.com/ 424
gsamaras, jxh, L.V.Rao, lordjohncena, MikeCAT, NeoR,
noamgot, OznOg, P.P., Toby, tofro
Structure Padding EsmaeelE, Jarrod Dixon, Jedi, Jesferman, Jonathan Leffler, Liju
54
and Packing Thomas, MayeulC, tilz0R
https://riptutorial.com/ 425
Community, cshu, DaBler, Daniel Jour, DarkDust, FedeWar,
Firas Moalla, Giorgi Moniava, gsamaras, haccks, hmijail, honk,
Jacob H, Jean-Baptiste Yunès, Jens Gustedt, John, John
Bollinger, Jonathan Leffler, Kamiccolo, Leandros, Lundin,
Magisch, Mark Yisri, Martin, MikeCAT, Nemanja Boric, P.P.,
Peter, Roland Illig, TimF, Toby, tversteeg, user45891, Vasfed,
void
https://riptutorial.com/ 426