Lab1 Linux and Program
Lab1 Linux and Program
Lab 1
Program and Linux Intro
Course: Operating Systems
Phuong-Duy Nguyen
March 17, 2024
OBJECTIVES:
1
Contents
1. PREREQUISITES 3
1.1. Linux Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2. GNU C Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4. Practices 17
5. Exercise 22
6. References 23
C. GNU DEBUGGER 42
C.1. Introduction to GDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
C.2. Debugging C++ program with GDB . . . . . . . . . . . . . . . . . . . . . 42
D. Signal Programming 45
2
1. PREREQUISITES
1.1. Linux Environment
To practice this class, students must prepare to install Linux operating system on their
personal computers (Windows/MacOS) through the following 3 ways (Note: Use only 1
of the 3 following methods) :
If GCC is not installed in your environment, please install using this link: https://linuxize.com/post/how-
to-install-gcc-compiler-on-ubuntu-18-04/
3
2. Introduction to Linux Command Line Interface
2.1. Command Line Interface
Introduction: CLI stands for Command Line Interface. The CLI is simply a
program that allows users to type text commands to instruct the computer to perform
specific tasks.
For example, the cal command below tells the computer to display the calendar of the
current month:
$ cal
September 2022
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
In the early generations of computers, CLI was used as the basic method of commu-
nication between humans and machines, with a keyboard as the input device and a
computer monitor that could only display information in the form of text. plain text.
After entering the command (the cal command above), the result the user receives will
be either textual information (returns the current month calendar) or specific actions
performed by the computer such as deleting and editing files, shutting down machines,. . .
The invention of the computer mouse marked the beginning of the point-and-click
method, as a new way to interact with the computer through a graphical user interface
(GUI). Nowadays, GUI has become a common way of programming. However, most
operating systems still offer a combination of CLI and GUI programming. Although
CLI is more confusing and less intuitive than GUI, CLI is still favored by programmers
for the following reasons:
Start CLI: To use the Linux CLI, we will use the Terminal app. Terminal is a
text-based interface/CLI that allows users to interact directly with the UNIX system.
Terminal will accept the command lines we enter via the keyboard and execute them,
4
then return the results on the screen.
• Method 1 (Via GUI): Press windows key, type Terminal. Then the Terminal
icon appears. Click to open it.
• Method 2 (Via Shortcut): Use the keyboard shortcut Ctrl+Alt+T to open the
Terminal
5
For WSL, after we have successfully installed Ubuntu from the Windows Store by fol-
lowing the instructions, we can access the CLI by launching the Ubuntu application on
our computer (via Search / Start Menu)
Bash (Bourne-Again SHell) is the shell used almost by default on Linux distributions.
Question: List some other popular Linux shells and describe their highlighted features.
All characters to the left of each command line are called the shell prompt, displayed
in the following structure:
<username>@<hostname>:<location>$
Where:
6
• hostname: The name of the computer ( lab-computer)
• location: Working directory (student01’s HOME (˜) directory)
• $: Shell prompt ending separator
When typing a command from the keyboard, the input character will be displayed after
the $ sign. Then we can press the Enter key to execute the command. For the example
below, we will type the whoami command to display the current user who is logging in
our system:
student01@lab −computer : ~ $ whoami
student01
We can also use the up arrow key to return to previously executed commands.
Command structure: Linux commands can run themselves or accept options and
parameters to change their behavior. The typical syntax of commands can be summa-
rized as follows:
command [options] [parameters]
Where options are usually defined in two ways as follows:
• Short form: Starts with a dash (-) followed by a character. Example: -l
• Full form: Starts with 2 dashes (--) followed by a word. Example --list
Usually each option will provide both types. For example, -a and --all both indicate
the show all option.
Consider the following example when we run the ls command to list the files and
subdirectories of a directory:
student01@lab −computer : ~ $ l s
When we do not provide any options or parameters, the command will use the default
setting as shown here to display a simple list of files and directories at the current
working directory (~). To list more detailed information, we use the -l option and can
add the directory parameter we want to check like /etc:
student01@lab −computer : ~ $ l s −l / etc
t o t a l 1376
drwxr−xr−x 2 r o o t r o o t 4096 Jun 29 2017 ImageMagick−6
drwxr−xr−x 8 r o o t r o o t 4096 May 18 2017 NetworkManager
drwxr−xr−x 2 r o o t r o o t 4096 Nov 13 2016 Upower
...
Exercise: Try more options with the ls command and analyze its output. You can
also try to combine multiple options into one, ex: -la.
In case we want to know the detailed information about a command, we use the syntax:
7
man <command to look up>
"Man" is an acronym for "manual", is considered a document in Linux that stores all
information about supported commands. Ex: man ls
Alternatively, we can also use the –help or -h option to view the help document which
is provided by the command itself (if the program supports it).
Linux files and directories In Linux, everything is considered a file, even a direc-
tory is a file type with its own characteristics. Linux has a directory structure that is
not divided by drive (C, D, ...) like Windows but has a root directory (usually written
as /). From this root , Linux filesystem will be splitted into many folders with different
purposes. Some commonly used directories are:
• /home: contains the content of the user user
• /bin, /usr/bin: contains executable programs, most system commands are run.
For example ls like we did in the examples above
• . : Working directory
The path represents how we can refer to the directory structure. The / is used in the
path to separate each level of this structure. There are 2 types of paths we will encounter:
• Absolute path: This path represents the location of the file relative to the root
directory, so it always begins with /. Example: /usr/bin
• Relative path: This path represents the location of the file from the current
directory, thus starting from the current directory. For example: ../../usr/bin
or ./test
• Change working directory: To change the working directory, we use the com-
mand: cd <path of desired directory>
8
• Delete folder or file: rm <name of folder/file to delete>. We can combine
with these options: -f to delete immediately without confirmation and -r to delete
the entire contents of the directory.
• View file contents: The cat command helps to view the contents of a file. If
we want to display the line number, we can use the -n <line number> option.
In addition, we can combine with more or less to view large files by redirecting
which will be introduced later.
9
2.4. Command redirection
Data Flow The basic workflow of any command is that they takes input and returns
an output. A command will have 3 data streams including:
• Standard error(stderr): Is the error returned after executing the command and
something went wrong. Stdout is usually output to the screen, but can also be
output to a file or another process
For example, when you type ls - this is stdin, and stdout is the output you see on the
screen:
10
• > : Overwrite the entire file content
Command input redirection Use < to redirect input (stdin). The example below
passes the input to the lowercase to uppercase conversion statement with the result of
the cal command in the file mycal We will print the calendar to the screen in uppercase
format:
student01@lab −computer : ~ $ c a l > mycal
Piping We can use | to redirect (Piping) the output to another statement. We can
also redirect the output of one command to another instead of the file as above. This is
useful when there are many statements that need to be concatenated or have too many
results. Example, when we type ls /etc | more from the keyboard, the output of the
ls command will be converted to the input of the more command and produce the result
to the screen.
Question: Compare the Output Redirection (>/>>) with the Piping (|) technique.
Filter output with grep: Grep is an acronym for Global Regular Expression Print.
The grep command in Linux is used to search for a string of characters in a specified
file, which is very convenient when searching large log files. Example filtering any files
in the /etc directory which contains the character "sys" :
11
student01@lab −computer : ~ $ l s −a / e t c | g r e p s y s
rsyslog . conf
rsyslog .d
s y s c t l . conf
...
2.5. Sudo
Sudo is an acronym for "substitute user do", or "super user do". This is a Linux pro-
gram that allows users to run programs with exclusive rights to secure the information
of other users in Linux (usually the root user).
For some programs that need privileges to access some protected directories/files, we
must use sudo to be able to execute programs or actions on that directory/file. For
example:
student01@lab −computer : ~ $ mkdir / opt / sample
mkdir : cannot c r e a t e d i r e c t o r y ’ / opt / sample ’ : P e r m i s s i o n d e n i e d
student01@lab −computer : ~ $ sudo mkdir / opt / sample
student01@lab −computer : ~ $ l s / opt
sample
• Others: Any system user who is not the owner and does not belong to the same
group
Each user belonging to one of the three groups owner, group, and other will be assigned
permissions:
• read: the ability to open and view the contents of the file
12
Figure 2.5: File detail using the ls command
To view the permission detail, we can use the -l option when calling the ls command:
Linux file permissions can be displayed in two formats. The first format, as shown
above, is called symbolic notation, which is a string of 10 characters: One character
representing the file type and 9 characters representing the permissions. read (r), write
(w) and execute (x) of the file in order of owner, group, and other users. If not allowed,
the dash (-) symbol will be used.
The second format is called numeric notation, which is a string of three digits, each
of which corresponds to user, group, and other permissions. Each digit can be between
0 and 7, and each digit’s value is obtained by summing the permissions of that class.
The way to convert between symbols and numbers is shown in Figure 2.6
To change the permissions for files and directories, we will use the chmod command:
chmod <options> <permission index> <file/directory name>
The example below shows how to grant the execute permission to a bash script a.sh:
student01@lab −computer : ~ $ echo ’ echo " H e l l o World " ’ > a . sh
student01@lab −computer : ~ $ ls
a . sh
student01@lab −computer : ~ $ . / a . sh
−bash : . / a . sh : P e r m i s s i o n denied
student01@lab −computer : ~ $ chmod 744 a . sh
student01@lab −computer : ~ $ . / a . sh
H e l l o World
In addition, we can also change the ownership of a file. To make changes, we need the
sudo privilege:
sudo chown <username> <filename>
13
Figure 2.6: Linux file permission presentation
Question: Discuss about the 777 permission on critical services (web hostings, sensitive
databases,...)
14
$ mkdir n e w f o l d e r
# i f we want t o c r e a t e f i l e w i t h some s i m p l e t e x t c o n t e n t
$ echo " t e x t ␣ c o n t e n t ␣ abc . . . " > t e x t 1
The common command for directory and file are cp (copy), move (move), rm (remove).
We use cp to copy the directory/file content to make a copy. If the directory has sub-
directory, we do cp recursively with the option "-r"
$ cp f i l e 1 c l o n e f i l e 1
# i f we need t o do i t r e c u r s i v e l y
$ cp −r f o l d e r 1 c l o n e f o l d e r
We use the mv to move the directory/file to a new place. If the source place and
destination place are the same, then it performs likely rename operation.
$ mv f o l d e r 1 / f i l e 1 d e s t i n a t i o n f o l e r /
# This i s a c t u a l l y a rename o p e r a t i o n
$ mv f i l e 1 n e w f i l e n a m e
We use the rm to delete the content of the directory/file. If the directory has sub-
directory, we do rm recursively with the option "-r"
$ rm f i l e 1
# I f t h e f o l d e r has s u b f o l d e r , we need t o do i t r e c u r s i v e l y
$ rm −r f o l d e r 1
Put all bash commands to a file called script file To reuse and organize
commands, we can put it into a file with the file extension ".sh"
#! / b i n / b a s h
command1
command2
We can put command as listed above, i.e. mkdir, touch, cp, mv, rm. Beside, bash scrip
support some structure control as: Variable declare
15
$ v i bashvar . sh
It is important to note that the programming language restrict the assign symbol has
no space before and after it. After declaring the variable, we can refer it to future usage
with the prefix "$".
By the default, the script file is not executable, we need to grant the execution permission
to it:
$ chmod +x bashvar . sh
$ . / bashvar . sh
I w i l l do some math ! : 1 + 1 = 2
We can name the iterative variable and later refer it with the prefix "$"
for VARIABLE in 1 2 3 4 5 . . . N
do
command1 on $VARIABLE
command2
commandN
done
To convert the source file and mapping the function to it real location address, we use
the following command
$ g c c −o h e l l o h e l l o . c
16
3.4. Makefile rule
Makefile is the input file of make program with exact name "Makefile" (proper capital-
ization), in which it contains a list of rule, each rule has the syntax
<t a r g e t >: <dependency1> <dependency2>
<tab> command1
<tab> command2
They will perform to the rule as stated below. The make program verifies that all
dependencies are present. If any of them already exist, it creates the target by executing
all of the commands. An example
out : t e x t 1 t e x t 2
cat t e x t 1 t e x t 2 > out
This rule implies the verification of file text1 and text2 existence. If they are all existed,
the out file is created by executing the command.
To try this example, we create any content of 2 file text1 and text2, then we run the
make rule
$ vi text1
# t y p e i n any t e x t c o n t e n t t o f i l e t e x t 1
$ vi text2
# t y p e i n any t e x t c o n t e n t t o f i l e t e x t 2
$ make out
4. Practices
In this section, we will work together to complete a project of code management using
shell script, compiler and automation tool. In detail, the project includes the following
step:
17
Remote login using SSH The ssh command is used to login to remote linux host.
Some system does not come with installed SSH service, you need to install it
$ sudo apt−g e t i n s t a l l openssh−s e r v e r
$ ipconfig
#or an a l t e r n a t i v e command depend on t h e command s u p o r t on your
environment
$ i p addr
$ s s h − l user_account IP_address
You can also use the GUI application on Windows named PuTTY.
Organize folder and file using shell command The command ls, cd, cp and
mv help you to Organize folder and file in your working folder
$ mkdir s i n g l e −source−f i l e −ws
$ cd s i n g l e −source−f i l e −ws
# v e r i f y our c u r r e n t d i r e c t o r y p a t h
$ pwd
/home/ u s e r / s i n g l e −source−f i l e −ws
Compose program source code using vi Using vi text editor to compose the
source code file
$ vi hello . c
18
Clone the current workspace folder "single-source-file-ws" to "multiple-source-file-ws"
$ cp −r s i n g l e −source−f i l e −ws m u l t i p l e −source−f i l e −ws
We modify our project from single source file where we directly call printf IO function
to multiple source file and move the printf to another separated source file
$ vi hello . c
#i n c l u d e < s t d i o . h>
$ v i in . c
#i n c l u d e < s t d i o . h>
i n t inChoTui ( )
{
p r i n t f ( " Xin ␣ chao ! \ n" ) ;
}
$ ls
hello . c in . c
echo $SRC_FILE
19
rm output /∗
for f s r c in $SRC_FILE
do
n f i l e =" $ { f s r c %. c } . o "
g c c −c −o output / $ n f i l e $ f s r c
done
cd output
OBJ_FILE=‘ l s −R | g r e p − i " . o " ‘
g c c −o myapp $OBJ_FILE
By default, the script file might not have the execution permission. The following com-
mand help grant the execution permission
$ chmod +x b u i l d −ws . sh
$ . / b u i l d −ws . sh
An alternative autotool Makefile For Makefile rules, you must place all of the rule
definitions in a fixed-name "Makefile" with proper capitalization.
$ cd . .
$ pwd
/home/ u s e r
$ cp −r m u l t i p l e −source−f i l e −ws make−ws
$ cd make−ws
$ ls
b u i l d −ws . sh hello . c in . c output /
$ vi Makefile
myapp : h e l l o . o in . o
g c c −o myapp in . o h e l l o . o
hello . o : hello . c
g c c −c −o h e l l o . o h e l l o . c
in . o : in . c
g c c −c −o in . o in . c
20
clean :
rm −f ∗ . o
rm myapp
a l l : myapp
$ make c l e a n
$ make h e l l o . o
$ ls
M a k e f i l e b u i l d −ws . sh hello . c h e l l o . o in . c myapp
output /
21
5. Exercise
In the previous section, we have built a Shell Script version of a calculator. Following
this idea, we will re-implement a C version.
Requirements:
• Students must create a Makefile to build the program with at least these 2 targets:
all and clean
• The main program is implemented in the calc.c source file, while the calculation
logic is held on the other source files.
• Input and output requirements are the same as the Shell Script version
22
6. References
• Coding style by GNU: http://www.gnu.org/prep/standards/standards.html.
• C programming
– Brian Kernighan, and Dennis Ritchie, "The C Programming Language", Sec-
ond Edition
– Randal E. Bryant and David R. O’Hallaron, "Computer systems: A Programmer‘s
Perspective", Second Edition
• Makefile:
– A simple Makefile tutorial http://www.cs.colby.edu/maxwell/courses/
tutorials/maketutor/
– GNU Make Manual https://www.gnu.org/software/make/manual/make.
html
– How to Debug a C or C++ Program on Linux Using gdb. https://www.
maketecheasier.com/debug-program-using-gdb-linux/
– GCC and Make: Compiling, Linking and Building C/C++ Applications.
https://www3.ntu.edu.sg/home/ehchua/programming/cpp/gcc_make.html
23
*Appendix
We can do the same sequence of steps by saving the commands in a bash script and
running it whenever we want, instead of having to retype all commands by hand.
By naming convention, bash scripts end with .sh. However, bash scripts can run per-
fectly fine without the sh extension.
A Bash Script is also defined with a shebang. Shebang is the first line of each script.
The shebang tells the shell to execute it through the bash shell or other shells we want
like zsh. The shebang is simply an absolute path to the bash interpreter, like the example
below:
#! / b i n / b a s h
Note that the user needs to have the execute permission to be able to run a bash script.
Then we identify the executable path of our system bash to include it in the shebang:
student01@lab −computer : ~ $ which bash
/ b i n / bash
Use nano to edit the file hello_world.sh that we just created above with the following
content:
#! / b i n / b a s h
echo " H e l l o ␣World"
24
student01@lab −computer : ~ $ . / h e l l o _ w o r l d . sh
H e l l o World
student01@lab −computer : ~ $ bash h e l l o _ w o r l d . sh
H e l l o World
student01@lab −computer : ~ $ . / h e l l o . sh
H e l l o Tux
var=$ ((1+2) )
echo $var
Passing arguments to Bash Script When launching a Bash Script, we can pass
arguments and use them in the script. Example: ./sum.sh 1 2
Here is the list of arguments and other system variables:
25
• $? - State of the last executed statement ( 0 -> true , 1 -> false )
#! / b i n / b a s h
# f i l e name . sh
echo $0
echo Your name i s $1
Read in from the keyboard: Sometimes we need to collect user input and perform
relevant operations. In bash, we can get user input with the read command.
#! / b i n / b a s h
# f i l e i n p u t . sh
var=$ ( ( a+b ) )
echo $var
student01@lab −computer : ~ $ . / i n p u t . sh
Enter a number : 1
Enter a number : 2
3
• >= : $a -ge $b
• > : $a -gt $b
• <= : $a -le $b
26
• < : $a -lt $b
• != : $b -ne $b
read x
read y
if [ $x −g t $y ]
then
echo X i s g r e a t e r than Y
elif [ $x − l t $y ]
then
echo X i s l e s s than Y
elif [ $x −eq $y ]
then
echo X i s equal to Y
fi
If the comparison structure is more complex, we can use the following conditional forms:
if...then...else...fi
if..elif..else..fi
if..then..else..if..then..fi..fi.. (Nested condition)
with logical operators AND (&&) and OR (||) to combine multiple conditions
A.4. Loop
The for loop allows us to execute statements for a specific number of times.
27
#! / b i n / b a s h
for i in { 1 . . 5 }
do
echo $ i
done
student01@lab −computer : ~ $ . / s c r i p t . sh
1
2
3
4
5
student01@lab −computer : ~ $ . / s c r i p t . sh
1
2
3
4
5
Saving results from a Bash command In case we need to save the output of
a complex command in Bash Script, we can write the statement inside the backtick:
‘<command>‘ or use the dollar sign: $(command) :
#! / b i n / b a s h
var =‘ d f −h | g r e p tmpfs ‘
echo $var
student01@lab −computer : ~ $ . / s c r i p t . sh
tmpfs 201M 22M 179M 11% / run tmpfs 1001M 192K 1000M 1% / dev /shm
tmpfs 5 . 0M 4 . 0K 5 . 0M 1% / run / l o c k tmpfs 1001M 0 1001M 0% /
s y s / f s / cgroup tmpfs 201M 0 201M 0% / run / u s e r /1001 tmpfs 201
28
M 24K 201M 1% / run / u s e r /112 tmpfs 201M 0 201M 0% / run / u s e r
/1000 tmpfs 201M 0 201M 0% / run / u s e r /1003
f i r s t command
s e c o n d command
...
}
For example:
#! / b i n / b a s h
hello_world () {
echo ’ h e l l o ␣ e v e r y o n e ’
}
hello_world
student01@lab −computer : ~ $ . / s c r i p t . sh
h e l l o everyone
A.5. Example
In this exercise, we will create a simple calculator application that can perform basic
arithmetic operations like addition, subtraction, multiplication, or division depending
on the number the user enters in Bash. For example:
Enter two numbers :
5.6
3.4
Enter Choice :
1 . Addition
2. Subtraction
3. Multiplication
4. Division
3
29
5.6 ∗ 3.4 = 19.0
3. Output: res
Complete program Here is the sample code of our program. You can use another
approach to complete this task.
# !/ bin / bash
# Take u s e r I n p u t
echo " Enter ␣two␣ numbers ␣ : ␣ "
read a
read b
# S w i t c h Case t o perform
# calculator operations
case $ch in
1 ) r e s =‘echo $a + $b | bc ‘
;;
2 ) r e s =‘echo $a − $b | bc ‘
;;
3 ) r e s =‘echo $a \∗ $b | bc ‘
;;
4 ) r e s =‘echo " s c a l e =2; ␣ $a ␣ / ␣$b" | bc ‘
30
;;
esac
echo " R e s u l t ␣ : ␣ $ r e s "
Input Requirements
Example: 2 + 5
2. Supports 5 basic calculations: Add (+), Subtract (-), Multiply (*), Divide (/),
Interger Divide (%). Store the result of the last calculation in the variable ANS,
and it can be accessed again at the next calculation. For example:
~$ . / c a l c . sh
>> 2 + 5
7
~$ . / c a l c . sh
>> ANS + 3
10
3. The ANS variable is initialized with a value of 0 and will not be lost after restarting
the program
4. Store the history of the last 5 successful calculations and use the HIST input com-
mand to review:
~$ . / c a l c . sh
>> HIST
1 + 2 = 3
ANS − 4 = −1
3 ∗ 5 = 15
9 % 6 = 1
25 / 6 = 4 . 1 7
31
5. When waiting to enter any math operations, only >> is displayed (Note: There is
one trailing space):
~$ . / c a l c . sh
>>
6. After entering a valid calculation (with a space between the operators), press ENTER
to calculate the result (Note: Print the result with a carriage return):
~$ . / c a l c . sh
>> 1 + 2
3
Display requirements:
1. For any non-integer (decimal) results, round to 2 decimal digits:
~$ . / c a l c . sh
>> 5 / 3
1.67
2. After completing a calculation, press any key to start a new calculation with the
display information is cleared:
>>
Submission Submit only one executable file: calc.sh as instructed by the Instructor
32
B. INTRODUCTION TO GNU C COMPILER
B.1. GCC History
The GNU Compiler Collection, commonly known as GCC, is a set of compilers and
development tools available for Linux, Windows, various BSDs, and a wide assortment
of other operating systems.
As its name, GCC primarily supports C and C++ and includes other modern languages
like Objective-C, Ada, Go, and Fortran. The Free Software Foundation (FSF) wrote
GCC and released it as free software.
GCC is a key component of the so-called "GNU Toolchain", for developing applications
and writing operating systems. The GNU Toolchain includes:
• GNU Compiler Collection (GCC): a compiler suite that supports many languages,
such as C/C++ and Objective-C/C++.
• GNU Binutils: a suite of binary utility tools, including linker and assembler.
GCC is portable and runs on many operating platforms. GCC (and GNU Toolchain)
is currently available on all Unixes. They are also ported to Windows (by Cygwin,
MinGW, and MinGW-W64).
Microsoft Visual C++, GNU Compiler Collection (GCC), and Clang/Low-Level Vir-
tual Machine (LLVM) are three mainstream C/C++ compilers in the industry. In this
course, we will use the standard GCC as it is the default compiler on Linux.
33
First of all, what even is an executable? An executable is a special type of file that
contains machine instructions (ones and zeros), and running this file causes the computer
to perform those instructions. Compiling is the process of turning our C++ program
files into an executable.
When we run GCC on a source code file, there are four steps before an executable file
is made:
1. Preprocessing is the first pass of any C compilation. It removes comments, ex-
pands #include files and macros, and processes conditional compilation instruc-
tions. This can be output as a .i file.
2. Compilation is the second pass. It takes the preprocessor’s output and source
code’s output and generates assembler source code. Assembly language is a low-
level programming language (even lower than C) that is still human-readable but
consists of mnemonic instructions that have strong correspondence to machine
instructions.
3. Assembly is the third stage of compilation. It takes the assembly source code
and produces an object file, which contains basic machine instructions and symbols
(e.g., function names) that are no longer human-readable since they are in bits.
4. Linking is the final stage of compilation. It takes one or more object files or
libraries as input and combines them to produce a single (usually executable) file.
In doing so, it resolves references to external symbols and assigns final addresses
to procedures/functions and variables, and revises code and data to reflect new
addresses (a process called relocation).
Let’s consider the following basic Hello World C program:
34
#include<s t d i o . h>
// F i l e h e l l o w o r l d . c
// This i s t h e main program
int main ( ) {
p r i n t f ( " H e l l o ␣%d\n" , YEAR) ;
return 0 ;
}
Normally, we will compile and run this file directly using the following command:
~$ g c c −o h e l l o w o r l d h e l l o w o r l d . c
~$ . / h e l l o w o r l d
H e l l o World
Moreover, we can stop at any of four above compilation stage using corresponding op-
tions. Let’s stop at the first step - preprocessing and discuss its output.
~$ g c c −E h e l l o w o r l d . c −o h e l l o w o r l d . i
~$ c a t h e l l o w o r l d . i
# 1 " helloworld . c"
# 1 "<b u i l t −in>" 1
# 1 "<b u i l t −in>" 3
# 370 "<b u i l t −in>" 3
# 1 "<command␣ l i n e >" 1
# 1 "<b u i l t −in>" 2
...
int main ( ) {
p r i n t f ( " H e l l o ␣%d\n" , 2 0 2 2 ) ;
return 0 ;
}
We can see that the preprocessor handles all the preprocessor directives, like #include
and #define. It is agnostic of the syntax of C, which is why it must be used with care.
Next, let’s check the output of the compilation step, which is the assembly format:
~$ g c c −S h e l l o w o r l d . c
35
~$ cat h e l l o −world . s
. section __TEXT, __text , r e g u l a r , p u r e _ i n s t r u c t i o n s
. b u i l d _ v e r s i o n macos , 1 2 , 0 sdk_version 12 , 3
. g l o b l _main ## −− Begin
f u n c t i o n main
. p2align 4 , 0 x90
_main : ## @main
. cfi_startproc
## %bb . 0 :
pushq %rbp
. c f i _ d e f _ c f a _ o f f s e t 16
...
Then we will use the assembler (as) to convert the assembly code into machine code:
~$ a s h e l l o w o r l d . s −o h e l l o w o r l d . o
~$ f i l e h e l l o w o r l d . o
h e l l o w o r l d . o : ELF 64− b i t LSB r e l o c a t a b l e , x86 −64 , v e r s i o n 1 (
SYSV) , not s t r i p p e d
We can also see the detailed compilation process by enabling the -v (verbose) option:
~$ g c c −v −o h e l l o w o r l d h e l l o w o r l d . c
Since building large C/C++ programs usually involves multiple steps, a tool like Make
is needed to ensure all source files are compiled and linked. Make also lets the developer
control how ancillary files like documentation, man pages, systemd profiles, init scripts,
and configuration templates are packaged and installed.
Make is not limited to languages like C/C++. Web developers can use GNU Make to
do repetitive tasks like minifying CSS and JS, and system administrators can automate
maintenance tasks. Additionally, end-users can use Make to compile and install software
without being programmers or experts on the software they are installing.
Here’s an example dependency graph that you might build with Make. If any file’s de-
pendencies change, then the file will get recompiled.
36
Figure B.2: Source File Dependency Graph in Make
Make was initially developed by Stuart Feldman in April of 1976, but GNU released its
free software version of the tool in the late 1980s. Version 3.56 is still available (via diffs)
from the GNU FTP server, dating back to September 23, 1989.
GNU Make is a valuable tool for compiling software projects, especially those in the
GNU/Linux landscape. Its simple Makefile syntax and intelligent processing of target
files make it an excellent choice for development.
Makefile Syntax A makefile is simply a text file (commonly named Makefile) that
consists of a set of rules. A rule generally looks like this:
targets : prerequisites
command
command
command
• The targets are file names, separated by spaces. Typically, there is only one per
rule.
• The commands are a series of steps typically used to make the target(s). These
need to start with a tab character, not spaces.
• The prerequisites are also filenames, separated by spaces. These files need to
exist before the commands for the target are run. These are also called dependen-
cies
Getting Started with Makefile Let’s start with a formal Hello World example.
Use nano or another of your favorite text editors to compose a file named Makefile with
the following content:
37
hello :
echo " H e l l o , ␣World"
echo " This ␣ l i n e ␣ w i l l ␣ always ␣ p r i n t ␣ b e c a u s e ␣ t h e ␣ f i l e ␣
h e l l o ␣ d o e s ␣ not ␣ e x i s t . "
It’s important to realize that hello is both a target and a file. That’s because the two
are directly tied together. Typically, when a target is run (aka when the commands of
a target are run), the commands will create a file with the same name as the target. In
this case, the hello target does not create the hello file.
Let’s create a more typical Makefile - one that compiles our previous C file:
hello : helloworld . c
g c c −o h e l l o h e l l o w o r l d . c
When we run make again and again, the following set of steps happens:
1. The first target hello is selected because the first target is the default target
2. This has a prerequisite of helloworld.c
3. Make decides if it should run the hello target. It will only run if hello doesn’t
exist, or helloworld.c is newer than the previous build.
This last step is critical and is the essence of our Make. What it’s attempting to do is
decide if the prerequisites of blah have changed since blah was last compiled.
That is, if helloworld.c is modified, running make should recompile the file. And
conversely, if helloworld.c has not changed, then it should not be recompiled.
38
To make this happen, it uses the filesystem timestamps as a proxy to determine if some-
thing has changed. This is a reasonable heuristic because file timestamps typically will
only change if the files are modified. But it’s important to realize that this isn’t always
the case.
Question:
2. Compiling a program in the first time usually takes a longer time in comparison
with the next re-compiling. What is the reason?
3. Is there any Makefile mechanism for other programming languages? If it has, give
an example?
Make clean The clean is often used as a target that removes the output of other
targets, but it is not a special word in Make. You can run make and make clean on this
to create and delete some file.
Note that clean is doing two new things here:
• It’s a target that is not first (the default) and not a prerequisite. That means it
should never run unless you explicitly call make clean
• It’s not intended to be a filename and should be marked as PHONY. If you happen
to have a file named clean, this target won’t run, which is not what we want.
A phony target is one that is not really the name of a file. It is just a name for some
commands to be executed when you make an explicit request. There are two reasons
to use a phony target: to avoid a conflict with a file of the same name and to improve
performance.
h e l l o : h e l l o w o r l d . cpp
g++ −o h e l l o h e l l o w o r l d . cpp
.PHONY: c l e a n
clean :
rm −f h e l l o
Makefile Variables We can define variables to reuse them across the Makefile.
However, variables can only be strings. We can reference variables using either $ or $():
i n p u t := h e l l o w o r l d . cpp
output := h e l l o w o r l d
${ output } : $ ( i n p u t )
g++ −o $ { output } $ { i n p u t }
39
clean :
rm −f $ { output }
Please notice that single or double quotes have no meaning to Make. They are sim-
ply characters that are assigned to the variable. Quotes are only useful to shell/bash
command,
Targets You can define an all target. Since this is the first rule listed, it will run by
default if make is called without specifying a target.
a l l : one two t h r e e
one :
touch one
two :
touch two
three :
touch t h r e e
clean :
rm −f one two t h r e e
When there are multiple targets for a rule, the commands will be run for each target.
We will use $@ which is an automatic variable that contains the target name.
a l l : f1 . o f2 . o
f1 . o f2 . o :
echo $@
# Equivalent to :
# f1 . o :
# echo f 1 . o
# f2 . o :
# echo f 2 . o
Automatic variables They are variables that have values computed afresh for each
rule that is executed, based on the target and prerequisites of the rule:
hey : one two
# Outputs " hey " , s i n c e t h i s i s t h e t a r g e t name
echo $@
40
# Outputs a l l p r e r e q u i s i t e s
echo $^
touch hey
one :
touch one
two :
touch two
clean :
rm −f hey one two
Implicit Rules Let’s see how we can now build a C program without ever explicitly
telling Make how to do the compilation, by using the implicit rules:
• LDFLAGS: Extra flags to give to compilers when they are supposed to invoke the
linker
41
C. GNU DEBUGGER
C.1. Introduction to GDB
The normal process for developing computer programs goes something like this: write
some code, compile the code, and run the program. If the program does not work as ex-
pected, then you go back to the code to look for errors (bugs) and repeat the cycle again.
Depending on the complexity of the program and the nature of the bugs, there are times
when you can do with some additional help in tracking down the errors. This is what a
“debugger” does. It allows you to examine a computer program while it is running. You
can see the values of the different variables, you can examine the contents of memory
and you can halt the program at a specified point and step through the code one line at
a time.
The primary debugger on Linux is the GNU debugger (gdb). It might already be installed
on your system (or a slimmed-down version called gdb-minimal), but to be sure type
the following gdb command in a terminal:
~$ gdb −−v e r s i o n
GNU gdb (GDB) Red Hat E n t e r p r i s e Linux 7 . 6 . 1 − 1 2 0 . e l 7
int main ( ) {
int i ;
f o r ( i =0; i < 1 0 ; i ++) {
p r i n t f ( " H e l l o ␣%d\n" , i ) ;
}
return 0 ;
}
At this point if you just start running the program (using the run command), then the
program will execute and finish before you get a chance to do anything. To stop that,
you need to create a "breakpoint" that will halt the program at a specified point. The
easiest way to do this is to tell the debugger to stop in the function main():
42
break main
The debugger will stop at the first executable line of the main function, e.g. the for
loop. To step to the next line, type next or n for short. Keep using next to repeat the
loop a couple of times.
To inspect the value of a variable, we use the print command. In our example program,
we can examine the contents of the variable i:
print i
Repeat around the loop and few more times and see how i changes:
next
next
next
next
print i
The loop will continue while i is less than 10. You can change the value of a variable
using set var. Type the following in gdb to set i to 10.
set var i = 10
print i
next
You may need to do another next (depending on where the program was halted when
you set i to 10), but when the for loop line is next executed, the loop will exit, because
i is no longer less than 10.
The next command doesn’t drill down into functions but rather the function is executed,
and the debugger stops again at the next line after the function. If you want to step
into a function, use the step command, or s for short.
Another way to debug your program is to set a watch on a variable. What this does is
halt the program whenever the variable changes. Restart the program again by typing
run. Since the program is already running, the debugger will ask if you want to start it
43
again from the beginning. The program will stop in the main (as we didn’t remove the
breakpoint). Now set a watch on i:
watch i
continue
44
D. Signal Programming
// s i g n a l . c
/∗ l e t s f o r k o f f a c h i l d p r o c e s s . . . ∗/
child_pid = f o r k () ;
/∗ c h e c k what t h e f o r k ( ) c a l l a c t u a l l y d i d ∗/
i f ( c h i l d _ p i d == −1) {
perror (" fork ") ;
/∗ p r i n t a system−d e f i n e d e r r o r message ∗/
exit (1) ;
}
i f ( c h i l d _ p i d == 0 ) {
/∗ f o r k ( ) s u c c e e d e d , we ’ r e i n s i d e the child p r o c e s s ∗/
s i g n a l ( SIGUSR1 , p a r e n t d o n e ) ;
/∗ s e t up a s i g n a l ∗/ /
s i g e m p t y s e t ( & mask ) ;
s i g a d d s e t ( & mask , SIGUSR1 ) ;
/∗ Wait f o r a s i g n a l t o a r r i v e ∗/
s i g p r o c m a s k (SIG_BLOCK, & mask , & oldmask ) ;
while ( ! u s r _ i n t e r r u p t )
s i g s u s p e n d (&oldmask ) ;
s i g p r o c m a s k (SIG_UNBLOCK, & mask , NULL) ;
p r i n t f ( "World ! \ n" ) ;
f f l u s h ( stdout ) ;
36
}
45
else {
/∗ f o r k ( ) s u c c e e d e d , we ’ r e i n s i d e t h e p a r e n t p r o c e s s ∗/
p r i n t f ( " Hello ␣ , ␣" ) ;
f f l u s h ( stdout ) ;
k i l l ( c h i l d _ p i d , SIGUSR1 ) ;
w a i t (NULL) ;
return 0 ;
}
46
E. Makefile full directives example
TARGET_EXEC := f i n a l _ p r o g r a m
BUILD_DIR := . / b u i l d
SRC_DIRS := . / s r c
# S t r i n g s u b s t i t u t i o n f o r e v e r y C/C++ f i l e .
# As an example , h e l l o . cpp t u r n s i n t o . / b u i l d / h e l l o . cpp . o
OBJS := $ (SRCS:%=$ (BUILD_DIR) /%.o )
# S t r i n g s u b s t i t u t i o n ( s u f f i x v e r s i o n w i t h o u t %) .
# As an example , . / b u i l d / h e l l o . cpp . o t u r n s i n t o . / b u i l d / h e l l o .
cpp . d
DEPS := $ (OBJS : . o=.d )
# The f i n a l b u i l d s t e p .
$ (BUILD_DIR) / $ (TARGET_EXEC) : $ (OBJS)
$ (CXX) $ (OBJS) −o $@ $ (LDFLAGS)
# B u i l d s t e p f o r C++ s o u r c e
47
$ (BUILD_DIR) /%. cpp . o : %.cpp
mkdir −p $ ( d i r $@)
$ (CXX) $ (CPPFLAGS) $ (CXXFLAGS) −c $< −o $@
.PHONY: c l e a n
clean :
rm −r $ (BUILD_DIR)
# I n c l u d e t h e . d m a k e f i l e s . The − a t t h e f r o n t s u p p r e s s e s t h e
errors of missing
# M a k e f i l e s . I n i t i a l l y , a l l t h e . d f i l e s w i l l be m i s s i n g , and
we don ’ t want t h o s e
# e r r o r s t o show up .
−i n c l u d e $ (DEPS)
48
Revision History
49