All About Linux Signals
All About Linux Signals
o From user space from some other process when someone calls a function like kill(2).
o When you send the signal from the process itself using a function like abort(3).
o When a child process exits the operating system sends the SIGCHLD signal.
o When the parent process dies or hangup is detected on the controlling
terminal SIGHUP is sent.
o When user interrupts program from the keyboard SIGINT is sent.
o When the program behaves incorrectly one of SIGILL, SIGFPE, SIGSEGV is delivered.
o When a program accesses memory that is mapped using mmap(2) but is not available
(for example when the file was truncated by another process) - really nasty situation when using
mmap() to access files. There is no good way to handle this case.
o When a profiler like gprof is used the program occasionally receives SIGPROF. This is
sometimes problematic when you forgot to handle interrupting system functions
like read(2) properly (errno == EINTR).
o When you use the write(2) or similar data sending functions and there is nobody to
receive your data SIGPIPE is delivered. This is a very common case and you must remember
that those functions may not only exit with error and setting the errno variable but also cause
the SIGPIPE to be delivered to the program. An example is the case when you write to the
standard output and the user uses the pipeline sequence to redirect your output to another
program. If the program exits while you are trying to send data SIGPIPE is sent to your
process. A signal is used in addition to the normal function return with error because this event
is asynchronous and you can't actually tell how much data has been successfully sent. This can
also happen when you are sending data to a socket. This is because data are buffered and/or
send over a wire so are not delivered to the target immediately and the OS can realize that can't
be delivered after the sending function exits.
Signal handlers
Traditional signal() is deprecated
The signal(2) function is the oldest and simplest way to install a signal handler but it's
deprecated. There are few reasons and most important is that the original Unix implementation
would reset the signal handler to it's default value after signal is received. If you need to handle
every signal delivered to your program separately like handling SIGCHLD to catch a dying
process there is a race here. To do so you would need to set to signal handler again in the signal
handler itself and another signal may arrive before you cal the signal(2) function. This behavior
varies across different systems. Moreover, it lacks features present in sigaction(2) you will
sometimes need.
The sigaction(2) function is a better way to set the signal action. It has the prototype:
int sigaction (int signum, const struct sigaction *act, struct sigaction
*oldact);
As you can see you don't pass the pointer to the signal handler directly, but instead
a struct sigaction object. It's defined as:
struct sigaction
{
void (*sa_handler)(int);
void (*sa_sigaction)(int, siginfo_t *, void *);
sigset_t sa_mask;
int sa_flags;
void (*sa_restorer)(void);
};
For a detailed description of this structure's fields see the sigaction(2) manual page. Most
important fields are:
o sa_handler - This is the pointer to your handler function that has the same prototype
as a handler for signal(2).
o sa_sigaction - This is an alternative way to run the signal handler. It has two
additional arguments beside the signal number where the siginfo_t * is the more interesting.
It provides more information about the received signal, I will describe it later.
o sa_mask allows you to explicitly set signals that are blocked during the execution of the
handler. In addition if you don't use the SA_NODEFER flag the signal which triggered will be also
blocked.
o sa_flags allow to modify the behavior of the signal handling process. For the detailed
description of this field, see the manual page. To use the sa_sigaction handler you must
use SA_SIGINFO flag here.
What is the difference between signal(2) and sigaction(2) if you don't use any additional feature
the later one provides? The answer is: portability and no race conditions. The issue with
resetting the signal handler after it's called doesn't affect sigaction(2), because the default
behavior is not to reset the handler and blocking the signal during it's execution. So there is no
race and this behavior is documented in the POSIX specification. Another difference is that
with signal(2) some system calls are automatically restarted and with sigaction(2) they're not by
default.
Example use of sigaction()
In the signal handler we read two fields from the siginfo_t *siginfo parameter to read the
sender's PID and UID. This structure has more fields, I'll describe them later.
The sleep(3) function is used in a loop because it's interrupted when the signal arrives and must
be called again.
SA_SIGINFO handler
In the previous example SA_SIGINFO is used to pass more information to the signal handler as
arguments. We've seen that the siginfo_t structure contains si_pid and si_uid fields (PID
and UID of the process that sends the signal), but there are many more. They are all described
in sigaction(2) manual page. On Linux only si_signo (signal number) and si_code (signal
code) are available for all signals. Presence of other fields depends on the signal type. Some
other fields are:
o si_code - Reason why the signal was sent. It may be SI_USER if it was delivered due
to kill(2) or raise(3), SI_KERNEL if kernel sent it and few more. For some signals there are
special values like ILL_ILLADR telling you that SIGILL was sent due to illegal addressing
mode.
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <string.h>
static int exit_flag = 0;
static void hdl (int sig)
{
exit_flag = 1;
}
int main (int argc, char *argv[])
{
struct sigaction act;
memset (&act, '\0', sizeof(act));
act.sa_handler = &hdl;
if (sigaction(SIGTERM, &act, NULL) < 0) {
perror ("sigaction");
return 1;
}
while (!exit_flag)
;
return 0;
}
Atomic Type
There is one data type defined that is guaranteed to be atomically read and written both in signal
handlers and code that uses it: sig_atomic_t. The size of this type is undefined, but it's an
integer type. In theory this is the only type you can safely assign and read if it's also accessed in
signal handlers. Keep in mind that:
o It doesn't work like a mutex: it's guaranteed that read or write of this type translates into
an uninterruptible operation but code such as:
sig_atomic_t i = 0;
void sig_handler (int sig)
{
if (i++ == 5) {
// ...
}
}
Isn't safe: there is read and update in the if operation but only single reads and single writes
are atomic.
o Don't try to use this type in a multi-threaded program as a type that can be used without
a mutex. It's only intended for signal handlers and has nothing to do with mutexes!
o You don't need to worry if data are modified or read in a signal handler are also modified
or read in the program if it happens only in parts where the signal is blocked. Later I'll show how
to block signals. But you will still need the volatile keyword.
Signal-safe functions
You can't just do anything in a signal handler. Remember that your program is interrupted, you
don't know at which point, which data objects are in the middle of being modified. It may be not
only your code, but a library you are using or the standard C library. In fact there is a quite short
list of function you can safely call from a signal handler in signal(7). You can for example open a
file with open(2), remove a file with unlink(2), call _exit(2) (but not exit(3)!) and more. In
practice this list is so limited that the best you can do is to just set some global flag to notify the
process to do something like exiting. On the other hand the wait(2) and waitpid(2) functions can
be used, so you can cleanup dead processes in SIGCHLD, unlink(2) is available, so you can
delete a pid file etc.
signalfd(2) is a quite new Linux specific call available from the 2.6.22 kernel that allows to
receive signals using a file descriptor. This allows to handle signals in a synchronous way,
without providing handler functions. Let's see an example of signalfd() use
This is perfect to be used in a single-process server with the main loop executes a function
like poll(2) to handle many connections. It simplifies signal handling because the signal
descriptor can be added to the poll's array of descriptors and handled like any other of them,
without asynchronous actions. You handle the signal when you are ready for that because your
program is not interrupted.
Handling Specific Signals
Handling SIGCHLD
If you create new processes in your program and don't really want to wait until they exit, and
possibly their exist status doesn't matter, just want to cleanup zombie processes you can create
a SIGCHLD handler that does just that and forget about process you've created. This handler can
look like this one:
This way whenever a child exits it will be cleaned-up but information which process was that,
why it exited and its exit status is forgotten. You could make the handler more intelligent but
remember to not use any function that is not listed as signal-safe.
You must remember that if you make child processes SIGCHLD must have a handler. The
behavior of ignoring this signal is undefined, so at least a handler that doesn't do anything is
required.
Handling SIGBUS
The SIGBUS signal is sent to the process when you access mapped memory (with mmap(2)) that
doesn't correspond to a file. A common example is that the file you've mapped was later
truncated (possible by another program) and you read past it's current end. Accessing files this
way doesn't require any system function that could return an error, you just read from memory
like if it was on the heap or stack. This is a really bad situation when you don't want your
program to terminate after a file read error. Unfortunately handling SIGBUS isn't simple or clean,
but it's possible. If you want to continue running your program you have to use longjmp(3). It's
something like goto but worse! We have to jump to some other place in the program that the
mmap()ed memory is not accessed if we receive SIGBUS. If you place an empty handler for this
signal, in case of read error the program will be interrupted, signal handler executed and the
control returns to the same place that caused the error. So we need to jump into another place
from the signal handler. This sounds low-level, but it's possible using standard POSIX functions.
You must keep in mind the list of signal-safe functions: In this example we never actually return
from the signal handler. The stack is cleaned up, but program is restarted in completely different
place, so if you've had, for example, a mutex locked during the operation like:
pthread_mutex_lock (&m);
for (l = 0; l < 1000; l++)
if (mem[l] == 'd') // BANG! SIGBUS here!
j++;
pthread_mutex_unlock (&m);
After longjmp(3) the mutex is still held although in every other situation the mutex is released.
So handling SIGBUS is possible but very tricky and can introduce bugs that are very hard to
debug. The program's code also becomes ugly.
Handling SIGSEGV
Handling the SIGSEGV (segmentation fault) signal is also possible. In most cases returning from
the signal handler makes no sense since the program will be restarted from the instruction that
caused segmentation fault. So if you have no solution on how to fix the state of the program to
let it continue running properly at the same moment it crashed, you must end the program. One
example of when you may restart the program is when you have memory obtained
using mmap(2) that is read-only, you may check if the signal handler that the cause of
segmentation fault was writing to this memory (using data from siginfo_t) and
use mprotect(2) to change the protection of this memory. How practical is it? I don't know.
Exhausting stack space is one of the causes of segmentation fault. In this case running a signal
handler is not possible because it requires space on the stack. To allow handling SIGSEGV in
such condition the sigaltstack(2) function exists that sets alternative stack to be used by signal
handlers.
Handling SIGABRT
When handling this signal you should keep in mind how the abort(3) function works: it rises the
signal twice, but the second time the SIGABRT handler is restored to the default state, so the
program terminates even if you have a handler defined. So you actually have a chance to do
something in case of abort(3) before the program termination. It's possible to not terminate the
program by not exiting from the signal handler and using longjmp(3) instead as described
earlier.
Default actions
With each signal there is an associated default action which is taken when you don't provide a
signal handler and you don't block a signal. The actions are:
o Termination of a process. This is the most common action. Not only
for SIGTERM or SIGQUIT but also for signals like SIGPIPE, SIGUSR1, SIGUSR2 and others.
o Termination with code dump. This is common for signals that indicate a bug in the
program like SIGSEGV, SIGILL, SIGABRT and others.
o Few signals are ignored by default like SIGCHLD.
o SIGSTOP (and similar stop signals) cause the program to suspend and SIGCOND to
continue. The most common situation is when you use the CTRL-Z command in the shell.
If you set a signal handler in your program you must be prepared that some system calls can be
interrupted by signals. Even if you don't set any signal handler there could be signals delivered
to your program so it's best to be prepared for that. An example situation is compiling your
program with the -pg gcc option (enable profiling), so when running it occasionally
gets SIGPROF handled without your knowledge, but causing syscalls to be interrupted.
What is interrupted?
Every system or standard library function that uses a system call can be potentially interrupted
and you must consult it's manual page to be sure. In general function that return immediately
(don't wait for any I/O operation to complete or sleep) are not interruptible like socket(2) which
just allocates a socket and doesn't wait for anything. On the other hand functions that wait for
something (like for a network transfer, pipe read, explicit sleep etc.) will be interruptible
like select(2), read(2), connect(2) and you must be prepared for that. What exactly happens
when a signal arrives during waiting for such function to complete is described in it's manual
page.
#include <unistd.h>
#include <signal.h>
static void hdl (int sig)
{
}
void my_sleep (int seconds)
{
while (seconds > 0)
seconds = sleep (seconds);
}
int main (int argc, char *argv[])
{
signal (SIGTERM, hdl);
my_sleep (10);
return 0;
}
This example works, but if you try it and send few signals during sleep you can see that it may
sleep different amount of time. This is because sleep(3) takes the argument and returns the
value with 1s resolution so it can't be precise telling you how long it need to sleep after
interruption.
Very important thing in daemon programs is proper handling of interruption of system functions.
One part of the problem is that common functions that transfer data like recv(2), write(2) and
similar like select(2) may be interrupted by a signal which is handled in it's handler, so you need
to continue receiving data, restart select(2) etc. We've just seen a simple example how to handle
it in case of sleep(3).
This program reads from it's standard input and copies the data to the standard output.
Additionally, when SIGUSR1 is received it prints to stderr how many bytes has been already
read and written. It installs a signal handler which sets a global flag to 1 if called. Whatever the
program does at the moment it receives the signal, the numbers are immediately printed. It
works because read(2) and write(2) functions are interrupted by signals even during operation.
In case of those functions two things might happen:
o When read(2) waits for data or write(2) waits for stdout to put some data and no data
were yet transfered in the call and SIGUSR1 arrives those functions exit with return value of -1.
You can distinguish this situation from other errors by reading the value of the errno variable. If
it's EINTR it means that the function was interrupted without any data transfered and we can
call the function again with the same parameters.
o Another case is that some data were transfered but the function was interrupted before it
finished. In this case the functions don't return an error but a value less that the supplied data
size (or buffer size). Neither the return value nor the errno variable tells us that the function
was interrupted by a signal, if we want to distinguish this case we need to set some flag in the
signal handler (as we do in this example). To continue after interruption we need to call the
function again keeping in mind that some data were consumed or read adn we must restart from
the right point. In our example only the write(2) must be properly restarted, we use
the written variable to track how many bytes were actually written and properly
call write(2) again if there are data left in the buffer.
Remember that not all system calls behave exactly the same way, consult their manual page to
make sure.
Reading the sigaction(2) manual page you can think that setting the SA_RESTART flag is simpler
that handling system call interruption. The documentation says that setting it will make certain
system calls automatically restartable across signals. It's not specified which calls are restarted.
This flag is mainly used for compatibility with older systems, don't use it.
Blocking signals
There is sometime a need to block receiving some signals, not handling them. Traditional way is
to use the deprecated signal(2) function with SIG_IGN constant as a signal handler. There is
also newer, recommended function to do that: sigprocmask(2). It has a bit more complex usage,
let's see an example of signal blocking with sigprocmask().
This program will sleep for 10 seconds and will ignore the SIGTERM signal during the sleep. It
works this way because we've block the signal with sigprocmask(2). The signal is not ignored,
it's blocked, it means that are queued by the kernel and delivered when we unblock the signal.
This is different than ignoring the signal with signal(2). First sigprocmask(2) is more
complicated, it operates in a set of signals represented by sigset_t, not on one signal.
The SIG_BLOCK parameter tells that the the signals in set are to be blocked (in addition to the
already blocked signals). The SIG_SETMASK tells that the signals in set are to be blocked, and
signals that are not present in the set are to be unblocked. The third parameter, if not NULL, is
written with the current signal mask. This allows to restore the mask after modifying the process'
signal mask. We do it in this example. The first sleep(3) function is executed
with SIGTERM blocked, if the signal arrives at this moment, it's queued. When we restore the
original signal mask, we unblock SIGTERM and it's delivered, the signal handler is called.
In the previous example nothing really useful was presented, such use of sigprocmask(2) isn't
very interesting. Here is a bit more complex example of code that really needs sigprocmask(2):
while (!exit_request)
{
fd_set fds;
int res;
/* BANG! we can get SIGTERM at this point. */
FD_ZERO (&fds);
FD_SET (lfd, &fds);
res = select (lfd + 1, &fds, NULL, NULL, NULL);
/* accept connection if listening socket lfd is ready */
}
Let's say it's an example of a network daemon that accepts connections
using select(2) and accept(2). It can use select(2) because it listens on multiple interfaces or
waits also for some events other than incoming connections. We want to be able to cleanly shut
it down with a signal like SIGTERM (remove the PID file, wait for pending connections to finish
etc.). To do this we have a handler for the signal defined which sets global flag and relay on the
fact that select(2) will be interrupted when the signal arrives at the moment we are just waiting
for some events. If the main loop in the program looks similarly as the above code everything
works... almost. There is a specific case in which the signal will not interrupt the program even if
it does nothing at all at the moment. When it arrives between checking the while condition and
executing select(2). The select(2) function will not be interrupted (because signal was handled)
and will sleep until some file descriptor it monitors will be ready.
This is where the sigprocmask(2) and other "new" functions are useful. Let's see an improved
version:
sigemptyset (&mask);
What's the difference between select(2) and pselect(2)? The most important one is that the later
takes an additional argument of type sigset_t with set of signals that are unblocked during the
execution of the system call. The idea is that the signals are blocked, then global variables/flags
that are changed in signal handlers are read and then pselect(2) runs. There is no race
because pselect(2) unblocks the signals atomically. See the example: the exit_request flag is
checked while the signal is blocked, so there is no race here that would lead to
executing pselect(2) just after the signal arrives. In fact, in this example we block the signal all
the time and the only place where it can be delivered to the program is the pselect(2) execution.
In real world you may block the signals only for the part of the program that contains the flag
check and the pselect(2) call to allow interruption in other places in the program.
Another difference not related to the signals is that select(2)'s timeout parameter is of
type struct timeval * and pselect(2)'s is const struct timespec *. See
the pselect(2) manual page for more information.
The proper solution is to use a dedicated function to wait for a signal: see an example of using
sigtimedwait().
This program creates a child process that sleeps few seconds (in a real world application this
process would do something like execve(2)) and waits for it to finish. We want to implement a
timeout after which the process is killed. The waitpid(2) function does not have a timeout
parameter, but we use the SIGCHLD signal that is sent when the child process exits. One
solution would be to have a handler for this signal and a loop with sleep(3) in it. The sleep(3) will
be interrupted by the SIGCHLD signal or will sleep for the whole time which means the timeout
occurred. Such a loop would have a race because the signal could arrive not in the sleep(3), but
somewhere else like just before the sleep(3). To solve this we use the sigtimedwait(2) function
that allows us to wait for a signal without any race. We can do this because we block
the SIGCHLD signal before fork(2) and then call sigtimedwait(2) which atomically unblock the
signal and wait for it. If the signal arrives it block it again and returns. It can also take a timeout
parameter so it will not sleep forever. So without any trick we can wait for the signal safely.
One drawback is that if sigtimedwait(2) is interrupted by another signal it returns with an error
and doesn't tell us how much time elapsed, so we don't know how to properly restart it. The
proper solution is to wait for all signals we expect at this point in hte program or block other
signals. There is another small bug i the program: when we kill the process, SIGCHLD is sent and
we don't handle it anywhere. We should unblock the signal before waitpid(2) and have a handler
for it.
There are also other functions that can be used to wait for a signal:
o sigsuspend(2) - waits for any signal. It takes a signal mask of signals that are atomically
unblocked, co it doesn't introduce race conditions.
o sigwaitinfo(2) - like sigtimedwait(2), but without the timeout parameter.
o pause(2) - simple function taking no argument. Just waits for any signal. Don't use it,
you will introduce a race condition similar to the described previously, use sigsuspend(2).
Sending signals
Sending signal from keyboard
There are two special key combinations that can be used in a terminal to send a signal to the
running application:
kill()
The simplest way to send a signal to the process is to use kill(2). It takes two
arguments: pid (PID of the process) and sig (the signal to send). Although the function has a
simple interface it's worth to read the manual page because there are few more things we can do
than just sending a signal to a process:
o The pid can be 0, the signal will be sent to all processes in the process group.
o The pid can be -1, the signal is sent to every process you have permission to send
signals except init and system processes (you won't kill system threads).
o The pid can be less than -1 to send signal to all processes in the process group whose ID
is -pid.
o You can check is a process exists sending signal 0. Nothing is really sent, but
the kill(2) return value will be as if it sent a signal, so if it's OK it means that the process exists.
There are two standard function that will help you to send signals to yourself:
o raise(3) - Just send the specified signal to yourself, but if it's a multithreaded program it
sends the signal to the thread, not the process.
o abort(3) - Sends SIGABRT, but before that it will unblock this signal, so this function
works always, you don't need to bother about unblocking this signal. It will also terminates you
program even if you have handler for SIGABRT by restoring the default signal handler and
sending the signal again. You can prevent it as was mentioned in signal handling chapter.
Real-time signals
The POSIX specification defines so called real-time signals and Linux supports it. They are to be
used by the programmer and have no predefined meaning. Two macros are
available: SIGRTMIN and SIGRTMAX that tells the range of these signals. You can use one
using SIGRTMIN+n where n is some number. Never hard code their numbers, real time signals
are used by threading library (both LinuxThreads and NTPL), so they adjust SIGRTMIN at run
time.
Whats the difference between RT signals and standard signals? There are couple:
o More than one RT signal can be queued for the process if it has the signal blocked while
someone sends it. In standard signals only one of a given type is queued, the rest is ignored.
o Order of delivery of RT signal is guaranteed to be the same as the sending order.
o PID and UID of sending process is written to si_pid and si_uid fields of siginfo_t.
For more information see section about Real time signals in signal(7).
o Process-directed signals (sent to a PID using functions like kill(2)). Threads have their
separate signal mask which can be manipulated using pthread_sigmask(2) similary
to sigprocmask(2), so such signal is not delivered to a thread that has this signal blocked. It's
delivered to one of threads in the process with this signal unblocked. It's unspecified which
thread will get it. If all threads have the signal blocked, it's queued in the per-process queue. If
there is no signal handler defined for the signal and the default action is to terminate the process
with or without dumping the core the whole process is terminated.
o Thread-directed signals. There is a special function to send a signal to a specific
thread: pthread_kill(2). It can be used to send a signal from one thread to another (or itself).
This way the signal will be delivered or queued for the specific thread. There are also per-thread
directed signals generated by the operating system like SIGSEGV. If there is no signal handler
defined for a signal that default's action is to terminate the process, a thread-directed signal
terminated the whole process.
As you can see there is a process-wide signal queue and a per-thread queues.
Signal handlers
Signal actions are set for the whole process. The behavior of signal(2) is undefined for multi-
threaded application, sigaction(2) must be used. Keep in mind that none of pthreads related
functions are described as signal safe in signal(7). Especially using mutexes in signal handlers is
very bad idea.
Real-time signals
As previously said, both threading implementations (LinuxThreads and NPTL) internally use some
number of real-time signals, so it's another good reason to always refer to those signals
using SIGRTMIN+n notation.
The dnotify mechanism uses similar technique: you are notified about file system actions using
signals related to file descriptors of monitored directories or files. The recommended way of
monitoring files is now inotify.