CPU Affinity
CPU Affinity
O
n a multi-core/multi-processor So, what is CPU affinity?
system, the OS usually distributes On symmetric multi-processing (SMP) systems,
different processes on all available the operating system's process scheduler not
processors (CPU) in a way that only decides when a process can run, but also
allows the system to work most efficiently. where it should run. CPU (or processor) affinity
However, for some reason, you might like to is the term that describes this property of the
take charge and overrule the kernel's process scheduler to associate a particular process to a
scheduling to bind your application to a specific processor or CPU.
processor/CPU of your choice. This is known There are two types of CPU affinity that
as CPU affinity. the Linux scheduler supports: soft affinity (or
In this article, I will try to cover how a natural affinity), and hard affinity.
normal Linux user can set/retrieve a specific The soft affinity of a process is merely an
task's affinity from the command line, and attempt by the scheduler to run the process
then we will go further with the actual on the same processor on which it ran the last
implementation of system calls. time. This way the scheduler tries to improve
the performance with the ‘locality of reference’. However, this hard affinity will rescue you from such performance
is not always possible if, for instance, the preferred or ideal degradation by letting you schedule your application on
processor is busy for further scheduling. The scheduler then the processor(s) of your choice.
migrates the process to a different processor for execution. In NUMA (Non-Uniform Memory Architecture) machines,
On the other hand, hard affinity provides users/ processors will have faster access to local memory than
programmers the flexibility to override the natural affinity for shared memory between different processors. Therefore,
their tasks/processes. In Linux, all processes are represented forcing a process to the processor that has local access
by the kernel data structure task_struct that contains fields to the frequently used memory helps in boosting the
related to the process attribute. Among these is the cpus_ performance.
allowed bitmask field that specifies which CPU(s) shall handle Sometimes real-time applications require a dedicated
the task. This bitmask consists of a series of n bits, one for processor. With hard affinity you can ensure that a
each n logical processor in the system. So if a system has four long-running and time-sensitive application runs on a
processors (i.e., a multi-processor system), this bitmask will specific processor.
have four bits, and if each processor is a dual core, then it'll Linux kernel 2.6 provides complete control to set and
have an eight-bit bitmask. retrieve the CPU affinity of a process. However, a word of
The default state of a process in Linux for the cpus_ caution before we proceed: using hard affinity might cause
allowed field is all 1s. It indicates that the process is allowed to the processors to have uneven loads.
run on any available CPU, and can migrate across processors
as and when required. CPU affinity—a user's perspective
Hard affinity allows you to alter this bit field. The To set or retrieve the CPU affinity of a running process from
scheduler then honours it and schedules your task on the the shell prompt, you can use the taskset command, with
processor of your choice. We will soon look at how you, as a which you can even launch a new task with a given affinity.
user/programmer, can change the affinity. Let's see how we can do that.
But let's first discuss the possible reasons that may lead Let’s suppose we have a running process, with PID
you to design your application to override the natural (or soft) (process ID) 21934. To bind this process to Processor
affinity of the process/thread. #0 (the processor count starts from 0), let’s issue the
following command:
Why overrule a natural affinity?
Well, the Linux scheduler does a fantastic job of scheduling. $ taskset -p 0x01 21934
It tries to run a process on the same processor it ran the last
time, assuming that some remnants of the process may be left Here, the -p flag indicates that taskset operates on an
(especially the cache) and thus a better performance could be existing process. The hex value 0x01 tells the new affinity
achieved. However, there are various other parameters that a mask of the process (i.e., CPU #0). Finally, the third parameter
scheduler considers while deciding which processor should 21934 is the PID of the task.
run the process. To test the above example, you first need to find the PID
One of the possible reasons could be that the preferred of an already running process using the ps command. In my
processor is busy but the other processor(s) in the case, the process I picked had a PID number of 21934. Here's
system is not. Under the circumstances, the scheduler the output after running the command:
will dispatch the process to the idle processor in order to
maintain the load balance. pid 21934's current affinity mask: 2
However, for whatever reason, a program might want to pid 21934's new affinity mask: 1
have control over the scheduling aspects for the application.
Some of these could be: From this output, we can conclude that for the process
A multi-processor system needs to keep the processor's with PID 21934, the affinity mask is reset from 2 to 1. Now
caches valid. Data must be modified by only one let's explore how the current affinity mask of a running
processor, and all other processors that have cached the process can be seen (in my case, I picked out a running
same data must invalidate their copy and fetch the most process with PID 21934):
recent data again in case of a cache miss, and they do so.
This may come at a high cost in terms of performance. $ taskset -p 21934
Now think of a situation when a process starts bouncing pid 21934's current affinity mask: 1
between different processors. This will constantly cause
the cache to get invalidated. And the situation may You can also check the CPU affinity list (i.e., the list of all
even worsen, if the threads of a process are scheduled at the processors that can run the process) of a process. This can
discrete processors and they are perpetually accessing be done with -c flag. Let's have a look at it.
and updating the same piece of data. This will lead
to the frequent invalidation of the cached data. Here, $ taskset -c -p 22139
pid 22139's current affinity list: 1 /* The 2nd processor in the system */
CPU_SET(1, &mymask);
So far, we have seen how an affinity of an already-running
process can be set/retrieved. You can also launch a new task /* Set affinity mask of the process */
with the given affinity. I will show you how you can do so. if (sched_setaffinity(pid, len, &mymask) < 0) {
perror("main: Error in sched_setaffinity() ");
$ taskset -c 0x01 ./a.out return;
}
The above command will launch a new task with Affinity
Mask 1 and Affinity List 0. CPU_ZERO(&mymask);
int sched_setaffinity(pid_t pid, unsigned int cpusetsize, cpu_set_t *mask); The above example binds the current process to
the first two processors in the system, and fetches its
int sched_getaffinity(pid_t pid, unsigned int cpusetsize, cpu_set_t *mask); affinity.
While sched_setaffinity() sets the CPU affinity, sched_ CPU affinity—leftover for developers
getaffinity() retrieves it. Finally, I would like to introduce you to the thread affinity.
The man page reads, “sched_setaffinity() sets the Initially, it did not seem like it was within the scope of this
CPU affinity mask of the process whose ID is PIC to the article, but while researching CPU affinity, I could not find
value specified by the mask. If PID is zero, then the calling significant discussions on the Internet or elsewhere, on
process is used. The argument cpusetsize is the length (in affinity setting/retrieving at the thread level in Linux, apart
bytes) of the data pointed to by the mask. Normally, this from a synopsis of APIs in the Linux man page.
argument would be specified as sizeof(cpu_set_t). If the So for my own purposes, I did some experiments with
process specified by the PID is not currently running on thread affinity and believe that it would be worth sharing
one of the CPUs specified in the mask, then that process my experience with you.
is migrated to one of the CPUs specified in the mask.” On
the other hand, “sched_getaffinity() writes the affinity Note: I expect readers to be familiar with multi-
mask of the process whose ID is the PID into the cpu_set_t threading concepts (with pthreads) in Linux.
structure pointed to by the mask. The cpusetsize argument
specifies the size (in bytes) of the mask. If PID is zero, then By default, all threads within a process inherit the same
the mask of the calling process is returned.” affinity that a process has. We can override this process-
On success, sched_setaffinity() and sched_getaffinity() level affinity with APIs: pthread_attr_setaffinity_np() and
returns 0, error returns -1. pthread_getattr_np().
Let's write a simple line of code to change the CPU affinity Let's explore these APIs with our next example.
of the same program and retrieve its earlier affinity.
Note: The posix functions with the _np suffix are non-
#ifndef _GNU_SOURCE standard and not portable.
#define _GNU_SOURCE 1
#endif #ifndef _GNU_SOURCE
#include <stdio.h> #define _GNU_SOURCE 1
#include <sched.h> #endif
#include <pthread.h>
int main() #include <stdio.h>
{ #include <sched.h>
cpu_set_t mymask;
unsigned int len = sizeof(mymask); void * thread_aff(void *arg)
unsigned int pid = 0; /* Current Process */ {
cpu_set_t mymask;
CPU_ZERO(&mymask); unsigned int len = sizeof(mymask);