Processes and Threads: Improve Application Responsiveness - Any Program in Which Many Activities Are Not Dependent
Processes and Threads: Improve Application Responsiveness - Any Program in Which Many Activities Are Not Dependent
In order to understand this let us consider the two main characteristics of a process: Unit of resource ownership -- A process is allocated: a virtual address space to hold the process image control of some resources (files, I/O devices...) Unit of dispatching - A process is an execution path through one or more programs: execution may be interleaved with other processes the process has an execution state and a dispatching priority The unit of dispatching is usually referred to a thread or a lightweight process. Thus a thread: Has an execution state (running, ready, etc.) Saves thread context when not running Has an execution stack and some per-thread static storage for local variables Has access to the memory address space and resources of its process
all threads of a process share this when one thread alters a (non-private) memory item, all other threads (of the process) sees that a file open with one thread, is available to others
Benefits of Threads vs Processes If implemented correctly then threads have some advantages of (multi) processes, They take: Less time to create a new thread than a process, because the newly created thread uses the current process address space. Less time to terminate a thread than a process. Less time to switch between two threads within the same process, partly because the newly created thread uses the current process address space. Less communication overheads -- communicating between the threads of one process is simple because the threads share everything: address space, in particular. So, data produced by one thread is immediately available to all the other threads. Multithreading your code can have many benefits: Improve application responsiveness -- Any program in which many activities are not dependent upon each other can be redesigned so that each activity is defined as a thread. For example, the user of a multithreaded GUI does not have to wait for one activity to complete before starting another. Spell checker in a word processing application. Use multiprocessors more efficiently -- Typically, applications that express concurrency requirements with threads need not take into account the number of available processors. The performance of the application improves transparently with additional processors. Numerical algorithms and applications with a high degree of parallelism, such as matrix multiplications, can run much faster when implemented with threads on a multiprocessor.
Improve program structure -- Many programs are more efficiently structured as multiple independent or semi-independent units of execution instead of as a single, monolithic thread. Multithreaded programs can be more adaptive to variations in user demands than single threaded programs. Use fewer system resources -- Programs that use two or more processes that access common data through shared memory are applying more than one thread of control. However, each process has a full address space and operating systems state. The cost of creating and maintaining this large amount of state information makes each process much more expensive than a thread in both time and space. In addition, the inherent separation between processes can require a major effort by the programmer to communicate between the threads in different processes, or to synchronize their actions. Application of Threads: Example : A file server on a LAN It needs to handle several file requests over a short period Hence more efficient to create (and destroy) a single thread for each request Multiple threads can possibly be executing simultaneously on different processors Thread Levels There are two broad categories of thread implementation: User-Level Threads -- Thread Libraries. Kernel-level Threads -- System Calls. User-Level Threads (ULT) In this level, the kernel is not aware of the existence of threads -- All thread management is done by the application by using a thread library. Thread switching does not require kernel mode privileges (no mode switch) and scheduling is application specific Kernel activity for ULTs: The kernel is not aware of thread activity but it is still managing process activity When a thread makes a system call, the whole process will be blocked but for the thread library that thread is still in the running state So thread states are independent of process states Advantages and inconveniences of ULT Advantages: Thread switching does not involve the kernel -- no mode switching Scheduling can be application specific -- choose the best algorithm. ULTs can run on any OS -- Only needs a thread library Disadvantages: Most system calls are blocking and the kernel blocks processes -- So all threads within the process will be blocked The kernel can only assign processes to processors -- Two threads within the same process
cannot run simultaneously on two processors Threads libraries creating and destroying threads passing messages and data between threads scheduling thread execution saving and restoring thread contexts
POSIX thread (pthread) libraries The purpose of using the POSIX thread library in your software is to execute software faster. Thread operations include thread creation, termination, synchronization (joins,blocking), scheduling, data management and process interaction. Threads in the same process share: Process instructions Most data open files (descriptors) signals and signal handlers current working directory User and group id Each thread has a unique: Thread ID set of registers, stack pointer stack for local variables, return addresses signal mask priority Return value: errno pthread functions return "0" if OK. int pthread_create(pthread_t * thread, const pthread_attr_t * attr, void * (*start_routine)(void *), void *arg); [C++ pitfalls]: The above sample program will compile with the GNU C and C++ compiler g++. The following function pointer representation below will work for C but not C++. Note the subtle differences and avoid the pitfall below: void print_message_function( void *ptr ); ... ... iret1 = pthread_create( &thread1, NULL, (void*)&print_message_function, (void*) message1); The threads library provides three synchronization mechanisms:
mutexes - Mutual exclusion lock: Block access to variables by other threads. This enforces exclusive access by a thread to a variable or set of variables. joins - Make a thread wait till others are complete (terminated). condition variables - data type pthread_cond_t
Mutexes: Mutexes are used to prevent data inconsistencies due to operations by multiple threads upon the same memory area performed at the same time or to prevent race conditions where an order of operation upon the memory is expected. A contention or race condition often occurs when two or more threads need to perform operations on the same memory area, but the results of computations depends on the order in which these operations are performed. Mutexes are used for serializing shared resources such as memory. Anytime a global resource is accessed by more than one thread the resource should have a Mutex associated with it. One can apply a mutex to protect a segment of memory ("critical region") from other threads. Mutexes can be applied only to threads in a single process and do not work between processes as do semaphores. pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER; int counter=0; /* Function C */ void functionC() { pthread_mutex_lock( &mutex1 ); counter++ pthread_mutex_unlock( &mutex1 ); } When a mutex lock is attempted against a mutex which is held by another thread, the thread is blocked until the mutex is unlocked. When a thread terminates, the mutex does not unless explicitly unlocked. Nothing happens by default. Joins: A join is performed when one wants to wait for a thread to finish. A thread calling routine may launch multiple threads then wait for them to finish to get the results. One waits for the completion of the threads with a join.
Create a Key for Thread-Specific Data Single-threaded C programs have two basic classes of data: local data and global data. For multithreaded C programs a third class is added: thread-specific data (TSD). This is very much like global data, except that it is private to a thread. Thread-specific data is maintained on a per-thread basis. TSD is the only way to define and refer to data that is private to a thread. Each thread-specific data item is associated with a key that is global to
all threads in the process. Using the key, a thread can access a pointer (void *) that is maintained per-thread. The function pthread_keycreate() is used to allocate a key that is used to identify threadspecific data in a process. The key is global to all threads in the process, and all threads initially have the value NULL associated with the key when it is created. pthread_keycreate() is called once for each key before the key is used. There is no implicit synchronization. Once a key has been created, each thread can bind a value to the key. The values are specific to the thread and are maintained for each thread independently. The per-thread binding is deallocated when a thread terminates if the key was created with a destructor function. pthread_keycreate() is prototyped by:
int pthread_key_create(pthread_key_t *key, void (*destructor) (void *));
When pthread_keycreate() returns successfully, the allocated key is stored in the location pointed to by key. The caller must ensure that the storage and access to this key are properly synchronized. An optional destructor function, destructor, can be used to free stale storage. When a key has a non-NULL destructor function and the thread has a non-NULL value associated with that key, the destructor function is called with the current associated value when the thread exits. The order in which the destructor functions are called is unspecified. pthread_keycreate() returns zero after completing successfully. Any other returned value indicates that an error occurred. When any of the following conditions occur, pthread_keycreate() fails and returns an error value.
Once a key has been deleted, any reference to it with the pthread_setspecific() or
pthread_getspecific() call results in the EINVAL error. It is the responsibility of the programmer to free any thread-specific resources before calling the delete function. This function does not invoke any of the destructors. pthread_keydelete() returns zero after completing successfully. Any other returned value indicates that an error occurred. When the following condition occurs, pthread_keycreate() fails and returns the corresponding value.
pthread_setspecific() returns zero after completing successfully. Any other returned value indicates that an error occurred. When any of the following conditions occur, pthread_setspecific() fails and returns an error value. Note: pthread_setspecific() does not free its storage. If a new binding is set, the existing binding must be freed; otherwise, a memory leak can occur.
body() { ... while (write(fd, buffer, size) == -1) { if (errno != EINTR) { fprintf(mywindow, "%s\n", strerror(errno)); exit(1); } } ... }
This code may be executed by any number of threads, but it has references to two global variables, errno and mywindow, that really should be references to items private to each thread. References to errno should get the system error code from the routine called by this thread, not by some other thread. So, references to errno by one thread refer to a different storage location than references to errno by other threads. The mywindow variable is intended to refer to a stdio stream connected to a window that is private to the referring thread. So, as with errno, references to mywindow by one thread should refer to a different storage location (and, ultimately, a different window) than references to mywindow by other threads. The only difference here is that the threads library takes care of errno, but the programmer must somehow make this work for mywindow. The next example shows how the references to mywindow work. The preprocessor converts references to mywindow into invocations of the mywindow procedure. This routine in turn invokes pthread_getspecific(), passing it the mywindow_key global variable (it really is a global variable) and an output parameter, win, that receives the identity of this thread's window. Turning Global References Into Private References Now consider this code fragment:
thread_key_t mywin_key; FILE *_mywindow(void) { FILE *win; pthread_getspecific(mywin_key, &win); return(win); } #define mywindow _mywindow() void routine_uses_win( FILE *win) { ... } void thread_start(...) { ... make_mywin(); ... routine_uses_win( mywindow ) ... }
The mywin_key variable identifies a class of variables for which each thread has its own private copy; that is, these variables are thread-specific data. Each thread calls make_mywin to initialize its window and to arrange for its instance of mywindow to refer to it. Once this routine is called, the thread can safely refer to mywindow and, after mywindow, the thread gets the reference to its private window. So, references to mywindow behave as if they were direct references to data private to the thread. We can now set up our initial Thread-Specific Data:
void make_mywindow(void) { FILE **win; static pthread_once_t mykeycreated = PTHREAD_ONCE_INIT; pthread_once(&mykeycreated, mykeycreate); win = malloc(sizeof(*win)); create_window(win, ...); pthread_setspecific(mywindow_key, win); } void mykeycreate(void) { pthread_keycreate(&mywindow_key, free_key); } void free_key(void *win) { free(win); }
First, get a unique value for the key, mywin_key. This key is used to identify the thread-specific class of data. So, the first thread to call make_mywin eventually calls pthread_keycreate(), which assigns to its first argument a unique key. The second argument is a destructor function that is used to deallocate a thread's instance of this thread-specific data item once the thread terminates. The next step is to allocate the storage for the caller's instance of this thread-specific data item. Having allocated the storage, a call is made to the create_window routine, which sets up a window for the thread and sets the storage pointed to by win to refer to it. Finally, a call is made to pthread_setspecific(), which associates the value contained in win (that is, the location of the storage containing the reference to the window) with the key. After this, whenever this thread calls pthread_getspecific(), passing the global key, it gets the value that was associated with this key by this thread when it called pthread_setspecific(). When a thread terminates, calls are made to the destructor functions that were set up in pthread_key_create(). Each destructor function is called only if the terminating thread established a value for the key by calling pthread_setspecific().
What is a deadlock? How can you detect it? How can you avoid it?