Chapter 4: Threads
In this chapter our focus is on:
Overview
Multithreading Models
Thread Libraries
Threading Issues
Objectives
To introduce the notion of a thread — a fundamental unit
of CPU utilization that forms the basis of multithreaded
computer systems
To discuss the APIs for the Pthreads, Win32, and Java
thread libraries
To examine issues related to multithreaded programming
Thread Overview
Threads are mechanisms that permit an application to
perform multiple tasks concurrently.
Thread is a basic unit of CPU utilization; comprises thread
ID, program counter, register set and stack.
A traditional, or heavyweight process has a single thread of
control.
Threads share resources with other threads belonging to
the same process
In busy WWW server: The server creates a separate
thread that would listen for clients requests, when a request
was made, creates a thread to service the request.
Single and Multithreaded Processes
heavyweight process lightweight process
Benefits
Benefits of multithreaded programming can be broken into 4 major categories:
Responsiveness : Multithreading an interactive application may allow a
program to continue running even if part of it is blocked or is performing a
lengthy operation, thereby increasing responsiveness to the user.
Resource Sharing: By default, threads share the memory and the
resources of the process to which they belongs. Several different threads
can access the same address space.
Economy : Allocating memory and new processes is costly. Threads are
much ‘cheaper’ to initiate.
Utilization of multiprocessor Architectures: The benefit of multithreading
can be greatly increased in a multiprocessor architecture, where threads
may be running in parallel on different processors
Multithreading Models
Support for threads may be provided at either
User level -> user threads - Supported above the kernel
and managed without kernel support
Kernel level -> kernel threads - Supported and managed
directly by the operating system
All contemporary operating system including Windows,
Linux, Mac OS, Solaris support kernel threads.
Ultimately there must exist a relationship between user
threads and kernel threads. There are three common ways
of establishing this relationship.
Multithreading Models
User Thread – to - Kernel Thread
Multithreading allows the execution of multiple parts of a
program at the same time. These parts are known as
threads and are light weight processes available within
the processes. Hence multithreading leads to maximum
utilization of the CPU by multitasking.
There are two types of threads: Kernel and user thread
There are 3 models and they are:
Many-to-One
One-to-One
Many-to-Many
Many-to-One
Many user-level threads
mapped to single kernel thread.
Only one thread can access the
kernel at a time, multiple
threads are unable to run in
parallel.
One-to-One
Each user-level thread maps to kernel thread
It provides more concurrency than the many-to-one model.
Drawback :creating a user thread requires creating the
corresponding kernel thread.
Many-to-Many Model
Allows many user level threads
to be mapped to many kernel
threads.
Allows the operating system to
create a sufficient number of
kernel threads.
Example
Windows NT/2000 with the
ThreadFiber package
Thread Libraries
Thread library provides the programmer with API for creating
and managing threads.
Two primary ways of implementing a thread library:
Library entirely in user space with no kernel support
Kernel-level library supported by the OS
Three main thread libraries :
POSIX Pthreads
Win32
Java
Thread Libraries
Three main thread libraries in use today:
POSIX Pthreads
May be provided either as user-level or kernel-level
A POSIX standard (IEEE 1003.1c) API for thread creation and
synchronization
API specifies behavior of the thread library, implementation is up to
development of the library
Win32
Kernel-level library on Windows system
Java
Java threads are managed by the JVM
Typically implemented using the threads model provided by
underlying OS
POSIX: Thread Creation
#include <pthread.h>
pthread_create (thread, attr, start_routine, arg)
This function starts a new thread in the calling process.
attr argument points to a structure whose contents are used at
thread creation time to determine attributes for the new thread.
The new thread starts execution by invoking start_routine.
arg is passed as the sole argument of start_routine.
returns : 0 on success, some error code on failure.
POSIX: Thread ID
#include <pthread.h>
pthread_t pthread_self()
A thread can get its own thread id by calling this function.
returns : ID of current (this) thread
Threading Issues
In this section, we discuss issues to consider with
multithreaded programs. They are :
The fork() and exec() system calls
Thread cancellation.
Signal handling
Thread pools
Thread-specific data
Scheduler Activations
The fork() and exec() system calls
The fork() system is used to create a separate, duplicate process.
With exec(), the calling program is replaced in memory. All threads, except
the once calling exec(), vanish immediately. No thread-specific data
destructors or cleanup handlers are executed.
Question : Does fork() duplicate only the calling thread or all threads?
Issue here is : If one thread forks, is the entire process copied, or is the new
process single-threaded?
Answer 1: System dependant.
Answer 2: If the new process execs right away, there is no need to copy all
the other threads. If it doesn't, then the entire process should be copied.
Answer 3: Many versions of UNIX provide multiple versions of the fork call for
this purpose.
Thread Cancellation
Thread cancellation is the task of terminating thread before it has
completed. The thread that is to be canceled is referred as target
thread.
Threads that are no longer needed may be cancelled by another
thread in one of two ways:
Asynchronous Cancellation : cancels the target thread
immediately.
Deferred Cancellation: The target thread periodically checks
whether it should terminate, allowing it an opportunity to
terminate itself in an orderly fashion.
Issue here is: Situations where resources have been allocated
to a cancelled thread or where a thread is cancelled while the
midst of updating data it is sharing with other threads.
But OS will reclaim the resources from a cancelled thread but
will not reclaim all resources.
Signal Handling
Signals are used in UNIX systems to notify a process that a particular
event has occurred.
A signal handler is used to process signals and patterns is:
1. Signal is generated by particular event
2. Signal is delivered to a process
3. Signal is handled
Issue here is: When a multi-threaded process receives a signal, to
what thread should that signal be delivered?
Options:
Deliver the signal to the thread to which the signal applies
Deliver the signal to every thread in the process
Deliver the signal to certain threads in the process
Assign a specific thread to receive all signals for the process
Thread Pools
Example: When a web server receives a request, it creates a
separate thread to service the request. Issue ?
First issue is amount of time required to create the thread prior to
servicing the request, the thread will be discarded once it has
completed its work.
Second issue is more troublesome: If we allow all concurrent
requests to be serviced in a new thread, we have not placed a
bound on number of threads. Unlimited threads could exhaust
system resources.
Solution is : Thread pool- Create a number of threads in a pool
where they sit and wait for work.
Advantages:
Usually slightly faster to service a request with an existing thread
than create a new thread
Allows the number of threads in the application(s) to be bound to
the size of the pool
Thread Specific Data
Most data is shared among threads, and this is one
of the major benefits of using threads in the first
place.
However sometimes threads need thread-specific
data also.
Most major thread libraries ( pThreads, Win32,
Java ) provide support for thread-specific data,
known as thread-local storage or TLS. Note that
this is more like static data than local variables,
because it does not cease to exist when the function
ends.
Threads vs. Processes
Advantages of multithreading
Sharing between threads is easy
Faster creation
Disadvantages of multithreading
Ensure threads-safety
Bug in one thread can bleed to other threads, since they share the
same address space
Threads must compete for memory
Considerations
Dealing with signals in threads is tricky
All threads must run the same program
Sharing of files, users, etc
Any Questions ????