0% found this document useful (0 votes)
69 views37 pages

14 Concurrency Threads

Lect
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views37 pages

14 Concurrency Threads

Lect
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

CS 550 Operating Systems

Spring 2019

“OSTEP Piece #2” – Concurrency


Threads

1
If you want to do one task
• Start one process

P1
If you want to do two task “concurrently”
• Start two processes
• Maybe P1 forks P2
• and P3…PN etc if
more than two tasks

P1 P2
• Problem:
• fork is expensive
• cold-start penalty
If P1 and P2 want to talk to each other?
• E.g. access the same data or
synchronize?
• Two different address spaces
• Need to use IPC
• shared memory, pipes, sockets,
signals
• Problem
P1 IPC P2
• kernel transitions are expensive
• May need to copy data
• user—->kernel—>user
• Inter-process Shared memory is
a pain to set up.
Option 1:Event-driven programming
• Make one process do all the tasks
P1
• Busy loop polls for events and
executes tasks for each event
while(1)
• No IPC needed
{
• Length of the busy loop determines if (event 1) do task 1;
response latency if (event 2) do task 2;

• Stateful event responses complicate if (event N) do task N;
the code }
• What if ith occurrence of event 1
effects the jth event processing ?
Option 2: Use threads
• Multiple threads of execution per
process
• Each thread has its own
• Program counter P1
• Stack, stack pointer Shared address space
• Registers
• All threads share
• one virtual address space
• code, heap and static data
• Lower context switching overhead
• No IPC
• Zero data transfer cost
• Only need inter-thread T1 T2 T3 T4
syncrhonization
Other Shared and non-shared components
• Shared components
• Open descriptors (files, sockets etc)
• Signals and Signal handlers

• Not shared
• Thread ID
• Errno
• Priority
Address space layout
Example: A word processor with three threads

• First thread handles keyboard input


• Second thread handles screen display
• Third thread handles saving the document to disk
Example: a multi-threaded web server

• A dispatcher thread waits for and accepts network connections


• Several worker threads
• Each worker processes one network connection concurrently
Advantages of threads
• Lower inter-thread context switching overhead than processes

• No Inter-process communication
• Zero data transfer cost between threads
• Only need inter-thread synchronization

• Threads can be pre-empted at any point


• Long-running threads are OK
• As opposed to event-driven tasks that must be short.

• Threads can exploit parallelism


• But it depends…more later

• Threads could block without blocking other threads


• But it depends…more later
Disadvantages of Threads
• Shared State!
• Global variables are shared between threads.
• Accidental data changes can cause errors.

• Threads and signals don’t mix well


• Common signal handler for all threads in a process
• Which thread to signal? Everybody!
• Royal pain to program correctly.

• Lack of robustness
• Crash in one thread will crash the entire process.

• Some library functions may not be thread-safe


• Library Functions that return pointers to static internal memory. E.g. gethostbyname()
• Less of a problem these days.
Example: a multi-threaded web server

• A dispatcher thread waits for and accepts network connections


• Several worker threads
• Each worker processes one network connection concurrently
Two types of threads: user-level and kernel-level

User-level threads Kernel-level threads


• User-level libraries provide multiple • OS kernel provides multiple threads
threads, per process
• OS kernel does not recognize user-level
threads • Each thread is scheduled
independently by the kernel’s CPU
• Threads execute when the process is scheduler
scheduled
Hybrid Implementations

Multiplexing user-level threads within each kernel- level threads


Local Thread Scheduling
• Next thread is picked from among the threads
belonging to the current process
• Each process gets a timeslice from kernel.
• Then the timeslice is divided up among the
threads within the current process

• Local scheduling can be implemented with


either
• Kernel-level threads OR
• User-level threads.

• Scheduling decision requires only local


knowledge of threads within the current • For example, say process
process. timeslice may be 50ms, and each
thread within the process runs for
5 msec/CPU burst
Global Thread scheduling
• Next thread to be scheduled is picked
up from ANY process in the system.
• Not just the current process

• Timeslice is allocated at the


granularity of threads
• No notion of per-process timeslice

• Global scheduling can be


implemented only with kernel-level • For example each thread
threads runs for 10msec per CPU
• Picking the next thread requires
burst
global knowledge of threads in all
processes.
POSIX threads API: pthread
• Implementations of the API are available on many
Unix-like POSIX-conformant operating systems.
• There are around 100 Pthreads procedures, all
prefixed "pthread_" and they can be categorized
into four groups:
• Thread management - creating, joining threads
etc.
• Mutexes
• Condition variables
• Synchronization between threads using
read/write locks and barriers 18
Pthread API: thread identification
• A thread ID is represented by the pthread_t data type.
• On many implementations the pthread_t type is
represented using integers (e.g., unsigned long in
Linux).
• Different from the pid_t type, implementations are
allowed to use a structure to represent the pthread_t
data type.
• Therefore, portable applications can’t treat the
pthread_t type as integer → we need a function to
compare thread IDs, instead of using “==“ operator.

19
Pthread API: process identification
• A thread can obtain its own thread ID by calling the
pthread_self() function.

• Why do threads need to know their own thread IDs?


• Various pthreads functions use thread IDs to identify the
thread on which they are to act.
• In some applications, it can be useful to tag dynamic data
structures with the ID of a particular thread. This can
serve to identify the thread that created or “owns” a data
structure.
20
Pthread API: thread creation

• The traditional UNIX process model supports


only one thread of control per process.
• Conceptually, this is the same as a threads-based model
whereby each process is made up of only one thread.
• With pthreads, when a program runs, it also
starts out as a single process with a single
thread of control.
• As the program runs, its behavior should be
indistinguishable from the traditional process, until
it creates more threads of control. 21
Pthread API: thread creation

• The thread argument set is to the thread ID of the newly


created thread before pthread_create() returns.
• The attr argument is a pointer to a pthread_attr_t object
that specifies various attributes for the new thread.
• If attr is specified as NULL, then the thread is created
with various default attributes
22
Pthread API: thread creation

• The new thread commences execution by calling the


function identified by start, with the argument arg (i.e., run
as start(arg)).
• start is a pointer to a function that takes a void pointer as
input, and returns a void pointer.
• arg points to a global or heap variable, but it can also be
specified as NULL.
• Can we pass a pointer to a local variable to arg?
23
Pthread API: thread creation

• Why are the argument and the return value are made as void
pointer?
• To pass multiple arguments or return multiple values
• For example, if we need to pass multiple arguments to
start, then arg can be specified as a pointer to a structure
containing the arguments as separate fields.
• The return value of the thread start function is also a void
pointer (void *). It can be captured by pthread_join().
24
Pthread API: thread creation

• When a thread is created, there is no guarantee which will


run first: the newly created thread or the calling thread.
• Note the difference of return value of process related
functions and thread related functions.
• Process functions → 0:success -1:error errno:failure
reason
• Thread functions → 0:success positive number:failure
reason
25
may not give correct thread ID
if the thread ID type is not
implemented as integer

26
Pthread API: thread termination

• If any thread within a process calls exit() or


_exit(), or the main thread performs a return in the
main() function, then the entire process terminates.
• A single thread can exit in three ways, thereby
stopping its flow of control, without terminating
the entire process.
• The thread can simply return from the start
routine. The return value is the thread’s exit
code.
• The thread can be canceled by another thread in
the same process.
• The thread can call pthread_exit(). 27
Pthread API: thread termination

• Calling pthread_exit() is equivalent to performing a


return in the thread’s start function.
• The difference that pthread_exit() can be called from
any function called by the thread’s start function.
• The rval_ptr argument is a void pointer.
• This pointer is visible to other threads in the process
by calling the pthread_join() function.
28
Pthread API: joining a terminated thread

• The pthread_join() function waits for the thread identified by thread


to terminate. This operation is termed joining.
• The calling thread will block until the specified thread calls
pthread_exit, returns from its start routine, or is canceled.
• If the thread calls pthread_exit(), retval points to the retval
argument of pthread_exit().
• If the thread simply returned from its start routine, retval points to the
return value of the start routine (which is also a void pointer).
• If the thread was canceled, the memory location specified by
rval_ptr is set to PTHREAD_CANCELED.
29
Pthread API: joining a terminated thread
• Why do we need to join a terminated process?
• If a thread is not detached, then we must join with it using
pthread_join().
• Otherwise, the thread terminates, it produces the thread
equivalent of a zombie process.
• Aside from wasting system resources, if enough thread
zombies accumulate, we won’t be able to create additional
threads

30
31
Pthread API: detaching a thread
• By default, a thread is joinable, meaning that when it
terminates, another thread can obtain its return status
using pthread_join().
• If don’t care about the thread’s return status and want
the system to automatically clean up and remove the
thread when it terminates.
• In this case, we can mark the thread as detached, by
making a call to pthread_detach() specifying the
thread’s identifier in thread.

32
int idata = 111; /* Allocated in data segment */

int main(int argc, char *argv[])


{
int istack = 222; /* Allocated in stack segment */
pid_t childPid;

childPid = fork();
if (childPid == -1) {exit(-1);}
else if (childPid == 0) {idata *= 333; istack *= 666;} /* Child Process: modify data */
else {sleep(3)} /* Parent process: give child a chance to execute */

/* Both parent and child come here */


printf("PID=%ld %s idata=%d istack=%d\n", (long) getpid(), (childPid == 0) ? "(child) " :
"(parent)", idata, istack);

exit(0);
}

33
int idata = 111; /* Allocated in data segment */

void *mythread(void *arg) {


idata *= 333;
istack *= 666; Undefined variable (each thread has its own stack)
return (void *) 0;
}

int main(int argc, char *argv[])


{
int m, istack = 222; /* Allocated in stack segment */
pthread_t tid;

pthread_create(&tid, NULL, mythread, NULL);

sleep(1);
printf(“idata=%d istack=%d\n”, idata, istack);
exit(0); ?
}

idata=333
34
Threads race condition: sharing data

35
What we expect …

But actually …

36
Source of the problem: uncontrolled
scheduling and unprotected shared data

37

You might also like