Lab 2
Lab 2
Lab 2
Process & Multithreaded Process
Course: Operating Systems
October 9, 2022
Goal: This lab helps student to review the data segment of a process and distinguish
the differences between thread and process.
Content In detail, this lab requires student identify the memory regions of process’s
data segment, practice with examples such as creating a multithread program, showing
the memory region of threads:
• view process memory regions: Data segment, BSS segment, Stack and Heap.
• Show the differences between process and thread in term of memory region.
Result After doing this lab, students can understand the mechanism of distributing
memory region to allocate the data segment for specific processes. In addition, they
understand how to write a multithreaded program.
Requirement Student need to review the theory of process memory and thread.
1
Contents
1. Introduction 2
1.1. Process ’s memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2. Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Interprocess Communication . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4. Introduction to thread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2. Practice 7
2.1. Looking inside a process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2. How to transfer data between processes? . . . . . . . . . . . . . . . . . . . 8
2.2.1. Shared Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2. Synchronization Issues in Shared Memory . . . . . . . . . . . . . . 11
3. Pipe 14
3.1. How to create multiple threads? . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.1. Thread libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.2. Multithread programming . . . . . . . . . . . . . . . . . . . . . . . 18
4. Exercise (Required) 20
4.1. Problem 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2. Problem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3. Problem 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1. Introduction
1.1. Process ’s memory
Traditionally, a Unix process is divided into segments. The standard segments are code
segment, data segment, BSS (block started by symbol), and stack segment.
The code segment contains the binary code of the program which is running as the pro-
cess (a “process” is a program in execution). The data segment contains the initialized
global variables and data structures. The BSS segment contains the uninitialized global
data structures and finally, the stack segment contains the local variables, return ad-
dresses, etc. for the particular process.
Under Linux, a process can execute in two modes - user mode and kernel mode. A
process usually executes in user mode, but can switch to kernel mode by making sys-
tem calls. When a process makes a system call, the kernel takes control and does the
requested service on behalf of the process. The process is said to be running in kernel
mode during this time. When a process is running in user mode, it is said to be “in
userland” and when it is running in kernel mode it is said to be “in kernel space”. We
will first have a look at how the process segments are dealt with in userland and then
take a look at the book keeping on process segments done in kernel space.
2
In Figure 1.1, blue regions represent virtual addresses that are mapped to physical
memory, whereas white regions are unmapped. The distinct bands in the address space
correspond to memory segments like the heap, stack, and so on.
• The Code segment consists of the code - the actual executable program. The code
of all the functions we write in the program resides in this segment. The addresses
of the functions will give us an idea where the code segment is. If we have a
function func() and let p be the address of func() (p = &func;). We know that p
will point within the code segment.
• The Data segment consists of the initialized global variables of a program. The
Operating system needs to know what values are used to initialize the global vari-
ables. The initialized variables are kept in the data segment. To get the address of
the data segment we declare a global variable and then print out its address. This
address must be inside the data segment.
3
address which occurs inside the BSS, we declare an uninitialized global variable,
then print its address.
• The automatic variables (or local variables) will be allocated on the stack, so
printing out the addresses of local variables will provide us with the addresses
within the stack segment.
1.2. Stack
Stack is one of the most important memory region of a process. It is used to store tem-
porary data used by the process (or thread). The name “stack” is used to described the
way data put and retrieved from this region which is identical to stack data structure:
The last item pushed to the stack is the first one to be removed (popped).
Stack organization makes it suitable from handling function calls. Each time a function
is called, it gets a new stack frame. This is an area of memory which usually contains, at
a minimum, the address to return when it complete, the input arguments to the function
and space for local variables.
In Linux, stack starts at a high address in memory and grows down to increase its size.
Each time a new function is called, the process will create a new stack frame for this
function. This frame will be place right after that of its caller. When the function re-
turns, this frame is clean from memory by shrinking the stack (stack pointer goes up).
The following program illustrates to identify the relative location of stack frames create
by nested function calls.
Similar to heap, stack has a pointer name stack pointer (as heap has program break)
which indicate the top of the stack. To change stack size, we must modify the value of
4
this pointer. Usually, the value of stack pointer is hold by stack pointer register inside
the processor. Stack space is limited, we cannot extend the stack exceed a given size.
If we do so, stack overflow will occurs and crash our program. To identify the default
stack size, use the following command
1 ulimit −s
Different from heap, data of stack are automatically allocated and cleaned through
procedure invocation termination, Therefore, in C programming, we do not need to
allocate and free local variables. In Linux, a process is permitted to have multiple stack
regions. Each regions belongs to a thread.
5
Figure 1.2: Shared memory vs message passing model.
Figure 1.3 illustrates the difference between a traditional single-threaded process and a
multithreaded process. The benefits of multithreaded programming can be broken down
into four major categories:
• Responsiveness
6
• Resource sharing
• Economy
• Scalability
Question: What resources are used when a thread is created? How do they differ from
those used when a process is created?
On a system with multiple cores, however, concurrency means that the threads can run
in parallel, because the system can assign a separate thread to each core, as Figure 1.4
shown.
Question: Is it possible to have concurrency but not parallelism? Explain.
2. Practice
2.1. Looking inside a process
Looking at the following C program with basic statements:
1 #include <s t d i o . h>
2 #include <s t d l i b . h>
3 #include <sys /types . h>
4 #include <unistd . h>
5
6 int glo_init_data = 99;
7
7 int glo_noninit_data ;
8
9 void print_func ( ) {
10 int local_data = 9 ;
11 p r i n t f ( "Process␣ID␣=␣%d\n" , getpid ( ) ) ;
12 p r i n t f ( "Addresses␣ of ␣the␣ process : \ n" ) ;
13 p r i n t f ( " 1 . ␣glo_init_data␣=␣%p\n" , &glo_init_data ) ;
14 p r i n t f ( " 2 . ␣glo_noninit_data␣=␣%p\n" , &glo_noninit_data ) ;
15 p r i n t f ( " 3 . ␣print_func ( ) ␣=␣%p\n" , &print_func ) ;
16 p r i n t f ( " 4 . ␣local_data␣=␣%p\n" , &local_data ) ;
17 }
18
19 int main( int argc , char ∗∗argv ) {
20 print_func ( ) ;
21 return 0 ;
22 }
Let’s run this program many times and give the discussion about the segments of a
process. Where is data segment/BSS segment/stack/code segment?
• Its first parameter is an integer key that specifies which segment to create and
unrelated processes can access the same shared segment by specifying the same
key value. Moreover, other processes may have also chosen the same fixed key,
which could lead to conflict. So that you should be careful when generating keys
for shared memory regions. A solution is that you can use the special constant
IPC_PRIVATE as the key value guarantees that a brand new memory segment is
created.
• Its second parameter specifies the number of bytes in the segment. Because seg-
ments are allocated using pages, the number of actually allocated bytes is rounded
up to an integral multiple of the page size.
• The third parameter is the bitwise or of flag values that specify options to shmget.
The flag values include these:
– IPC_CREAT: This flag indicates that a new segment should be created. This
permits creating a new segment while specifying a key value.
8
– IPC_EXCL: This flag, which is always used with IPC_CREAT, causes shmget
to fail if a segment key is specified that already exists. If this flag is not given
and the key of an existing segment is used, shmget returns the existing seg-
ment instead of creating a new one.
– Mode flags: This value is made of 9 bits indicating permissions granted to
owner, group, and world to control access to the segment.
To make the shared memory segment available, a process must attach it by calling
shmat().
void ∗shmat( int shmid , const void ∗shmaddr , int shmflg ) ;
• a pointer that specifies where in your process’s address space you want to map the
shared memory; if you specify NULL, Linux will choose an available address.
• The third argument is a flag. You can read more details about this argument
in Linux manual page. https://man7.org/linux/man-pages/man3/shmat.3p.
html.
When you’re finished with a shared memory segment, the segment should be detached
using shmdt. Pass it the address returned by shmat. If the segment has been deallocated
and this was the last process using it, it is removed. Examples: Run the two following
processes in two terminals. At the writer process, you can type an input string and
observe returns from the reader process.
• writer.c
#include <sys /types . h>
#include <sys / ipc . h>
#include <sys /shm . h>
#include <s t d i o . h>
#include <unistd . h>
#include <s t d l i b . h>
#define SHM_KEY 0x123
int main( int argc , char ∗argv [ ] )
{
int shmid ;
char ∗shm ;
9
}else {
p r i n t f ( "shared␣memory : ␣%d\n" , shmid ) ;
}
• reader.c
#include <sys /types . h>
#include <sys / ipc . h>
#include <sys /shm . h>
#include <s t d i o . h>
#include <s t d l i b . h>
∗/
shmid = shmget (SHM_KEY, 1000 , 0644|IPC_CREAT) ;
i f ( shmid < 0) {
perror ( "shmget" ) ;
10
return 1 ;
}
else {
p r i n t f ( "shared␣memory : ␣%d\n" , shmid ) ;
}
s l e e p (10) ;
i f (shmdt(shm) == −1) {
perror ( "shmdt" ) ;
return 1 ;
}
return 0 ;
}
• writer1.c
#include <sys /types . h>
#include <sys / ipc . h>
#include <sys /shm . h>
#include <s t d i o . h>
#include <unistd . h>
11
#include <s t d l i b . h>
#include <semaphore . h>
#include <f c n t l . h>
#include <stdbool . h>
#include<sys / s t a t . h>
#define SHM_KEY 0x123
#define SNAME "/mysem"
struct shared_data{
int counter ;
int writerID ;
};
int main( int argc , char ∗argv [ ] ) {
int shmid ;
struct shared_data ∗data ;
sem_t ∗sem = sem_open(SNAME, O_CREAT,0644) ;
sem_init (sem , 0 , 1) ;
while( true ) {
sem_wait(sem) ;
p r i n t f ( "Read␣from␣Writer␣ID : ␣%d␣with␣counter : ␣%d\n" ,
data−>writerID , data−>counter ) ;
data−>writerID = 1 ;
data−>counter++;
sem_post(sem) ;
12
s l e e p (1) ;
}
i f (shmdt( data ) == −1) {
perror ( "shmdt" ) ;
eturn 1 ;
}
• writer2.c
#include <sys /types . h>
#include <sys / ipc . h>
#include <sys /shm . h>
#include <s t d i o . h>
#include <unistd . h>
#include <s t d l i b . h>
#include <semaphore . h>
#include <f c n t l . h>
#include <stdbool . h>
#include<sys / s t a t . h>
#define SHM_KEY 0x123
#define SNAME "/mysem"
struct shared_data{
int counter ;
int writerID ;
};
13
p r i n t f ( "shared␣memory : ␣%d\n" , shmid ) ;
}
i f (sem == SEM_FAILED) {
p r i n t f ( "Sem␣ f a i l e d \n" ) ;
return −1;
}
In the above programs, the semaphore named sem will be used to lock the code segment
changing the values of shared variable. It ’s noteworthy that we use semaphores’ names
to identify them between different processes.
3. Pipe
Pipe actually is very common method to transfer data between processes. For example,
the "pipe" operator ’|’ can be used to transfer the output from a command to another
command as in the following example:
# the output from " h i s t o r y " w i l l be input to the grep command.
14
history | grep "a"
In terms of C programming, the standard library named "unistd.h" defined the following
function to create a pipe. This function creates a pipe, a unidirectional data channel
that can be used for interprocess communication. The array pipefd is used to return two
file descriptors referring to the ends of the pipe. pipefd[0] refers to the read end of the
pipe. pipefd[1] refers to the write end of the pipe. Data written to the write end of the
pipe is buffered by the kernel until it is read from the read end of the pipe.
int pipe ( int pipefd [ 2 ] ) ;
// Child process
i f ( pid == 0) {
read ( pipefds [ 0 ] , readmessage , sizeof ( readmessage ) ) ;
p r i n t f ( "Child␣Process : ␣Reading , ␣message␣ i s ␣%s \n" , readmessage ) ;
return 0 ;
}
//Parent process
p r i n t f ( "Parent␣Process : ␣Writing , ␣message␣ i s ␣%s \n" , writemessages ) ;
write ( pipefds [ 1 ] , writemessages , sizeof ( writemessages ) ) ;
return 0 ;
}
In the above program, firstly the parent process will create a pipline and call fork() to
create a child process. Then, the parent process will write a message to the pipeline.
At the same time, the child process will read data from the pipeline. Noticeably, both
write() and read() need to know the size of the message.
15
3.1. How to create multiple threads?
3.1.1. Thread libraries
A thread library provides the programmer with an API for creating and managing
threads. There are two primary ways of implementing a thread library. Three main
thread libraries are in use today: POSIX Pthreads, Windows, and Java. In this lab, we
use POSIX Pthread on Linux and Mac OS to practice with multithreading programming.
Creating threads
pthread_create ( thread , attr , start_routine , arg )
Initially, your main() program comprises a single, default thread. All other threads
must be explicitly created by the programmer.
• thread: An opaque, unique identifier for the new thread returned by the subrou-
tine.
• attr: An opaque attribute object that may be used to set thread attributes. You
can specify a thread attributes object, or NULL for the default values.
• start: the C routine that the thread will execute once it is created.
16
19 p r i n t f ( "In␣main : ␣ creating ␣thread␣%ld \n" , t ) ;
20 rc = pthread_create(&threads [ t ] ,NULL, PrintHello , (void ∗) t ) ;
21 i f ( rc ) {
22 p r i n t f ( "ERROR; ␣ return ␣from␣pthread_create ( ) ␣ i s ␣%d\n" , rc ) ;
23 e x i t (−1) ;
24 }
25 }
26
27 /∗ Last thing t h a t main() should do ∗/
28 pthread_exit (NULL) ;
29 }
Passing argument to Thread We can pass a structure to each thread such as the
example below. Using the previous example to implement this example:
1 struct thread_data{
2 int thread_id ;
3 int sum ;
4 char ∗message ;
5 };
6
7 struct thread_data thread_data_array [NUM_THREADS] ;
8
9 void ∗ PrintHello (void ∗thread_arg )
10 {
11 struct thread_data ∗my_data ;
12 ...
13 my_data = ( struct thread_data ∗) thread_arg ;
14 taskid = my_data−>thread_id ;
15 sum = my_data−>sum ;
16 hello_msg = my_data−>message ;
17 ...
18 }
19
20 int main ( int argc , char ∗argv [ ] )
21 {
22 ...
23 thread_data_array [ t ] . thread_id = t ;
24 thread_data_array [ t ] . sum = sum ;
25 thread_data_array [ t ] . message = messages [ t ] ;
26 rc = pthread_create(&threads [ t ] , NULL, PrintHello ,
27 (void ∗) &thread_data_array [ t ] ) ;
28 ...
29 }
17
Joining and Detaching Threads “Joining” is one way to accomplish synchronization
between threads, For example:
• The pthread_join() subroutine blocks the calling thread until the specified threa-
did thread terminates.
• The programmer is able to obtain the target thread’s termination return status if
it was specified in the target thread’s call to pthread_exit().
18
19 return −1;
20 }
21 /∗ get the d e f a u l t a t t r i b u t e s ∗/
22 pthread_attr_init(&a t t r ) ;
23 /∗ create the thread ∗/
24 pthread_create(&tid , &attr , runner , argv [ 1 ] ) ;
25 /∗ wait f o r the thread to e x i t ∗/
26 pthread j o i n ( tid ,NULL) ;
27
28 p r i n t f ( "sum␣=␣%d\n" ,sum) ;
29 }
30
31 /∗ The thread w i l l begin control in t h i s function ∗/
32 void ∗runner (void ∗param)
33 {
34 int i , upper = a t o i (param) ;
35 sum = 0 ;
36 for ( i = 1 ; i <= upper ; i++)
37 sum += i ;
38 pthread e x i t (0) ;
39 }
19
4. Exercise (Required)
4.1. Problem 1
Firstly, downloading two text files from the url: https://drive.google.com/file/
d/1fgJqOeWbJC4ghMKHkuxfIP6dh2F911-E/view?usp=sharing These file contains the
100000 ratings of 943 users for 1682 movies in the following format:
userID <tab> movieID <tab> r a t i n g <tab> timeStamp
userID <tab> movieID <tab> r a t i n g <tab> timeStamp
...
Secondly, you should write a program that spawns two child processes, and each of them
will read a file and compute the average ratings of movies in the file. You implement
the program by using shared memory method.
4.2. Problem 2
An interesting way of calculating pi is to use a technique known as Monte Carlo, which
involves randomization. This technique works as follows: Suppose you have a circle
inscribed within a square, as shown in Figure 4.1.
(Assume that the radius of this circle is 1.) First, generate a series of random points as
simple (x, y) coordinates. These points must fall within the Cartesian coordinates that
bound the square. Of the total number of random points that are generated, some will
occur within the circle. Next, estimate pi by performing the following calculation:
pi = 4 x (number of points in c i r c l e ) / ( t o t a l number of points )
20
As a general rule, the greater the number of points, the closer the approximation to
pi. However, if we generate too many points, this will take a very long time to perform
our approximation. Solution for this problem is to carry out point generation and cal-
culation concurrently. Suppose the number of points to be generated is nPoints. We
create N separate threads and have each thread to create only nPoints / N points and
count the number of points fall into the circle. After all threads have done their job we
then get the total number of points inside the circle by combining the results from each
thread. Since the total number of points has been generated equal to nPoint, the results
of this method is equivalent to that of running a single process program. Furthermore,
as threads run concurrently and the number of points each thread has to handle is much
fewer than that of a singe process program, we can save a lot of time.
Write two programs implementing algorithm describe above: one serial ver-
sion and one multi-thread version.
The program takes the number of points to be generated from user then creates multiple
threads to approximate pi. Put all of your code in two files named “pi_serial.c” and
“pi_multi-thread.c”. The number of points is passed to your program as an input pa-
rameter. For example, if your executable file is pi then to have your program calculate
pi by generating one million points, we will use the follows command:
. / p i _ s e r i a l 1000000
. / pi_multi−thread 1000000
Requirement: The multi-thread version must have some speed-up compared to the
serial version. There are at least 2 targets in the Makefile pi_serial and pi_multi-thread
to compile the two program.
4.3. Problem 3
Conventionally, pipe is a one-way communication method.(In the example at section 3,
you can test by add a read() call after the writer() call at the parent process, a write()
call after the read() call at the child process and observe what happens?). However, we
still can have some tricks to adapt it for two-way communication by using two pipes. In
this exercise, you should implement the TODO segment in the below program.
1 #include <s t d i o . h>
2 #include <s t d l i b . h>
3 #include <unistd . h>
4 static int pipefd1 [ 2 ] , pipefd2 [ 2 ] ;
5
6 void INIT(void) {
7 i f ( pipe ( pipefd1 )<0 | | pipe ( pipefd2 )<0){
8 perror ( "pipe" ) ;
9 e x i t (EXIT_FAILURE) ;
10 }
21
11 }
12 void WRITE_TO_PARENT(void) {
13 /∗ send parent a message through pipe ∗/
14 // TO DO
15 p r i n t f ( "Child␣send␣message␣to␣parent ! \ n" ) ;
16 }
17 void READ_FROM_PARENT(void) {
18 /∗ read message sent by parent from pipe ∗/
19 // TO DO
20 p r i n t f ( "Child␣ r e c e i v e ␣message␣from␣parent ! \ n" ) ;
21 }
22 void WRITE_TO_CHILD(void) {
23 /∗ send c h i l d a message through pipe ∗/
24 // TO DO
25 p r i n t f ( "Parent␣send␣message␣to␣ c h i l d ! \ n" ) ;
26 }
27 void READ_FROM_CHILD(void) {
28 /∗ read the message sent by c h i l d from pipe ∗/
29 // TO DO
30 p r i n t f ( "Parent␣ r e c e i v e ␣message␣from␣ c h i l d ! \ n" ) ;
31 }
32 int main( int argc , char∗ argv [ ] ) {
33 INIT ( ) ;
34 pid_t pid ;
35 pid = fork ( ) ;
36 // s e t a timer , process w i l l end a f t e r 1 second .
37 alarm (10) ;
38 i f ( pid==0){
39 while (1) {
40 s l e e p ( rand ( )%2+1);
41 WRITE_TO_CHILD( ) ;
42 READ_FROM_CHILD( ) ;
43 }
44 }else{
45 while (1) {
46 s l e e p ( rand ( )%2+1);
47 READ_FROM_PARENT( ) ;
48 WRITE_TO_PARENT( ) ;
49 }
50 }
51 return 0 ;
52 }
22
A. Memory-related data structures in the kernel
In the Linux kernel, every process has an associated struct task_struct. The definition
of this struct is in the header file include /linux/sched.h.
1 struct task_struct {
2 volatile long s t a t e ;
3 /∗ −1 unrunnable , 0 runnable , >0 stopped ∗/
4 struct thread_info ∗thread_info ;
5 atomic_t usage ;
6 ...
7 struct mm_struct ∗mm, ∗active_mm ;
8 ...
9 pid_t pid ;
10 ...
11 char comm[ 1 6 ] ;
12 ...
13 } ;
• The mm_struct within the task_struct is the key to all memory management
activities related to the process.
23
Here the first member of importance is the mmap. The mmap contains the pointer
to the list of VMAs (Virtual Memory Areas) related to this process. Full usage of the
process address space occurs very rarely. The sparse regions used are denoted by VMAs.
The VMAs are stored in struct vm_area_struct defined in linux/mm.h:
1 struct vm_area_struct {
2 struct mm_struct ∗ vm_mm; /∗The address space we belong to . ∗/
3 unsigned long vm_start ; /∗Our s t a r t address within vm_mm. ∗/
4 unsigned long vm_end; /∗The f i r s t byte a f t e r our end
5 address within vm_mm. ∗/
6 ....
7 /∗ l i n k e d l i s t of VM areas per task , sorted by address ∗/
8 struct vm_area_struct ∗vm_next ;
9 ....
10 }
24