0% found this document useful (0 votes)
23 views6 pages

Lab 2 Threads

The lab report covers various activities related to threading with Pthreads, including thread creation, passing arguments, and performing operations like summation and matrix multiplication using threads. Key concepts discussed include the importance of pthread join(), memory allocation for thread IDs, and ensuring data independence to avoid race conditions. The report emphasizes the flexibility of parallel processing for various operations while maintaining safe access to shared data.

Uploaded by

oussama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views6 pages

Lab 2 Threads

The lab report covers various activities related to threading with Pthreads, including thread creation, passing arguments, and performing operations like summation and matrix multiplication using threads. Key concepts discussed include the importance of pthread join(), memory allocation for thread IDs, and ensuring data independence to avoid race conditions. The report emphasizes the flexibility of parallel processing for various operations while maintaining safe access to shared data.

Uploaded by

oussama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

CSC3351: Threading with Pthreads

Lab Report
Oussama Agourram
February 24, 2025

1 Activity 1: Basic Thread Creation


1.1 Report Questions
1. What happens if you remove the pthread join() call?
If the pthread join() call is removed, the main thread will continue execution without
waiting for the child thread to complete. This could lead to the main thread terminat-
ing before the child thread finishes, causing the entire process (including all threads)
to exit prematurely. This means the child thread might not get a chance to complete
its execution or print its message.

2. How many threads (including the main thread) are running while the child
thread is active?
Two threads are running while the child thread is active:

• The main thread (which created the child thread)


• The child thread itself

3. What is the return type of a thread function, and how is it declared?


The return type of a thread function is void* (a void pointer), and it is declared with
a single parameter of void*. The signature is:
1 void * function_name ( void * arg ) ;
2

This allows the thread function to accept and return arbitrary data through pointers.

2 Activity 2: Creating Multiple Threads


2.1 Report Questions
1. Are the threads guaranteed to print in the order they were created?

1
No, the threads are not guaranteed to print in the order they were created. Thread
execution scheduling is determined by the operating system scheduler, which may run
threads in any order based on various factors such as system load, scheduling policy,
and thread priority. The output order can vary between different runs of the program.
2. How could you verify that the thread IDs are unique?
To verify that thread IDs are unique, you could:
• Store the thread IDs in an array as they’re printed
• Compare each new ID against previously stored IDs to check for duplicates
• Use a data structure like a hash set to track IDs and check for duplicates
Alternatively, you could simply observe the output of pthread self() for each thread
and confirm they’re different values.
3. What changes when you define NUM THREADS larger (e.g., 10, 20)?
When NUM THREADS is increased:
• More threads will be created and executed
• There will be more interleaving of thread outputs, as more threads compete for
CPU time
• The overhead of thread creation and management increases
• The system might slow down if the number of threads exceeds the number of
available CPU cores
• The thread scheduling becomes more complex, potentially leading to more varied
execution patterns

3 Activity 3: Passing Arguments to Threads


3.1 Report Questions
1. Why do we use malloc() to pass the ID?
We use malloc() to allocate memory for the thread ID on the heap rather than the
stack because:
• If we used a stack variable inside the loop (e.g., int id = i;), its memory location
might be reused or go out of scope before the thread accesses it
• The loop that creates threads would continue and modify the variable before the
thread could read its value
• Heap-allocated memory persists until explicitly freed, ensuring the value remains
valid until the thread is done with it
• Each thread gets its own independent copy of the value that won’t be affected by
the main thread’s execution

2
2. Could you store the ID on the stack instead of the heap?
No, storing the ID on the stack would be unsafe. If we used a stack variable:
• The variable would be local to the loop iteration
• It would go out of scope or be reused for the next iteration before the thread
could access it
• Multiple threads could end up reading the same value, or reading undefined values
after the stack memory is reused
The only safe way to use stack variables would be to have a separate stack variable for
each thread (like an array of ints declared before the loop), but allocating on the heap
is a more flexible approach.
3. Why is pthread join() needed before freeing tids?
The pthread join() calls are needed before freeing the tids array because:
• The tids array contains the thread identifiers that are still in use by the threads
• If we freed the memory before joining, we’d be deallocating memory that’s still
in active use
• Joining ensures that all threads have completed execution before we free the
resources they were using
• This prevents potential memory access violations or undefined behavior

4 Activity 4: Summation of an Array Using Threads


4.1 Report Questions
1. How is data sharing avoided so that no additional synchronization primitives
are needed?
Data sharing is avoided through careful partitioning of the data and output:
• Each thread is assigned a distinct segment of the array to process (no overlapping)
• Each thread writes its result to a unique index in the partialSums array
• Since no memory location is written to by more than one thread, there’s no risk
of race conditions
• The main thread only reads from the partialSums array after all threads have
completed (via pthread join())
This approach is called ”data parallelism” and eliminates the need for mutexes or other
synchronization mechanisms.
2. What happens if ARRAY SIZE is not divisible by NUM THREADS?
The code handles this case with a conditional expression when calculating the end
value for each thread:

3
1 threadData [ i ]. end = ( i == NUM_THREADS - 1)
2 ? ARRAY_SIZE
3 : ( i + 1) * segmentSize ;
4

This ensures that:

• Most threads process exactly segmentSize elements


• The last thread processes any remaining elements (which may be more than
segmentSize)
• All elements in the array are processed exactly once
• No elements are missed or processed twice

3. Could you parallelize other operations in the same way?


Yes, this pattern can be applied to many other operations:

• Counting occurrences of a specific value in an array


• Finding minimum or maximum values
• Computing other statistics like average, median, standard deviation
• Filtering elements based on a condition
• Applying transformations to array elements
• Pattern matching or searching

Any operation that can be performed independently on segments of an array and then
combined is a good candidate for this parallelization approach.

5 Activity 5: Dot Product of Two Vectors


5.1 Report Questions
1. How does this approach compare to the array summation code in terms of
data partitioning?
This approach is very similar to the array summation code:

• Both divide the input data into contiguous segments for parallel processing
• Both assign each thread a specific range of indices to work with
• Both store partial results in thread-specific locations in a shared array
• Both combine partial results in the main thread after all worker threads complete

The main difference is that the dot product operates on two input arrays simultaneously
(multiplying corresponding elements) instead of just summing one array.

4
2. Which thread concept ensures that partialDots[i] is safely written before
the main thread reads it?
The pthread join() function ensures safe access to partialDots[i]:

• When the main thread calls pthread join(threads[i], NULL), it blocks until
the specified thread has completed
• This creates a happens-before relationship, guaranteeing that all memory oper-
ations in the thread (including writing to partialDots[i]) complete before the
join returns
• Only after joining all threads does the main thread read from the partialDots
array
• This synchronization ensures memory visibility without requiring explicit syn-
chronization primitives like mutexes

3. Can you extend this method to do additional vector arithmetic in parallel?


Yes, this method can be extended to other vector operations:

• Vector addition: each thread adds corresponding elements of two vectors for its
segment
• Vector subtraction: similar to addition but subtracting elements
• Scalar multiplication: multiply each element by a scalar value
• Element-wise vector multiplication: multiply corresponding elements (without
summing)
• Vector normalization: compute the magnitude and then divide each element
• Computing vector metrics: like Euclidean distance, Manhattan distance, etc.

The basic pattern remains the same: divide the vectors into segments, process inde-
pendently in parallel, and combine results if needed.

6 Activity 6: Basic Multi-Threaded Matrix Multipli-


cation
6.1 Report Questions
1. How is each thread assigned its subset of rows?
Each thread is assigned a subset of rows based on:

• Calculating the total number of rows per thread: rowsPerThread = N / NUM THREADS
• Assigning each thread a starting row: threadData[i].startRow = i * rowsPerThread
• Determining the ending row (exclusive):
– For most threads: threadData[i].endRow = (i + 1) * rowsPerThread

5
– For the last thread: threadData[i].endRow = N (to handle any remainder)

This ensures that all rows are covered and that the workload is divided as evenly as
possible among the threads.

2. Why do we not require additional synchronization in this matrix example?


Additional synchronization is not required because:

• Each thread writes to a completely separate portion of the output matrix C


• No thread modifies matrix elements that another thread reads from or writes to
• The input matrices A and B are only read, not modified
• The main thread only accesses the result matrix C after all worker threads have
completed (via pthread join())

This data independence eliminates the potential for race conditions or data corruption.

3. How would you handle the case where N is not evenly divisible by NUM THREADS?
The code already handles this case with the ternary expression:
1 threadData [ i ]. endRow = ( i == NUM_THREADS - 1)
2 ? N
3 : ( i + 1) * rowsPerThread ;
4

This ensures that:

• Most threads process exactly rowsPerThread rows


• The last thread processes any remaining rows if N is not evenly divisible
• All rows of the matrix are processed exactly once
• The workload distribution is as balanced as possible

You might also like