Lab 2 Threads
Lab 2 Threads
Lab Report
Oussama Agourram
February 24, 2025
2. How many threads (including the main thread) are running while the child
thread is active?
Two threads are running while the child thread is active:
This allows the thread function to accept and return arbitrary data through pointers.
1
No, the threads are not guaranteed to print in the order they were created. Thread
execution scheduling is determined by the operating system scheduler, which may run
threads in any order based on various factors such as system load, scheduling policy,
and thread priority. The output order can vary between different runs of the program.
2. How could you verify that the thread IDs are unique?
To verify that thread IDs are unique, you could:
• Store the thread IDs in an array as they’re printed
• Compare each new ID against previously stored IDs to check for duplicates
• Use a data structure like a hash set to track IDs and check for duplicates
Alternatively, you could simply observe the output of pthread self() for each thread
and confirm they’re different values.
3. What changes when you define NUM THREADS larger (e.g., 10, 20)?
When NUM THREADS is increased:
• More threads will be created and executed
• There will be more interleaving of thread outputs, as more threads compete for
CPU time
• The overhead of thread creation and management increases
• The system might slow down if the number of threads exceeds the number of
available CPU cores
• The thread scheduling becomes more complex, potentially leading to more varied
execution patterns
2
2. Could you store the ID on the stack instead of the heap?
No, storing the ID on the stack would be unsafe. If we used a stack variable:
• The variable would be local to the loop iteration
• It would go out of scope or be reused for the next iteration before the thread
could access it
• Multiple threads could end up reading the same value, or reading undefined values
after the stack memory is reused
The only safe way to use stack variables would be to have a separate stack variable for
each thread (like an array of ints declared before the loop), but allocating on the heap
is a more flexible approach.
3. Why is pthread join() needed before freeing tids?
The pthread join() calls are needed before freeing the tids array because:
• The tids array contains the thread identifiers that are still in use by the threads
• If we freed the memory before joining, we’d be deallocating memory that’s still
in active use
• Joining ensures that all threads have completed execution before we free the
resources they were using
• This prevents potential memory access violations or undefined behavior
3
1 threadData [ i ]. end = ( i == NUM_THREADS - 1)
2 ? ARRAY_SIZE
3 : ( i + 1) * segmentSize ;
4
Any operation that can be performed independently on segments of an array and then
combined is a good candidate for this parallelization approach.
• Both divide the input data into contiguous segments for parallel processing
• Both assign each thread a specific range of indices to work with
• Both store partial results in thread-specific locations in a shared array
• Both combine partial results in the main thread after all worker threads complete
The main difference is that the dot product operates on two input arrays simultaneously
(multiplying corresponding elements) instead of just summing one array.
4
2. Which thread concept ensures that partialDots[i] is safely written before
the main thread reads it?
The pthread join() function ensures safe access to partialDots[i]:
• When the main thread calls pthread join(threads[i], NULL), it blocks until
the specified thread has completed
• This creates a happens-before relationship, guaranteeing that all memory oper-
ations in the thread (including writing to partialDots[i]) complete before the
join returns
• Only after joining all threads does the main thread read from the partialDots
array
• This synchronization ensures memory visibility without requiring explicit syn-
chronization primitives like mutexes
• Vector addition: each thread adds corresponding elements of two vectors for its
segment
• Vector subtraction: similar to addition but subtracting elements
• Scalar multiplication: multiply each element by a scalar value
• Element-wise vector multiplication: multiply corresponding elements (without
summing)
• Vector normalization: compute the magnitude and then divide each element
• Computing vector metrics: like Euclidean distance, Manhattan distance, etc.
The basic pattern remains the same: divide the vectors into segments, process inde-
pendently in parallel, and combine results if needed.
• Calculating the total number of rows per thread: rowsPerThread = N / NUM THREADS
• Assigning each thread a starting row: threadData[i].startRow = i * rowsPerThread
• Determining the ending row (exclusive):
– For most threads: threadData[i].endRow = (i + 1) * rowsPerThread
5
– For the last thread: threadData[i].endRow = N (to handle any remainder)
This ensures that all rows are covered and that the workload is divided as evenly as
possible among the threads.
This data independence eliminates the potential for race conditions or data corruption.
3. How would you handle the case where N is not evenly divisible by NUM THREADS?
The code already handles this case with the ternary expression:
1 threadData [ i ]. endRow = ( i == NUM_THREADS - 1)
2 ? N
3 : ( i + 1) * rowsPerThread ;
4