0% found this document useful (0 votes)
11 views23 pages

Daa Mini Project Report

This mini project report focuses on implementing matrix multiplication using both single-threaded and multithreaded approaches, specifically analyzing one thread per row and one thread per cell strategies. The project aims to evaluate the performance of these methods to understand the benefits and limitations of parallel processing in matrix operations. The findings will help optimize software for large-scale matrix computations on modern multi-core processors.

Uploaded by

vaidehi.zurvs21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views23 pages

Daa Mini Project Report

This mini project report focuses on implementing matrix multiplication using both single-threaded and multithreaded approaches, specifically analyzing one thread per row and one thread per cell strategies. The project aims to evaluate the performance of these methods to understand the benefits and limitations of parallel processing in matrix operations. The findings will help optimize software for large-scale matrix computations on modern multi-core processors.

Uploaded by

vaidehi.zurvs21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

lOMoARcPSD|47139209

DAA-Mini project report

Computer Engineering (Savitribai Phule Pune University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

A MINI PROJECT REPORT ON

“MATRIX MULTIPLICATION (DAA)”

SUBMITTED TOWARDS THE


PARTIAL FULFILLMENT OF THE REQUIREMENTS OF

BACHELOR OF ENGINEERING (BE Computer Engineering)

BY
STUDENT
NAME Roll no.

Kartikay Dhakad BEA29

Ashish Auti BEA12

Nidhi Bachuwar BEA13

Under The Guidance of


Dr. Megha V. Kadam

“Towards Ubiquitous Computing Technology”


DEPARTMENT OF COMPUTER ENGINEERING
Marathwada Mitra Mandal’s Institute of
Technology (MMIT)
Lohgaon, Pune- 411047 (20222023)

Department of Computer Engg, MMIT Lohegaon ,Pune 47 1


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

“Techno-Social Excellence”

Marathwada Mitra Mandal’s Institute of


Technology (MMIT)
Lohgaon, Pune- 411 047
Accredited ‘A’ Grade by NAAC

“Towards Ubiquitous Computing Technology”


DEPARTMENT OF COMPUTER ENGINEERING

CERTIFICATE
This is to certify that the Project Entitled

“MATRIX MULTIPLICATION (DAA)

Submitted by
STUDENT
NAME Roll no.
Kartikay Dhakad BEA29
Ashish Auti BEA12
Nidhi Bachuwar BEA13
Is a bonafide work carried out by students under the supervision Dr. Megha V. Kadam and it is
submitted towards the partial fulfilment of the requirement of Mini Project of Laboratory Practice
III (410246) for Bachelor of Engineering (BE Computer Engineering).

Dr. Megha V. Kadam Prof. Subhash G. Rathod


Subject Teacher H.O.D
Dept. of Computer Engg. Dept. of Computer Engg.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 2


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

Abstract

Matrix multiplication is a fundamental operation in linear algebra and computer


science, with applications in various fields such as image processing, scientific
simulations, and machine learning. This mini-project aims to implement matrix
multiplication using both single-threaded and multithreaded approaches and analyze
their performance to understand the benefits and limitations of parallel processing.
For the multithreaded matrix multiplication, we will explore two parallelization
strategies - one thread per row and one thread per cell. In the "one thread per row"
approach, each row of the output matrix will be computed by a separate thread. In
the "one thread per cell" approach, individual threads will calculate each cell of the
output matrix in parallel. This will enable us to examine the trade-offs between
thread management and computational parallelism.
By conducting this comparative analysis, we aim to provide insights into the
advantages and challenges of multithreaded matrix multiplication. Understanding
the performance characteristics of these approaches will aid in making informed
decisions when dealing with large-scale matrix operations and optimizing software
for modern multi-core processors.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 3


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

INDEX

S. No TOPIC NAME PAGE NO


5
1
INTRODUCTION
7
2
LITERATURE SURVEY
8
3
OBJECTIVES OF SYSTEM
9
4
PROBLEM STATEMENT
10
5
IMPLEMENTATION
14
6
ALGORITHMS USED
17
9
RESULTS
19
10
FUTURE SCOPE
21
11
CONCLUSION

22
12
REFERENCES

Department of Computer Engg, MMIT Lohegaon ,Pune 47 4


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

INTRODUCTION

Matrix multiplication is a computationally intensive operation with numerous


applications in fields such as computer graphics, scientific computing, and machine
learning. As the size of matrices increases, the time taken to perform matrix
multiplication also increases, making it a prime candidate for optimization.
Multithreading is a parallel computing technique that can significantly enhance the
performance of matrix multiplication by utilizing multiple CPU cores or threads to
perform computations concurrently. This project focuses on implementing matrix
multiplication using multithreading and evaluates the efficiency of two different
thread allocation strategies: one thread per row and one thread per cell.

MULTITHREADED MATRIX MULTIPLICATION:

1. One Thread Per Row (Row-wise Multithreading): In this strategy, each


row of the resulting matrix is computed concurrently by a separate thread.
The idea behind this approach is to distribute the workload evenly across
the available threads, ensuring that each thread is responsible for
multiplying a subset of rows. This method is advantageous when dealing
with larger matrices, as it minimizes the overhead associated with thread

Department of Computer Engg, MMIT Lohegaon ,Pune 47 5


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

creation and synchronization. However, it may introduce load imbalance


issues if the rows have significantly different computational complexity.

2. One Thread Per Cell (Cell-wise Multithreading): Cell-wise


multithreading takes a different approach by allocating one thread to each
cell (element) in the resulting matrix. Each thread is responsible for
calculating a single cell's value, which involves the summation of products
from the corresponding row and column in the input matrices. This
approach offers fine-grained parallelism and is beneficial when dealing
with smaller matrices or in situations where load balance is a concern.
However, it may introduce higher synchronization overhead due to the large
number of threads involved.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 6


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

LITERATURE SERVEY

Matrix multiplication is a fundamental operation in linear algebra, and its efficient


implementation has garnered significant attention in both computer science and
numerical mathematics. Various algorithms and techniques have been developed
over the years to optimize the performance of matrix multiplication. The traditional
approach for matrix multiplication involves a straightforward triple-nested loop,
which results in a time complexity of O(n^3) for multiplying two n x n matrices.
Strassen's algorithm, developed by Volker Strassen in 1969, was a breakthrough as
it reduced the time complexity to O(n^2.81). This algorithm inspired further
research into optimizing matrix multiplication for various hardware architectures.

In the context of multithreaded matrix multiplication, parallelism is a key factor to


improve performance. Researchers have explored different strategies for
parallelizing matrix multiplication. One common approach is to assign one thread
per row of the resulting matrix. This allows for efficient partitioning of the
computation, but it may not fully leverage the available hardware resources,
especially in scenarios where the number of rows significantly exceeds the number
of processor cores.

Another strategy is to assign one thread per cell or element of the resulting matrix,
distributing the workload more evenly. However, this approach introduces more
complex synchronization and thread management, potentially incurring additional
overhead. The performance trade-offs between these two multithreading strategies
have been a subject of investigation in the literature.

Notable research has also focused on the hardware-level optimizations for matrix
multiplication, such as cache-aware algorithms and techniques designed to exploit
the parallelism of modern processors. Matrix multiplication is a pivotal operation
in scientific and engineering applications, making the study of efficient
implementation techniques of utmost importance. In this report, we aim to explore
and compare the performance of both single-threaded and multithreaded matrix
multiplication using the one-thread-per-row and one-thread-per-cell approaches.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 7


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

OBJECTIVES OF SYSTEM

The objective of this mini project is to implement matrix multiplication using two
different multithreading approaches: one thread per row and one thread per cell.
The performance of both approaches will be analyzed and compared to determine
which one is more efficient in terms of execution time for a given matrix size. The
primary goals are as follows:

• Implement matrix multiplication in a single-threaded environment.


• Implement multithreaded matrix multiplication using two approaches: one
thread per row and one thread per cell.
• Analyze and compare the performance of the single-threaded and
multithreaded implementations.
• Understand the impact of multithreading on matrix multiplication efficiency.
• Identify scenarios where multithreading offers significant performance
improvements.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 8


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

PROBLEM STATEMENT

The objective of this project is to create a Python program to perform matrix


multiplication and assess the efficiency of multithreading as a means to optimize
the process.
The project will explore two distinct multithreading strategies: one in which each
thread computes the product of a row in the result matrix, and the other in which
each thread computes the product of a single cell. By comparing these two
approaches, we aim to gain insights into the performance improvements that
multithreading can offer in the context of matrix multiplication.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 9


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

IMPLEMENTATION

#Python code for matrix multiplication with both single-threaded and


multithreaded implementations.

import numpy as np

import time import

threading

# Generate random matrices matrix_size

= 200

matrix_A = np.random.rand(matrix_size, matrix_size) matrix_B

= np.random.rand(matrix_size, matrix_size)

# Single-threaded matrix multiplication def

single_threaded_matrix_multiply(A, B):

start_time = time.time() result =

np.dot(A, B) end_time =

time.time() return result, end_time

- start_time

Department of Computer Engg, MMIT Lohegaon ,Pune 47 10


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

# Multithreaded matrix multiplication (one thread per row) def

multithreaded_matrix_multiply_rowwise(A, B, num_threads):

start_time = time.time()

result = np.zeros((matrix_size, matrix_size))

def multiply_rows(start, end):

for i in range(start, end):

result[i] = np.dot(A[i], B)

threads = [] chunk_size = matrix_size // num_threads for i

in range(0, matrix_size, chunk_size): thread =

threading.Thread(target=multiply_rows, args=(i, i + chunk_size))

threads.append(thread) thread.start()

for thread in threads:

thread.join()

end_time = time.time() return

result, end_time - start_time

# Multithreaded matrix multiplication (one thread per cell) def

multithreaded_matrix_multiply_cellwise(A, B, num_threads):

Department of Computer Engg, MMIT Lohegaon ,Pune 47 11


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

start_time = time.time()

result = np.zeros((matrix_size, matrix_size))

def multiply_cell(row, col):

result[row, col] = np.dot(A[row], B[:, col])

threads = [] for i in range(matrix_size): for j in

range(matrix_size): thread =

threading.Thread(target=multiply_cell, args=(i, j))

threads.append(thread) thread.start()

for thread in threads:

thread.join()

end_time = time.time() return

result, end_time - start_time

# Perform the matrix multiplications


single_result, single_time = single_threaded_matrix_multiply(matrix_A,
matrix_B) num_threads = 4 # You can adjust this based on your CPU
cores rowwise_result, rowwise_time =
multithreaded_matrix_multiply_rowwise(matrix_A, matrix_B,
num_threads)

Department of Computer Engg, MMIT Lohegaon ,Pune 47 12


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

cellwise_result, cellwise_time =
multithreaded_matrix_multiply_cellwise(matrix_A, matrix_B, num_threads)

# Print the execution times print("Single-threaded execution

time:", single_time, "seconds")

print("Multithreaded (one thread per row) execution time:", rowwise_time,


"seconds")
print("Multithreaded (one thread per cell) execution time:", cellwise_time,
"seconds")

# Output:

Department of Computer Engg, MMIT Lohegaon ,Pune 47 13


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

ALGORITHMS USED

Single-Threaded Matrix Multiplication:


1. Initialize two input matrices, A and B, with appropriate dimensions.
• You start by creating two matrices with the required dimensions, matrix
A and matrix B.
2. Initialize an empty result matrix, C, with the same dimensions.
• You create a result matrix C with the same dimensions as matrices A and
B, which will store the final multiplication result.
3. For each row i in matrix A and for each column j in matrix B:
• You loop through each row of matrix A (i) and each column of matrix B
(j).
• Compute the dot product of the i-th row of matrix A and the j-th column
of matrix B.
• In this step, you calculate the dot product by multiplying the
corresponding elements of the i-th row of A and the j-th column of
B and summing the results.
• Store the result in the corresponding cell (i, j) of matrix C.
• You save the computed dot product in the corresponding cell (i, j)
of the result matrix C.
4. Return the resulting matrix C.
• Finally, you return the matrix C as the result of the matrix multiplication.

Multithreaded Matrix Multiplication (One Thread Per Row):

1. Initialize two input matrices, A and B, with appropriate dimensions.


2. Initialize an empty result matrix, C, with the same dimensions.
3. Divide the rows of matrix A into equal-sized chunks for each thread.
• In this step, you divide the rows of matrix A into equal-sized chunks,
allocating each chunk to a separate thread. This allows parallel
computation of matrix multiplication.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 14


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

4. Create multiple threads, one for each row chunk, and assign each thread to
compute the product of its assigned rows.
• You create multiple threads, one for each row chunk, and assign each
thread the task of computing the product of its assigned rows.
5. In each thread, compute the dot product of the assigned rows of matrix A with
the entire matrix B.
• Each thread performs the dot product calculation by multiplying the
assigned rows of A with the entire matrix B.
6. Store the results in the corresponding rows of matrix C.
• The results from each thread are stored in the corresponding rows of the
result matrix C.
7. Ensure proper synchronization to avoid race conditions.
• Proper synchronization mechanisms are implemented to ensure that
multiple threads do not access and modify the same memory locations
simultaneously, which helps prevent race conditions.
8. Return the resulting matrix C.
• Finally, you return the matrix C as the result of the parallel matrix
multiplication operation.

Multithreaded Matrix Multiplication (One Thread Per Cell):


1. Initialize two input matrices, A and B, with appropriate dimensions.
2. Initialize an empty result matrix, C, with the same dimensions.
3. Create multiple threads, one for each cell in the resulting matrix C (i, j).
• Instead of dividing the work by rows, you create multiple threads, one
for each cell (i, j) in the resulting matrix C.
4. In each thread, compute the dot product of the i-th row of matrix A with the j-
th column of matrix B.
• Each thread calculates the dot product for its assigned cell by multiplying
the i-th row of A with the j-th column of B.
5. Store the result in the corresponding cell (i, j) of matrix C.
• The result of each dot product calculation is stored in the corresponding
cell (i, j) of the result matrix C.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 15


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

6. Ensure proper synchronization to avoid race conditions.


• As in the previous parallel algorithm, you ensure proper synchronization
to prevent race conditions when multiple threads are writing to the same
result matrix.
7. Return the resulting matrix C.
• Finally, you return the matrix C as the result of the parallel matrix
multiplication operation.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 16


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

RESULTS

1. Single-Threaded Execution:

• The single-threaded execution of matrix multiplication with a matrix


size of 200x200 took approximately 0.0136 seconds.

• The resulting matrix C from the single-threaded computation is


available for reference.

2. Multithreaded Execution (One Thread Per Row):

• The multithreaded execution with one thread per row for the same
matrix size (200x200) reduced the execution time to approximately
0.0090 seconds.

• The resulting matrix C from the multithreaded computation (one


thread per row) is available for reference.

3. Multithreaded Execution (One Thread Per Cell):

• The multithreaded execution with one thread per cell for the same
matrix size (200x200) had a significantly longer execution time of
approximately 7.8864 seconds.

• The resulting matrix C from the multithreaded computation (one


thread per cell) is available for reference.

• The single-threaded execution took the longest time, while the


multithreaded approach with one thread per row significantly
improved the execution time.
• However, the multithreaded approach with one thread per cell was
notably slower compared to the other methods.
• The use of multithreading, specifically one thread per row, proved to
be more efficient than the single-threaded approach, resulting in
reduced execution times.
• One thread per cell, while conceptually parallel, led to significantly
longer execution times due to the high overhead of thread
management and synchronization.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 17


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

• The scalability of the multithreaded approaches was limited, possibly


due to the small matrix size and the overhead associated with creating
and managing multiple threads.
• The results suggest that multithreading, when applied correctly (e.g.,
one thread per row), can improve the efficiency of matrix
multiplication, reducing execution times.
• However, the choice of multithreading strategy is crucial, as one
thread per cell led to a significant performance degradation.
• For small matrix sizes, the benefits of multithreading may be
outweighed by the overhead of thread management.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 18


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

Future Scope

1. Parallelization Optimization: Investigate and implement more advanced


parallelization techniques, such as SIMD (Single Instruction, Multiple Data) or GPU
acceleration, to further improve the performance of matrix multiplication. This can
lead to even faster computations for large matrices.

- Advanced parallelization methods like SIMD and GPU acceleration have the
potential to significantly enhance the performance of matrix multiplication.
Exploring and implementing these techniques can unlock the full potential of
modern hardware, enabling faster and more efficient matrix operations. Research
into optimizing algorithms for these platforms can yield substantial speedup,
especially for large-scale matrices.

2. Dynamic Thread Management: Develop a more intelligent thread management


system that can dynamically adjust the number of threads based on the available
CPU cores and matrix size. This can help optimize performance for different
scenarios.

- Adapting the number of threads to the available hardware resources and problem
size is crucial for achieving optimal performance. Developing a dynamic thread
management system that intelligently allocates and deallocates threads can enhance
efficiency, ensuring that resources are utilized effectively. This would make the
algorithm more adaptable to varying computational environments.

3. Cache Awareness: Explore techniques to make the algorithm cache-aware,


ensuring that data is efficiently used from cache memory to minimize memory
access times, which can be a significant bottleneck in matrix multiplication.

- Cache-aware optimization is essential for improving memory access patterns and


reducing data transfer times. Investigating strategies to maximize cache utilization,
such as tiling and data reordering, can significantly reduce memory-related
bottlenecks, especially on modern processors with complex memory hierarchies.
This can lead to substantial performance gains for both single-threaded and
multithreaded implementations.

4. Benchmarking on Different Hardware:Conduct performance benchmarking on


a variety of hardware configurations, including multi-core CPUs, GPUs, and

Department of Computer Engg, MMIT Lohegaon ,Pune 47 19


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

distributed systems, to analyze how different architectures affect the efficiency of


multithreaded matrix multiplication.

- Extensive benchmarking on diverse hardware configurations is crucial to


understand the behavior and efficiency of matrix multiplication algorithms across
various computing platforms. By assessing performance on different hardware, we
can identify strengths and weaknesses and tailor optimization strategies for specific
architectures, making the algorithm more versatile and adaptable for real-world
applications.

Incorporating these future scope elements can lead to not only enhanced
performance and efficiency but also a more adaptable and versatile matrix
multiplication algorithm suitable for a wide range of computing environments and
hardware setups.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 20


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

CONCLUSION

In this project, we explored the efficiency of multithreading in the context of matrix


multiplication. Our results showed that the multithreaded approach with one thread
per row significantly reduced the execution time compared to the single-threaded
approach, demonstrating the potential benefits of parallel computing in matrix
operations.
However, the use of one thread per cell exhibited considerable performance
degradation due to high thread management overhead. Furthermore, we found that
the scalability of multithreaded approaches was limited, likely due to the relatively
small matrix size used in our experiments.
In conclusion, the choice of multithreading strategy is crucial in optimizing matrix
multiplication. While multithreading can enhance performance, careful
consideration is required to balance the benefits against the overhead introduced
by managing multiple threads. Future work could focus on optimizing the
parallelization techniques further, considering memory efficiency and cache
awareness, and exploring the application of these techniques in real-time and
distributed computing scenarios.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 21


Downloaded by vaidehi dharmadhikari ([email protected])
lOMoARcPSD|47139209

MATRIX MULTIPLICATION

REFERENCES

[1] Strassen, Volker. "Gaussian elimination is not optimal." Numerische


Mathematik 13.4 (1969): 354-356.
[2] Solomonik, Edgar, and James Demmel. "Communication-optimal parallel
algorithm for Strassen's matrix multiplication." ACM Transactions on Computer
Systems (TOCS) 30.2 (2012): 7.
[3] Wilkinson, J. H. "The Algebraic Eigenvalue Problem." Oxford University
Press, 1965.
[4] Reinders, James. "Intel Threading Building Blocks: Outfitting C++ for
multi-core processor parallelism." O'Reilly Media, 2007.

Department of Computer Engg, MMIT Lohegaon ,Pune 47 22


Downloaded by vaidehi dharmadhikari ([email protected])

You might also like