Assignment 2
Assignment 2
Assignment 2
Submission Instructions:
In this assignment, you will implement federated learning for a linear regression model with a
single independent variable using C/C++ client-server processes. You will learn how to handle
distributed machine learning, model parameter exchange between clients and server, and model
aggregation.
You will first implement a standard linear regression model in C/C++ using a single dataset. You
will combine the nine separate training subsets of the provided dataset and train the model using
gradient descent. After training the model, you will evaluate its performance on the test subset
by calculating the root mean squared error (RMSE).
Steps:
In this part of the assignment, you will design a federated learning framework. This involves
developing both client and server programs in C/C++. The clients will each train a linear
regression model locally on one of the nine training subsets. Once training is complete, the
clients will send the trained model parameters (weights and bias) to the server. The server will
compute a weighted average of the received parameters to create a global model, which will then
be used to make predictions on the test set. The final performance of the model will be evaluated
using RMSE.
Dataset Description
You are provided with a dataset that contains 10000 records of students' performance and study
hours. It consists of ten subsets of data, of which nine will be used for training and the tenth
subset will be used as the test set to evaluate the performance of the trained model. Each subset
is provided as a text file containing multiple rows, with each row representing a single student
record. Each record consists of two values:
1. Study Hours: A floating-point number representing the number of hours the student
studied.
2. Performance Index: An integer representing the student's performance index, a score
ranging between 1 and 100 that reflects their overall academic performance.
Example:
SH, PI
The first row represents a student who studied for 3.5 hours
3.5 78
5.0 85 and achieved a performance index of 78.
2.2 60
The second row shows a student who studied for 5.0 hours
7.1 92
and had a performance index of 85, and so on.
4.3 75
Deliverables
• [50 marks] C/C++ code for both the centralized model and the federated learning model.
• [25 marks] A brief report, in PDF format, explaining the design and implementation of
both models with screenshots of code and output. The report should also include a
comparison of the two models and their results.
• [25 marks] A 10-minute screen recording demonstrating the functionality of your
programs. Explain the code and provide a demo of the client-server federated learning
process, including the final RMSE results. Each group member should explain their
individual contribution. Upload the video on Google Drive and add its link to the report.
Note: You will get zero marks in the whole assignment if any of the deliverables are missing.