This document provides instructions for a machine learning assignment to predict air quality using linear regression. Students are asked to:
1. Analyze a dataset containing air quality information to determine the number of features.
2. Train a linear regression model using gradient descent with various learning rates and stochastic gradient descent, reporting coefficients, intercept, and hypothesis function.
3. Implement a function to compute the coefficient of determination and report the score on the training dataset.
4. Make predictions on a test dataset and compute the score, then compare the training and test scores.
This document provides instructions for a machine learning assignment to predict air quality using linear regression. Students are asked to:
1. Analyze a dataset containing air quality information to determine the number of features.
2. Train a linear regression model using gradient descent with various learning rates and stochastic gradient descent, reporting coefficients, intercept, and hypothesis function.
3. Implement a function to compute the coefficient of determination and report the score on the training dataset.
4. Make predictions on a test dataset and compute the score, then compare the training and test scores.
In this problem, you are given a dataset (https://github.com/coding-blocks-archives/machine- learning-online-2018/tree/master/assignment_datasets/Regression_Data) containing information about various features on which air quality depends. Download the data set from the above link, train your Linear Regression Model without using library function for linear regression.
1. How many features you observe in the dataset.
2. Use Gradient Descent Algorithm with variouse leanring rates. Use Convergence criteria as change in error. Repeat the same part with stochastic gradient descent(batch size=1). 3. Report the value of coefficients and intercept, and hypothesis function. 4. Implement a function to compute Coefficient of Determination. Report the 'Score' on the training dataset. 5. Make predictions on test dataset and compute score. Compare your Training and Test Scores. Which one is better?