Assignment 1 (Fall 2024)

Uploaded by

20208046

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views4 pages

Assignment 1 (Fall 2024)

Uploaded by

20208046

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Cairo University

Faculty of Computers and Artificial Intelligence

Machine Learning Course

Assignment 1: Linear and Logistic Regression

Most vehicles on the road today use petrol or diesel fuel engines. When these
fuels burn, they release energy that powers the vehicle. However, this
process also produces waste in the form of carbon dioxide gas (CO2). The
amount of CO2 emission is a basic indication of a vehicle's impact on the
environment and air quality. So, with climate change at the forefront of global
issues, it is crucial to understand and predict the vehicles’ CO2 emission
amounts.

In this assignment, you are required to build a linear regression model and a
logistic regression model to predict the amounts of CO2 emission and their
respective classes based on the vehicles’ data.

Dataset:
The attached dataset “co2_emissions_data.csv” contains over 7000 records
of vehicles' data with 11 feature columns in addition to 2 target columns. The
features are: the vehicle’s make, model, size, engine size, number of cylinders,
transmission type, fuel type, and some fuel consumption ratings columns.
The targets are the CO2 emission amount (in g/km) and the emission class.
Note: This dataset is a modified version of the "CO2 Emission by Vehicles”
dataset. The original dataset was obtained from Kaggle.
Requirements:
Write a Python program in which you do the following:
a) Load the "co2_emissions_data.csv" dataset.
b) Perform analysis on the dataset to:
i) check whether there are missing values
ii) check whether numeric features have the same scale
iii) visualize a pairplot in which diagonal subplots are histograms
iv) visualize a correlation heatmap between numeric columns
c) Preprocess the data such that:
i) the features and targets are separated
ii) categorical features and targets are encoded
iii) the data is shuffled and split into training and testing sets
iv) numeric features are scaled
d) Implement linear regression using gradient descent from scratch to
predict the CO2 emission amount.
-> Based on the correlation heatmap, select two features to be the
independent variables of your model. Those two features should have a
strong relationship with the target but not a strong relationship with
each other (i.e. they should not be redundant).
-> Calculate the cost in every iteration and illustrate (with a plot) how
the error of the hypothesis function improves with every iteration of
gradient descent.
-> Evaluate the model on the test set using Scikit-learn’s R2 score.
e) Fit a logistic regression model to the data to predict the emission class.
-> Use the two features that you previously used to predict the CO2
emission amount.
-> The logistic regression model should be a stochastic gradient
descent classifier.
-> Calculate the accuracy of the model using the test set.
Remarks:
● You can use functions from data analysis and computing libraries (e.g.
Pandas and NumPy) as you please throughout the entire code.
●
● You can use machine learning libraries such as Scikit-learn for
preprocessing and metrics but NOT for "from scratch" requirements.

● The train/test split has to be performed before the feature scaling step.
●
● The numeric features of the test set should be scaled using the statistics
of the train set that were used to scale it.
●
● You should use the R2 score to evaluate the linear regression model as it
provides a measure of how well observed outcomes are replicated by
the model. In general, the best possible score is 1, but the score can be
negative as the model can be arbitrarily worse.

Deliverables:
● You are required to submit ONE zip file containing the following:
○ Your code (.py) file.
If you have a (.ipynb) file, you have to save/download it as (.py)
before submitting.
○ A report (.pdf) containing the team members' names and IDs, and
the code of each requirement with screenshots of the output of
each part.

● The zip file MUST follow this naming convention:

Group_A1_ID1_ID2_ID3_ID4
Submission Remarks:
● The maximum number of students in a team is 5 and the minimum is 3.
● Team members must be from the same lab (or have the same TA).
● All team members must understand all parts of the code.
● No late submission is allowed.
● Stick to uploading ONLY the required files following the naming
convention: Group_A1_ID1_ID2_ID3_ID4
● A penalty will be imposed for violating any of the assignment rules.
● Cheaters will get ZERO and no excuses will be accepted as per the
“Plagiarism Scope” document.

Grading Criteria:
Both the code and the report must include:
Analysis 4 marks (1 mark per step)
Preprocessing 4 marks (1 mark per step)
Linear Regression
Selected features 1 mark
Gradient descent 3 marks
Cost function and plot 1 mark
2
R score 1 mark
Logistic Regression
Classifier 3 marks
Accuracy 1 mark
The total is 18 marks (will be scaled to 6 marks)

Historical Development of Social Work
100% (3)
Historical Development of Social Work
18 pages
CS4100 CS5100 CW1 20241001
No ratings yet
CS4100 CS5100 CW1 20241001
10 pages
IP21 Study Guide
No ratings yet
IP21 Study Guide
6 pages
Module-2 - Logistic Regression in Machine Learning
No ratings yet
Module-2 - Logistic Regression in Machine Learning
28 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
AI Lab File - C
No ratings yet
AI Lab File - C
52 pages
Stages of Language Development
No ratings yet
Stages of Language Development
14 pages
Tips and Tricks Toefl
No ratings yet
Tips and Tricks Toefl
6 pages
Co2 Emission Project
No ratings yet
Co2 Emission Project
6 pages
Data Analytics Program
No ratings yet
Data Analytics Program
11 pages
Ritesh Mangla ML PracticalFile
No ratings yet
Ritesh Mangla ML PracticalFile
55 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
10 pages
Documentation For Co2 Emission Project
No ratings yet
Documentation For Co2 Emission Project
8 pages
Machine Learning-SEAIML-241P (PR) Bharat
No ratings yet
Machine Learning-SEAIML-241P (PR) Bharat
42 pages
PS4
No ratings yet
PS4
8 pages
Implementation of Linear Regression With Python
No ratings yet
Implementation of Linear Regression With Python
5 pages
Ass 1
No ratings yet
Ass 1
3 pages
Int AI TW-PW 03
No ratings yet
Int AI TW-PW 03
4 pages
CMMS
100% (2)
CMMS
86 pages
Linear Regression Tech
No ratings yet
Linear Regression Tech
15 pages
ML Hota Assign3
No ratings yet
ML Hota Assign3
4 pages
Azure AI Fundamentals Study Guide and Practice Exam For The Microsoft AI-900 Exam (David Voss David Voss)
No ratings yet
Azure AI Fundamentals Study Guide and Practice Exam For The Microsoft AI-900 Exam (David Voss David Voss)
54 pages
AIML Hard
No ratings yet
AIML Hard
22 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
Bil470 hw2 Summer2024
No ratings yet
Bil470 hw2 Summer2024
4 pages
Lab 5
No ratings yet
Lab 5
4 pages
hw1 1
No ratings yet
hw1 1
3 pages
BITS - AIML-Cohort 10 - Regression - Assignment 1
No ratings yet
BITS - AIML-Cohort 10 - Regression - Assignment 1
2 pages
Assignment 3
No ratings yet
Assignment 3
2 pages
Lab 6
No ratings yet
Lab 6
2 pages
ML Assignment 2
No ratings yet
ML Assignment 2
3 pages
Train
No ratings yet
Train
17 pages
CO2 Emission Project Source Code
No ratings yet
CO2 Emission Project Source Code
2 pages
22IT035,22IT066
No ratings yet
22IT035,22IT066
4 pages
University Institute of Engineering Department of Computer Science & Engineering
No ratings yet
University Institute of Engineering Department of Computer Science & Engineering
7 pages
MP 1
No ratings yet
MP 1
2 pages
Vishal AIML 2.2
No ratings yet
Vishal AIML 2.2
4 pages
Assignment 2 Regression2
No ratings yet
Assignment 2 Regression2
4 pages
Data-Analytics-Manual Lab G.anill Kumar
No ratings yet
Data-Analytics-Manual Lab G.anill Kumar
23 pages
MLPC Midterm
No ratings yet
MLPC Midterm
18 pages
Emd6m7a Group2
No ratings yet
Emd6m7a Group2
8 pages
ML Syllabus
No ratings yet
ML Syllabus
4 pages
LAB5 Regularization
No ratings yet
LAB5 Regularization
6 pages
Rdatascience - Problem Statements
No ratings yet
Rdatascience - Problem Statements
2 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Write A Lab Report On Linear Regression and Logistic Regression. Include The Cost Function Differentiation and The Code in The Report.
No ratings yet
Write A Lab Report On Linear Regression and Logistic Regression. Include The Cost Function Differentiation and The Code in The Report.
7 pages
Regression Analysis
No ratings yet
Regression Analysis
16 pages
ML Lab Question Set - 1
No ratings yet
ML Lab Question Set - 1
5 pages
ML Assign1 2023 Updated
No ratings yet
ML Assign1 2023 Updated
3 pages
Homework #1 (100 Points) : A. Theory Problems
No ratings yet
Homework #1 (100 Points) : A. Theory Problems
4 pages
MBAN Assignment
No ratings yet
MBAN Assignment
2 pages
Hemraj Python Ass1
No ratings yet
Hemraj Python Ass1
7 pages
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
No ratings yet
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
5 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
North Eastern Institute of Ayurveda & Homoeopathy
No ratings yet
North Eastern Institute of Ayurveda & Homoeopathy
8 pages
W2. Homework - Pipeline
No ratings yet
W2. Homework - Pipeline
1 page
ML0101EN Reg Simple Linear Regression Co2 Py v1
No ratings yet
ML0101EN Reg Simple Linear Regression Co2 Py v1
4 pages
Assignment 2
No ratings yet
Assignment 2
1 page
Lab Experiments Vi Sem-1
No ratings yet
Lab Experiments Vi Sem-1
10 pages
Kanban - Agile Methodology - GeeksforGeeks
No ratings yet
Kanban - Agile Methodology - GeeksforGeeks
19 pages
ED Assignment 2 Thapar University
No ratings yet
ED Assignment 2 Thapar University
14 pages
Col774 Ass1 v1
No ratings yet
Col774 Ass1 v1
5 pages
Annamalai University Faculty of Engineering and Technology, Annamalai Nagar, Cuddalore
No ratings yet
Annamalai University Faculty of Engineering and Technology, Annamalai Nagar, Cuddalore
3 pages
Table of Scpecifications in English 7
100% (1)
Table of Scpecifications in English 7
4 pages
Is Valid Only With Original Photo ID: Railway Recruitment Board
No ratings yet
Is Valid Only With Original Photo ID: Railway Recruitment Board
2 pages
Assignment - 1 - Machine Learning
No ratings yet
Assignment - 1 - Machine Learning
3 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Important Questions
No ratings yet
Important Questions
4 pages
Pe01 - Lesson 1
No ratings yet
Pe01 - Lesson 1
48 pages
Math7 Unpacking
No ratings yet
Math7 Unpacking
1 page
Assignment 1
No ratings yet
Assignment 1
2 pages
Chapter 4
No ratings yet
Chapter 4
20 pages
Electrostatics
No ratings yet
Electrostatics
17 pages
Lecture 9-NN - Modified
No ratings yet
Lecture 9-NN - Modified
94 pages
WLP Eapp Q1
No ratings yet
WLP Eapp Q1
14 pages
Kami Export - Optics (PDF - Ai-Rotated)
No ratings yet
Kami Export - Optics (PDF - Ai-Rotated)
235 pages
Janice Final Resume
No ratings yet
Janice Final Resume
2 pages
Abrahams Et Al. - 2013 - The Assessment of Practical Work in School Science
No ratings yet
Abrahams Et Al. - 2013 - The Assessment of Practical Work in School Science
66 pages
The Natural Approach
No ratings yet
The Natural Approach
4 pages
GA Lec5 Operators Selection Replacement
No ratings yet
GA Lec5 Operators Selection Replacement
26 pages
Paper Work Bullet SPM
No ratings yet
Paper Work Bullet SPM
4 pages
Midterm
No ratings yet
Midterm
4 pages
IB World Literature Assignment 1
No ratings yet
IB World Literature Assignment 1
3 pages
Final 2006jan
No ratings yet
Final 2006jan
40 pages
Term 1 Syllabus Class 11
No ratings yet
Term 1 Syllabus Class 11
2 pages
BR1701942080brochure ACFM GIAN2024 R1
No ratings yet
BR1701942080brochure ACFM GIAN2024 R1
4 pages
Lecture 7
No ratings yet
Lecture 7
15 pages
Lecture 6
No ratings yet
Lecture 6
27 pages
Syllabus: Communication Systems
No ratings yet
Syllabus: Communication Systems
8 pages
Final 2006june
No ratings yet
Final 2006june
2 pages
Airway Management in Emergencies, 1st Edition Digital Download
100% (8)
Airway Management in Emergencies, 1st Edition Digital Download
17 pages
Midterm 2022
No ratings yet
Midterm 2022
7 pages
Week Subject Specs/Targets Concepts & Skills Students' Task Teacher's Exposition/ Actions Teacher Assessment Resources Required Week 1 June 8-12
No ratings yet
Week Subject Specs/Targets Concepts & Skills Students' Task Teacher's Exposition/ Actions Teacher Assessment Resources Required Week 1 June 8-12
2 pages
Math 10 WLP Week 6 2022 2023
No ratings yet
Math 10 WLP Week 6 2022 2023
2 pages
Midterm 2017
No ratings yet
Midterm 2017
5 pages
Final 2008
No ratings yet
Final 2008
2 pages
2020 Answer v2 by Sallam
No ratings yet
2020 Answer v2 by Sallam
8 pages
Machine Learning Exams - Solved
No ratings yet
Machine Learning Exams - Solved
8 pages
Final 2016
No ratings yet
Final 2016
2 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
2 pages
Problem Solving
No ratings yet
Problem Solving
4 pages
Final 2007
No ratings yet
Final 2007
2 pages
Genetic Algorithms (CS464)
No ratings yet
Genetic Algorithms (CS464)
2 pages
College of Education Ila Orangun Affliated To EkSU - Google Search
No ratings yet
College of Education Ila Orangun Affliated To EkSU - Google Search
1 page
Final 2014
No ratings yet
Final 2014
2 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
2 pages
Final 2015
No ratings yet
Final 2015
2 pages
Final 2013
No ratings yet
Final 2013
2 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
2 pages
Quiz A
No ratings yet
Quiz A
1 page
Special Programs Final Exam Schedule - v3
No ratings yet
Special Programs Final Exam Schedule - v3
1 page
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
Neo4j Graph Data Science Certified - Exam Practice Tests
From Everand
Neo4j Graph Data Science Certified - Exam Practice Tests
Cristian Scutaru
No ratings yet

Assignment 1 (Fall 2024)

Uploaded by

Assignment 1 (Fall 2024)

Uploaded by

Cairo University

Faculty of Computers and Artificial Intelligence

Assignment 1: Linear and Logistic Regression

● The zip file MUST follow this naming convention:

You might also like