0% found this document useful (0 votes)
68 views3 pages

CSC 603 - Final Project

The document outlines requirements for a machine learning project, including choosing a dataset, preprocessing data, building models, evaluating performance, and tuning hyperparameters. Deliverables are code files, a technical report with abstract, introduction, methods, results and conclusion sections, and a presentation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views3 pages

CSC 603 - Final Project

The document outlines requirements for a machine learning project, including choosing a dataset, preprocessing data, building models, evaluating performance, and tuning hyperparameters. Deliverables are code files, a technical report with abstract, introduction, methods, results and conclusion sections, and a presentation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

University of Tabuk

Faculty of Computers and Information Technology


Department of Information Systems
First Semester 1443
CSC-603 Machine Learning/ Master of Science- AI Program
Machine Learning Project
Due date: 14/5/2022

Overview

Machine Learning projects are mainly focus on machine learning algorithms and evaluations
by implementing or even modifying them to handle real-world problems. However, these
machine-learning algorithms is considered as somehow the final stage of a set of processes
starting from data collection, data preparation (including data wrangling), data visualization,
and prediction or forecasting, where the project's results can be seen. Our goal in this project
is to delve deeper into these steps especially the machine learning algorithms and practice them
with a well-known dataset.

• Dataset is one of the key aspects of the machine leaning project, therefore, the group can
choose one the following datasets:
1- Network intrusion detection dataset
https://drive.google.com/drive/folders/1sVPshUvHkOBwm0gvLQJgFB4XxoyswwVq?usp=sharin
g

2- Customer Churn, predict the churn risk rate


https://www.kaggle.com/datasets/undersc0re/predict-the-churn-risk-rate

3- Inconsistent and consistent amazon reviews: For detecting mismatch between review's
text and review's star rating
https://www.kaggle.com/datasets/yeshmesh/inconsistent-and-consistent-amazon-reviews

4- CUSTOMER_BANKANALYSIS_CLASSIFICATION:
Bank_Termdeposit_Customer_Analysis_Classificationmodel
https://www.kaggle.com/datasets/saikrishjalakam/customer-bankanalysis-classification

5- Amazon Reviews for SA fine-grained


https://www.kaggle.com/datasets/yacharki/amazon-reviews-for-sentianalysis-finegrained-csv

6- Document classification
https://www.kaggle.com/datasets/achrafbribiche/document-classification

7- Document classification2
https://www.kaggle.com/competitions/doc-class/data?select=sol_all1.csv

8- Document classification3
https://www.kaggle.com/datasets/haytemcharraj/document-classification

9- Energy aspects data of house hold communities


https://www.kaggle.com/datasets/neyyanjayesh/non-parametric-energy-aspects-data

• Projects can be done individually, or in teams of two students. For a two-person group,
group members are responsible for dividing up the work equally and making sure that each
member contributes.

Machine learning project consists of three main steps as follows:

1. Data pre-processing (5 marks)


• This step includes removing duplicate rows, check for null values, Compensate the
missing data, feature scaling (value standardization and/or normalization), encoding
character or string to numeric values, and visualizing dataset.

• Make 3 useful graphs that show different features of the data. Write a paragraph in your
report for each plot describing the interesting qualities that your visualization shows.
These must include the following:
o one-line chart
o one scatter plot
o one bar chart or histogram
2. Building Machine Learning Models (10 marks)
1) Selecting and Training Machine Learning Models: you have to choose and implement
at least 5 machine learning algorithms.
2) Evaluating the Model : Evaluation matrices: for classification- Accuracy, Precision,
Recall, F1, and Confusion matrix etc. For regression, Mean Squared Error (MSE) and R-
squared etc.
3) Hyperparameter Tuning: Once you have created and evaluated your model, see if its
accuracy can be improved in any way. This is done by tuning the parameters present in
your model.

Deliverables: (5 marks)
1. Code (‘.ipnyp’ and ‘html’ files):

1.1. Try your best to present the data in a good manner by showing it and their data
type using what we have been learned such as: head(), tail(), dtype(),…, etc.

2. Technical report (‘, pdf’ file):


2.1. Abstract
2.2. Introduction. What problem are you tackling? stating the problem and some
context about the proposed solution.
2.3. Method. What machine learning techniques are you planning to apply upon?
Training Description.
2.4. Results and Evaluation. How do you plan to evaluate your machine learning
algorithm? Summary of Results on validation set and testing set
2.5. Conclusion
3. Presentation (‘.ppt’ or ‘pdf’ files)

You might also like