Data Science and Machine Learning Syllabus V1.0
Data Science and Machine Learning Syllabus V1.0
1. Course Description
This course provides a comprehensive introduction to the fields of Data Science and Machine
Learning, aimed at equipping students with the essential knowledge and practical skills required
to analyze data, interpret data, apply machine learning methods and visualize results.
2. General Objectives
The course is designed with the following general objectives:
• To provide students with a foundational understanding of Data Science and Machine
Learning.
• To familiarize students with techniques for cleaning, transforming, and visualizing data to
uncover patterns and insights.
• To provide the knowledge for use of mathematics such as statistics, probability for data
analysis and machine learning,supervised learning algorithms, linear regression, decision trees,
and support vector machines, and their applications.
4. Methods of Instruction
The course will utilize a mix of lectures, tutorials, case studies, and lab sessions to support
learning. Lectures will deliver core knowledge, while tutorials and case studies will enhance
comprehension. Lab sessions will provide hands-on experience, enabling students to apply
theory to practical, real-world situations. This integrated approach ensures a well-rounded
learning experience, fostering both theoretical insight and practical skills essential for success in
data science and analytics.
5. Case Studies
Students will complete the following case studies and submit their reports:
● Exploratory Data Analysis (Agricultural Commodities): Students will conduct a
comprehensive exploratory data analysis on a dataset related to agricultural commodities. This
will involve analyzing trends, patterns, and correlations to provide insights.
● Supervised Learning (Customer Churn Prediction in Telecommunications): Students will
build and evaluate a supervised learning model to predict customer churn in the
telecommunications industry. The case study will require them to preprocess data, select relevant
features, and apply classification algorithms to identify customers at risk of leaving.
● Anomaly Detection in Real-World Applications: Students will implement anomaly
detection techniques to identify unusual patterns or outliers in a real-world dataset. This case
study will involve applying various anomaly detection methods to solve practical problems such
as fraud detection or system monitoring.
Students are required to submit a detailed report documenting their approach, results, and
analysis.
6. List of Tutorials
The following tutorial activities of 15 hours per group of maximum 24 students should be
conducted to cover all the required contents of this course.
S.N. Tutorials
1 ● Using libraries of your programming choices (e.g. pandas, R) to
manipulate datasets.
● Conducting exploratory data analysis (EDA) on real-world datasets.
● Cleaning and preprocessing data to prepare for modeling.
2 ● Solving problems related to descriptive statistics (mean, median,
mode, variance).
● Applying probability concepts to data science problems.
● Working with probability distributions and sampling techniques.
3 ● Solving problems involving matrix operations and vector calculus.
7. Practical Works
S.N. Practical works
1 Conduct an exploratory data analysis (EDA) on a public dataset.
2 Perform data manipulation tasks such as filtering, grouping, and summarizing.
3 Implement and compare different statistical techniques to analyze sample data (e.g.,
hypothesis testing, regression analysis).
4 Clean and preprocess a messy dataset (e.g., handling missing data, encoding
categorical variables, feature scaling).
5 Implement different supervised learning algorithms (e.g., linear regression, decision
trees) on a dataset.
6 Apply clustering techniques (e.g., K-means, hierarchical clustering) on a dataset and
evaluate the clusters.
7 Perform a probabilistic model.
8 Apply anomaly detection methods in real world dataset.
Reference Books
McKinney, W. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython,
Second Edition, O'Reilly Media.