0% found this document useful (0 votes)

73 views6 pages

Data Science and Machine Learning Syllabus V1.0

Uploaded by

hafasit984

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views6 pages

Data Science and Machine Learning Syllabus V1.0

Uploaded by

hafasit984

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Pokhara University

Faculty of Science and Technology

Course Code.: CMP 336 Full marks: 100

Course title: Data Science and Machine Learning (3-1-2) Pass marks: 45
Nature of the course: Theory & Practical Time per period: 1 hour
Year, Semester:…………… Total periods: 45
Level: Bachelor Program: BE Software

1. Course Description
This course provides a comprehensive introduction to the fields of Data Science and Machine
Learning, aimed at equipping students with the essential knowledge and practical skills required
to analyze data, interpret data, apply machine learning methods and visualize results.

It will include the following information:

● Covers a wide range of topics, including data pre-processing, statistical analysis, machine
learning algorithms, model evaluation, and the application of these techniques to solve real-
world problems.
● Delivery approach includes hands-on labs, case studies and deep understanding of how to
leverage data to make informed decisions.

2. General Objectives
The course is designed with the following general objectives:
• To provide students with a foundational understanding of Data Science and Machine
Learning.
• To familiarize students with techniques for cleaning, transforming, and visualizing data to
uncover patterns and insights.
• To provide the knowledge for use of mathematics such as statistics, probability for data
analysis and machine learning,supervised learning algorithms, linear regression, decision trees,
and support vector machines, and their applications.

• To expose students to unsupervised learning techniques such as clustering and

dimensionality reduction, and their use in identifying patterns and simplifying data.
.
3. Contents in Detail
This section contains the details to be taught under the course.
Specific Objectives Contents

● Intends to provide a brief Unit I: Introduction to Data Science and Machine

introduction to the field of Data Learning (4Hrs)
Science and Machine Learning. 1.1 Definition and Overview of Data Science and
● Learn about various domains Machine Learning
within Data Science and how they 1.2 Applications of Data Science in various industries
interrelate. 1.3 Types of Data: Clean Data and Dirty Data
● Helps students understand the 1.3 Data Science, AI, and Machine Learning
significance of data in modern
decision-making.
● Intends to get students well- Unit II: Data Collection and Preprocessing (7 Hrs)
acquainted with data collection 2.1 Different Data Collection Methods for Machine
methods and preprocessing techniques. Learning: Surveys, Sensors, Web Scraping, APIs,
● Able to apply various Databases
preprocessing techniques to clean and 2.2 Data Quality Issues: Missing Data, Noisy Data,
prepare data for analysis. Inconsistent Data, Data Transformation
2.3 Techniques for Handling Missing Data
2.4 Data Cleaning Techniques: Handling Outliers,
Dealing with Categorical Data, Normalization, and
Standardization
2.5 Dependent and independent variables
● Intends to provide students with Unit III: Exploratory Data Analysis (6Hrs)
the skills to explore and understand 3.1 Introduction to EDA
data. 3.2 Descriptive Statistics: Mean, Median, Mode, Standard
● Learn about various EDA Deviation, Variance, Skewness, Kurtosis
techniques to identify patterns and 3.3 Data Visualization Techniques: Histograms, Box
insights in the data Plots, Scatter Plots, Heatmaps
3.4 Identifying Trends: Mann–Kendall, Spearman's Rank,
Sen’s Slope
3.5 Correlations
3.6 Introduction to Hypothesis Testing
● Intends to provide students with Unit IV: Data Engineering (5 Hrs)
the skills to explore and understand 4.1 Data pipeline, Design and Monitoring
data. 4.2 Extract, Transform and Load (ETL)
● Learn about various EDA 4.3 Feature Engineering
techniques to identify patterns and 4. 5 Feature Selection
insights in the data 4.6 Dimensionality Reduction: PCA, LDA

● Helps students learn how to Unit V: Introduction to Machine Learning (9 Hrs)

implement basic machine learning 5.1 Definition and Types of Machine Learning:
models. Supervised, Unsupervised Learning, Reinforcement
● Able to differentiate between Learning
various machine learning algorithms 5.2 Overview of the Machine Learning
and their applications.
5.3 Supervised Learning: Linear Regression, Logistic
Regression, Decision Trees, Random Forest, k-NN,
Support Vector Machines (SVM)
5.4 Unsupervised Learning: k-Means Clustering,
Hierarchical Clustering methods
Key Concepts: Training, Testing, Validation, Overfitting,
Underfitting
● Apply basic and machine Unit VI: Anomaly Detection (4 Hrs)
learning methods for detecting 6.1 Definition
anomalies. 6.2 Types: point, contextual, collective
6.3 Applications
6.4 Techniques for Anomaly Detection
6.4.1 Statistical Methods
6.4.2 Distance-based Methods
6.4.3 Density-based Methods
6.4.4 Clustering based Methods
6.4.5 Common Methods (one-class classification,
isolation forest)
6.5 Anomaly Detection in High-Dimension
● Intends to get students well- Unit VII: Model Evaluation and Optimization (6 Hrs)
acquainted with model evaluation 7.1 Confusion Matrix,
techniques. 7.2 Evaluation Metrics
● Able to make use of various 7.2.1 Supervised: Accuracy, Precision, Recall, F1
optimization techniques to improve Score, ROC Curve, AUC, MSE, True Positive Rate, False
model performance. Positive, MSE, MAE, RMSE
7.2.2 Unsupervised: Purity, Rand Index, Silhouette
Coefficient, Dunn Index
7.3 Cross-Validation Techniques
7.4 Hyperparameter Tuning: Grid Search, Random Search
7.5 Model Selection Techniques: Bias-Variance Trade-
off, Ensemble Methods (Bagging and Boosting)
7.6 SMOTE Technique to Handle Imbalance
7.7 Time & Space Complexity of Machine Learning
Models
● Helps students understand the Unit VIII: Ethical and Legal Considerations in Data
ethical implications and legal Science (4 Hrs)
considerations in data science. 8.1 Data Privacy and Security
8.2 Ethical Issues in Data Science: Bias, Transparency,
Accountability
8.3 Legal Considerations: Data Protection Laws,
Intellectual Property

4. Methods of Instruction
The course will utilize a mix of lectures, tutorials, case studies, and lab sessions to support
learning. Lectures will deliver core knowledge, while tutorials and case studies will enhance
comprehension. Lab sessions will provide hands-on experience, enabling students to apply
theory to practical, real-world situations. This integrated approach ensures a well-rounded
learning experience, fostering both theoretical insight and practical skills essential for success in
data science and analytics.

5. Case Studies
Students will complete the following case studies and submit their reports:
● Exploratory Data Analysis (Agricultural Commodities): Students will conduct a
comprehensive exploratory data analysis on a dataset related to agricultural commodities. This
will involve analyzing trends, patterns, and correlations to provide insights.
● Supervised Learning (Customer Churn Prediction in Telecommunications): Students will
build and evaluate a supervised learning model to predict customer churn in the
telecommunications industry. The case study will require them to preprocess data, select relevant
features, and apply classification algorithms to identify customers at risk of leaving.
● Anomaly Detection in Real-World Applications: Students will implement anomaly
detection techniques to identify unusual patterns or outliers in a real-world dataset. This case
study will involve applying various anomaly detection methods to solve practical problems such
as fraud detection or system monitoring.
Students are required to submit a detailed report documenting their approach, results, and
analysis.

6. List of Tutorials

The following tutorial activities of 15 hours per group of maximum 24 students should be
conducted to cover all the required contents of this course.

S.N. Tutorials
1 ● Using libraries of your programming choices (e.g. pandas, R) to
manipulate datasets.
● Conducting exploratory data analysis (EDA) on real-world datasets.
● Cleaning and preprocessing data to prepare for modeling.
2 ● Solving problems related to descriptive statistics (mean, median,
mode, variance).
● Applying probability concepts to data science problems.
● Working with probability distributions and sampling techniques.
3 ● Solving problems involving matrix operations and vector calculus.

4 ● Applying linear algebra concepts to data transformations.

● Implementing supervised models like linear regression, decision trees,
and k-nearest neighbors.
● Implementing unsupervised model like k-means, hierarchical
● Implementing anomaly detection for real world data.
● Understanding the concept of overfitting and underfitting through
practical examples.
● Hyperparameter tuning and model evaluation techniques.
6 ● Creating visualizations using Matplotlib and Seaborn.
● Visualizing complex datasets and interpreting the results.
● Building dashboards using tools like Plotly or Dash.

7 ● Implementing a complete machine learning pipeline from data

collection to model deployment.
● Working on real-world datasets and competitions (e.g., Kaggle).
● Understanding the ethical implications and bias in machine learning.

7. Practical Works
S.N. Practical works
1 Conduct an exploratory data analysis (EDA) on a public dataset.
2 Perform data manipulation tasks such as filtering, grouping, and summarizing.
3 Implement and compare different statistical techniques to analyze sample data (e.g.,
hypothesis testing, regression analysis).
4 Clean and preprocess a messy dataset (e.g., handling missing data, encoding
categorical variables, feature scaling).
5 Implement different supervised learning algorithms (e.g., linear regression, decision
trees) on a dataset.
6 Apply clustering techniques (e.g., K-means, hierarchical clustering) on a dataset and
evaluate the clusters.
7 Perform a probabilistic model.
8 Apply anomaly detection methods in real world dataset.

8. Evaluation system and Students’ Responsibilities

Evaluation System
In addition to the formal exam(s) conducted by the Office of the Controller of Examination of
Pokhara University, the internal evaluation of a student may consist of class attendance, class
participation, quizzes, assignments, presentations, written exams, etc. The tabular presentation of
the evaluation system is as follows.

External Evaluation Marks Internal Evaluation Marks

Semester-End 50 Class attendance and participation 5

Examination Lab, Case study and Viva 15
Internal Term Exam 30
Total External 50 Total Internal 50
Full Marks 50+50 = 100
Students’ Responsibilities:
Each student must secure at least 45% marks in the internal evaluation with 80% attendance in the
class to appear in the Semester End Examination. Failing to obtain such a score will be given NOT
QUALIFIED (NQ) and the student will not be eligible to appear in the End-Term examinations.
Students are advised to attend all the classes and complete all the assignments within the specified
time period. If a student does not attend the class(es), it is his/her sole responsibility to cover the
topic(s) taught during the period. If a student fails to attend a formal exam, quiz, test, etc. there
won’t be any provision for a re-exam.

9. Prescribed Books and References

Text Book
Grus, J. Data Science from Scratch: First Principles with Python, Second Edition, O'Reilly
Media.
Geron, A. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, Second
Edition, O'Reilly Media.
An Introduction to Statistical Learning by Gareth James et al.
O'Neil, C. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens
Democracy, Crown Publishing Group.
Aggarwal, C. C. (2017). Outlier analysis (2nd ed.). Springer.

Reference Books
McKinney, W. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython,
Second Edition, O'Reilly Media.

INFORMATION MANAGEMENT Unit 3 NEW
100% (1)
INFORMATION MANAGEMENT Unit 3 NEW
61 pages
B.tech Minor Syllabus-CSE (Data Science) - Final
No ratings yet
B.tech Minor Syllabus-CSE (Data Science) - Final
17 pages
DSP U2
No ratings yet
DSP U2
172 pages
Machine Learning With Python
100% (2)
Machine Learning With Python
137 pages
Ocs353dsf Unit Wise Notes
100% (2)
Ocs353dsf Unit Wise Notes
121 pages
Foundations of Data Science
No ratings yet
Foundations of Data Science
139 pages
DSP U1
No ratings yet
DSP U1
89 pages
Dev U2
No ratings yet
Dev U2
96 pages
Concepts - of - Machine - Learning (Minor)
No ratings yet
Concepts - of - Machine - Learning (Minor)
14 pages
ML Unit - 3
No ratings yet
ML Unit - 3
23 pages
Data Science Course Syllabus 01
100% (1)
Data Science Course Syllabus 01
20 pages
CONCEPTS IN MACHINE LEARNING-Ktunotes - in
No ratings yet
CONCEPTS IN MACHINE LEARNING-Ktunotes - in
14 pages
MTECH Handbook
No ratings yet
MTECH Handbook
18 pages
MSC Data Science
No ratings yet
MSC Data Science
20 pages
Hammad Raza.
No ratings yet
Hammad Raza.
28 pages
CSE Sem7 N 8
No ratings yet
CSE Sem7 N 8
51 pages
Data Science Report
No ratings yet
Data Science Report
32 pages
CO - CSE 4102 - AI Lab Course Outline
100% (1)
CO - CSE 4102 - AI Lab Course Outline
4 pages
Data Science Report
No ratings yet
Data Science Report
32 pages
CSET228 Course Handout
No ratings yet
CSET228 Course Handout
7 pages
MR20 Vi-I Syllabus
No ratings yet
MR20 Vi-I Syllabus
22 pages
AI - Machine Learning Algorithms Applied To Transformer Diagnostics
No ratings yet
AI - Machine Learning Algorithms Applied To Transformer Diagnostics
6 pages
Birla Institute of Technology & Science, Pilani Work Integrated Learning Programmes Digital
No ratings yet
Birla Institute of Technology & Science, Pilani Work Integrated Learning Programmes Digital
9 pages
Data Science Notes Full
No ratings yet
Data Science Notes Full
5 pages
Data Science
No ratings yet
Data Science
9 pages
Foundation of Data Science Syllabus
No ratings yet
Foundation of Data Science Syllabus
4 pages
Sem 6
No ratings yet
Sem 6
12 pages
Data Science Master
No ratings yet
Data Science Master
11 pages
DataScience Minordegree 2023 Syllabus
No ratings yet
DataScience Minordegree 2023 Syllabus
12 pages
Introduction To Data Science Course Outline
No ratings yet
Introduction To Data Science Course Outline
5 pages
DS&a + AI ML Nov 23 6868 - Calendar
No ratings yet
DS&a + AI ML Nov 23 6868 - Calendar
9 pages
SML Syllabus
No ratings yet
SML Syllabus
3 pages
Document 1
No ratings yet
Document 1
6 pages
Cse2021 - Data Mining CH
No ratings yet
Cse2021 - Data Mining CH
13 pages
Data Science Master Class 2023
No ratings yet
Data Science Master Class 2023
8 pages
Ps - ML Coursepack - 19th Feb 24
No ratings yet
Ps - ML Coursepack - 19th Feb 24
8 pages
DS QB
No ratings yet
DS QB
6 pages
Final Data Science Course (Practicals)
No ratings yet
Final Data Science Course (Practicals)
5 pages
Handout
No ratings yet
Handout
4 pages
2yrs Mca Sem4
No ratings yet
2yrs Mca Sem4
10 pages
Data Science Syllabus
No ratings yet
Data Science Syllabus
3 pages
Data Science
No ratings yet
Data Science
2 pages
Syllabus FDS
No ratings yet
Syllabus FDS
4 pages
Ya5uE5 Syllabus Instructors
No ratings yet
Ya5uE5 Syllabus Instructors
2 pages
MLT Syllabus
No ratings yet
MLT Syllabus
3 pages
Comp5541 20231
No ratings yet
Comp5541 20231
3 pages
Toronto FinTech Curriculum
No ratings yet
Toronto FinTech Curriculum
13 pages
M.tech ML&C - Curriculum - Revision From 2018
No ratings yet
M.tech ML&C - Curriculum - Revision From 2018
13 pages
Data Science Syllabus
No ratings yet
Data Science Syllabus
3 pages
Data Science Syl Lab Us
No ratings yet
Data Science Syl Lab Us
4 pages
AI3104 Foundation of Data Science (Handout) 2024
No ratings yet
AI3104 Foundation of Data Science (Handout) 2024
7 pages
Data Science Course Outline CES LUMS
No ratings yet
Data Science Course Outline CES LUMS
4 pages
Cmsa Sem 6 Dse ML
No ratings yet
Cmsa Sem 6 Dse ML
3 pages
B Tech AIDS-80-81
No ratings yet
B Tech AIDS-80-81
2 pages
Data Science Course Content Chapter 1: Introduction To Data Science
No ratings yet
Data Science Course Content Chapter 1: Introduction To Data Science
8 pages
21CSS303T Data Science Syllabus
No ratings yet
21CSS303T Data Science Syllabus
2 pages
CourseOutline FDS
No ratings yet
CourseOutline FDS
2 pages
Introduction To Data Science: Cpts 483-06 - Syllabus
No ratings yet
Introduction To Data Science: Cpts 483-06 - Syllabus
5 pages
DAI101 Detailed Syllabus
No ratings yet
DAI101 Detailed Syllabus
1 page
TE7265 - Introduction To Data Science
No ratings yet
TE7265 - Introduction To Data Science
4 pages
CPSC 4430 Introduction To Machine Learning Catalog Description Course Symbol: CPSC 4430 Title: Machine Learning Hours of Credit: 3 Course Description
No ratings yet
CPSC 4430 Introduction To Machine Learning Catalog Description Course Symbol: CPSC 4430 Title: Machine Learning Hours of Credit: 3 Course Description
5 pages
KNN Interview Question Rev 2.0
No ratings yet
KNN Interview Question Rev 2.0
17 pages
Module Handbook PSD-1 Universitas Indonesia
No ratings yet
Module Handbook PSD-1 Universitas Indonesia
3 pages
PDF
No ratings yet
PDF
25 pages
IRIS - Flower
No ratings yet
IRIS - Flower
13 pages
Proactive Collections Management: Using Artificial Intelligence To Predict Invoice Payment Dates By: Sonali Nanda
No ratings yet
Proactive Collections Management: Using Artificial Intelligence To Predict Invoice Payment Dates By: Sonali Nanda
22 pages
Building Occupancy Estimation and Detection A Review
No ratings yet
Building Occupancy Estimation and Detection A Review
42 pages
Scikit-Learn Cheat Sheet
No ratings yet
Scikit-Learn Cheat Sheet
1 page
Credit Card Fraud Detection Using A Deep Learning Multistage Model
No ratings yet
Credit Card Fraud Detection Using A Deep Learning Multistage Model
26 pages
UNIT 3 - INSTANCE BASED LEARNING Akgec
No ratings yet
UNIT 3 - INSTANCE BASED LEARNING Akgec
14 pages
Internship Report - Merged
No ratings yet
Internship Report - Merged
29 pages
Breast Cancer Aiml Project
No ratings yet
Breast Cancer Aiml Project
25 pages
Final Year Project Report
No ratings yet
Final Year Project Report
44 pages
Nitin Jha (05114802819)
No ratings yet
Nitin Jha (05114802819)
21 pages
12 Machine Learning Model To Predict Construction Duration
No ratings yet
12 Machine Learning Model To Predict Construction Duration
15 pages
Natural Language Processing and ML Based Student Mental Health Analysis Using Non Clinical Texts PDF
No ratings yet
Natural Language Processing and ML Based Student Mental Health Analysis Using Non Clinical Texts PDF
53 pages
P16 Prediction of Drinking Water Quality With Machine Learning
No ratings yet
P16 Prediction of Drinking Water Quality With Machine Learning
17 pages
Full Paper
No ratings yet
Full Paper
8 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
6 pages
Placement Retargeting of Virtual Avatars To Dissimilar Indoor Environments
No ratings yet
Placement Retargeting of Virtual Avatars To Dissimilar Indoor Environments
15 pages
Wearable Smart Rings For Multifinger Gesture Recognition Using Supervised Learning
No ratings yet
Wearable Smart Rings For Multifinger Gesture Recognition Using Supervised Learning
12 pages
Saleh Et Al-2024-Scientific Reports
No ratings yet
Saleh Et Al-2024-Scientific Reports
11 pages
An Efficient Approach For Credit
No ratings yet
An Efficient Approach For Credit
17 pages
Fire Alarm System Through Smoke Detectio
No ratings yet
Fire Alarm System Through Smoke Detectio
4 pages
Aids - ML - B3 - 74 - Assi 1
No ratings yet
Aids - ML - B3 - 74 - Assi 1
6 pages
A Weighted Majority Voting Ensemble Approach For Classification
No ratings yet
A Weighted Majority Voting Ensemble Approach For Classification
6 pages
A Strategy For Automatically Extracting References From PDF Documents
No ratings yet
A Strategy For Automatically Extracting References From PDF Documents
6 pages

Data Science and Machine Learning Syllabus V1.0

Uploaded by

Data Science and Machine Learning Syllabus V1.0

Uploaded by

Pokhara University

Faculty of Science and Technology

Course Code.: CMP 336 Full marks: 100

It will include the following information:

• To expose students to unsupervised learning techniques such as clustering and

● Intends to provide a brief Unit I: Introduction to Data Science and Machine

● Helps students learn how to Unit V: Introduction to Machine Learning (9 Hrs)

4 ● Applying linear algebra concepts to data transformations.

7 ● Implementing a complete machine learning pipeline from data

8. Evaluation system and Students’ Responsibilities

External Evaluation Marks Internal Evaluation Marks

Semester-End 50 Class attendance and participation 5

9. Prescribed Books and References

You might also like