INT354 Syllabus
INT354 Syllabus
CO1 :: explain different types of Machine Learning and statistics used for risk minimization
CO2 :: examine the performance of Generative models based on Bayesian learning to solve
different classification problems
CO3 :: evaluate and optimize ensemble learning techniques and classifiers using relevant metrics
CO6 :: apply model evaluation strategies to fine-tune machine learning models effectively
Unit I
Introduction to machine learning : Well Posed Learning Problems, Designing a Learning Systems,
Statistical Learning Framework, Empirical Risk Minimization, Empirical Risk Minimization with
Inductive Bias, PAC Learning, How machines learn, Learning input-output functions
Building good training sets : Data Preprocessing, Handling Categorical Data, Partitioning a Dataset
in Training and Test Sets, Normalization, Handling imbalanced datasets, Feature selection and
dimensionality reduction
Unit II
Machine learning classifiers-1 : Overview of supervised learning, Difference between classification
and regression, Choosing a Classification Algorithm, No free lunch theorem, Perceptron, Logistic
Regression, Decision Tree, ID3, CART, and C4.5
Unit III
Machine learning classifiers-2 : SVM, KNN, Naïve Bayes Classifier, Introduction to ensemble
learning, Bagging vs. boosting, Majority voting classifier, Random Forest, Gradient Boosting Machines
(GBM), XGBoost, Evaluation metrics for classifiers
Unit IV
Regression-1 : Introducing Linear Regression, Fitting a Robust Regression Model using RANSAC,
Relationship Using a Correlation Matrix, Exploratory Data Analysis, Regularized Methods for
Regression, Polynomial Regression
Unit V
Regression-2 : SVM regressor, Decision Tree regressor and Random Forest Regressor, ARIMA and
SARIMA, R2 Score, Mean Absolute Error, Mean Squared Error, Mean Squared Logarithmic Error, Mean
Absolute Percentage Error, Explained Variance Score, Visual Evaluation of Regression Models
Unit VI
Model evaluation and hyperparameter tuning : Streamlining Workflows with Pipelines, Using k-
fold Cross Validation to Access Model Performance, Debugging Algorithms with Learning and
Validation Curves, Fine-Tuning Machine Learning Models via Grid Search
Practicals
• Identify a Real-World Machine Learning Problem and Select a Suitable Dataset for the Project
• Load, Explore, and Visualize Dataset to Understand Data Structure, Trends, and Distribution for
Analysis
• Perform Data Cleaning: Handle Missing Values, Duplicates, and Data Inconsistencies for Accurate
Results
• Perform Feature Engineering: Create, Transform, and Extract Features to Enhance Dataset and Model
Performance
• Handle Categorical Variables: Apply Encoding Techniques Like One-Hot, Label, and Ordinal Encoding
Approaches
• Normalize and Standardize Features: Use Scaling Methods Like Min-Max, Standard, and Robust
Scalers
• Train and Evaluate Baseline Models to Establish Reference Metrics for Further Improvements and
Comparisons
• Evaluate Model Performance Using Appropriate Metrics Like Precision, Recall, F1-Score, R2-Score, or
MAE
• Perform Hyperparameter Tuning Using Grid Search, Random Search, and Bayesian Optimization for
Best Results
• Implement Ensemble Learning Techniques Like Bagging, Boosting, and Stacking for Robust Model
Performance
• Deploy the Machine Learning Model Using Flask, FastAPI, or Streamlit for Real-World Applications
• Prepare and Present a Comprehensive Project Report Including Problem Statement, Methods,
Results, and Challenges
Text Books:
1. MACHINE LEARNING : A PRACTITIONER'S APPROACH by CHANDRA S.S., VINOD;
HAREENDRAN S., ANAND, PHI Learning
References:
1. MACHINE LEARNING WITH PYTHON: PRINCIPLES AND PRACTICAL by PARTEEK BHATIA,
CAMBRIDGE UNIVERSITY PRESS