Data Mining Handout
Data Mining Handout
In addition to part I (General Handout for all courses appended to the time-table) this portion gives
further specific details regarding the course.
Course Objective:
To gain a comprehensive understanding of various data mining technique (theoretical and practical
aspect) and the ability to compare their merits and demerits for solving real-world problems.
Course Description:
This course explores the concepts and techniques of data mining, a promising and flourishing frontier in
database systems. The scope of the course covers basic data mining tasks like data pre-processing,
exploratory data analysis, data quality measures, classification, clustering, and anomaly detection
techniques. This course is designed to provide students with a broad understanding of the design and
use of different data mining algorithms. The course also aims at providing a holistic view of data
mining. It will have database, statistical, algorithmic and application perspectives of data mining.
Furthermore, the objective of the course is to have hands-on on data mining algorithms.
Text Book:
T1 Pang-Ning Tan, Micheal Steinbach, Vipin Kumar, “Introduction to Data Mining”, Pearson,
2009
Reference Books:
R1 Han J & Kamber M, “Data Mining: Concepts and Techniques”, Morgan Kaufmann
Publishers, 2001
R2 Hand D, Mannila H, & Smyth P, “Principles of Data Mining”, MIT Press, 2001
R3 Pujari A K, “Data Mining Techniques”, University Press (India), 2001
R4 Kimball R, “The Data Warehouse Toolkit”, 2e, John Wiley, 2002
Learning Objectives
LO1 Students will gain an understanding of Data Mining as a whole and its components.
LO2 Students will know data pre-processing techniques, their issues and possible
conventional solutions- Noise Reduction, Data Reduction, and Missing Values etc.
LO3 Students will have a detailed understanding of clustering and classification methods, their
limitations and applications.
LO4 Students will acquire knowledge about data warehousing, decision making, and association
rule mining algorithms.
LO5 After the course completion, students will be able to design and build real-world
applications using data mining algorithms.
Course Plan:
Lecture
Topics
No.:
1 Introduction, Motivation, Plan, Evaluation, Policies
Introduction to Data Mining
What is Data mining
2-3
Motivation & challenges
Data Mining Tasks
Data
Types of Data
4-5 Data quality
Data Preprocessing
Measures of Similarity & Dissimilarity
6-7 Exploratory Data Analysis
Cluster Analysis: Basic concepts and algorithms
Overview
K-Means
8-13
Agglomerative and Divisive hierarchical clustering
DBSCAN
Cluster evaluation
Cluster Analysis: Additional Issues and Algorithms
Characteristics of Data, Clusters and Clustering Algorithms
14-18 Prototype-based clustering
Density-based Clustering
Graph-based Clustering
Classification
Basics
General approach to solving a classification problem
19-21 Decision Tree Introduction
Model overfitting
Evaluating the performance of a classifier
Methods of comparing classifiers
22 Course Pre-Summary for Mid-Semester Exam
Classification: Alternative Techniques
Rule-based classifiers
Nearest-neighbour classifiers
23-28
Bayesian Classifiers
Support vector machines
Ensemble methods
Neural Networks
Introduction and motivation
Biological Neural Network
29-32
Artificial Neural Network
Learning and Training in NN
Perceptron, backpropagation and its variants
Adversarial Machine Learning
33-36 Poisoning Attacks
Evasion Attacks
Anomaly Detection
Preliminaries
37-40
Statistical Approaches
Proximity-based outlier detection
Density-based outlier detection
Clustering based Techniques
41-42 Course Summary, Review for End-Semester exam
Evaluation:
Component Nature Examination Schedule Weightage
Quiz – I/II/III Closed Book TBA 10%
As per the timetable
Mid Semester Closed Book 13/10/18, Saturday 30%
(4:00 PM - 5:30 PM)
Lab/Assignment Open Book TBA 20%
As per the timetable
Comprehensive Test TBA 07/12/18, Friday 40%
(FN)
Office Hours:
Hemant Rathore: Every Saturday 10:00am – 12:00pm
Announcements:
All notices concerning this course will be displayed on the course page of the Photon server.
o http://photon.bits-goa.ac.in/lms/
Follow-up with ID/ARC notices as well.
Make-up Policy:
Quiz / Assignment: No Makeup
Mid-Semester/Comprehensive Makeup:
o Only with prior permission (in written)
o Given only on justifiable ground
o Will not be given to attend any marriage/function etc.