01ce0707 Data Mining and Information Retrieval
01ce0707 Data Mining and Information Retrieval
Computer Engineering
B.Tech. Year - IV
Objective: The course is designed for a section level investigation of data mining and
information retrieval methods. It is about how to discover significant data and therefore
separate important patterns from it. The fundamental speculations and scientific models of
data mining and information retrieval are covered in the syllabus.
Computer Engineering
Contents:
Unit Topics Contact
Hours
1 Introduction 08
Introduction to Information Retrieval and Data Mining include
Correlation, Association Rules, Knowledge Discovery from Databases,
Classification, and Clustering.
2 Indexing 10
Basic concepts of Indexing. Principles theory of Indexing. Content
Analysis: Meaning, Purpose, Applications in real life. Characteristics of
Indexing, Languages used for Indexing, Types of Indexing, Criteria for
evaluation of Information Retrieval Systems.
3 Retrieval methods 08
Types of Information retrieval. Search processes, Strategies of Search
methods, Boolean Logic, Query Preparation.
4 Data warehousing 08
What is OLAP, Dimensional Modeling (facts, dimensions), Cube, Schema,
defining Schema’s Star Schema, Snow-flakes Schema and Fact
Constellation, ETL Process.
5 Classification methods 12
Decision tree(ID3,C4.5,CART), Bayesian Classification, Rule based,
Neural Network, Lazy and Eager Learners, Performance Parameters of
classification algorithms.
6 Prediction methods 08
Linear and nonlinear regression, Logistic Regression
Use of open source data mining tool – WEKA, XLMiner , MOA.
Total Hours 54
References:
Computer Engineering
Suggested Theory distribution:
The suggested theory distribution as per Bloom’s taxonomy is as per follows. This
distribution serves as guidelines for teachers and students to achieve effective teaching-
learning process
Distribution of Theory for course delivery and evaluation
Remember Understand Apply Analyze Evaluate Create
0% 10% 50% 20% 0% 20%
Suggested List of Experiments:
1. Explore and compare various data mining tools.
2. Weka Installation.
3. Preprocessing on real and synthetic datasets.
4. Apply classification technique to find association rules.
5. Demonstration of various classification algorithms.
6. Performance measurement of various classification algorithms.
7. Apply K-mean method of clustering to discover similar objects of real time datasets.
8. Demonstration of various IR techniques.
9. Performance evaluation of various IR techniques.
10. Mini Project based on learning of this subject.
Instructional Method:
a) The course delivery method will depend upon the requirement of content and need of
students. The teacher in addition to conventional teaching method by black board, may
also use any of tools such as demonstration, role play, Quiz, brainstorming, MOOCs etc.
b) The internal evaluation will be done on the basis of continuous evaluation of students
in the laboratory and class-room.
d) Students will use supplementary resources such as online videos, NPTEL videos, e
courses, Virtual Laboratory as suggested by subject faculty.