DW&DM Syllabus
DW&DM Syllabus
Unit 1
Introduction
Introduction to Data Warehousing: Overview, Difference between Database System and Data Warehouse, The
Compelling Need for data warehousing, Data warehouse – The building Blocks: Defining Features, data warehouses
and data marts, overview of the components, Three tier architecture, Metadata in the data warehouse.
Data pre-processing: Data cleaning, Data transformation ETL Process, ETL tools,
interoperability of data and applications.
[7 hrs]
Unit 2
Dimensional Modelling
Defining the business requirements: Dimensional analysis, information packages – a new concept, requirements
gathering methods, requirements definition: scope and content.
Principles of Dimensional Modelling: Objectives, From Requirements to data design, Multi-Dimensional Data
Model, Schemas: the STAR schema, the Snowflake schema, fact constellation schema.
[8 hrs]
Unit 3
OLAP:
OLAP in the Data Warehouse: Demand for Online Analytical Processing, limitations of other analysis methods-
OLAP is the answer, OLAP definitions and rules, OLAP characteristics, major features and functions, hyper cubes.
OLAP Operations: Drill-down and roll-up, slice-and-dice , pivot or rotation, OLAP models, overview of variations,
the MOLAP model, the ROLAP model, the DOLAP model, ROLAP versus MOLAP, OLAP implementation
considerations. Query and Reporting, Executive Information Systems (EIS), Data Warehouse and Business Strategy.
Advanced Data warehouse Techniques: Document oriented NoSQL Databases- MangoDB, CouchDB; Graph Databases-
Neo4J, Infinite Graph.
[8 hrs]
Unit 4
Introduction to Data Mining
Data Mining Basics: What is Data Mining, Data Mining Defined, The knowledge discovery process (KDD
Process), Data Mining Applications- The Business Context of Data Mining, Data Mining for Process Improvement,
Data Mining as a Research Tool, Data Mining for Marketing, Benefits of data mining,
Major Data Mining Techniques: Classification and Prediction: Issues Regarding Classification and Prediction,
Classification by Decision Tree Induction, KNN Algorithm.
[9 hrs]
Unit 5
Data Mining Algorithms
Cluster detection, K- means Algorithm, Outlier Analysis, memory-based reasoning, link analysis, Mining Association
Rules in Large Databases: Association Rule Mining, genetic algorithms, neural networks, Data mining tools.
[8 hrs]
SUGGESTED READINGS:
1. Paul Raj Poonia, “Fundamentals of Data Warehousing”, John Wiley & Sons.
2. Kamber and Han, “Data Mining Concepts and Techniques”, Hart Court India P. Ltd. Elsevier Publications Second
Edition.
3. W. H. Inmon, “Building the operational data store”, 2nd Ed., John Wiley.
4. “Data Warehousing”, BPB Publications.
5. Pang- Ning Tan, Michael Steinbach, Viach, Vipin Kumar, Introduction to Data Mining, Pearson.
6. Shmueli, “Data Mining for Business Intelligence : Concepts, Techniques and Applications in Microsoft Excel with
XLMiner”,Wiley Publications.