0% found this document useful (0 votes)
5 views

Chapter - Data Mining Introduction

Chapter 2 introduces data mining as a process for discovering actionable patterns in large datasets, emphasizing its importance in decision-making for organizations. It outlines the need for data mining due to the growth of corporate data and technological advancements, and highlights various applications such as fraud detection and market analysis. The chapter also distinguishes data mining from machine learning and provides an overview of the data mining process and techniques.

Uploaded by

Nur Al Ahad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Chapter - Data Mining Introduction

Chapter 2 introduces data mining as a process for discovering actionable patterns in large datasets, emphasizing its importance in decision-making for organizations. It outlines the need for data mining due to the growth of corporate data and technological advancements, and highlights various applications such as fraud detection and market analysis. The chapter also distinguishes data mining from machine learning and provides an overview of the data mining process and techniques.

Uploaded by

Nur Al Ahad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Chapter 2

Introduction to
Data Mining
Chapter Objectives

1. To learn about the concepts of data mining.


2. To understand the need for, and the applications of data mining
3. To differentiate between data mining and machine learning
4. To understand the process of data mining.
5. To understand the difference between data mining and machine
learning.

These slides are designed to complement 'Data Mining and Data Warehousing' by Parteek Bhatia, published by Cambridge University Press.
Defining Data Mining
 Data mining is a collection of techniques for efficient
automated discovery of previously unknown, valid,
novel, useful and understandable patterns in large
databases. The patterns must be actionable so they may
be used in an enterprise’s decision making.’

These slides are designed to complement 'Data Mining and Data Warehousing' by Parteek Bhatia, published by Cambridge University Press.
Introduction to Data Mining
 From this definition, the important takeaways are:
 Data mining is a process of automated discovery of previously unknown patterns in
large volumes of data.
 This large volume of data is usually the historical data of an organization known as
the data warehouse.
 Data mining deals with large volumes of data, in Gigabytes or Terabytes of data and
sometimes as much as Zetabytes of data (in case of big data).
 Data mining allows businesses to determine historical patterns to predict future
behavior.

These slides are designed to complement 'Data Mining and Data Warehousing' by Parteek Bhatia, published by Cambridge University Press.
Need of Data Mining
Per Minute Generation of Data over Internet

These slides are designed to complement 'Data Mining and Data Warehousing' by Parteek Bhatia, published by Cambridge University Press.
Need of Data Mining
• Growth in generation and storage of corporate data
• Need for sophisticated decision making
• Evolution of technology
• Availability of much cheaper storage, easier data
collection and better database management for data
analysis and understanding
• Decline in the costs of hard drives
• Growth in worldwide disk capacities

These slides are designed to complement 'Data Mining and Data Warehousing' by Parteek Bhatia, published by Cambridge University Press.
Applications of Data Mining
 Loan/Credit card approvals
 Market segmentation
 Fraud detection
 Better marketing
 Trend analysis
 Market basket analysis
 Customer churn
 Website design
 Corporate analysis and risk management

These slides are designed to complement 'Data Mining and Data Warehousing' by Parteek Bhatia, published by Cambridge University Press.
Data Mining Process

These slides are designed to complement 'Data Mining and Data Warehousing' by Parteek Bhatia, published by Cambridge University Press.
Data Mining Techniques

Predictive
Modelling

Deviation Data Mining Database


detection Techniques segmentation

Link
analysis

These slides are designed to complement 'Data Mining and Data Warehousing' by Parteek Bhatia, published by Cambridge University Press.
Difference between Data mining and ML

These slides are designed to complement 'Data Mining and Data Warehousing' by Parteek Bhatia, published by Cambridge University Press.
Book Details
Table of Contents Published by
1. Beginning with machine learning Cambridge University, Press
2. Introduction to data mining (UK).
3. Beginning with Weka and R language Recommended as Six Best
New Data Warehousing
4. Data pre-processing
Books to read in 2020 and
5. Classification 43 Best Data Mining Books
6. Implementing classification in Weka and R of all time by
7. Cluster analysis bookauthority.org
8. Implementing clustering with Weka and R
9. Association mining
10. Implementing association mining with Weka and R
11. Web mining and search engine
12. Operational data store and data warehouse
13. Data warehouse schema
14. Online analytical processing
15. Big data and NoSQL
Order Your Copy Today
 Amazon.com: Data Mining and Data Warehousing: Principles
and Practical Techniques eBook : Bhatia, Parteek: Kindle Store

amazon.com

amazon.in
Other Books from Parteek Bhatia:
Machine Learning with Python
Machine learning python
principles and practical
techniques | Pattern
recognition and machine
learning | Cambridge
University Press
Other Books from Parteek Bhatia:
Simplified Approach to DBMS
Table of Contents 19. Enterprise Database Products
1. Fundamentals of Database Management System 20. Beginning with SQL
2. The architecture of Database Management 21. Invoking SQL* Plus
System 22. Performing basic SQL
3. Data Models operations
4. Relational Database Management System 23. Basic SELECT statement
5. Relational Algebra and Calculus 24. Inbuilt functions
6. Entity-Relationship Model 25. Grouping of data
7. Conversion of ER Diagrams to Tables 26. Joining of tables
8. Normalization for refinement of data 27. Sub Queries
9. Physical Database design 28. Managing Tables
10. Transaction Management 29. Database objects, DCL and
11. Concurrency Control TCL statements
12. Security and Integrity of data 30. Pl/SQL Fundamentals and its
13. Recovery of data statements
14. Distributed Database 31. Error Handling
15. Object-Oriented Databases and expert system 32. Cursor Management
16. DBTG Model 33. Subprograms and packages
17. Data Warehouse and Data mining 34. Database triggers
18. No-SQL Database 35. Advanced Topics
Other Books from Parteek Bhatia

For more information visit: www.parteekbhatia.com


Online Courses on Udemy by Parteek Bhatia
Parteek Bhatia’s YouTube Channel

Premier resource for over 450 in-depth videos covering Machine


Learning, Data Mining, DBMS, SQL, PL/SQL, Python, and Big Data. Our
channel simplifies complex topics without compromising on depth,
catering to both beginners and seasoned professionals. Explore tutorials,
step-by-step guides, and practical tips to enhance your tech skills.
Whether you're preparing for exams, advancing your career, or exploring
new technologies, our content is designed to empower your learning
journey. Subscribe and join our community of tech enthusiasts to stay
updated with educational and engaging content!
Thanks
Happy Learning.

You might also like