0% found this document useful (0 votes)

9 views36 pages

Fundamentals of Data Mining

The document provides an introduction to data mining, explaining its importance due to the explosive growth of data and the need for automated analysis to extract knowledge. It outlines the data mining process, techniques, applications, and the role of data warehouses and OLAP in managing and analyzing data. Additionally, it discusses various data mining tasks, including descriptive and predictive tasks, as well as methods for finding patterns and associations in data.

Uploaded by

noahwilson686

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views36 pages

Fundamentals of Data Mining

Uploaded by

noahwilson686

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 36

FUNDAMENTALS OF Lecture 1

DATA MINING
CHAPTER 1- INTRODUCTION
1. Why data mining?
2. What is Data Mining?
3. Data Mining Process
4. Data Mining Applications & Benefits
WHY DATA MINING?
Explosive Growth of data: from
terabytes to petabytes.
Data Collection & Data
Availability:
Automated data Collection tools,
database systems, web, emails,
Computerized Society.
SOURCES OF DATA
Web data
E-Commerce
Bank Transaction
Digital media
Online Games
Research
ONLINE DATA
Every 60s
98k + tweets
Millions of FB updates
11 million of chats
217 new mobile user
SOLUTION
We are drowning in data but lacking
in Knowledge.
The solution is to mine the
knowledge from data.
Automated analysis of massive data
sets.
WHAT IS DATA MINING?
DATA MINING
It is the process of mining knowledge from large amount of data.

Data Mining
Techniques

Useful data
WHY WE DO THIS?

1 • Companies & organizations get huge amount

of data from different sources and platforms.

• A size of database increasing and it is very

2 difficult to manually search for useful
information in it.

• They use data mining techniques which

3 includes AI and mathematical complex

algorithms for getting specific and useful
data
CONTINUED….

1 2
• We also get • Data mining is
trends and also called as
patterns, insights Knowledge
of collected data. Discovery in
Database (KDD).
DATA MINING TECHNIQUES

Statistic Cluster techniques

Regression

s
Segmentation
mathematical

AI
KNN algo

ML Apriori algo
K mean algo
Naïve bayes
DATA MINING PROCESS
DATA SELECTION

DATA PREPROCESSING

DATA TRANSFORMATION

DATA MINING

PATTERN EVALUATION
KNOWLEDGE
PRESENTATION
ARCHITECTURE

Database or data warehouse

Data Mining Engine

Knowled
ge base
Pattern Evaluation

User Interface
WHAT KIND OF DATA CAN BE
MINED?
Here are the data

Database Data Data Warehouse Other kind of data

DATA WAREHOUSE
A data warehouse is a repository of
information collected from multiple
sources, stored under a unified schema,
and usually residing at a single site. Data
warehouses are constructed via a
process of data cleaning, data
integration, data transformation, data
loading, and periodic data refreshing.
KEY FEATURES
 Subjected
Integrated
Non-Volatile
Time-Varient
Data Granularity
A data warehouse is
usually modeled by a
multidimensional data
structure, called a data
cube, in which each
dimension corresponds to
an attribute or a set of
attributes in the schema,
and each cell stores the
value of some aggregate
measure such as count or
sum(sales amount). A data
cube provides a
multidimensional view of
data and allows the
precomputation and fast
access of summarized
data.
OLAP
OLAP (Online Analytical Processing) is a
technology used in data warehouses to
analyze large volumes of data from multiple
perspectives quickly and efficiently. It allows
users to perform complex queries, such as
comparing sales by region, time, or product
category, and interact with the data to
discover insights.
OLAP OPERATIONS
 Pivoting
 Slice and Dice
Roll up and drill down
PIVOTING
SLICE AND DICE
ROLL UP AND DRILL DOWN
OTHERS DATA

TRANSACTIONAL DATA
DATA MINING APPLICATIONS

Customer Segmentation Benefits:

• Manufacturing
Market basket analysis • Mail Order
• Supermarkets
Risk Management • Airlines
• Department Store
Fraud Detection •
•
Insurance
Banks

Demand Prediction
DATA MINING TASK
There are two type of task:

Descriptive Predictive

Clustering Classification
• Grouping Similar • Categorizing new data
Customer based on based on previous
their interest patterns
Association Mining Regression
Rule
• Finding Relationships • Predicting Continuous
between items in values like sales and
data stock prices
WHAT KIND OF PATTERNS
CAN BE MINED?
CLASS/CONCEPT
DESCRIPTION:
In data mining, class/concept description helps in
understanding and summarizing data by describing
characteristics and differences of data groups.

Characterization
Discrimination
Mining Frequent Patterns
Association and Correlations
Classification and Regression
CHARACTERIZATION AND
DISCRIMINATION
Characterization: (Describing a group)
• It describes the common characteristics of a group (class or
concept).
• It summarizes general patterns in data.
Discrimination: (Comparing two or more groups)
• It compares two or more groups to find differences between
them.
• It identifies what makes one group different from another.
COMPARISON
Features Characterization Discrimination
What it does? Describes common Compares two or more
characteristics of a group groups to find differences
Example "Loyal customers shop "High-risk borrowers have
frequently and spend low credit scores"
more"
Use Case Customer profiling, Fraud detection, risk
business trends analysis
MINING FREQUENT
PATTERNS
Frequent pattern mining is a technique in data mining that
finds repeating patterns in large datasets. These patterns help
in understanding trends, making predictions, and improving
decision-making.
What are Frequent Pattern?
A frequent pattern is something that appears often in a dataset.
Example:
Supermarket Purchases
Many customers buy bread and butter together.
If this happens frequently, it is called a frequent pattern.
TYPES OF FREQUENT
PATTERNS
Frequent Itemsets → Groups of items that appear together
frequently.
Example: Customers often buy milk, bread, and eggs together.
Sequential Patterns → Repeated patterns in a sequence (ordered
events).
Example: A customer first buys a phone, then buys a phone
case after a week.
Association Rules → If one event happens, another is likely to
happen.
Example: If people buy diapers, they often buy baby wipes
too.
ASSOCIATION AND
CORRELATIONS
These are techniques used in data mining to find relationships between items
in a dataset.

Association:
It finds connections between items that often appear together.
Example: If customers buy bread, they often buy butter too.

Correlations:
It checks if two things change together and how strong their relationship is.
Example (Weather & Ice Cream Sales):
On hot days, ice cream sales increase.
This means temperature and ice cream sales are correlated.
A high correlation means the two things are strongly related.
KEY DIFFERENCE
Association = Items that appear together frequently.
Correlation = Items that influence each other’s behavior.
ASSOCIATION RULE MINING
 ARM is also called market basket analysis.
 Set of items in a transaction is called market
basket.
 Mostly used in retail Industry.
SUPPORT AND CONFIDENCE
In association rule mining, we use support and confidence to measure
the strength of a rule.
Support:
Support tells how often an itemset appears in the dataset. It helps in
finding popular items.
Confidence:
Confidence tells how often an association rule is true. It shows the
likelihood of B happening when A occurs.
Example:
We want to check the rule:
If a customer buys milk, they also buy bread
ASSOCIATION ANALYSIS
Transaction Item Purchase
ID
1 Bread, Cheese, Egg, Juice
2 Bread, Cheese, Juice
3 Bread, yogurt, Milk
4 Bread, Juice, Milk
5 Cheese, Juice, Milk

Chapter 1 Data Mining (Cont.)
No ratings yet
Chapter 1 Data Mining (Cont.)
50 pages
DBMS Unit-Iv
No ratings yet
DBMS Unit-Iv
20 pages
Introduction To Data Mining1
No ratings yet
Introduction To Data Mining1
11 pages
DWDM Mod-1
No ratings yet
DWDM Mod-1
13 pages
Data Mining Mids
No ratings yet
Data Mining Mids
24 pages
Isolated Footing Excel Computation
No ratings yet
Isolated Footing Excel Computation
27 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
39 pages
Data Mining: Concepts and Techniques
100% (2)
Data Mining: Concepts and Techniques
27 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Datamining 1
No ratings yet
Datamining 1
30 pages
01 Intro
No ratings yet
01 Intro
45 pages
Introduction
No ratings yet
Introduction
26 pages
Intro Data Mining
No ratings yet
Intro Data Mining
51 pages
Module 4
No ratings yet
Module 4
54 pages
Data Mining and Datawarehousing CS-303
No ratings yet
Data Mining and Datawarehousing CS-303
34 pages
Data Mining Unit 1-1
No ratings yet
Data Mining Unit 1-1
11 pages
FDS Unit01
No ratings yet
FDS Unit01
10 pages
Fundamentals of Data Science Notes (Module - 1)
No ratings yet
Fundamentals of Data Science Notes (Module - 1)
19 pages
Digital Data Mining Nostos - FP
No ratings yet
Digital Data Mining Nostos - FP
37 pages
Data Mining - 2
No ratings yet
Data Mining - 2
16 pages
DWDM Unit-II Notes
No ratings yet
DWDM Unit-II Notes
29 pages
500 Grammar Questions With Keys PDF
No ratings yet
500 Grammar Questions With Keys PDF
48 pages
THEONE ? Sentence Improvement Pre 4th Oct Level Up Your English
No ratings yet
THEONE ? Sentence Improvement Pre 4th Oct Level Up Your English
145 pages
IoT Quantum Computing A Future Concept
No ratings yet
IoT Quantum Computing A Future Concept
8 pages
Data Mining Notes1
No ratings yet
Data Mining Notes1
56 pages
01 Intro
No ratings yet
01 Intro
26 pages
Madrid Protocol TMR
No ratings yet
Madrid Protocol TMR
21 pages
Math 6 March 23 Quarter 3 Speed
No ratings yet
Math 6 March 23 Quarter 3 Speed
34 pages
Chapter 1&2
No ratings yet
Chapter 1&2
91 pages
PAS Report 556
No ratings yet
PAS Report 556
264 pages
Unit I Dbmi
No ratings yet
Unit I Dbmi
35 pages
Radical-Scavenging Effects of Aloe Arborescens Miller On Prevention of Pancreatic Islet B-Cell Destruction in Rats
No ratings yet
Radical-Scavenging Effects of Aloe Arborescens Miller On Prevention of Pancreatic Islet B-Cell Destruction in Rats
9 pages
1 IT326 - Ch1 - Introduction
No ratings yet
1 IT326 - Ch1 - Introduction
37 pages
WELDING Presentation
No ratings yet
WELDING Presentation
21 pages
Unit-4 DWM
No ratings yet
Unit-4 DWM
73 pages
Unit 1
No ratings yet
Unit 1
59 pages
Unit 3
No ratings yet
Unit 3
23 pages
Class Actvity 1 Answers
55% (11)
Class Actvity 1 Answers
10 pages
Literature Review Last Edit
No ratings yet
Literature Review Last Edit
11 pages
Data Mining 1
No ratings yet
Data Mining 1
39 pages
Data Mining
No ratings yet
Data Mining
40 pages
Data Mining
No ratings yet
Data Mining
6 pages
Data Science & Big Data Analysis Module 1,2,3,4,5
No ratings yet
Data Science & Big Data Analysis Module 1,2,3,4,5
70 pages
The Status of Knowledge
No ratings yet
The Status of Knowledge
8 pages
Chapter 1
No ratings yet
Chapter 1
55 pages
R June 6 Prakash Bari Health
No ratings yet
R June 6 Prakash Bari Health
6 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
13 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
27 pages
Data Mining Unit I Notes
No ratings yet
Data Mining Unit I Notes
24 pages
Chapter 1 Capstone
No ratings yet
Chapter 1 Capstone
2 pages
Concepts and Techniques: - Chapter 1
No ratings yet
Concepts and Techniques: - Chapter 1
37 pages
1 Chapter One
No ratings yet
1 Chapter One
54 pages
Introduction
No ratings yet
Introduction
27 pages
PJS Damansara Qtr4 2022 - Invoices
No ratings yet
PJS Damansara Qtr4 2022 - Invoices
3 pages
Chapter - 1
No ratings yet
Chapter - 1
22 pages
DM 1 PDF
No ratings yet
DM 1 PDF
67 pages
DWDM 01 Introduction
No ratings yet
DWDM 01 Introduction
43 pages
The Travelers Property Casualty Co. v. Saint-Gobain Technical Fabrics Canada Ltd.
No ratings yet
The Travelers Property Casualty Co. v. Saint-Gobain Technical Fabrics Canada Ltd.
11 pages
Home Sweet Compromise
No ratings yet
Home Sweet Compromise
7 pages
Data Mining Summaries PDF
No ratings yet
Data Mining Summaries PDF
22 pages
DWM
No ratings yet
DWM
66 pages
Homework 6: Math 308 Due: 8 March
No ratings yet
Homework 6: Math 308 Due: 8 March
3 pages
1712060004 (1)
No ratings yet
1712060004 (1)
25 pages
The College Walkthrough Ver 0.39
No ratings yet
The College Walkthrough Ver 0.39
22 pages
K P P Abhilash Emergency Medicine Best Practices at CMC EMAC 2018
100% (1)
K P P Abhilash Emergency Medicine Best Practices at CMC EMAC 2018
531 pages
Archana Data Mining
No ratings yet
Archana Data Mining
24 pages
Xanthan Gum On Foam Concrete PDF
No ratings yet
Xanthan Gum On Foam Concrete PDF
8 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
27 pages
Unit I DATA MINING AAGAC
No ratings yet
Unit I DATA MINING AAGAC
27 pages
Introduction To Data Mining-Week1
No ratings yet
Introduction To Data Mining-Week1
43 pages
Data Mining Nostos
100% (1)
Data Mining Nostos
39 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
Prefinal-1 Model Paper (2024-25)
No ratings yet
Prefinal-1 Model Paper (2024-25)
4 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
p144 Data Mining
100% (3)
p144 Data Mining
11 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
25 pages
ABCD Complete V7b HR 1
No ratings yet
ABCD Complete V7b HR 1
11 pages
Fourier Analysis-A Signal Processing Approach
No ratings yet
Fourier Analysis-A Signal Processing Approach
14 pages
A Brief Overview On Data Mining Survey PDF
No ratings yet
A Brief Overview On Data Mining Survey PDF
8 pages
Chapter 6 Data Mining
No ratings yet
Chapter 6 Data Mining
39 pages
Runehammer OSE Hacked 1.2
100% (1)
Runehammer OSE Hacked 1.2
17 pages
Intro of Data Mining
No ratings yet
Intro of Data Mining
27 pages
Data Mining: An Overview From A Database Perspective
No ratings yet
Data Mining: An Overview From A Database Perspective
30 pages
Ka & TN Cbse (c3 To c5) C Batch BWT - 7 Syllabus (19.02.2024)
No ratings yet
Ka & TN Cbse (c3 To c5) C Batch BWT - 7 Syllabus (19.02.2024)
2 pages
MSS 064 Rev.00 Final
No ratings yet
MSS 064 Rev.00 Final
33 pages
9TH SSC Trigonometry Paper
100% (2)
9TH SSC Trigonometry Paper
2 pages
SCADA
No ratings yet
SCADA
12 pages
Briandavidphillips - Core Skills Hypnosis DVD Course
No ratings yet
Briandavidphillips - Core Skills Hypnosis DVD Course
6 pages
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet

Fundamentals of Data Mining

Uploaded by

Fundamentals of Data Mining

Uploaded by

FUNDAMENTALS OF Lecture 1

1 • Companies & organizations get huge amount

• A size of database increasing and it is very

• They use data mining techniques which

3 includes AI and mathematical complex

Statistic Cluster techniques

Database or data warehouse

Data Mining Engine

Database Data Data Warehouse Other kind of data

Customer Segmentation Benefits:

You might also like