0% found this document useful (0 votes)

10 views24 pages

Data Mining Tasks

The document outlines various data mining tasks, which are classified into descriptive and predictive categories. Key tasks include mining frequent patterns, association analysis, correlation analysis, cluster analysis, and classification and regression for predictive analysis. It emphasizes the importance of these tasks in discovering patterns, making predictions, and summarizing data in various applications.

Uploaded by

virat18kohli360

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views24 pages

Data Mining Tasks

Uploaded by

virat18kohli360

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

DATA MINING TASKS

 Definition
 Classifications of Data Mining Tasks
 Key Data Mining Tasks
Definition
 Data mining tasks are the kind of data patterns that
can be mined.
 Data mining functionalities are used to specify the
kinds of patterns to be found in data mining tasks.
Classifications of Data Mining Tasks
In general, data mining tasks can be classified into two
categories

 Descriptive
 Predictive
Descriptive data mining

 Descriptive mining tasks characterize the general

properties of the data in a target data set.
 Descriptive data mining demonstrates the common
characteristics in the results.
 It offers knowledge of the data and gives insight into
what's going on inside the data without any prior
idea.
Predictive Data Mining

 Predictive mining tasks perform induction

(inferences) on the current data in order to
make predictions.
 Predictive data mining provides prediction
features from data to its users.
Key Data Mining Tasks
Class/Concept Description: Characterization and
Discrimination

 Data entries can be associated with classes or

concepts.
 It can be useful to describe individual classes and
concepts in summarized, concise, and yet precise
terms. Such descriptions of a class or a concept are
called class/concept descriptions.
 These descriptions can be derived by the
following two ways,
 Data characterization is a summarization of the general
characteristics or features of a target class of data.
 Data Discrimination is referring to the mapping or
classification of a class with some predefined group or class.
Mining Frequent Patterns
 Frequent patterns are patterns that occur frequently in
transactional data.
 There are many kinds of frequent patterns
 Frequent itemsets
 Frequent subsequences (also known as sequential
patterns)
 Frequent substructures
Frequent itemset

 A frequent itemset refers to a set of items that

frequently appear together in a transactional data set.
 for example, milk and bread, which are frequently
bought together in grocery stores by many
customers.
Frequent Subsequence
 A sequence of patterns that occur frequently
such as purchasing a camera is followed by
memory card.
Frequent substructure
 A substructure can refers to the various types of data
structures that can be combined with an item set or
subsequences, such as trees and graphs.
 If a substructure occurs frequently, it is called a
(frequent) structured pattern.
 Mining frequent patterns leads to the discovery of
interesting associations and correlations within data.

Association Analysis

 It analyses the set of items that generally occur together in a

transactional dataset. It is also known as Market Basket
Analysis for its wide use in retail sales.
Two parameters are used for determining the association rules:
 It provides which identifies the common item set in the
database.
 Confidence is the conditional probability that an item occurs
when another item occurs in a transaction.
Correlation Analysis

 Correlation is a mathematical technique that can show whether

and how strongly the pairs of attributes are related to each other.
 For example, Highted people tend to have more weight.
 It is a kind of additional analysis performed to uncover
interesting statistical correlations between two item sets to
analyze that if they have positive, negative or no effect on each
other.
Cluster Analysis

 Cluster refers to a group of similar kind of objects.

 Cluster analysis refers to forming group of objects
that are very similar to each other but are highly
different from the objects in other clusters.
Summarization

 Summarization is the generalization of data. A set of

relevant data is summarized which result in a smaller
set that gives aggregated information of the data.
 For example, the shopping done by a customer can
be summarized into total products, total spending,
offers used, etc
Sequence Discovery

Sequence discovery or sequential pattern mining,

is a data mining technique that is used to find
relevant and important patterns in sequential
data.
Classification and Regression for Predictive Analysis
Classification
 Classification is the process of finding a model (or function)
that describes and distinguishes data classes or concepts.
 Classification derives a model to determine the class of an
object based on its attributes.
 The model is derived based on the analysis of a set of training
data (i.e., data objects for which the class labels are known).
 The model is used to predict the class label of objects for
which the the class label is unknown.
 A classification model can be represented in various forms
 IF-THEN rules
 Decision tree
 Neural network
IF-THEN rules
Decision Tree
 A decision tree is a flowchart-like
tree structure, where each node
denotes a test on an attribute value,
each branch represents an outcome
of the test, and tree leaves represent
classes or class distributions.
 Decision trees can easily be
converted to classification rules.
Neural network

 A neural network, when

used for classification, is
typically a collection of
neuron-like processing
units with weighted
connections between the
units.are many other methods for constructing classification models,
There
such as Naive Bayesian classification, support vector machines,
and k-nearest-neighbor classification.
Regression
 Regression is learning a function which maps a data item to a
real-valued prediction variable.
 Regression is used to predict missing or unavailable
numerical data values rather than (discrete) class labels. The
term prediction refers to both numeric prediction and class
label prediction.
 Regression analysis is a statistical methodology that is most
often used for numeric prediction.
Time Series Analysis
 Time series is a sequence of events where the next event is
determined by one or more of the preceding events.
 Time series reflects the process being measured and there are
certain components that affect the behaviour of a process.
 Time series analysis includes methods to analyze time-series
data in order to extract useful patterns, trends, rules and
statistics.
 Stock market prediction is an important application of time-
series analysis.
Prediction
 Prediction task predicts the possible values of missing or
future data. Prediction involves developing a model based on
the available data and this model is used in predicting future
values of a new data set of interest.
 For example, a model can predict the income of an employee
based on education, experience and other demographic
factors like place of stay, gender etc.
 Also prediction analysis is used in different areas including
medical diagnosis, fraud detection etc.

English Translation of A Birth Certificate From Honduras PDF
78% (9)
English Translation of A Birth Certificate From Honduras PDF
1 page
The Field Guide To Human Error Investigations by Sidney Dekker
0% (1)
The Field Guide To Human Error Investigations by Sidney Dekker
3 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
Unit 1
No ratings yet
Unit 1
59 pages
Introduction
No ratings yet
Introduction
26 pages
DWDM Unit-II Notes
No ratings yet
DWDM Unit-II Notes
29 pages
Module 1
No ratings yet
Module 1
41 pages
DM-Unit-I Introduction To Association-1
No ratings yet
DM-Unit-I Introduction To Association-1
97 pages
Unit 1
No ratings yet
Unit 1
21 pages
Data Mining Technique Using Weka Tool
No ratings yet
Data Mining Technique Using Weka Tool
21 pages
Unit 3 BI & Data Science
No ratings yet
Unit 3 BI & Data Science
19 pages
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
10 pages
Lecture Notes 1.1 & 1.2
No ratings yet
Lecture Notes 1.1 & 1.2
8 pages
Bca DM Unit I
No ratings yet
Bca DM Unit I
20 pages
2-Tasks and Techniques
No ratings yet
2-Tasks and Techniques
17 pages
Unit-4 DWM
No ratings yet
Unit-4 DWM
73 pages
CSC 425 Data Mining and Warehousing 2024
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
24 pages
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
UNIT 1 Introduction of Data Mining
No ratings yet
UNIT 1 Introduction of Data Mining
11 pages
Data Mining
No ratings yet
Data Mining
25 pages
Data Mining
No ratings yet
Data Mining
35 pages
Paper - Xvii Data Mining and Warehousing
No ratings yet
Paper - Xvii Data Mining and Warehousing
140 pages
DDB - Presentation5data Mining Overview
No ratings yet
DDB - Presentation5data Mining Overview
19 pages
CH 2
No ratings yet
CH 2
37 pages
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
No ratings yet
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
40 pages
Unit 1 Data Mining Task
No ratings yet
Unit 1 Data Mining Task
7 pages
4 Datamining
No ratings yet
4 Datamining
90 pages
Q.1. What Is Data Mining?
No ratings yet
Q.1. What Is Data Mining?
15 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
Module 4
No ratings yet
Module 4
54 pages
FoDS - Unit 1
No ratings yet
FoDS - Unit 1
7 pages
Data Mining 1 2 and 3
No ratings yet
Data Mining 1 2 and 3
20 pages
Data Mining
No ratings yet
Data Mining
6 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
16 pages
Data Mining
No ratings yet
Data Mining
30 pages
What Is Not Data Mining - Ex: Generation of Attendance Report (Of A Course) From Registration Cards. - Student Table (STD)
No ratings yet
What Is Not Data Mining - Ex: Generation of Attendance Report (Of A Course) From Registration Cards. - Student Table (STD)
33 pages
Data Mining - Tasks: Data Characterization Data Discrimination
No ratings yet
Data Mining - Tasks: Data Characterization Data Discrimination
4 pages
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
No ratings yet
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
6 pages
DM Unit 1
No ratings yet
DM Unit 1
10 pages
Unit I DATA MINING AAGAC
No ratings yet
Unit I DATA MINING AAGAC
27 pages
Data Mining: An Overview From A Database Perspective
No ratings yet
Data Mining: An Overview From A Database Perspective
30 pages
Unit - 2 Data Minig Notes
No ratings yet
Unit - 2 Data Minig Notes
15 pages
Data Warehousing Fundamentals - Unit 2
No ratings yet
Data Warehousing Fundamentals - Unit 2
38 pages
Datamining 1
No ratings yet
Datamining 1
30 pages
1 IT326 - Ch1 - Introduction
No ratings yet
1 IT326 - Ch1 - Introduction
37 pages
Chapter 6 Data Mining
No ratings yet
Chapter 6 Data Mining
39 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
39 pages
Data Mining Unit I Notes
No ratings yet
Data Mining Unit I Notes
24 pages
Data Mining Real
No ratings yet
Data Mining Real
19 pages
Data Mining
No ratings yet
Data Mining
24 pages
Introduction To Data Mining: Dr. Dipti Chauhan Assistant Professor SCSIT, SUAS Indore
No ratings yet
Introduction To Data Mining: Dr. Dipti Chauhan Assistant Professor SCSIT, SUAS Indore
16 pages
Data Mining
No ratings yet
Data Mining
7 pages
Data Mining Tasks
No ratings yet
Data Mining Tasks
26 pages
R18CSE4102-UNIT 2 Data Mining Notes
100% (1)
R18CSE4102-UNIT 2 Data Mining Notes
31 pages
Data Science & Big Data Analysis Module 1,2,3,4,5
No ratings yet
Data Science & Big Data Analysis Module 1,2,3,4,5
70 pages
DM Unit1 Intro
No ratings yet
DM Unit1 Intro
12 pages
Patterns Mined +frequent Patterns
No ratings yet
Patterns Mined +frequent Patterns
18 pages
p144 Data Mining
100% (3)
p144 Data Mining
11 pages
Lecture2 DataMiningFunctionalities
No ratings yet
Lecture2 DataMiningFunctionalities
18 pages
DM Module1 Notes
No ratings yet
DM Module1 Notes
25 pages
PHP - Form Introduction: Dynamic Websites
No ratings yet
PHP - Form Introduction: Dynamic Websites
3 pages
Best PHD Thesis Topics
No ratings yet
Best PHD Thesis Topics
5 pages
Unlock HDD That Are Locked After Secure Erase
No ratings yet
Unlock HDD That Are Locked After Secure Erase
9 pages
Database Design Section 5 Quiz
No ratings yet
Database Design Section 5 Quiz
7 pages
PC 22: Internal Control Evaluation Manual SL No. Points Key To Point
No ratings yet
PC 22: Internal Control Evaluation Manual SL No. Points Key To Point
9 pages
Microsoft Word: Microsoft Official Academic Course
No ratings yet
Microsoft Word: Microsoft Official Academic Course
210 pages
Relay Setting
No ratings yet
Relay Setting
144 pages
SE Course Pack Final
No ratings yet
SE Course Pack Final
220 pages
Experiment - 7 Single-Phase Half Wave Voltage Multiplier 7-1 Object
No ratings yet
Experiment - 7 Single-Phase Half Wave Voltage Multiplier 7-1 Object
2 pages
Unit 5
No ratings yet
Unit 5
7 pages
Inventario Bazar
No ratings yet
Inventario Bazar
97 pages
8 DataStorageIndexingStructures Updated
No ratings yet
8 DataStorageIndexingStructures Updated
57 pages
Business Information System PDF
No ratings yet
Business Information System PDF
4 pages
RVR FM Product List
0% (1)
RVR FM Product List
37 pages
Cerec Radio Device
No ratings yet
Cerec Radio Device
32 pages
Network Security Policy 2024
No ratings yet
Network Security Policy 2024
11 pages
Enlogic by Nvent G3 Enterprise Power Distribution Units For HPE Datasheet
No ratings yet
Enlogic by Nvent G3 Enterprise Power Distribution Units For HPE Datasheet
3 pages
Windows 7 Regal Business Edition 2014 SP1
No ratings yet
Windows 7 Regal Business Edition 2014 SP1
1 page
4 6filter Banks
No ratings yet
4 6filter Banks
9 pages
Lec 4
No ratings yet
Lec 4
16 pages
James Instruments - Windsor - HP - Probe - System - Data - Manual
No ratings yet
James Instruments - Windsor - HP - Probe - System - Data - Manual
80 pages
موسوعة امثلة C++ المحلولة
No ratings yet
موسوعة امثلة C++ المحلولة
34 pages
Analizadores de Presion de Vapor Analizador RVP PDF
No ratings yet
Analizadores de Presion de Vapor Analizador RVP PDF
6 pages
220245-MSBTE-22619-PHP (Unit 5)
No ratings yet
220245-MSBTE-22619-PHP (Unit 5)
7 pages
Stages of Development of HRIS
50% (2)
Stages of Development of HRIS
15 pages
RR1720 User Manual PDF
No ratings yet
RR1720 User Manual PDF
71 pages
3 - Identifying Information Sources
No ratings yet
3 - Identifying Information Sources
7 pages
Questions
No ratings yet
Questions
6 pages

Data Mining Tasks

Uploaded by

Data Mining Tasks

Uploaded by

DATA MINING TASKS

 Descriptive mining tasks characterize the general

 Predictive mining tasks perform induction

 Data entries can be associated with classes or

 A frequent itemset refers to a set of items that

 It analyses the set of items that generally occur together in a

 Correlation is a mathematical technique that can show whether

 Cluster refers to a group of similar kind of objects.

 Summarization is the generalization of data. A set of

Sequence discovery or sequential pattern mining,

 A neural network, when

You might also like