Data Mining Tasks

Uploaded by

woahujessica

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views20 pages

Data Mining Tasks

Uploaded by

woahujessica

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Data Mining Tasks

• The data mining tasks can be classified
generally into two types based on what a
specific task tries to achieve. Those two
categories are descriptive tasks and predictive
tasks. The descriptive data mining tasks
characterize the general properties of data
whereas predictive data mining tasks perform
inference on the available data set to predict
how a new data set will behave.
Different Data Mining Tasks
Different Data Mining Tasks

• Predictive data mining tasks come up with a model from

the available data set that is helpful in predicting
unknown or future values of another data set of
interest. A medical practitioner trying to diagnose a
disease based on the medical test results of a patient
can be considered as a predictive data mining task.
• Descriptive data mining tasks usually finds data
describing patterns and comes up with new, significant
information from the available data set. A retailer trying
to identify products that are purchased together can be
considered as a descriptive data mining task.
Classification
Classification derives a model to determine the class of an object
based on its attributes. A collection of records will be available,
each record with a set of attributes. One of the attributes will be
class attribute and the goal of classification task is assigning a
class attribute to new set of records as accurately as possible.
Classification can be used in direct marketing, that is to reduce
marketing costs by targeting a set of customers who are likely to
buy a new product. Using the available data, it is possible to
know which customers purchased similar products and who did
not purchase in the past. Hence, {purchase, don’t purchase}
decision forms the class attribute in this case. Once the class
attribute is assigned, demographic and lifestyle information of
customers who purchased similar products can be collected and
promotion mails can be sent to them directly.
Classification has two types of variables
A. explanatory variables – which defines the
essential properties of data
B. Target variables – whose values can be
determined
It is used to predicate the value of discrete
target variable
Prediction

• Prediction task predicts the possible values of

missing or future data. Prediction involves
developing a model based on the available data
and this model is used in predicting future values
of a new data set of interest. For example, a model
can predict the income of an employee based on
education, experience and other demographic
factors like place of stay, gender etc. Also
prediction analysis is used in different areas
including medical diagnosis, fraud detection etc.
Time - Series Analysis
• Time series is a sequence of events where the
next event is determined by one or more of the
preceding events. Time series reflects the process
being measured and there are certain
components that affect the behavior of a process.
Time series analysis includes methods to analyze
time-series data in order to extract useful
patterns, trends, rules and statistics. Stock market
prediction is an important application of time-
series analysis.
Outlier Analysis in Data Mining
What are Outliers?
• Outliers are an integral part of data analysis. An outlier can be
defined as observation point that lies in a distance from other
observations.
• An outlier is important as it specifies an error in the experiment.
Outliers are extensively used in various areas such as detecting
frauds, introducing potential new trends in the market and others.
• Usually, outliers are confused with noise. However, outliers are
different from noise data in the following sense:
• Noise is a random error, but outlier is an observation point that is
situated away from different observations.
• Noise should be removed for better outlier detection.
Various causes of outliers in Data Mining

• It is used in identifying the frauds in banking

sectors such as credit card hacking or any
similar frauds.
• It is used in observing the change in trends of
buying patterns of a customer.
• It is used in identifying the typing errors and
reporting errors made by humans.
• It is used in discovering the errors or faults in
machines or systems.
What is the need of handling the outliers in Data Mining?

• Outliers affect the results of the databases.

• Outliers often give useful or beneficial results
and conclusions due to which various trends
or patterns can be recorded.
• Outliers can be beneficial in research
department also. They can be extremely
useful in some discovery.
• Outliers are the key branches of data mining.
Applications of Outlier Detection in Data Mining

• In Data Mining, Outlier Detection is extensively used. It is

used to obtain patterns or trends in data mining. The
applications of Outlier Detection in Data Mining are given
below:
• Fraud Detection
• Telecom Fraud Detection
• Intrusion Detection in Cyber Security
• Medical Analysis
• Environment Monitoring such as Cyclone, Tsunami, Floods,
Drought and so on
• Noticing unforeseen entries in Databases
Different approaches in Outlier Detection
• There are majorly three approaches observed
in outlier detection. Those approaches are
given below:
• The Statistical Approach
• The Distance Based Approach
• The Deviation Based Approach
Regression in data mining
• A data mining technique that is used to
predict the numeric values in a given data set.
For example, regression might be used to
predict the product or service cost or other
variables. It is also used in various industries
for business and marketing behavior, trend
analysis, and financial forecast.
Application of Regression

• Regression is a very popular technique, and it has

wide applications in businesses and industries.
The regression procedure involves the predictor
variable and response variable. The major
application of regression is given below.
• Environmental modeling
• Analyzing Business and marketing behavior
• Financial predictors or forecasting
• Analyzing the new trends and patterns.
Difference between Regression and
Classification in data mining
• Regression and classification are quite similar to each other.
Classification and Regression are two significant prediction
issues that are used in data mining. If you have given a
training set of inputs and outputs and learn a function that
relates the two, that hopefully enables you to predict outputs
given inputs on new data. The only difference is that in
classification, the outputs are discrete, whereas, in
regression, the outputs are not. But the concepts are
blurred, as in "logistic regression", which can be interpreted
as either a classification or a regression method. So, it
becomes difficult for the user to understand when to use
classification and regression.
Association
• Association discovers the association or connection
among a set of items. Association identifies the
relationships between objects. Association analysis is
used for commodity management, advertising, catalog
design, direct marketing etc. A retailer can identify the
products that normally customers purchase together
or even find the customers who respond to the
promotion of same kind of products. If a retailer finds
that beer and nappy are bought together mostly, he
can put nappies on sale to promote the sale of beer.
Clustering
• Clustering is used to identify data objects that are
similar to one another. The similarity can be
decided based on a number of factors like
purchase behavior, responsiveness to certain
actions, geographical locations and so on. For
example, an insurance company can cluster its
customers based on age, residence, income etc.
This group information will be helpful to
understand the customers better and hence
provide better customized services.
Summarization

• Summarization is the generalization of data. A set of

relevant data is summarized which result in a smaller
set that gives aggregated information of the data. For
example, the shopping done by a customer can be
summarized into total products, total spending, offers
used, etc. Such high level summarized information
can be useful for sales or customer relationship team
for detailed customer and purchase behavior analysis.
Data can be summarized in different abstraction
levels and from different angles.

Internet (Short Presentation)
72% (25)
Internet (Short Presentation)
13 pages
Data Domain Fundamentals Student Guide
100% (1)
Data Domain Fundamentals Student Guide
70 pages
Attendence System Using Python
No ratings yet
Attendence System Using Python
6 pages
Data Mining 4545
No ratings yet
Data Mining 4545
20 pages
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
No ratings yet
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
40 pages
Data Analytics Chapter - 1
No ratings yet
Data Analytics Chapter - 1
42 pages
Data Mining Technique Using Weka Tool
No ratings yet
Data Mining Technique Using Weka Tool
21 pages
Data Mining Models and Tasks
No ratings yet
Data Mining Models and Tasks
6 pages
Unit 2
No ratings yet
Unit 2
37 pages
DDB - Presentation5data Mining Overview
No ratings yet
DDB - Presentation5data Mining Overview
19 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
24 pages
R21 DM Unit1
No ratings yet
R21 DM Unit1
77 pages
Introduction To Data Mining For Business Analytics
No ratings yet
Introduction To Data Mining For Business Analytics
51 pages
UNIT I Introduction To Data Mining
No ratings yet
UNIT I Introduction To Data Mining
22 pages
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
No ratings yet
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
6 pages
Combinepdf 1
No ratings yet
Combinepdf 1
74 pages
DM Unit-1
No ratings yet
DM Unit-1
27 pages
DataMining Chapter1
No ratings yet
DataMining Chapter1
13 pages
SWEN3165 Lecture 9 - Data Mining
No ratings yet
SWEN3165 Lecture 9 - Data Mining
32 pages
What Is Data Mining?
No ratings yet
What Is Data Mining?
17 pages
Data Mining
No ratings yet
Data Mining
87 pages
Data Mining
No ratings yet
Data Mining
7 pages
Data Mining Tasks
No ratings yet
Data Mining Tasks
26 pages
Lecture2 DataMiningFunctionalities
No ratings yet
Lecture2 DataMiningFunctionalities
18 pages
Unit - Iii - Ba
No ratings yet
Unit - Iii - Ba
36 pages
Module - 03
No ratings yet
Module - 03
28 pages
Ch2 DTasks
No ratings yet
Ch2 DTasks
44 pages
Presentation 1
No ratings yet
Presentation 1
28 pages
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
No ratings yet
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
35 pages
4 Datamining
No ratings yet
4 Datamining
90 pages
Wk. 1. Introduction (08.10.2020)
No ratings yet
Wk. 1. Introduction (08.10.2020)
30 pages
2 Data Mining Tasks A Functionalities
No ratings yet
2 Data Mining Tasks A Functionalities
24 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
What Is Data Mining?
No ratings yet
What Is Data Mining?
2 pages
Unit 4
No ratings yet
Unit 4
15 pages
Data Mining Slides
No ratings yet
Data Mining Slides
65 pages
Important Questions Unit-1
No ratings yet
Important Questions Unit-1
20 pages
Data Mining
No ratings yet
Data Mining
31 pages
Introduction To Data Mining Techniques: Dr. Rajni Jain
No ratings yet
Introduction To Data Mining Techniques: Dr. Rajni Jain
11 pages
02-Data Mining Functionalities-2
No ratings yet
02-Data Mining Functionalities-2
23 pages
BI Unit 3 Part 1
No ratings yet
BI Unit 3 Part 1
51 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
Data Mining Questions
100% (1)
Data Mining Questions
7 pages
Data Mining Tasks
No ratings yet
Data Mining Tasks
24 pages
2 - Business Problems and Data Science Solutions
No ratings yet
2 - Business Problems and Data Science Solutions
26 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
87 pages
Business Analytics: Aviral Apurva Anureet Bansal Devansh Agarwaal Dhwani Dhingra Chirag Verma
No ratings yet
Business Analytics: Aviral Apurva Anureet Bansal Devansh Agarwaal Dhwani Dhingra Chirag Verma
49 pages
UNIT 1 Introduction of Data Mining
No ratings yet
UNIT 1 Introduction of Data Mining
11 pages
Knowledge Discovery & Data Mining
No ratings yet
Knowledge Discovery & Data Mining
30 pages
03 Data Mining Functionalities
No ratings yet
03 Data Mining Functionalities
16 pages
Dataming T PDF
No ratings yet
Dataming T PDF
48 pages
Data Warehouse
No ratings yet
Data Warehouse
19 pages
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
1.1 - Data Mining
No ratings yet
1.1 - Data Mining
18 pages
AnIntroductiontoDataMining PDF
No ratings yet
AnIntroductiontoDataMining PDF
40 pages
CSC 425 Data Mining and Warehousing 2024
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
Subject Data Warehouse
No ratings yet
Subject Data Warehouse
42 pages
Data Mining
No ratings yet
Data Mining
23 pages
Unit 3
No ratings yet
Unit 3
34 pages
2-Tasks and Techniques
No ratings yet
2-Tasks and Techniques
17 pages
Data Mining Tasks
No ratings yet
Data Mining Tasks
3 pages
Data Mining
No ratings yet
Data Mining
33 pages
Principles of Data Mining
From Everand
Principles of Data Mining
Subodh Keshari
No ratings yet
Balancing Human and Automated Support Blog
No ratings yet
Balancing Human and Automated Support Blog
3 pages
Data Mining Introduction
No ratings yet
Data Mining Introduction
35 pages
Tirthan Jibhi Season PDF
No ratings yet
Tirthan Jibhi Season PDF
10 pages
Study of Human Resource Management Policies of Tata Motors
100% (1)
Study of Human Resource Management Policies of Tata Motors
36 pages
Mi-280 - Seafarers' Documentation - Filing Agents' Manual
No ratings yet
Mi-280 - Seafarers' Documentation - Filing Agents' Manual
27 pages
Toshiba 37bv701b Chassis 17mb60 17mb65 Ver.1.00
No ratings yet
Toshiba 37bv701b Chassis 17mb60 17mb65 Ver.1.00
54 pages
Type The Document Subtitle
No ratings yet
Type The Document Subtitle
24 pages
Zamfira Ioana Ruxandra - Raport
No ratings yet
Zamfira Ioana Ruxandra - Raport
10 pages
Google Trends
No ratings yet
Google Trends
65 pages
Schematic Diagram (Main 1/2) : L CH Power Amp
No ratings yet
Schematic Diagram (Main 1/2) : L CH Power Amp
2 pages
DS 4254 NetApp ASA
No ratings yet
DS 4254 NetApp ASA
5 pages
CS Xii - PP-2
No ratings yet
CS Xii - PP-2
9 pages
Pgdca 1 Sem Introduction of Software Organisation 117 Dec 2018
No ratings yet
Pgdca 1 Sem Introduction of Software Organisation 117 Dec 2018
2 pages
Mba 4 TH Sem Only
No ratings yet
Mba 4 TH Sem Only
29 pages
CD GTU Study Material Presentations Unit-1 27062020072512AM
No ratings yet
CD GTU Study Material Presentations Unit-1 27062020072512AM
41 pages
Op-Amp Lecture Notes
No ratings yet
Op-Amp Lecture Notes
36 pages
State of Practice of Building Information Modeling
No ratings yet
State of Practice of Building Information Modeling
8 pages
XenApp 6.5 Advanced Administratoin - Student Manual
No ratings yet
XenApp 6.5 Advanced Administratoin - Student Manual
310 pages
Wintertotal 2014
No ratings yet
Wintertotal 2014
14 pages
FYP Proposal Presentation Final
No ratings yet
FYP Proposal Presentation Final
14 pages
10 WIFI 16dbi Super Antenna Pictorial
100% (1)
10 WIFI 16dbi Super Antenna Pictorial
12 pages
Word Puzzle PDF
No ratings yet
Word Puzzle PDF
23 pages
Unit-Ii 191eec303t Lic
No ratings yet
Unit-Ii 191eec303t Lic
125 pages
Learn JavaScript - Iterators Cheatsheet - Codecademy
No ratings yet
Learn JavaScript - Iterators Cheatsheet - Codecademy
2 pages
214 Implementing Dark Mode On Ios
No ratings yet
214 Implementing Dark Mode On Ios
142 pages
Final Project On MR Puff
No ratings yet
Final Project On MR Puff
12 pages
15 - Software Development
No ratings yet
15 - Software Development
89 pages
Dex2jar Steps
No ratings yet
Dex2jar Steps
6 pages
Benchtop Bender Instruction Manual: MODELS H5502
No ratings yet
Benchtop Bender Instruction Manual: MODELS H5502
24 pages
User Manual - Digital Microscope Andonstar ADSM201 PDF
No ratings yet
User Manual - Digital Microscope Andonstar ADSM201 PDF
8 pages
Acc Gearbox Manual
100% (1)
Acc Gearbox Manual
47 pages

Data Mining Tasks

Uploaded by

Data Mining Tasks

Uploaded by

Data Mining Tasks