Intorduction To Data Mining
Intorduction To Data Mining
Table of Contents
Data-Mining Application
A Strategy for Data Mining: CRISP-DM
Stages and tasks in CRISP-DM
Life Cycle of a Data Mining Project
Skills Needed for Data Mining
Objectives
1 Business Understanding
2 Data Understanding
3 Data Preparation
4 Modeling
5 Evaluation
6 Deployment
Stage 1: Business Understanding
Measures of success:
the initial assessment will be directly tied to the
predictive accuracy
in the long run the success of a data-mining effort is
measured by concrete factors
Data-Mining Success (2 of 4)
Monitoring:
after deployment, collect data to assess the model’s
success
Data-Mining Success (3 of 4)
Cost of errors:
there will always be errors, sometimes with high cost
Bad data:
no data mining algorithm will be able to compensate for
large amounts of error in the data
never scrimp on the time spent on data preparation and
cleaning
Data-Mining Failure (2 of 4)
Organizational resistance:
difficulties implementing a solution are still part of the
whole data-mining effort
to address resistance, educate and convince others about
the potential benefits of the solution
consider implementation in only a portion of the
organization
Data-Mining Failure (3 of 4)
Database knowledge:
the database administrator plays an important role:
Which data tables or files are available?
How are they linked?
How are the fields coded?
What are reasonable data values?
Skills Needed for Data Mining (3 of 4)
fine-tuning techniques
identify anomalies
Skills Needed for Data Mining (4 of 4)
data-mining algorithms
project management