Data Mining
Data Mining
pattern
1. A consistent, trait, feature, or method.
DATA ANALYTICS/PREDICTIVE
ANALYTICS
Data analytics is distinguished from Data mining by the scope, purpose
and focus of the analysis. Data miners sort through huge data sets using
sophisticated software to identify undiscovered patterns and establish
hidden relationships. Data analytics focuses on inference, the process of
deriving a conclusion based solely on what is already known by the
researcher.
Uses lower level of Granularity, meaning it looks at the
individual level. Instead of looking at which candidate will
win the Presidential election in the state of Ohio, which is
forecasting. It looks at the individual level.
Which person is voting for or against.
Predicts which individuals can be persuaded, which ones will
not change, etc. Now with this information we ca change the
outcome of the race.
Obama used this technique very well.
EMERGING TECHNOLOGY
DATA MINING: BUSINESS
• What is it?
Decision making
Marketing
Detecting Fraud
This technology is popular with many businesses because it allows them to learn more
about their customers, prevent frauds and identity theft, and also make smart marketing
decisions
Keys to a Successful Data Mining Project
• Knowledgeable personnel
• Appropriate algorithms
PRIMARY TASKS OF DATA MINING
Classification classify a data item into one of
several predefined classes
With market segmentation, you will be able to find behaviors that are
common among your customers. As a company seeks customer’s trends,
it helps them find necessities in order to help them improve their
business.
DATA MINING APPLICATIONS
CUSTOMER CHURN
Market basket analysis- involves researching customer
characteristics in respect to their purchase patterns
MARKET BASKET
Examples of real life.
• Mathematical taxonomy
• Concept clustering
Class identification, which consists of mathematical taxonomy and concept
clustering. Mathematical taxonomy focuses on what makes the members of
a certain class similar, as opposed to differentiating one class from another.
For example, Ralphs can classify its customers based on their income or past
purchases
Consider the pattern a purchase of toys for age group 3–5 years, is followed by
purchase of kid’s bicycle within 6 months about 90% of the time by high
income customers, which was discovered by data mining. The Company can
identify the prospective customers for kid’s bicycle based
on toy purchase details and adjust the mail catalog accordingly.
• What to produce?
• Equal Success
PRIVACY
•Sensitive information
Employers try to
predict churn
RESPONSE BIAS ISSUES
• Types
o –Coverage or frame error
o –Sampling error
o –Nonresponsive error
o –Measurement error
• Flawed data
The most recent and most promising use of data mining has been the
development of data mining
tools for the medical sector. The use of
data mining to extract patterns from medical data provides near endless
opportunities for symptom trend detection, earlier detection of illness, DNA
trend analysis and improved patient reactions to medicines. These many
advantages allow doctors and hospitals to be more effective and more efficient.
• Symptom trends
• Data analysis
• Privacy
DATA MINING - MEDICAL
How data mining is actually used to analyze individual data can become quite
complex due to the data. The goal of the process is to take the medical data
which contain many attributes and determine which ones are actually relevant to
the diagnosis, symptom or result.
Two methods used in medical data mining are clustering, discussed previously
and biclustering.