0% found this document useful (0 votes)
22 views25 pages

Data Mining

This document provides an overview of data mining and data warehousing. It begins with a brief history of databases from the 1960s to present. It then defines data mining as the process of locating hidden patterns in large data sets. The main models of predictive and descriptive data mining are described. The data mining process involves problem definition, data exploration, model building, and evaluation. Applications of data mining discussed include business analysis, security, customer relationships, and healthcare. Data warehousing is then introduced as a way to integrate data from multiple sources for analysis. Key aspects of data warehousing architectures, building, updating, and mining warehouses are outlined.

Uploaded by

Hoa Ha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views25 pages

Data Mining

This document provides an overview of data mining and data warehousing. It begins with a brief history of databases from the 1960s to present. It then defines data mining as the process of locating hidden patterns in large data sets. The main models of predictive and descriptive data mining are described. The data mining process involves problem definition, data exploration, model building, and evaluation. Applications of data mining discussed include business analysis, security, customer relationships, and healthcare. Data warehousing is then introduced as a way to integrate data from multiple sources for analysis. Key aspects of data warehousing architectures, building, updating, and mining warehouses are outlined.

Uploaded by

Hoa Ha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 25

Yogesh Benawat

Sameer Deshmukh
Outline
 Data Mining
 Data Warehousing
 Q ‘n’ A
 Conclusion
Historical Perspective
 1960s:
 Data collection, database creation, IMS and
network DBMS
 1970s:
 Relational data model, relational DBMS
implementation
 1980s:
 RDBMS, advanced data models (extended-
relational, OO, deductive, etc.) and
application-oriented DBMS (spatial, scientific,
engineering, etc.)
 1990s—2000s:
 Data mining and data warehousing,
multimedia databases, and Web databases
Data Mining
Definition
 Data mining automates the process of locating and
extracting the hidden patterns and knowledge

 In simple words
 Searching for new knowledge
Why we need data mining

 Data explosion problem


 Automated data collection tools and mature database technology
lead to tremendous amounts of data stored in databases, data
warehouses and other information repositories
 We are drowning in data, but starving for knowledge!
 Solution: Data mining
 Data warehousing and on-line analytical processing
 Extraction of interesting knowledge (rules, regularities, patterns,
constraints) from data in large databases
Data Mining Models

 Predictive Model

 Descriptive Model
Predictive Model
 Prediction
 determining how certain attributes will behave in the future
 Regression
 mapping of data item to real valued prediction variable
 Classification

categorization of data based on combinations of attributes
 Time Series analysis
 examining values of attributes with respect to time
Descriptive Model
 Clustering
 most closely data clubbed together into clusters
 Data Summarization
 extracting representative information about database
 Association Rules
 associativity defined between data items to form relationship
 Sequence Discovery
 it is used to determine sequential patterns in data based on
time sequence of action
Data mining process

Problem Definition

Creating Database

Exploring database

Preparation for creating a data mining model

Building Data Mining Model

Evaluation Phase

Deploying the Data Mining model


Fig. General Phases of Data Mining Process
Who needs data mining?
 Whoever has information fastest and uses it wins
 Don McKeough former president of Coke Cola

 Businesses are looking for new ways to let end users


find the data they need to:
 make decisions
 Serve customers
 Gain the competitive edge
Applications
 Business analysis and management
 Computer security
 Customer relationships analysis and management
 Telecommunication analysis and management
 News and entertainment
 Bioinformatics and Healthcare analysis
Summary
 Need of data mining
 Data mining models
 Process of data mining
 Some applications
Data Warehousing
Data Warehousing
 Data Warehouse
 What is Data Warehouse?
 Database & Data Warehouse.

 How to distinguish?
 Purpose
 Database : Transactional

 Data Warehouse :Intended for Decision Supporting

Applications.
 Functionality
 Optimized for data retrieval, not routine transaction

processing.
 Structure
 Performance
Data Warehousing
 Modern Organization’s needs ?
 Companies spread world wide.
 Have
 So many Data Sources
 Different Operational Systems
 Different Schemas
 Need Data for
 Complex Analysis
 Knowledge Discovery
 Decision Making.

Solution ???
Data Warehousing
 Solution…Data Warehouse.
 Data Warehouse . Definition ??
 No single definition….
 Data Warehouse
 Collection of Information gathered from multiple sources,
stored under unified schema, at a single site & mainly
intended for decision support applications.
 A subject oriented, integrated, nonvolatile, time-variant,
collection of data in support of management’s decision.
~ W.H. Inmon
Warehouses are Very Large
Databases
35%

30%

25%
Respondents

20%

15%

10%
Initial
5% Projected 2Q96

0% Source: META Group, Inc.


5GB 10-19GB 50-99GB 250-499GB
5-9GB 20-49GB 100-249GB 500GB-1TB
Data Warehousing
 Data Warehouse - Architecture

Data Data Warehouse Data


Source1
DBMS Mining

Data
Data Data Loaders OLAP
Source 2
Data
.
.
. DSSI
ESI

DataSource
n
Data Warehousing
 Data Warehouse building
 When & how to gather data
 Source-driven architecture
 Destination-driven architecture
 What schema to use
 Data Cleansing
 Task of correcting and processing data
 How to propagate updates
 What data to summarize
 And many more……
Summary
 What is Data Warehousing?
 Data Warehouse.
 Data Warehouse – Architecture
 Data Warehouse vs. Data Mining
Conclusion
 Your data is full of undiscovered gems;
start digging!
References

 Data Mining Introductory and advanced Topics


Margaret H. Dunham
 Modern Data Warehousing, Mining, and visualization
George M. Marakas
 Data Mining
BPB Publications
 Database System Concepts
Silbershatz, Korth,
Sudarshan
 www.statoo.info/
 www.crm2day.com/
 www.trilliumsoftware.com/
Q ‘n’ A
Thank You!

You might also like