0% found this document useful (0 votes)
32 views3 pages

Data MIning: Chapter 1

Uploaded by

Hasibur Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views3 pages

Data MIning: Chapter 1

Uploaded by

Hasibur Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Chapter 1 (Introduction)

 Data Warehousing vs. Data Mining:


Data warehousing is a repository of information collected from multiple data source. And
data mining is a process of extraction of interesting patterns or knowledge from huge
amounts of data. An alternatives name of data mining is knowledge discovery from data
(KDD).

 Steps of KDD (Knowledge Discovery from Data) Process:


The following steps are;
 Data cleaning
 Data integration
 Data selection
 Data transformation
 Data mining
 Pattern evaluation
 Knowledge representation

 What kind of data (area) can be mined?


Data can be mined in the following data set:
 Database oriented data set:
- Relational database
- Data warehouse
- Transactional database
 Advance database set:
- Data stream and sensor data
- Time-series data, sequence data
- Object-relational database
- Heterogeneous data set.
- Structured database, graphs, social-network, multi-link data
- Multimedia database
- Text databases
- The World-Wide Web
 What Kinds of Pattern/Knowledge can be mined?
Data can be mined in the following pattern or knowledge:
 Characterization and discrimination
 Frequent pattern, association and correlation
 Classification and regression
 Clustering
 Outliers

 Characterization and Discrimination


Data characterization is a summarization of the general characteristics or features of a
target class of data. And data discrimination is a comparison of the general features of
the target class data objects against the general features of objects from one or multiple
contrasting classes.

 Which Technologies are used?


A different kind of technology are used in data mining with confluence of multiple
discipline. These are;
 Machine learning
 Pattern recognition
 Statistics
 High-performance computing
 Visualization
 Applications
 Algorithm
 Database technology

 Applications of Data Mining:


It is said that “Where there are data, there are data mining applications”. Applications are;
 Retail and telecommunication
 Banking and stock market analysis
 Text and web mining
 Biological and medical data analysis
 Web page analysis
 Data mining and software engineering
 Search engines
 Major Issues in Data Mining:
“Life is short but art is long” according to the notation the issues are grouped in five
categories. These are;
 Mining methodology
 User interaction
 Efficiency and stability
 Diversity of data types
 Data mining and society

You might also like