We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3
Chapter 1 (Introduction)
Data Warehousing vs. Data Mining:
Data warehousing is a repository of information collected from multiple data source. And data mining is a process of extraction of interesting patterns or knowledge from huge amounts of data. An alternatives name of data mining is knowledge discovery from data (KDD).
Steps of KDD (Knowledge Discovery from Data) Process:
The following steps are; Data cleaning Data integration Data selection Data transformation Data mining Pattern evaluation Knowledge representation
What kind of data (area) can be mined?
Data can be mined in the following data set: Database oriented data set: - Relational database - Data warehouse - Transactional database Advance database set: - Data stream and sensor data - Time-series data, sequence data - Object-relational database - Heterogeneous data set. - Structured database, graphs, social-network, multi-link data - Multimedia database - Text databases - The World-Wide Web What Kinds of Pattern/Knowledge can be mined? Data can be mined in the following pattern or knowledge: Characterization and discrimination Frequent pattern, association and correlation Classification and regression Clustering Outliers
Characterization and Discrimination
Data characterization is a summarization of the general characteristics or features of a target class of data. And data discrimination is a comparison of the general features of the target class data objects against the general features of objects from one or multiple contrasting classes.
Which Technologies are used?
A different kind of technology are used in data mining with confluence of multiple discipline. These are; Machine learning Pattern recognition Statistics High-performance computing Visualization Applications Algorithm Database technology
Applications of Data Mining:
It is said that “Where there are data, there are data mining applications”. Applications are; Retail and telecommunication Banking and stock market analysis Text and web mining Biological and medical data analysis Web page analysis Data mining and software engineering Search engines Major Issues in Data Mining: “Life is short but art is long” according to the notation the issues are grouped in five categories. These are; Mining methodology User interaction Efficiency and stability Diversity of data types Data mining and society