Chapter1 Introduction
Chapter1 Introduction
Booch, 2013
April 20, 2024 Data Mining: Concepts and Techniques 3
April 20, 2024 Data Mining: Concepts and Techniques Booch, 2013 4
April 20, 2024 Data Mining: Concepts and Techniques Booch, 2013 5
April 20, 2024 Data Mining: Concepts and Techniques
Booch, 2013 6
Big Data Applications
April 20, 2024 Data Mining: Concepts and Techniques Brodsky, 2013 7
Brodsky, 2013
April 20, 2024 Data Mining: Concepts and Techniques 8
Syllabus
Instructor: Prof. Jianlin Cheng
My Teaching
My Research
Office Hours: EBW 109, MoWe: 4 – 5
TA (Xiaokai Qian)
Objectives
Text Book
Assignments
Projects
Grading
Course web site:
http://calla.rnet.missouri.edu/cheng_courses/datamining2016/
April 20, 2024 Data Mining: Concepts and Techniques 9
Coverage of Topics
Task-relevant Data
Data Selection
Warehouse
Data Cleaning
Data Integration
Databases
April 20, 2024 Data Mining: Concepts and Techniques 17
Data Mining and Business Intelligence
Data Exploration
Statistical Summary, Querying, and Reporting
Database
Technology Statistics
Machine Visualization
Learning Data Mining
Pattern
Recognition Other
Algorithm Disciplines
http://www.datasciencecentral.com/forum/topics/the-3vs-that-
define-big-data
April 20, 2024 Data Mining: Concepts and Techniques 21
Multi-Dimensional View of Data Mining
Data to be mined
Relational, data warehouse, transactional, stream,
object-oriented/relational, active, spatial, time-series, text, multi-media,
heterogeneous, legacy, WWW
Knowledge to be mined
Characterization, discrimination, association, classification, clustering,
trend/deviation, outlier analysis, etc.
Multiple/integrated functions and mining at multiple levels
Techniques utilized
Database-oriented, data warehouse (OLAP), machine learning, statistics,
visualization, etc.
Applications adapted
Retail, telecommunication, banking, fraud analysis, bio-data mining, stock
market analysis, text mining, Web mining, etc.
General functionality
Descriptive data mining (Democrat <-> Republican)
Predictive data mining
Different views lead to different classifications
Data view: Kinds of data to be mined
Knowledge view: Kinds of knowledge to be discovered
Method view: Kinds of techniques utilized
Application view: Kinds of applications adapted
Outlier analysis
Outlier: Data object that does not comply with the general behavior
of the data
Noise or exception? Useful in fraud detection, rare events analysis
practices of Knowledge
Data Mining and Knowledge
Discovery and Data Mining Discovery (DAMI or DMKD)
(PKDD) IEEE Trans. On Knowledge
Pacific-Asia Conf. on and Data Eng. (TKDE)
Knowledge Discovery and Data KDD Explorations
Mining (PAKDD) ACM Trans. on KDD
April 20, 2024 Data Mining: Concepts and Techniques 33
Where to Find References? DBLP, CiteSeer, Google
Pattern Evaluation
Knowl
Data Mining Engine edge-
Base
Database or Data
Warehouse Server