0% found this document useful (0 votes)
16 views16 pages

Topic 1b - History, Evolution and DM Classification

The document provides an overview of data mining, including its history, evolution, and classification. It discusses the relationship between data mining and knowledge discovery, as well as the various techniques and applications of data mining in commercial and scientific contexts. Key motivations for data mining's growth are highlighted, such as advancements in data generation technologies and the increasing volume of data collected.

Uploaded by

2024793147
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views16 pages

Topic 1b - History, Evolution and DM Classification

The document provides an overview of data mining, including its history, evolution, and classification. It discusses the relationship between data mining and knowledge discovery, as well as the various techniques and applications of data mining in commercial and scientific contexts. Key motivations for data mining's growth are highlighted, such as advancements in data generation technologies and the increasing volume of data collected.

Uploaded by

2024793147
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

To p i c 1 ( Pa r t 2 )

HISTORY, EVOLUTION AND


CL ASSIFICATION OF DATA
MINING

Ts . D r. Tu a n N o r h a fi z a h Tu a n
Zakaria
Objectives
To introduce about Data Mining (DM)
1 and its relationship with data and
knowledge
To discuss the history, evolution and
2 motivation of DM

To discuss DM techniques, tasks,


3 applications and some major issues
History of Data Mining

The term “data mining” appeared around 1990 in the database community.

Gregory Piatetsky-Shapiro coined the term “Knowledge Discovery in


Databases” for the first workshop on the same topic (KDD-1989) and this
term become more popular in AI and Machine Learning Community.

Currently, Data Mining and KDD are used interchangeably.

Since about 2007, “Predictive Analytics” and since 2011, “Data Science”
terms were also used to describe this field (Source: Coenen, 2011)
Origin of Data Mining

Draws ideas from machine learning/AI, pattern recognition,


statistics, and database systems
AI,
Statistics
Machine Learning,
Pattern
Recognition
Traditional techniques may be unsuitable due to data that is
High
Large-scale
dimensional
Heterogeneous Complex Distributed Data Mining

Database
A key component of the emerging field of data science and data-
systems
driven discovery
Evolution of Data Mining
Evolutionary Enabling Technologies Business Question Characteristics
Step
Data Computers, tapes, "What was my total Retrospective, static
Collection disks revenue in the last five data delivery
(1960s) years?"
Data Access RDBMS, SQL, ODBC "What were unit sales Retrospective, dynamic
(1980s) in New England last data delivery at record
March? level

Data OLAP, multidimensional "What were unit sales Retrospective, dynamic


Warehousing databases, in New England last data delivery at
(1990s) Data warehouses March? Drill down to multiple levels
Boston”

Data Mining Advanced algorithms, “What’s likely to Prospective, proactive


(Emerging Multiprocessor happen to Boston unit informative delivery
Today) computers, sales next month?
Massive databases Why?”
Motivation of Data
Mining Amazon, Shopee, Lazada
(E-commerce)
• in commercial and scientific
Growth of databases due to advances in data
generation and collection
data technologies

• Lots of data is being collected and


Commercial warehoused
• Computers have become cheaper
Viewpoint and more powerful

• Data collected and stored at


Scientific enormous speeds
• Helps scientists in automated analysis
Viewpoint of massive datasets
Knowledge Discovery (KDD) Process
• the systematic process of identifying valid, practical, and understandable patterns in massive
and complicated data sets.
• The base of the KDD method is data mining, which involves the inference of algorithms that
analyse the data, build the model, and discover previously unknown patterns.
• This is a view from typical database systems and data warehousing communities
• Data mining plays an essential role in the knowledge discovery process
Pattern Evaluation

Data Mining

Task-relevant Data

Data Warehouse Selection

Data Cleaning
Data Integration

Databases
Data Mining: 1-Step of KDD

Knowledge Discovery in Databases


• data mining is an
aspect or part of the
Data mining
knowledge
discovery in
databases (KDD)
process. Task
Techniques
Classification of Data Mining System

Kinds of database Kind of knowledge techniques used Application adapted


• Relational • Categorizing data • Machine learning • Finance
• Data warehouse (Classification) • Pattern recognition • Marketing
• Transactional DB • Find relationship • Neural Network • Medical
• Advanced DB (Association) • Naïve-Bayes • Stock
system • Subdivide similar • K-nearest • Telecommunication
• Flat files data (Clustering) neighbour
• WWW • Make prediction • Rough Set
•… • Statistic
What is Data Mining?
(Analytics / Outcomes)
The process
of analyzing • To discover useful information, which
are collected and assembled in
hidden common repositories such as data
patterns warehouses, multimedia databases
within large and web databases.
datasets

Analytics: the • Descriptive analytics (summarize,


similarity)
outcome of • Predictive analytics (classify,
the data estimate)
mining • Prescriptive analytics (combination
process techniques)
What is Data Mining?
(Analytics / Outcomes)
Why DM? Potential
Applications
• Data analysis and decision support
1. Market analysis and management
• Target marketing, customer relationship management (CRM),
market basket analysis, cross selling, market segmentation
2. Risk analysis and management
• Forecasting, customer retention, improved underwriting, quality
control, competitive analysis
3. Fraud detection and detection of unusual patterns (outliers)

• Other Applications
1. Text mining (news group, email, documents) and Web mining
2. Stream data mining
3. DNA and bio-data analysis
Market Analysis & Management
(Potential DM)

• Clustering or classifying the customers based on


Customer profiling the products they purchase

• identifying the best products for different


Customer customers
requirement analysis • predict what factors will attract new customers

Provision of • multidimensional summary reports


summary • statistical summary information (data central
information tendency and variation)
References
1. Tan, Steinbach, Karpatne, Kumar, Lecture Notes, Chapter 1, Introduction to Data Mining, 2 nd
Edition, 2018
2. Pang-Ning Tan, Michael Steinbach & Vipin Kumar, Introduction to Data Mining, Addison Wesley,
2019.
3. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, 3rd Edition, Morgan
Kaufmann, 2012.
4. Coenen, Frans. Data mining: past, present and future. Knowledge Engineering Review, 26(1),
25-29, 2011
5. Gregory Piatetsky-Shapiro, Data Science: Past, Present, and Future KDnuggets 1© Kdnuggets,
2016

You might also like