Knowledge Discovery in Database
Knowledge Discovery in Database
(KNOWLEDGE DISCOVERY IN
DATABASE)
Durgesh Yadav
Prince Gautam
Bipin Agrahari
KDD
INTRODUCTION
KDD stands for Knowledge Discovery in Database
KDD (Knowledge Discovery in Databases) is a process that involves the extraction of
useful, previously unknown, and potentially valuable information from large datasets.
It is the process of uncovering patterns and other valuable information from large data
sets.
The KDD uses data mining algorithms for the data discovery in a large dataset.
2
K D D I N D ATA M I N I N G
KDD PROCESS
Steps:-
Data Cleaning
Data Integration
Data Selection
Data Transformation
Data Mining
Pattern Evolution
Data Presentation
3
KDD
D ATA C L E A N I N G
Data Cleaning is the process of removing noise and inconsistent data from a
database.
The goal is to improve the quality of the data, making it more suitable for processing
and analysis.
The data cleaning process is done by the discrepancy detection and transformation
tools.
4
KDD
D ATA I N T E G R AT I O N
Data integration is the process of combining data from different sources to provide a
unified view or dataset.
It's an essential step in many data-related activities, especially in environments where
data is collected across various platforms or systems.
Data integration tools are migration tool, synchronization tool, etc.
5
P R E S E N TAT I O N T I T L E
D ATA S E L E C T I O N
This stage involves selecting the relevant subset of data from the available dataset(s)
for further processing and analysis.
The goal of data selection is to reduce the volume of data to be processed and
analyzed while ensuring that the selected data contains the information needed for
meaningful insights.
6
KDD
D ATA T R A N S F O R M AT I O N
Data transformation is the process of converting raw data into a format that will be
more effective and appropriate for the subsequent data mining and analysis tasks.
Proper transformation can help in improving the accuracy and efficiency of the
discovery algorithms.
7
KDD
D ATA M I N I N G
Data mining refers to the application of algorithms and techniques to identify and
extract meaningful patterns, associations, clusters, and knowledge from the
transformed and prepared data.
8
KDD
PAT T E R N E V O L U T I O N
Pattern evaluation in KDD is the step where discovered patterns from data are
assessed for their importance and relevance, ensuring that only meaningful and
valuable insights are retained.
9
THANK YOU