0% found this document useful (0 votes)
40 views44 pages

Data Knowledge

Uploaded by

Peace No War
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views44 pages

Data Knowledge

Uploaded by

Peace No War
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

CAGAYAN STATE UNIVERSITY APARRI - GRADUATE SCHOOL

MASTER OF SCIENCE IN INFORMATION TECHNOLOGY

KNOWLEDGE
DATA DISCOVERY
IN INFORMATION
SYSTEM
Presented by:
Franksel P. Tindoc Jr.
Rogiefel G. Torres
Christian A. Elaurza
DATA AND KNOWLEDGE
DISCOVERY
DATA AND KNOWLEDGE
DISCOVERY
refers to the overall process of extracting valuable, non-
trivial, and actionable knowledge from large datasets. It
combines techniques from data mining, machine learning,
statistics, and database management to analyze data,
discover patterns, and generate insights.

Key characteristics:
Non-trivial extraction of implicit knowledge
Useful information from data
Potentially valuable insights
Improves decision-
making processes
Enhances competitive
WHY KDD
advantage
MATTERS? Supports data-driven
strategies
Facilitates innovation
and problem-solving
Relational databases
Data warehouses
Transactional data TYPES OF
Time-series data
Sequence data
DATA
Web data SOURCE
Social media data
Sensor data
ADVANTAGES OF USING
KDD
Improved Decision-Making
Automation and Efficiency
Enhanced Prediction and Forecasting
Cost Savings
Wide Applicability
Customizability
DISADVANTAGES OF
USING KDD
Data Quality Issues
Complexity in Technique Selection
Privacy and Security Concerns
Scalability and Performance
Interpretation of Results
Integration with Business Processes
Cost and Expertise
APPLICATIONS OF KDD
APPLICATIONS OF
KDD

Business and Marketing


Customer Segmentation
Fraud Detection
Market Basket Analysis
Customer Retention

Healthcare
Disease Prediction and Diagnosis
Drug Discovery
Patient Management
APPLICATIONS OF
KDD

Education
Student Performance Analysis
Adaptive Learning Systems
Curriculum Optimization

Retail and E-commerce


Dynamic Pricing
Inventory Management
Recommendation Systems
APPLICATIONS OF
KDD

Finance and Banking


Risk Management
Stock Market Analysis
Insurance Claims Analysis

Transportation and Logistics


Route Optimization
Demand Forecasting
Vehicle Maintenance
APPLICATIONS OF
KDD

Social Media and Online Platforms


Sentiment Analysis
User Behavior Analysis

Government and Public Services


Crime Analysis
Disaster Management
Tax Fraud Detection
APPLICATIONS OF
KDD

Energy and Environment


Energy Demand Prediction
Climate Change Analysis
Renewable Energy Optimization

Manufacturing
Quality Control
Supply Chain Optimization
Predictive Maintenance
CONCLUSION

KDD is a powerful process for extracting


valuable insights from data. By
understanding the KDD lifecycle and
leveraging appropriate tools and
techniques, organizations can gain
significant benefits from their data assets.
KNOWLEDGE
DISCOVERY PROCESS
Knowledge Discovery Process
helps us to find accurate
information
WHY DO WE It transform raw data into

NEED actionable insights, allowing


organizations to gain a
KNOWLEDGE competitive edge
It helps organization and
DISCOVERY individuals make sense of a
large and complex datasets,
PROCESS? enabling informed decision-
making and unlocking the value
hidden in data.
Database data
Data Warehouse
Transactional Data
Other kinds of Data(
Time-related data KINDS OF
Sequence data DATA THAT
Data streams
Spatial data CAN BE
Hypertext and PROCESSED
Multimedia data
Engineering design
data)
STEPS OF KNOWLEDGE
DISCOVERY
1. Problem Understanding
2. Data Collection
3. Data Cleaning
4. Data Integration
5. Data Transformation
6. Data Mining
7. Pattern Evaluation
8. Knowledge Presendation
9. Deployment
10. Iterative Refinement
KNOWLEDGE
DISCOVERY PROCESS
MODELS
Academic Research
Models

Industrial Models

Hybrid Models
ACADEMIC RESEARCH
MODELS
The efforts to establish a Knowledge Discovery
Process model were initiated in academia, in the
mid-1990s
There are two process models developed in 1996 &
1998:
Nine-step model by Fayyad et al
Eight-step model by Anand and Buchner
FAYYAD ET AL. NINE STEP
MODEL
1. Developing and understanding the
application domain
2. Creating a target data set
3. Data cleaning and pre-processing
4. Data reduction and projection
5. Choosing the data mining task
6. Choosing the data mining algorithm
7. Data mining
8. Interpreting mined patterns
9. Consolidating discovered knowledge
ANAND & BUCHNER EIGHT STEP
MODEL
1. Defining the Objective
2. Creating a Target Dataset
3. Data cleaning
4. Data Transformation
5. Data Reduction
6. Choosing the Appropriate Data Mining Task
7. Applying the Data Mining Algorithm
8. Interpreting and Evaluating Results
INDUSTRIAL MODEL

Two Representative Industrial Model


Five step model by Cabena et al
CRISP-DM model
FIVE STEP MODEL BY CABENA
ET AL

1. Business Objectives Determination


2. Data Preparation
3. Data Exploration
4. Modeling
5. Knowledge Deployment
CRISP-DM MODEL
Cross-Industry Standard Process for Data
Mining
First established by 4 companies in the
late 1990s: Integral Solutions Ltd., NCR,
DaimlerChrysler, and OHRA
It consists of 6 steps:
Business understanding
Data understanding
Data preparation
Modeling
Evaluation
Deployment
HYBRID MODEL
The development of academic and industrial
models has led to the development of hybrid
model, a model that combined aspects of
both academic and industrial model
It was developed by Cios et al.
it consist of six steps:
Understanding the Problem Domain
Data Understanding
Data Preparation
Data Mining
Evaluation
Deployment
CONCLUSION
Knowledge Discovery Process is a systematic
approach to extracting meaningful insights
and actionable knowledge from raw data.
By combining various steps - ranging from
understanding the problem and collecting
data to mining patterns and presenting
results - it ensures that data is transformed
into valuable information that can drive
decision-making, innovation, and strategic
advantages
DATA MINING
WHAT IS DATA MINING
In the field of information technology, data mining is the
process of applying a variety of statistical, machine
learning, and database approaches to extract meaningful
information from big datasets, including patterns,
correlations, and trends. It is an essential component of
data analysis that reveals hidden insights in data to assist
organizations in making well-informed decisions.
Data cleaning is the process
of eliminating mistakes, noise,
and discrepancies from data.
Integrating data from several
KEY CONCEPTS
sources to create a single
IN dataset is known as data
DATAMINING integration.
Data transformation is the
process of transforming data
into an analysis-ready format
(e.g., standardization).
TECHNIQUES IN DATA
MINING

SORTING DATA INTO PREDETERMINED


CATEGORIES, SUCH AS SPAM AND NON-SPAM
EMAILS, IS KNOWN AS CLASSIFICATION.
TECHNIQUES IN DATA
MINING

CLUSTERING IS THE PROCESS OF ASSEMBLING


RELATED DATA POINTS INTO GROUPS WITHOUT
THE USE OF PRE-ESTABLISHED CATEGORIES (E.G.,
CONSUMER SEGMENTATION).
TECHNIQUES IN DATA
MINING

FINDING CORRELATIONS BETWEEN VARIABLES IN


DATA IS KNOWN AS ASSOCIATION RULE MINING
(E.G., MARKET BASKET ANALYSIS: "CUSTOMERS
WHO BUY CHIPS OFTEN BUY SODA").
TECHNIQUES IN DATA
MINING

REGRESSION ANALYSIS IS THE PROCESS OF


FORECASTING A CONTINUOUS VARIABLE FROM
INPUT DATA, SUCH AS SALES.
TECHNIQUES IN DATA
MINING

FINDING ODD TRENDS OR ANOMALIES IN THE


DATA IS KNOWN AS ANOMALY DETECTION (E.G.,
FRAUD DETECTION).
Software: Weka,
RapidMiner, Orange,
TOOLS AND KNIME.
Programming Languages:
TECHNOLOGY
Python, R, SQL.
USED FOR Frameworks: Apache
DATA MINING Hadoop, Spark, TensorFlow
(for advanced
applications).
APPLICATIONS OF
DATA MINING

BUSINESS

FORECASTING SALES, ANALYZING THE MARKET,


AND MANAGING CUSTOMER RELATIONSHIPS.
APPLICATIONS OF
DATA MINING

HEALTHCARE

PERSONALIZED MEDICINE AND DISEASE


OUTBREAK PREDICTION.
APPLICATIONS OF
DATA MINING

FINANCE

IDENTIFYING FRAUD AND EVALUATING CREDIT


RISK.
APPLICATIONS OF
DATA MINING

E-COMMERCE

ANALYSIS OF CONSUMER BEHAVIOR AND


PRODUCT RECOMMENDATIONS.
APPLICATIONS OF
DATA MINING

EDUCATION

FINDING TRENDS IN STUDENT PERFORMANCE TO


INFORM INDIVIDUALIZED INSTRUCTION.
improves decision-making by
offering useful information.
makes the process of finding
IMPORTANCE information in big datasets
automated.
OF DATA encourages the creation of
MINING IN IT applications for AI and
predictive analytics.
stimulates innovation by
highlighting patterns that
conventional analysis might
miss.
CONCLUSION

In conclusion, data mining is a potent tool


in the era of big data because it turns
unprocessed data into insightful
knowledge.
THANK
YOU!

You might also like