Pptcs 1661

Uploaded by

Rafael Santana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views38 pages

Pptcs 1661

Uploaded by

Rafael Santana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

DATA MINING

AND
DATA WAREHOUSING
ARYA S V
Lecturer in Computer Science
School of Distance Education
University of Kerala
OVERVIEW
• Introduction • Decision Tree
• Data Mining • Bayesian Classifier
• Data pre-processing • Lazy Classifier
• Data Warehousing • K-Nearest Neighbor
• Data Cube method
• OLAP • Rule based Classification
• Market Basket Analysis • Cluster Analysis
• Association Rule • Partition Methods
• Apriori Algorithm • K-means and K-medoids
• Classification vs Prediction • Outlier Detection
Introduction to Data Mining
• Data is raw fact or disconnected fact.
• Information is the Processed data.
• Knowledge is derived from information by applying
rules to it.

• Data mining is the process of extracting hidden, valid,

and potentially useful patterns in huge data sets.
• Data Mining is all about discovering unsuspected/
previously unknown relationships amongst the data.
Steps in Data Mining
1. Data cleaning (remove noise and
inconsistent data)
2. Data integration (where multiple data
sources may be combined)
3. Data selection (where data relevant to
the analysis task are retrieved from the
database)
4. Data transformation (where data are
transformed and consolidated into forms
appropriate for mining by performing
summary or aggregation operations)
5. Data mining (an essential process
where intelligent methods are applied to
extract data patterns)
6. Pattern evaluation (to identify the
truly interesting patterns representing
knowledge based on interestingness
measures)
7. Knowledge presentation (where
visualization and knowledge
representation techniques are used to
present mined knowledge to users)
Types of Data for Mining
1. Flat files (The data for transactions, time-series data,
scientific measurements, etc can be represented in
these files.)
2. database data (Relational databases are one of the
most commonly available and richest information
repositories)
3. data warehouse data (A data warehouse is a
repository of information collected from multiple
sources, stored under a unified schema, and usually
residing at a single site. )
4. transactional data. (A transaction typically includes a
unique transaction identity number (trans ID) and a list
of the items making up the transaction, such as the
items purchased in the transaction. )
Application Domains
• Financial Data Analysis
• Retail Industry
• Telecommunication Industry
• Biological Data Analysis
• Other Scientific Applications
• Intrusion Detection
Two highly successful and popular application examples of
data mining:
 Business intelligence
 Search engines.
Data mining tasks can be classified into two categories:
 Descriptive mining tasks
 Predictive mining tasks.
Data Pre-Processing
Data Processing is a task of converting data from a given
form to a much more usable and desired form.
Why Preprocess the Data?
Data have quality if they satisfy the requirements of
the intended use.
There are many factors comprising data quality:
• Accuracy
• Completeness
• Consistency
• Timeliness
• Believability
• Interpretability
Stages of Data Processing Cycle

• Collection
• Preparation
• Input
• Processing
• Output
• Storage
Major tasks in Data Preprocessing
• Data cleaning
 Clean the data by filling in missing values,
smoothing noisy data, identifying or removing outliers,
and resolving inconsistencies.
• Data integration.
 Integrating multiple databases, data cubes, or files.
• Data reduction Data reduction
 Reduces the size of data and makes it suitable and
feasible for analysis.
• Data transformation.
 Converting data from one format or structure into
another format or structure.
Data Warehouse

• A single, complete and

consistent store of data
obtained from a variety of
different sources made
available to end users in a
what they can understand
and use in a business
context.
• Warehouse are the very
large databases.
What is Data Warehousing?
• The process of transforming data into information and
making it available to users in a timely enough manner
to make a difference is known as data warehousing.
• “A data warehouse is a subject-oriented, integrated,
time-variant, and nonvolatile collection of data in
support of management‟s decision making process”
• Data warehouses provide online analytical
processing (OLAP) tools for the interactive analysis of
multidimensional data of varied granularities, which
facilitates effective data generalization and data mining.
Differences Between Database
Systems And Data Warehouses
OLAP OLTP

Users and An OLAP system is market-oriented and is used OLTP system is customer-oriented and is
for data analysis by knowledge workers, used for transaction and query processing
system including managers, executives, and analysts. by clerks, clients, and information
orientatio technology professionals
n
Data An OLAP system manages large amounts of An OLTP system manages current data.
contents historic data.

Database typically adopts either a star or a snowflake An OLTP system usually adopts an entity-
design model and a subject -oriented database design. relationship (ER) data model and an
application -oriented database design.

View An OLAP system often spans multiple versions of An OLTP system focuses mainly on the
a database schema, due to the evolutionary current data within an enterprise or
process of an organization department, without referring to historic
data or data in different organizations

Access Accesses to OLAP systems are mostly read-only The access patterns of an OLTP system
patterns operations consist mainly of short, atomic transactions.
Data Warehouse Architecture
Data Cube
• Data warehouses and OLAP tools are based on a
multidimensional data model. This model views data in
the form of a data cube.
• A data cube allows data to be modeled and viewed in
multiple dimensions.
• It is defined by dimensions and facts.
• In general terms, dimensions are the perspectives or
entities with respect to which an organization wants to keep
records. Each dimension may have a table associated with
it, called a dimension table.
• Facts are numeric measures. The fact table contains the
names of the facts, or measures, as well as keys to each of
the related dimension tables.
A 3-D data cube
representation of the data A 4-D data cube representation of
according to time, item, and sales data, according to time, item,
location. Here the measure location, and supplier. The
displayed is dollars_sold (in measure displayed is dollars sold
thousands) (in thousands).(only some of the
cube values are shown.)
Lattice of Cuboids, making up a 4-D
Data Cube For Time, Item, Location,
And Supplier.
0-D(apex) cuboid
time item location supplier
1-D cuboids

time,item time,location item,location location,supplier

time,supplier item,supplier 2-D cuboids

time,location,supplier

3-D cuboids
time,item,location time,item,supplier item,location,supplier

time, item, location, supplier 4-D(base) cuboid

OLAP
(Online Analytical Processing)
• OLAP provides a user-friendly environment for interactive
data analysis.
• An OLAP system is market-oriented and is used for data
analysis by knowledge workers, including managers,
executives, and analysts.
• An OLAP system manages large amounts of historic data,
provides facilities for summarization and aggregation, and
stores and manages information at different levels of
granularity.
• OLAP systems can organize and present data in various
formats in order to accommodate the diverse needs of
different users.
Examples of Typical OLAP Operations
on Multidimensional Data.
Market Basket Analysis

• Market basket analysis

analyzes customer buying
habits by finding associations
between the different items
that customers place in their
“shopping baskets”
• The discovery of these
associations can help retailers
develop marketing strategies
by analyzing which items are
frequently purchased
together by customers.
Association Rule
• Association rule mining finds interesting associations and
relationships among large sets of data items.
• This rule shows how frequently a item set occurs in a
transaction. A typical example is Market Based Analysis.
• Market Based Analysis is one of the key techniques used by
large relations to show associations between items.
• It allows retailers to identify relationships between the items
that people buy together frequently.
The basic definitions:
• Support Count() – Frequency of occurrence of a itemset.
• Frequent Item set – An item set whose support is greater
than or equal to minsup threshold.
• Association Rule – An implication expression of the form
X -> Y, where X and Y are any 2 item sets.
Apriori Algorithm
• It uses prior knowledge of frequent item set properties.
• We apply an iterative approach or level-wise search where
k-frequent item sets are used to find k+1 item sets.
• To improve the efficiency of level-wise generation of
frequent item sets, an important property is used
called Apriori property which helps by reducing the search
space.
• Apriori Property
All non-empty subset of frequent item set must be
frequent. The key concept of Apriori algorithm is its anti-
monotonicity of support measure.
Apriori Algorithm Steps
1. Scan the transaction data base to get the support „S‟
each 1-itemset, compare „S‟ with min_sup, and get a
support of 1-itemsets,
2. Use join to generate a set of candidate k-item set. Use
apriori property to prune the unfrequented k-item sets
from this set.
3. Scan the transaction database to get the support „S‟ of
each candidate k-item set in the given set, compare „S‟
with min_sup, and get a set of frequent k-item set
4. If the candidate set is NULL, for each frequent item set
1, generate all nonempty subsets of 1.
5. For every nonempty subsets of 1, output the rule
“s=>(1-s)” if confidence C of the rule “s=>(1-s)”
min_conf
6. If the candidate set is not NULL, go to step 2.
Classification vs Prediction
• Classification is the process of finding a model that
describes and distinguishes data classes and concepts.
• Classification is the problem of identifying on the basis of
a training set of data containing observations and whose
categories membership is known.
• It is a two-step process,
Learning step (where a classification model is
constructed)
Classification step (where the model is used to predict
class labels for given data).
• A medical researcher wants to analyze breast cancer data
to predict which one of three specific treatments a patient
should receive.
Classification vs Prediction(cont..)
• Here data analysis task is classification,
where a model or classifier is constructed to
predict class (categorical) labels, such as “safe” or “risky”
for the loan application data;
“yes” or “no” for the marketing data;
“treatment A,” “treatment B,” or “treatment C” for
the medical data.
• Suppose that the marketing manager wants to predict
how much a given customer will spend during a sale at a
shop.
• This data analysis task is an example of numeric
prediction, where the model constructed predicts a
continuous-valued function, or ordered value, as
opposed to a class label. This model is a predictor.
Decision Trees for Classification
• Decision tree is the most
powerful and popular tool for
classification and prediction.
• A Decision tree is a flowchart
like tree structure, where each
internal node denotes a test
on an attribute, each branch
represents an outcome of the
test, and each leaf node
(terminal node) holds a class
label.
• The topmost node in a tree is
the root node.
Naive Bayesian Classifier
• Naive Bayes classifiers are a collection of classification
algorithms based on Bayes’ Theorem. It is not a single
algorithm but a family of algorithms where all of them
share a common principle,
i.e. every pair of features being classified is
independent of each other.
• Bayes‟ Theorem finds the probability of an event
occurring given the probability of another event that has
already occurred. Bayes‟ theorem is stated
mathematically as the following equation:
𝑃 𝐵 𝐴 𝑃(𝐴)
𝑃 𝐴𝐵 =
𝑃(𝐵)
where A and B are events.
Lazy Learners
• Lazy Learners are the most intuitive type of learners and
are used in many practical scenarios.
• The process of modeling the training data until it is needed
to classify the testing data. Techniques that employ this
strategy are known as Lazy Learners.
• A lazy learner simply stores the training data and only when
it sees a test tuple starts generalization to classify the tuple
based on its similarity to the stored training tuples.
• Lazy learners do less work while training data is given and
more work when classification of a test tuple is given.
• Lazy learners can be computationally very expensive while
doing classification or predictions which do not require any
model building.
k–Nearest Neighbor Method
• Mainly used to finds intense application in pattern recognition,
data mining and intrusion detection.
• It is widely disposable in real-life scenarios since it is non-
parametric, meaning, it does not make any underlying
assumptions about the distribution of data
Algorithm
Let m be the number of training data samples. Let p be an
unknown point.
1. Store the training samples in an array of data points arr[]. This
means each element of this array represents a tuple (x, y).
2. for i=0 to m: Calculate Euclidean distance d(arr[i], p).
3. Make set S of K smallest distances obtained. Each of these
distances corresponds to an already classified data point.
4. Return the majority label among S.
Rule-Based Classification
• Rule-based classifier makes use of a set of IF-THEN rules
for classification. We can express a rule in the following
from −
Let us consider a rule R1,
R1: IF age = youth AND student = yes THEN
buy_computer = yes
Points to remember
• The IF part of the rule is called rule
antecedent or precondition.
• The THEN part of the rule is called rule consequent.
• The antecedent part the condition consist of one or more
attribute tests and these tests are logically ANDed.
• The consequent part consists of class prediction.
Cluster Analysis
• Clustering is the process of grouping a set of data
objects into multiple groups or clusters so that objects
within a cluster have high similarity, but are very
dissimilar to objects in other clusters.
• Cluster analysis or simply clustering is the process of
partitioning a set of data objects (or observations) into
subsets.
• Each subset is a cluster, such that objects in a cluster
are similar to one another, yet dissimilar to objects in
other clusters. The set of clusters resulting from a
cluster analysis can be referred to as clustering.
• Clustering is also called data segmentation in some
applications because clustering partitions large data sets
into groups according to their similarity
The data points in the graph
below clustered together can be
classified into one single group.
• Clustering can also be used
We can distinguish the clusters,
for outlier detection,
and we can identify that there are
where outliers (values that
3 clusters in the below picture.
are “far away” from any
cluster) may be more
interesting than common
cases.
• Clustering is known as
unsupervised learning
because the class label
information is not present.
Requirements for Cluster Analysis

• Scalability
• Ability to deal with different types of attributes
• Discovery of clusters with arbitrary shape
• Requirements for domain knowledge to determine input
parameters
• Ability to deal with noisy data
• Incremental clustering and insensitivity to input order
• Capability of clustering high-dimensionality data
• Constraint-based clustering
• Interpretability and usability
Partitioning Methods
• The simplest and most fundamental version of cluster
analysis is partitioning, which organizes the objects of a set
into several exclusive groups or clusters.
• The general criterion of a good partitioning is that objects in
the same cluster are “close” or related to each other,
whereas objects in different clusters are “far apart” or very
different.
• General Characteristics of Partitioning methods
 Find mutually exclusive clusters of spherical shape
 Distance-based
 May use mean or medoid (etc.) to represent cluster center
 Effective for small- to medium-size data sets
K-Means: A Centroid-Based Technique
Input:
• k: the number of clusters, (4) update the cluster
• D: a data set containing n means, that is, calculate
objects. the mean value of the
Output: A set of k clusters. objects for each cluster;
Method: (5) until no change;
(1) arbitrarily choose k objects from
D as the initial cluster centers;
(2) repeat
(3) (re)assign each object to the
cluster to which the object is the
most similar, based on the
mean value of the objects in the
cluster;
k-Medoids
• It is also called as Partitioning Around Medoid algorithm.
• A medoid can be defined as the point in the cluster, whose
dissimilarities with all the other points in the cluster is minimum.
• The dissimilarity of the medoid(Ci) and object(Pi) is calculated
by using 𝑬 = |𝑷𝒊 − 𝑪𝒊|
Algorithm:
1. Initialize: select k random points out of the n data points as the
medoids.
2. Associate each data point to the closest medoid by using any
common distance metric methods.
3. While the cost decreases:
For each medoid m, for each data o point which is not a medoid:
1.Swap m and o, associate each data point to the closest medoid,
recompute the cost.
2.If the total cost is more than that in the previous step, undo the
swap.
Outlier Detection in Clustering
• An outlier is a data object that
deviates significantly from the rest of The objects in
the objects, as if it were generated region R are outliers
by a different mechanism.
• Outliers are referred as “abnormal”
data.
• Outliers are different from noisy data.
Noise is a random error or variance in
a measured variable.
• Outliers are interesting because they Types of Outliers
are suspected of not being generated  Global Outliers
by the same mechanisms as the rest  Contextual Outliers
of the data.  Collective Outliers
• Outlier detection is also related to
novelty detection in evolving data
sets.
Outlier detection techniques.
Supervised, Semi- Statistical Methods,
Supervised, and Proximity-Based Methods and
Unsupervised Methods Clustering-Based Methods
• Statistical methods (also known as
• Supervised methods model data model-based methods) make
normality and abnormality. assumptions of data normality.
• In some application scenarios, • The effectiveness of proximity-
objects labeled as “normal” or based methods relies heavily on
“outlier” are not available. the proximity (or distance)
Thus, an unsupervised learning measure used.
method has to be used.
• Clustering-based methods assume
• In some cases where only a that the normal data objects
small set of the normal and/or belong to large and dense
outlier objects are labeled, but clusters, whereas outliers belong
most of the data are unlabeled. to small or sparse clusters, or do
not belong to any clusters.
Thank You

DWDM B Tech Unit 1 Part-A
No ratings yet
DWDM B Tech Unit 1 Part-A
15 pages
Zak, Cameron - Data Mining Concepts and Techniques - Complete Guide To A Comprehensive Understanding of Data Mining (2020) - Libgen - Li
No ratings yet
Zak, Cameron - Data Mining Concepts and Techniques - Complete Guide To A Comprehensive Understanding of Data Mining (2020) - Libgen - Li
372 pages
Hu DM 2024
No ratings yet
Hu DM 2024
205 pages
DWM Cheatsheet Sem 5
No ratings yet
DWM Cheatsheet Sem 5
27 pages
Data Mining
No ratings yet
Data Mining
11 pages
DM Notes
No ratings yet
DM Notes
193 pages
DataminingWarehousing Module 1 PPT Notes
No ratings yet
DataminingWarehousing Module 1 PPT Notes
95 pages
Unit 2
No ratings yet
Unit 2
144 pages
Data Mining - Unit-1
No ratings yet
Data Mining - Unit-1
28 pages
UNIT2DM
No ratings yet
UNIT2DM
63 pages
BusinessIntelligence 2023
No ratings yet
BusinessIntelligence 2023
36 pages
DWDM
No ratings yet
DWDM
48 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
39 pages
Data Warehousing
No ratings yet
Data Warehousing
21 pages
Datamining 1
No ratings yet
Datamining 1
21 pages
Data Mining, Data Warehousing and Knowledge Discovery
No ratings yet
Data Mining, Data Warehousing and Knowledge Discovery
15 pages
Fundamentals of Data Mining
No ratings yet
Fundamentals of Data Mining
36 pages
Unit 2 DATA WAREHOUSE AND DATA MART
No ratings yet
Unit 2 DATA WAREHOUSE AND DATA MART
17 pages
Lecture 1 & 2
No ratings yet
Lecture 1 & 2
14 pages
Vicon DIM IV
No ratings yet
Vicon DIM IV
26 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
10 pages
Data Mining and Warehosuing Lecture 01
No ratings yet
Data Mining and Warehosuing Lecture 01
36 pages
Defining Data Mining and Data Warehouse
No ratings yet
Defining Data Mining and Data Warehouse
10 pages
Chapter 2.introduction To Data Warehouse
No ratings yet
Chapter 2.introduction To Data Warehouse
49 pages
Data Mining - 2
No ratings yet
Data Mining - 2
16 pages
DM Chapter 2
No ratings yet
DM Chapter 2
35 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
9 MidReview
No ratings yet
9 MidReview
25 pages
Unit 3 PPT (BA)
No ratings yet
Unit 3 PPT (BA)
19 pages
DMDW 1 2nd Module
No ratings yet
DMDW 1 2nd Module
29 pages
HAJJATII
No ratings yet
HAJJATII
11 pages
DM Unit-1
No ratings yet
DM Unit-1
14 pages
DWM Assigment-Questions Ans
No ratings yet
DWM Assigment-Questions Ans
67 pages
358 44 Datamining and Warehousing 4.4
No ratings yet
358 44 Datamining and Warehousing 4.4
155 pages
Unit 4
No ratings yet
Unit 4
27 pages
1 Chapter One
No ratings yet
1 Chapter One
54 pages
Adbms Unit5
No ratings yet
Adbms Unit5
10 pages
Department of Information Technology: Data Warehousing and Data Mining IT4204 3
No ratings yet
Department of Information Technology: Data Warehousing and Data Mining IT4204 3
60 pages
DWDM PPT by DR - Shankaragowda B.B
No ratings yet
DWDM PPT by DR - Shankaragowda B.B
11 pages
DWDM
No ratings yet
DWDM
11 pages
Unit 1 DMDW
No ratings yet
Unit 1 DMDW
57 pages
DMDW 2nd Module
No ratings yet
DMDW 2nd Module
29 pages
Data Mining
No ratings yet
Data Mining
7 pages
DM Unit2 (Part1)
No ratings yet
DM Unit2 (Part1)
19 pages
Data Minng
No ratings yet
Data Minng
20 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
17 pages
Defining Data Mining and Data Warehouse (Adugna Gutema)
No ratings yet
Defining Data Mining and Data Warehouse (Adugna Gutema)
9 pages
UNIT-1 Introduction To Data Mining
No ratings yet
UNIT-1 Introduction To Data Mining
29 pages
XL Miner User Guide
No ratings yet
XL Miner User Guide
420 pages
What Motivated Data Mining? Why Is It Important?
No ratings yet
What Motivated Data Mining? Why Is It Important?
14 pages
Unit 1
No ratings yet
Unit 1
11 pages
Unit I DATA MINING AAGAC
No ratings yet
Unit I DATA MINING AAGAC
27 pages
Data Mining
No ratings yet
Data Mining
8 pages
Deepanshi Dube Gaurav Kalra Sakshi
No ratings yet
Deepanshi Dube Gaurav Kalra Sakshi
32 pages
Unit 3 Data Mining PDF
No ratings yet
Unit 3 Data Mining PDF
19 pages
LP-III Lab Manual
No ratings yet
LP-III Lab Manual
49 pages
Predicting Customer Using SVM
100% (1)
Predicting Customer Using SVM
24 pages
The Key in Business Is To Know Something That Nobody Else Knows.
No ratings yet
The Key in Business Is To Know Something That Nobody Else Knows.
43 pages
Session Commands
No ratings yet
Session Commands
1,046 pages
Unit 1
No ratings yet
Unit 1
9 pages
Titanic: Logistic Regression Project
No ratings yet
Titanic: Logistic Regression Project
19 pages
Manual For Satellite Data Analysis Ecognition Developer PDF
No ratings yet
Manual For Satellite Data Analysis Ecognition Developer PDF
80 pages
Data Mining Classification and Prediction
No ratings yet
Data Mining Classification and Prediction
17 pages
Forest Fire Prediction Using Machine Learning
No ratings yet
Forest Fire Prediction Using Machine Learning
28 pages
BreastCancer CNNs
No ratings yet
BreastCancer CNNs
17 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
ML LAB MANUAL (ACSML0651) - DR Roop Singh
No ratings yet
ML LAB MANUAL (ACSML0651) - DR Roop Singh
58 pages
Student's Performance Prediction Using Weighted Modified ID3 Algorithm
No ratings yet
Student's Performance Prediction Using Weighted Modified ID3 Algorithm
6 pages
Artificial Intelligence and Machine Learning: Subject Code: 21CS54 by Savitha Nagaraju Aiml Dept, Atme
No ratings yet
Artificial Intelligence and Machine Learning: Subject Code: 21CS54 by Savitha Nagaraju Aiml Dept, Atme
80 pages
Use of Gabor Filters For Texture Classification of Digital Images
No ratings yet
Use of Gabor Filters For Texture Classification of Digital Images
14 pages
Production Flow Analysis (PFA) : Vandana Srivastava
No ratings yet
Production Flow Analysis (PFA) : Vandana Srivastava
13 pages
Thesis Presentation
No ratings yet
Thesis Presentation
33 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
36 pages
Thesis
No ratings yet
Thesis
6 pages
Finalproject Review PPT
No ratings yet
Finalproject Review PPT
39 pages
Prediction of Risk Delay in Construction Projects Using
No ratings yet
Prediction of Risk Delay in Construction Projects Using
15 pages
2023-Contextualizing The Current State of Research On The Use Ofmachine Learning For Student Performance Prediction Asystematic Literature Review
No ratings yet
2023-Contextualizing The Current State of Research On The Use Ofmachine Learning For Student Performance Prediction Asystematic Literature Review
25 pages
Ijgi 10 00332 With Cover
No ratings yet
Ijgi 10 00332 With Cover
17 pages
Exploring Flavors Through AI The Future of Culinary Taste Prediction
No ratings yet
Exploring Flavors Through AI The Future of Culinary Taste Prediction
9 pages
Human Diseases Detection Based On Machine Learning Algorithms: A Review
No ratings yet
Human Diseases Detection Based On Machine Learning Algorithms: A Review
13 pages
Weed Identification Using Deep Learning and Image Processing in Vegetable Plantation
No ratings yet
Weed Identification Using Deep Learning and Image Processing in Vegetable Plantation
11 pages
Social Media Data Analytics To Improve Supply Chain Management in Food Industries
No ratings yet
Social Media Data Analytics To Improve Supply Chain Management in Food Industries
18 pages
Pediatric Risk Stratification Is Improved by Integrating Both Patient Comorbidities and Intrinsic Surgical Risk
No ratings yet
Pediatric Risk Stratification Is Improved by Integrating Both Patient Comorbidities and Intrinsic Surgical Risk
10 pages
Data Mining - Practical Machine Learning Tools and
No ratings yet
Data Mining - Practical Machine Learning Tools and
3 pages
Software Vulnerability Prediction Using Text Analysis Techniques
No ratings yet
Software Vulnerability Prediction Using Text Analysis Techniques
3 pages
4141 Final IEEE-igarss Publishedpaper2018
No ratings yet
4141 Final IEEE-igarss Publishedpaper2018
4 pages

Pptcs 1661

Uploaded by

Pptcs 1661

Uploaded by

DATA MINING

• Data mining is the process of extracting hidden, valid,

• A single, complete and

time,item time,location item,location location,supplier

time,supplier item,supplier 2-D cuboids

time, item, location, supplier 4-D(base) cuboid

• Market basket analysis

You might also like