03 Data Mining Functionalities
03 Data Mining Functionalities
Cluster Analysis
Outlier Analysis
Concept/Class Description
Data can be associated with classes or concepts.
instances.
discrimination or both.
Concept/Class Description
Data characterization is a summary of the general characteristics or
features of a target class of data.
The result could be a general profile of the customers, such as they are
40–50 years old, employed, and have excellent credit ratings.
The system should allow users to drill down on any dimension, such as
on occupation in order to view these customers according to their type of
employment.
Concept/Class Description
Data discrimination is a comparison of the general
features of target class data objects with the general
features of objects from one or a set of contrasting classes.
Use the model to predict the class of objects whose class label is
unknown.
Decision trees
Derive a model for each of these three classes based on the descriptive
features of the items, such as price, brand, place made, type, and
category.
IF-THEN rules:
Classification and Prediction
Example:
Decision tree:
Predict the amount of revenue that each item will generate during an
upcoming sale at AllElectronics, based on previous sales data.
Cluster Analysis
Unlike classification and prediction, which analyse class-labelled data
objects, clustering analyses data objects without consulting a known
class label.
These data objects are outliers. Most data mining methods discard
outliers as noise or exceptions.