0% found this document useful (0 votes)

46 views15 pages

Unit 4

The document discusses association rules, which are if-then statements that uncover relationships between unrelated data. Association rules have two parts: an antecedent (if) and consequent (then). They are created by analyzing data for frequent patterns and using support and confidence criteria. Association rules are useful for analyzing customer behavior and play an important role in applications like market basket analysis, product clustering, and machine learning. Classification involves assigning categories or labels to data based on rules generated during training. Regression predicts continuous numeric values using a mathematical formula fit to the data.

Uploaded by

Vinay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views15 pages

Unit 4

Uploaded by

Vinay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

Association rules are if/then statements that help uncover relationships between seemingly

unrelated data in a relational database or other information repository. An example of an

association rule would be "If a customer buys a dozen eggs, he is 80% likely to also purchase
milk." An association rule has two parts, an antecedent (if) and a consequent (then). An
antecedent is an item found in the data. A consequent is an item that is found in combination
with the antecedent.

Association rules are created by analyzing data for frequent if/then patterns and using the criteria
support and confidence to identify the most important relationships. Support is an indication of
how frequently the items appear in the database. Confidence indicates the number of times the
if/then statements have been found to be true.

In data mining, association rules are useful for analyzing and predicting customer behavior. They
play an important part in shopping basket data analysis, product clustering, catalog design and
store layout.

Programmers use association rules to build programs capable of machine learning. Machine
learning is a type of artificial intelligence (AI) that seeks to build programs with the ability to
become more efficient without being explicitly programmed.

5.3.1 Association Rules

The goal of the techniques described in this section is to detect relationships or associations
between specific values of categorical variables in large data sets. These powerful exploratory
techniques have a wide range of applications in many areas of business practice and also
research - from the analysis of consumer preferences or human resource management, to the

history of language. These techniques enable analysts and researchers to uncover hidden patterns
in large data sets, such as "customers who order product A often also order product B or C" or
"employees who said positive things about initiative X also frequently complain about issue Y
but are happy with issue Z."

Association rules mining has many applications other than market basket analysis, including
applications in marketing, customer segmentation, medicine, electronic commerce,
bioinformatics and finance.

How do Association Rules Work? The usefulness of this technique to address unique data
mining problems is best illustrated in a simple example. Suppose you are collecting data at the
checkout cash registers at a large bookstore. Each customer transaction is logged in a database,
and consists of the titles of the books purchased by the respective customer, perhaps additional
magazine titles and other gift items that were purchased, and so on. Hence, each record in the
database will represent one customer (transaction), and may consist of a single book purchased
by that customer, or it may consist of many (perhaps hundreds of) different items that were
purchased, arranged in an arbitrary order depending on the order in which the different items
(books, magazines, and so on) came down the conveyor belt at the cash register. The purpose of
the analysis is to find associations between the items that were purchased, i.e., to derive
association rules that identify the items and co-occurrences of different items that appear with the
greatest (co-) frequencies. For example, you want to learn which books are likely to be
purchased by a customer who you know already purchased (or is about to purchase) a particular
book. This type of information could then quickly be used to suggest to the customer those
additional titles. You may already be "familiar" with the results of these types of analyses, if you
are a customer of various on-line (Web-based) retail businesses; many times when making a
purchase on-line, the vendor will suggest similar items (to the ones purchased by you) at the time
of "check-out", based on some rules such as "customers who buy book title A are also likely to
purchase book title B," and so on.

There are many interesting algorithms proposed recently to discover association rules. One of the
key features of all algorithms is that each of these methods assume that the underlying database
size is enormous and they require multiple passes over the database.

Thus, the desirable features of any efficient algorithm are,

(a) To reduce the I/O operations

(b) at the same time be efficient in computing.

5.3.2 Classification

Definition: Classification is a Data Mining (machine learning) technique used to predict group
membership for data instances. For example, you may wish to use classification to predict if the
weather on a particular day will be “sunny”, “rainy” or “cloudy”. Popular classification
techniques include decision trees and neural networks.

Regression and Classification are two of the more popular Classification Techniques.

Classification involves finding rules that partition the data into disjoint groups. The input for the
classification is the training data set, whose class labels are already known. Classification
analyzes the training data set and constructs a model based on the class label, and aims to assign
a class label to the future unlabelled records. Since the class field is known, this type of
classification is known as supervised learning. A set of classification rules are generated by such
a classification process, which can be used to classify future data and develop a better
understanding of each class in the database.

The applications include the credit card analysis, banking, medical applications and the like.

Classification Example
Problem:

Given a new automobile insurance applicant, should he or she be classified as low risk, medium
risk or high risk?

Classification rules for above problem could use a variety of data, such as customer’s
educational level, salary, age, etc.

Classification Rules:

Rule 1:

" Person P, P.degree = masters and P.income > 75,000

P.credit = “Excellent”

Rule 2:

" person P, P.degree = bachelors and (P.income ³ 25,000 and P.income £ 75,000) P.credit =
“Good”

Decision Trees for Classification

A Decision Tree is a predictive model that, as its name implies, can be viewed as a tree.
Specifically each branch of the tree is a classification question and the leaves of the tree are
partitions of the dataset (data base table/file) with their classification.

In the above classification, four groups are classified i.e Bad, Good, Average and Excellent. At
any moment of time the customer would fall into any one of the group.

5.3.3 Regression

Regression is the oldest and most well known Statistical technique that the Data Mining
community utilizes. Basically, Regression takes a numerical dataset and develops a mathematical
formula (Eg: y=a+ bx, here y is the dependant variable and x is the independent variable) that fits
the data. When you're ready to use the results to predict future behavior, you simply take your
new data, plug it into the developed formula and you've got a prediction. The major limitation of
this technique is that it only works well with continuous quantitative data (like weight, speed or
age).

If the data is categorical, where order is not significant (like color, name or gender) then it is
better off choosing another technique.
There are two forms of data analysis that can be used for extracting models describing important
classes or to predict future data trends. These two forms are as follows −

 Classification
 Prediction

Classification models predict categorical class labels; and prediction models predict continuous
valued functions. For example, we can build a classification model to categorize bank loan
applications as either safe or risky, or a prediction model to predict the expenditures in dollars of
potential customers on computer equipment given their income and occupation.

What is classification?

Following are the examples of cases where the data analysis task is Classification −

 A bank loan officer wants to analyze the data in order to know which customer (loan
applicant) are risky or which are safe.
 A marketing manager at a company needs to analyze a customer with a given profile,
who will buy a new computer.

In both of the above examples, a model or classifier is constructed to predict the categorical
labels. These labels are risky or safe for loan application data and yes or no for marketing data.

What is prediction?

Following are the examples of cases where the data analysis task is Prediction −

Suppose the marketing manager needs to predict how much a given customer will spend during a
sale at his company. In this example we are bothered to predict a numeric value. Therefore the
data analysis task is an example of numeric prediction. In this case, a model or a predictor will be
constructed that predicts a continuous-valued-function or ordered value.

Note − Regression analysis is a statistical methodology that is most often used for numeric
prediction.

How Does Classification Works?

With the help of the bank loan application that we have discussed above, let us understand the
working of classification. The Data Classification process includes two steps −

 Building the Classifier or Model

 Using Classifier for Classification

Building the Classifier or Model

 This step is the learning step or the learning phase.

 In this step the classification algorithms build the classifier.
 The classifier is built from the training set made up of database tuples and their
associated class labels.
 Each tuple that constitutes the training set is referred to as a category or class. These
tuples can also be referred to as sample, object or data points.

Using Classifier for Classification

In this step, the classifier is used for classification. Here the test data is used to estimate the
accuracy of classification rules. The classification rules can be applied to the new data tuples if
the accuracy is considered acceptable.

Classification and Prediction Issues

The major issue is preparing the data for Classification and Prediction. Preparing the data
involves the following activities −

 Data Cleaning − Data cleaning involves removing the noise and treatment of missing
values. The noise is removed by applying smoothing techniques and the problem of
missing values is solved by replacing a missing value with most commonly occurring
value for that attribute.
 Relevance Analysis − Database may also have the irrelevant attributes. Correlation
analysis is used to know whether any two given attributes are related.
 Data Transformation and reduction − The data can be transformed by any of the
following methods.
o Normalization − The data is transformed using normalization. Normalization
involves scaling all values for given attribute in order to make them fall within a
small specified range. Normalization is used when in the learning step, the neural
networks or the methods involving measurements are used.
o Generalization − The data can also be transformed by generalizing it to the
higher concept. For this purpose we can use the concept hierarchies.

Note − Data can also be reduced by some other methods such as wavelet transformation,
binning, histogram analysis, and clustering.

Comparison of Classification and Prediction Methods

Here is the criteria for comparing the methods of Classification and Prediction −

 Accuracy − Accuracy of classifier refers to the ability of classifier. It predict the class
label correctly and the accuracy of the predictor refers to how well a given predictor can
guess the value of predicted attribute for a new data.
 Speed − This refers to the computational cost in generating and using the classifier or
predictor.
 Robustness − It refers to the ability of classifier or predictor to make correct predictions
from given noisy data.
 Scalability − Scalability refers to the ability to construct the classifier or predictor
efficiently; given large amount of data.
 Interpretability − It refers to what extent the classifier or predictor understands.

Data Mining - Decision Tree Induction

A decision tree is a structure that includes a root node, branches, and leaf nodes. Each internal
node denotes a test on an attribute, each branch denotes the outcome of a test, and each leaf node
holds a class label. The topmost node in the tree is the root node.

The following decision tree is for the concept buy_computer that indicates whether a customer at
a company is likely to buy a computer or not. Each internal node represents a test on an attribute.
Each leaf node represents a class.
The benefits of having a decision tree are as follows −

 It does not require any domain knowledge.

 It is easy to comprehend.
 The learning and classification steps of a decision tree are simple and fast.

Decision Tree Induction Algorithm

A machine researcher named J. Ross Quinlan in 1980 developed a decision tree algorithm known
as ID3 (Iterative Dichotomiser). Later, he presented C4.5, which was the successor of ID3. ID3
and C4.5 adopt a greedy approach. In this algorithm, there is no backtracking; the trees are
constructed in a top-down recursive divide-and-conquer manner.

Generating a decision tree form training tuples of data partition D

Algorithm : Generate_decision_tree

Input:
Data partition, D, which is a set of training tuples
and their associated class labels.
attribute_list, the set of candidate attributes.
Attribute selection method, a procedure to determine the
splitting criterion that best partitions that the data
tuples into individual classes. This criterion includes a
splitting_attribute and either a splitting point or splitting subset.

Output:
A Decision Tree

Method
create a node N;

if tuples in D are all of the same class, C then

return N as leaf node labeled with class C;

if attribute_list is empty then

return N as leaf node with labeled
with majority class in D;|| majority voting

apply attribute_selection_method(D, attribute_list)

to find the best splitting_criterion;
label node N with splitting_criterion;

if splitting_attribute is discrete-valued and

multiway splits allowed then // no restricted to binary trees

attribute_list = splitting attribute; // remove splitting attribute

for each outcome j of splitting criterion

// partition the tuples and grow subtrees for each partition

let Dj be the set of data tuples in D satisfying outcome j; // a partition

if Dj is empty then
attach a leaf labeled with the majority
class in D to node N;
else
attach the node returned by Generate
decision tree(Dj, attribute list) to node N;
end for
return N;

Tree Pruning

Tree pruning is performed in order to remove anomalies in the training data due to noise or
outliers. The pruned trees are smaller and less complex.

Tree Pruning Approaches

Here is the Tree Pruning Approaches listed below −

 Pre-pruning − The tree is pruned by halting its construction early.

 Post-pruning - This approach removes a sub-tree from a fully grown tree.

Cost Complexity

The cost complexity is measured by the following two parameters −

 Number of leaves in the tree, and
 Error rate of the tree.

Data Mining - Bayesian Classification

Baye's Theorem

Bayes' Theorem is named after Thomas Bayes. There are two types of probabilities −

 Posterior Probability [P(H/X)]

 Prior Probability [P(H)]

where X is data tuple and H is some hypothesis.

According to Bayes' Theorem,

P(H/X)= P(X/H)P(H) / P(X)

Bayesian Belief Network

Bayesian Belief Networks specify joint conditional probability distributions. They are also
known as Belief Networks, Bayesian Networks, or Probabilistic Networks.

 A Belief Network allows class conditional independencies to be defined between subsets

of variables.
 It provides a graphical model of causal relationship on which learning can be performed.
 We can use a trained Bayesian Network for classification.

There are two components that define a Bayesian Belief Network −

 Directed acyclic graph

 A set of conditional probability tables

Directed Acyclic Graph

 Each node in a directed acyclic graph represents a random variable.

 These variable may be discrete or continuous valued.
 These variables may correspond to the actual attribute given in the data.

Directed Acyclic Graph Representation

The following diagram shows a directed acyclic graph for six Boolean variables.
The arc in the diagram allows representation of causal knowledge. For example, lung cancer is
influenced by a person's family history of lung cancer, as well as whether or not the person is a
smoker. It is worth noting that the variable PositiveXray is independent of whether the patient
has a family history of lung cancer or that the patient is a smoker, given that we know the patient
has lung cancer.

Conditional Probability Table

The arc in the diagram allows representation of causal knowledge. For example, lung cancer is
influenced by a person's family history of lung cancer, as well as whether or not the person is a
smoker. It is worth noting that the variable PositiveXray is independent of whether the patient
has a family history of lung cancer or that the patient is a smoker, given that we know the patient
has lung cancer.

Data Mining - Rule Based Classification

IF-THEN Rules

Rule-based classifier makes use of a set of IF-THEN rules for classification. We can express a
rule in the following from −

IF condition THEN conclusion

Let us consider a rule R1,

R1: IF age=youth AND student=yes
THEN buy_computer=yes

Points to remember −

 The IF part of the rule is called rule antecedent or precondition.

 The THEN part of the rule is called rule consequent.
 The antecedent part the condition consist of one or more attribute tests and these tests are
logically ANDed.
 The consequent part consists of class prediction.

Note − We can also write rule R1 as follows:

R1: (age = youth) ^ (student = yes))(buys computer = yes)

If the condition holds true for a given tuple, then the antecedent is satisfied.

Rule Extraction

Here we will learn how to build a rule-based classifier by extracting IF-THEN rules from a
decision tree.

Points to remember −

 One rule is created for each path from the root to the leaf node.
 To form a rule antecedent, each splitting criterion is logically ANDed.
 The leaf node holds the class prediction, forming the rule consequent.

Rule Induction Using Sequential Covering Algorithm

Sequential Covering Algorithm can be used to extract IF-THEN rules form the training data. We
do not require to generate a decision tree first. In this algorithm, each rule for a given class
covers many of the tuples of that class.

Some of the sequential Covering Algorithms are AQ, CN2, and RIPPER. As per the general
strategy the rules are learned one at a time. For each time rules are learned, a tuple covered by
the rule is removed and the process continues for the rest of the tuples. This is because the path
to each leaf in a decision tree corresponds to a rule.

Note − The Decision tree induction can be considered as learning a set of rules simultaneously.

The Following is the sequential learning Algorithm where rules are learned for one class at a
time. When learning a rule from a class Ci, we want the rule to cover all the tuples from class C
only and no tuple form any other class.

Algorithm: Sequential Covering

Input:
D, a data set class-labeled tuples,
Att_vals, the set of all attributes and their possible values.

Output: A Set of IF-THEN rules.

Method:
Rule_set={ }; // initial set of rules learned is empty

for each class c do

repeat
Rule = Learn_One_Rule(D, Att_valls, c);
remove tuples covered by Rule form D;
until termination condition;

Rule_set=Rule_set+Rule; // add a new rule to rule-set

end for
return Rule_Set;

Rule Pruning

The rule is pruned is due to the following reason −

 The Assessment of quality is made on the original set of training data. The rule may
perform well on training data but less well on subsequent data. That's why the rule
pruning is required.
 The rule is pruned by removing conjunct. The rule R is pruned, if pruned version of R has
greater quality than what was assessed on an independent set of tuples.

FOIL is one of the simple and effective method for rule pruning. For a given rule R,

FOIL_Prune = pos - neg / pos + neg

where pos and neg is the number of positive tuples covered by R, respectively.

Note − This value will increase with the accuracy of R on the pruning set. Hence, if the
FOIL_Prune value is higher for the pruned version of R, then we prune R.

Miscellaneous Classification Methods

Here we will discuss other classification methods such as Genetic Algorithms, Rough Set
Approach, and Fuzzy Set Approach.
Genetic Algorithms

The idea of genetic algorithm is derived from natural evolution. In genetic algorithm, first of all,
the initial population is created. This initial population consists of randomly generated rules. We
can represent each rule by a string of bits.

For example, in a given training set, the samples are described by two Boolean attributes such as
A1 and A2. And this given training set contains two classes such as C1 and C2.

We can encode the rule IF A1 AND NOT A2 THEN C2 into a bit string 100. In this bit
representation, the two leftmost bits represent the attribute A1 and A2, respectively.

Likewise, the rule IF NOT A1 AND NOT A2 THEN C1 can be encoded as 001.

Note − If the attribute has K values where K>2, then we can use the K bits to encode the
attribute values. The classes are also encoded in the same manner.

Points to remember −

 Based on the notion of the survival of the fittest, a new population is formed that consists
of the fittest rules in the current population and offspring values of these rules as well.
 The fitness of a rule is assessed by its classification accuracy on a set of training samples.
 The genetic operators such as crossover and mutation are applied to create offspring.
 In crossover, the substring from pair of rules are swapped to form a new pair of rules.
 In mutation, randomly selected bits in a rule's string are inverted.

Rough Set Approach

We can use the rough set approach to discover structural relationship within imprecise and noisy
data.

Note − This approach can only be applied on discrete-valued attributes. Therefore, continuous-
valued attributes must be discretized before its use.

The Rough Set Theory is based on the establishment of equivalence classes within the given
training data. The tuples that forms the equivalence class are indiscernible. It means the samples
are identical with respect to the attributes describing the data.

There are some classes in the given real world data, which cannot be distinguished in terms of
available attributes. We can use the rough sets to roughly define such classes.

For a given class C, the rough set definition is approximated by two sets as follows −

 Lower Approximation of C − The lower approximation of C consists of all the data

tuples, that based on the knowledge of the attribute, are certain to belong to class C.
 Upper Approximation of C − The upper approximation of C consists of all the tuples,
that based on the knowledge of attributes, cannot be described as not belonging to C.

The following diagram shows the Upper and Lower Approximation of class C:

Fuzzy Set Approaches

Fuzzy Set Theory is also called Possibility Theory. This theory was proposed by Lotfi Zadeh in
1965 as an alternative the two-value logic and probability theory. This theory allows us to
work at a high level of abstraction. It also provides us the means for dealing with imprecise
measurement of data.

The fuzzy set theory also allows us to deal with vague or inexact facts. For example, being a
member of a set of high incomes is in exact (e.g. if $50,000 is high then what about $49,000 and
$48,000). Unlike the traditional CRISP set where the element either belong to S or its
complement but in fuzzy set theory the element can belong to more than one fuzzy set.

For example, the income value $49,000 belongs to both the medium and high fuzzy sets but to
differing degrees. Fuzzy set notation for this income value is as follows −

mmedium_income($49k)=0.15 and mhigh_income($49k)=0.96

where ‘m’ is the membership function that operates on the fuzzy sets of medium_income and
high_income respectively. This notation can be shown diagrammatically as follows −

Cracking/Money Making Guide V 1.1: Key Programs
83% (6)
Cracking/Money Making Guide V 1.1: Key Programs
11 pages
Offshore Platform Cost Estimation
50% (2)
Offshore Platform Cost Estimation
7 pages
BCM Questions
No ratings yet
BCM Questions
8 pages
Sensor Characteristics: (Part Two)
No ratings yet
Sensor Characteristics: (Part Two)
26 pages
Datamining Fifth Lecture
No ratings yet
Datamining Fifth Lecture
65 pages
Patterns Mined +frequent Patterns
No ratings yet
Patterns Mined +frequent Patterns
18 pages
Data Mining Slides
No ratings yet
Data Mining Slides
65 pages
Lecture2 DataMiningFunctionalities
No ratings yet
Lecture2 DataMiningFunctionalities
18 pages
Data Mining Technique Using Weka Tool
No ratings yet
Data Mining Technique Using Weka Tool
21 pages
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
No ratings yet
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
40 pages
UNIT 1 Introduction of Data Mining
No ratings yet
UNIT 1 Introduction of Data Mining
11 pages
02-Data Mining Functionalities-2
No ratings yet
02-Data Mining Functionalities-2
23 pages
Data Warehouse
No ratings yet
Data Warehouse
19 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
16 pages
Data Mining - Detailed - Simple Terms
No ratings yet
Data Mining - Detailed - Simple Terms
9 pages
Chapter 3
No ratings yet
Chapter 3
26 pages
4 Datamining
No ratings yet
4 Datamining
90 pages
Introduction To Data Mining Techniques: Dr. Rajni Jain
No ratings yet
Introduction To Data Mining Techniques: Dr. Rajni Jain
11 pages
8 Data Mining Algorithms
No ratings yet
8 Data Mining Algorithms
8 pages
Data Mining-Unit-1
No ratings yet
Data Mining-Unit-1
21 pages
Unit 3 BI & Data Science
No ratings yet
Unit 3 BI & Data Science
19 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
69 pages
Assignment Solution 074
No ratings yet
Assignment Solution 074
8 pages
8 Chapter Eight
No ratings yet
8 Chapter Eight
20 pages
Unit 1
No ratings yet
Unit 1
21 pages
Datamining 1
No ratings yet
Datamining 1
30 pages
Time Table Scheduling in Data Mining
No ratings yet
Time Table Scheduling in Data Mining
61 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
Data Mining
No ratings yet
Data Mining
30 pages
Exercises 5
No ratings yet
Exercises 5
5 pages
Important Questions Unit-1
No ratings yet
Important Questions Unit-1
20 pages
Presentation 1
No ratings yet
Presentation 1
28 pages
Chapter 4 New
No ratings yet
Chapter 4 New
17 pages
Introduction To Data Mining For Business Analytics
No ratings yet
Introduction To Data Mining For Business Analytics
51 pages
Data Mining Questions
No ratings yet
Data Mining Questions
2 pages
R18CSE4102-UNIT 2 Data Mining Notes
100% (1)
R18CSE4102-UNIT 2 Data Mining Notes
31 pages
DDB - Presentation5data Mining Overview
No ratings yet
DDB - Presentation5data Mining Overview
19 pages
03 Data Mining Functionalities
No ratings yet
03 Data Mining Functionalities
16 pages
Week 4 - Introduction To Data Mining and Data Mining Techniques
No ratings yet
Week 4 - Introduction To Data Mining and Data Mining Techniques
44 pages
Big Data 4 (3 - 4)
No ratings yet
Big Data 4 (3 - 4)
13 pages
Data Mining Tasks
No ratings yet
Data Mining Tasks
26 pages
Lec 1 Data Mining Introduction For Exam
No ratings yet
Lec 1 Data Mining Introduction For Exam
48 pages
Data Mining Questions
100% (1)
Data Mining Questions
7 pages
DataMining Chapter1
No ratings yet
DataMining Chapter1
13 pages
Data Mining Tasks
No ratings yet
Data Mining Tasks
20 pages
3 DM Classification
No ratings yet
3 DM Classification
55 pages
Data Mining 4545
No ratings yet
Data Mining 4545
20 pages
DataWarehouseMining Complete Notes
No ratings yet
DataWarehouseMining Complete Notes
55 pages
DW and DM Notes
No ratings yet
DW and DM Notes
89 pages
BI Unit 3 Part 1
No ratings yet
BI Unit 3 Part 1
51 pages
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
10 pages
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
No ratings yet
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
6 pages
Data Mining
No ratings yet
Data Mining
6 pages
Lecture Notes 1.1 & 1.2
No ratings yet
Lecture Notes 1.1 & 1.2
8 pages
Unit - 2 Data Minig Notes
No ratings yet
Unit - 2 Data Minig Notes
15 pages
Vinee
100% (1)
Vinee
28 pages
DM Unit-1
No ratings yet
DM Unit-1
27 pages
Data Mining
No ratings yet
Data Mining
33 pages
Data Mining: Nikita K Somaiya
No ratings yet
Data Mining: Nikita K Somaiya
19 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
CH 2
No ratings yet
CH 2
37 pages
Classification Analysis
No ratings yet
Classification Analysis
4 pages
Lect 2
No ratings yet
Lect 2
35 pages
Indigo Airlines Terms & Conditions
No ratings yet
Indigo Airlines Terms & Conditions
2 pages
Arduino Style Guide For Writing Libraries: Search The Arduino Website
No ratings yet
Arduino Style Guide For Writing Libraries: Search The Arduino Website
4 pages
Java QB
No ratings yet
Java QB
12 pages
Chapter 11: Integration-And System Testing
No ratings yet
Chapter 11: Integration-And System Testing
28 pages
Industrial Automation PLC Intro
100% (3)
Industrial Automation PLC Intro
105 pages
ManuallyDeleteDS Job
No ratings yet
ManuallyDeleteDS Job
11 pages
Post BIM Execution Plan Guid
No ratings yet
Post BIM Execution Plan Guid
6 pages
Eaac0203 LM03
No ratings yet
Eaac0203 LM03
196 pages
Contacts Modeling in Ansys
100% (3)
Contacts Modeling in Ansys
74 pages
NodeRed Raspberry Pi
100% (1)
NodeRed Raspberry Pi
8 pages
Vcredist x86
No ratings yet
Vcredist x86
53 pages
Springer Manuscript Style Guide
No ratings yet
Springer Manuscript Style Guide
13 pages
15 Bakery Algorithm
No ratings yet
15 Bakery Algorithm
10 pages
DG Setup
No ratings yet
DG Setup
5 pages
Satellite L305 Detailed Product Specification: Graphics
No ratings yet
Satellite L305 Detailed Product Specification: Graphics
3 pages
GIT-28 Access Control PDF
No ratings yet
GIT-28 Access Control PDF
9 pages
Unit 4
No ratings yet
Unit 4
18 pages
Nancy
No ratings yet
Nancy
90 pages
BR 070 SD Standard Sales Process in Brazil
No ratings yet
BR 070 SD Standard Sales Process in Brazil
25 pages
Dvc3 Manual en
No ratings yet
Dvc3 Manual en
58 pages
Algebra 1 Summer Skills Packet
100% (1)
Algebra 1 Summer Skills Packet
17 pages
Brain Computer Interface With Gus Bam P
No ratings yet
Brain Computer Interface With Gus Bam P
16 pages
Finautomata
No ratings yet
Finautomata
1 page
Oop Assignment
No ratings yet
Oop Assignment
4 pages
Important Questions: Chapter-1
No ratings yet
Important Questions: Chapter-1
3 pages

Unit 4

Uploaded by

Unit 4

Uploaded by

Association rules are if/then statements that help uncover relationships between seemingly

unrelated data in a relational database or other information repository. An example of an

5.3.1 Association Rules

Thus, the desirable features of any efficient algorithm are,

(a) To reduce the I/O operations

(b) at the same time be efficient in computing.

" Person P, P.degree = masters and P.income > 75,000

Decision Trees for Classification

How Does Classification Works?

 Building the Classifier or Model

Building the Classifier or Model

 This step is the learning step or the learning phase.

Using Classifier for Classification

Classification and Prediction Issues

Comparison of Classification and Prediction Methods

Data Mining - Decision Tree Induction

 It does not require any domain knowledge.

Decision Tree Induction Algorithm

Generating a decision tree form training tuples of data partition D

if tuples in D are all of the same class, C then

if attribute_list is empty then

apply attribute_selection_method(D, attribute_list)

if splitting_attribute is discrete-valued and

attribute_list = splitting attribute; // remove splitting attribute

// partition the tuples and grow subtrees for each partition

Tree Pruning Approaches

Here is the Tree Pruning Approaches listed below −

 Pre-pruning − The tree is pruned by halting its construction early.

The cost complexity is measured by the following two parameters −

Data Mining - Bayesian Classification

 Posterior Probability [P(H/X)]

where X is data tuple and H is some hypothesis.

According to Bayes' Theorem,

P(H/X)= P(X/H)P(H) / P(X)

Bayesian Belief Network

 A Belief Network allows class conditional independencies to be defined between subsets

There are two components that define a Bayesian Belief Network −

 Directed acyclic graph

Directed Acyclic Graph

 Each node in a directed acyclic graph represents a random variable.

Directed Acyclic Graph Representation

Conditional Probability Table

Data Mining - Rule Based Classification

IF condition THEN conclusion

Let us consider a rule R1,

 The IF part of the rule is called rule antecedent or precondition.

Note − We can also write rule R1 as follows:

R1: (age = youth) ^ (student = yes))(buys computer = yes)

Rule Induction Using Sequential Covering Algorithm

Algorithm: Sequential Covering

Output: A Set of IF-THEN rules.

for each class c do

Rule_set=Rule_set+Rule; // add a new rule to rule-set

The rule is pruned is due to the following reason −

FOIL_Prune = pos - neg / pos + neg

Miscellaneous Classification Methods

Rough Set Approach

 Lower Approximation of C − The lower approximation of C consists of all the data

Fuzzy Set Approaches

mmedium_income($49k)=0.15 and mhigh_income($49k)=0.96

You might also like