100% found this document useful (1 vote)
748 views3 pages

Data Mining Question Bank

The document discusses association rule mining and provides answers to 12 questions. The key points are: 1) Association rule mining involves finding frequent itemsets that occur at or above a minimum support threshold and generating strong association rules that satisfy minimum support and confidence. 2) The two main measures in association rule mining are support and confidence.

Uploaded by

ravi3754
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
748 views3 pages

Data Mining Question Bank

The document discusses association rule mining and provides answers to 12 questions. The key points are: 1) Association rule mining involves finding frequent itemsets that occur at or above a minimum support threshold and generating strong association rules that satisfy minimum support and confidence. 2) The two main measures in association rule mining are support and confidence.

Uploaded by

ravi3754
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 3

DATA WAREHOUSING AND MINING

UNIT 3: ASSOCIATION RULES


2MARKS QUESTIONS WITH ANSWERS: Q1. How can we find association rule from large amount of data? ANS: Association rule mining in a two step process. 1. Find all frequent item sets: By definition each of these item sets will occur atleast as frequently as a pre determined minimum support count. 2. Generate Strong Association Rule from the frequent item sets: By definition these rules must satisfy minimum support and minimum confidence. Q2. What language constructs are useful in defining a data mining query language? ANS:

Q3. What are the two measures present in Association mining? ANS: The two measures are 1. Support ( A => B ) = p ( A U B ) 2. Confidence (A =>B) = p( B /A ) Q4. What is the difference between Boolean and Quantitative associated rules? ANS: BOOLEAN ASSOCIATIVE RULE: It involves associations between the presence or absence of items. buys( x, SQLServer ) ^ buys( x, DM Book ) =>buys( x, DBMS)

QUANTITATIVE ASSOCIATIVE RULE: It describes associations between quantitative items or attributes. age( x, 3039 ) ^ income( x, 42k.48k ) => buys( x, PC )

Q5. What is the difference between Single and Multidimensional association rule? ANS: SINGLE DIMENSIONAL ASSOCIATION RULE: It contains a single distinct predicate (eg. Buys) with multiple occurrences ( i.e. the predicate occurs more than once with in the rule). buys( x, digital camera ) => buys( x, HP printer )

MULTI DIMENSIONAL ASSOCIATION RULE: Association rule that involves two or more dimensions or predicate can be referred to a multi dimensional association rule. age( x, 2029 ) ^ occupation( x, student ) => buys( x, laptop )

Q6. How is the Apriori property used in the algorithm? ANS: To understand this let us look at how Lk -1 is used to find Lk for k>=2. A two step process is followed consisting of join and pruine action. Q7. Define anti- monotone? ANS: A set cannot pass a test, all of its supersets will fail the same test as well. It is called anti-monotone. Because the property is monotonic in the context of failing a test. Apriori property belongs to a special category of properties called antimonotone. Q8. Define Apriori property? ANS: APRIORI PROPERTY: All non empty subsets of a frequent item set must also be frequent. Q9. How to find strong association rule using confidence? ANS: Once the frequent item sets from transaction in a database D have been found, it in straight forward to generate strong association rule from them. Confidence (A=>B) = p(B/A) = (Support_count(AUB)) / (Support_ count(A))

Q10. How might the efficiency of Apriori be improved? ANS: Efficiency of Apriori improvement: Many variation of the Apriori algorithm have been proposed that focus on improving the efficiency of the original algorithm. 1. Hash based techniques( hashing item sets into corresponding buckets). 2. Transaction reduction ( reducing the number of transaction scanned in future iteration); 3. Partitioning ( partitioning the data to find candidate item set). 4. Sampling (mining in a subset of the given data). Q11. Define FP- tree pattern? ANS: FP Tree Pattern: Create the root of tree labeled with NULL. Scan database D a second time. The items in each transaction are processed in L order i.e. sorted according to descending (support count) branch created for each transaction. Q12. Define Multilevel association rule from transaction database? ANS: MULTILEVEL ASSOCIATION RULE: It involves concepts at different levels of abstraction. It can be mined using several strategies, based on how minimum support thresholds are defined at each level of abstraction.

You might also like