0% found this document useful (0 votes)
19 views2 pages

Write Your Roll Number: Time: Hours Max. Marks

This document is an examination paper for the B. Tech. (CSE) VI Semester (Supplementary) Examination, December 2022, at Kakatiya Institute of Technology & Science. It covers topics related to Data Warehousing and Data Mining, including definitions, architectures, data cleaning, clustering methods, and classification processes. The exam consists of various questions requiring definitions, explanations, and problem-solving related to data mining techniques and methodologies.

Uploaded by

pranaybollam1414
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views2 pages

Write Your Roll Number: Time: Hours Max. Marks

This document is an examination paper for the B. Tech. (CSE) VI Semester (Supplementary) Examination, December 2022, at Kakatiya Institute of Technology & Science. It covers topics related to Data Warehousing and Data Mining, including definitions, architectures, data cleaning, clustering methods, and classification processes. The exam consists of various questions requiring definitions, explanations, and problem-solving related to data mining techniques and methodologies.

Uploaded by

pranaybollam1414
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

18BT465/3

URR-18 Write your Roll number

KAKATIYA INSTITUTE OF TECHNOLOGY & SCIENCE, WARANGAL


(An Autonomous Institute under Kakatiya University, Warangal)
FACULTY OF ENGINEERING AND TECHNOLOGY
B. Tech. (CSE) VI Semester (Supplementary) Examination, December 2022
U18CS605: Data Warehousing and Data Mining
Time: 3 Hours] [Max. Marks : 60
Note: Answer all the questions.
CDLL CO’s
1. a. Define Data warehouse how it is different from a database. [1] R CO1
b. Define the term Data cleaning. [1] R CO1
c. Explain three tier architecture of data warehousing. [1] U CO1
d. What factors lead to the mining of data? [1] R CO2
e. Explain how data mining system can be integrated with database/ data [1] U CO2
warehouse system.
f. How do I generate association rules from frequent itemsets? [1] U CO2
g. What are the evaluation measure used for evaluating the performance of [1] R CO3
the classifier?
h. What is back propagation? [1] R CO3
i. Explain supervised learning and unsupervised learning. [1] U CO3
j. What is density based clustering? [1] R CO4
k. Explain data clustering? [1] U CO4
l. Briefly discuss hierarchical clustering method? [1] U CO4

2. a. Discuss issues to consider during data integration? [6] U CO1


b. Suppose a group of 12 sales price records has been sorted as follows 5, 10, [6] Ap CO1
11, 13, 15, 35, 50, 55, 72, 92, 204, 215. Partition them into three bins by each
of the following methods.
(i) equal-frequency (equidepth) partitioning
(ii) equal-width partitioning
(iii) clustering
(OR)
c. A data warehouse can be modeled by either a star schema or a snowflake [6] U CO1
schema. Briefly describe the similarities and the differences of the two
models, and then analyze their advantages and disadvantages with
regard to one another. Give your opinion of which might be more
empirically useful and state the reasons behind your answer.
d. Consider the attribute age: 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, [6] Ap CO1
30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
(i) Use smoothing by bin means to smooth the above data, using a bin
depth of 3. Illustrate your steps. Comment on the effect of this
technique for the given data.
(ii) How might you determine outliers in the data?
(iii) What other methods are there for data smoothing?

3. a. List and describe the five primitives for specifying a data mining task. [6] U CO2
b. The price of each item in a store is nonnegative. The store manager is only [6] Ap CO2
interested in rules of certain forms, using the constraints given below. For
each of the following cases, identify the kinds of constraint they represent
and briefly discuss how to mine such association rules using constraint-
based pattern mining.
(i) Containing at least one Blueray DVD movie
(ii) Containing items whose sum of the prices is less than $150

Page 1 of 2
18BT465/3
(iii) Containing one free item and other items whose sum of the prices
is at least $200
(iv) Where the average price of all the items is between $100 and $500
(OR)
c. Describe the steps involved in data mining when viewed as a process of [6] U CO2
knowledge discovery.
d. The price of each item in a store is nonnegative. The store manager is only [6] Ap CO2
interested in rules of the form: “one free item may trigger $200 total
purchases in the same transaction”. State how to mine such rules
efficiently using constraint-based pattern mining.

4. a. Briefly describe the classification processes using [6] U CO3


(i) genetic algorithms
(ii) rough sets
(iii) fuzzy sets.
b. Explain tree pruning in decision tree induction? What is drawback of [6] An CO3
using a separate set of tuples to evaluate pruning?
(OR)
c. Discuss Rough Set Approach. [6] U CO3
d. Explain k-nearest neighbor classification algorithm. [6] An CO3

5. a. Briefly describe and give examples of each of the following approaches to [6] U CO4
clustering: partitioning methods, hierarchical methods, density-based
methods and grid-based methods.
b. Explain about STING algorithm. [6] An CO4
(OR)
c. Describe [6] U CO4
(i) Mining sequence data
(ii) Web data mining
d. Explain about K-Medoids algorithm and its advantages. [6] An CO4

--- Question Paper Ends ---

Page 2 of 2

You might also like