0% found this document useful (0 votes)
84 views3 pages

DWM May 2024

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views3 pages

DWM May 2024

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

X2

5
5

7Y

FD
Y

FD
Paper / Subject Code: 31924 / Data Warehousing & Mining

B2
37

23

YA
YA
X2

D5

2X

D
37

AF
37
B2

AF

5B

X2
X2

7Y
D5

7Y

FD

B2
B2

23
AF

23

YA

D5

2X
5

2X
7Y
Time: 3 hours

D
Max. Marks: 80

37

AF

5B
AF

5B
23

X2
=====================================================================

X
7Y

FD
7Y
2X

B2
B2
Note: 1. Question no.1 is compulsory.

AF

23

YA
23
5B

D5
D5

2X
Y
2. Attempt any three out of remaining five.

X
FD

37

AF
37
2

AF

5B
5B
3. Assumptions made should be clearly indicated.

X2
YA

7Y
7Y
X

FD
FD

B2
4. Figures to the right indicates full marks.

B2
37

23
23

YA
A

D5

Y
X2

2X
D5
5.Assume suitable data whenever necessary.

2X
Y

37
37

AF
7
B2

5B
F

5B
3

X2
A

X2
X2

7Y
D5

FD
Y

FD

B2
Question 1 Write a short note on the following. Solve any four.

B2
7
B2

23
AF

YA
23

D5
D5

2X
5

Y
2X
7Y

37
(5 marks each)

AF
37

AF

5B
F

5B
23

X2

5
A

X2

7Y

FD
7Y

FD
7Y
2X

D
A Write a note on web usage mining. Also state its any two applications.

B2
B2

23
F

YA
23

YA
23
5B

YA

D5

2X
5

2X
2X

37
B Describe any five issues in data mining.
FD

37

AF
37

5B
AF

5B

X2
B

X2
YA

7Y
D5

FD
7Y
X

FD

B2
Explain how Naive Bayes classification makes predictions and

B2
2
37

23
AF

YA
23
B

YA

D5
D5
C discuss the "naive" assumption in Naive Bayes. Provide an example to
X2

2X
D5

X
Y

37

AF
37
2

AF
7

illustrate the application of Naive Bayes in a real-world scenario.


B2

5B
AF

5B
3

X2
2

7Y
X2

7Y
X
D5

FD
Y

FD

B2
B2
7

23
B2

Suppose the data for clustering is {6,14,18,22,1,40,50,11,25} consider


23
AF

YA
3

D5
D
X2

2X
D5

2X
D5

k=2, cluster the given data using k means algorithm.


7Y

37

AF
7
2

5B
F

5B
F

3
5B

A
23

X2
YA

X2

7Y

FD
Y

FD

E Explain the concept of market basket analysis with example.


2X

FD

B2
7
B2
37

23

YA
23

A
5B

YA

D5
X2

2X
D5

Y
2X

F Differentiate between ER modeling vs Dimensional modeling.

37
FD

37

F
37
B2

5B
F

5B

X2
YA

X2
YA

X2

7Y
D5

FD
D

B2
B2
7
2

F
37

23
AF

YA
23
5B

YA

D5
5
X2

2X
2X

D
Y

FD

37

AF

Question 2 10 marks each


37

AF
37
B2

5B
5B

X2
YA

X2

7Y
X2

7Y
D5

FD
FD

B2

A Describe in detail about how to evaluate accuracy of the classifier.


2
37

23
2

23
5B
AF

YA
B

D5
X2

2X
2X
D5

D
Y

37

B Illustrate major steps in ETL process.


AF
7
B2

5B
AF
37

5B
F

23

2
A

7Y
X2

X
5

FD
Y
2X

FD
Y

FD

B2
7
7

23
B2

YA
23
5B

A
23

YA

D5

10 marks each
2X

Question 3
Y
2X
2X

FD

37
7

AF
7

5B
23
5B
3

X2
5B

YA
X2

Explain KDD process with neat diagram. Also state any five
Y
X

FD
D

B2
FD

A
7
B2
37
B2

AF

23

applications of data mining.


A

D5
YA

D5

Y
2X
5

7Y
2X
D

37

AF
AF
37

5B
AF

For the table given perform Apriori algorithm and show frequent item
23
5B

X2

7Y
7Y
2X

D
Y

set and strong association rules. Assume Minimum Support of 30% and
B2
AF
37

23
F

23
5B
YA

D5
X2

2X

Minimum confidence of 70%.


Y
X
FD

37
B2

AF
37
2

5B
5B

X2
2

D5

7Y

FD

TID Items
7Y
X
FD

B2

B
2

AF

23

YA
23
B

1 1,4,6,8
D5

X
D5

7Y
X

37
B2
2

AF

2 2,5,3
AF

23
5B

X2
5
Y
2X

FD
7Y

3 7,1,3,8
FD

B2
37
B

A
23

X2

D5

4 9,10
D5

Y
Y
2X

37
B2

AF
37

AF

5 1,5
X2
X2

D5

7Y
Y

B2
7
B2

AF

23
23

2X
D5

7Y
2X

FD

5B
AF

23
5B

56039 Page 1 of 3
YA
2X

FD
7Y

FD

37
5B

YA
YA

X2
FD

X237YAFD5B2X237YAFD5B2X237YAFD5B2X237YAFD5B2
37
B2
37
X2

5
5

7Y

FD
Y

FD
Paper / Subject Code: 31924 / Data Warehousing & Mining

B2
37

23

YA
YA
X2

D5

2X

D
37

AF
37
B2

AF

5B

X2
X2

7Y
D5

7Y

FD

B2
B2

23
AF

23

YA

D5

2X
5

2X
7Y

37

AF

5B
AF

5B
23

X2
Question 4 10 marks each

X
7Y

FD
7Y
2X

B2
B2
AF

23

YA
23
5B

D5
D5
A A social media platform wants to analyze user engagement data to

2X
Y
X
FD

37

AF
37
2

AF
improve content recommendations and user experience. The

5B
5B

X2
YA

7Y
7Y
X

FD
INTERACTIONS fact table contains information about user

FD

B2
B2
37

23
23

YA
A

D5
interactions, including interaction details, user information, content

Y
X2

2X
D5

2X
Y

37
37

AF
7
details, and time periods. The dimension tables provide additional

B2

5B
F

5B
3

X2
A

X2
X2

7Y
D5

FD
Y
context about users, content, categories, and time periods. Design a star

FD

B2
B2
7
B2

23
AF

YA
23

A
schema and snowflake schema for the same.

D5
D5

2X
5

Y
2X
7Y

37

AF
37

AF

5B
F

5B
23

X2
B

5
Explain Multilevel Association Rules Mining and Multidimensional
A

X2

7Y

FD
7Y

FD
7Y
2X

B2
Association Rules Mining with examples.

B2

23
F

YA
23

YA
23
5B

YA

D5

2X
5

2X
2X

37
FD

37

AF
37

5B
AF

5B

X2
B

X2
YA

7Y
10 marks each
D5

FD
Question 5
7Y
X

FD

B2
B2
2
37

23
AF

YA
23
B

YA

D5
D5
X2

2X
D5

X
A A company wants to predict whether a customer will subscribe to a
Y

37

AF
37
2

AF
7
B2

5B
AF

5B
3

X2
premium membership based on their demographic and browsing 2

7Y
X2

7Y
X
D5

FD
Y

FD

B2
behavior data. The dataset contains information about customers,
B2
7

23
B2

23
AF

YA
3

D5
X2

2X
D5

including age, gender, income, browsing time, and subscription status. 2X


D5

Y
7Y

37

AF
7
2

5B
F

5B
F

Age Gender Income Browsing Time Subscription


3
5B

A
23

X2
YA

X2

7Y

FD
Y

FD
2X

FD

20-30 Male High 10am-12pm Yes


B2
7
B2
37

23

YA
23

A
5B

YA

D5

20-30 Female Medium 2pm-4pm Yes


X2

2X
D5

Y
2X

37
FD

37

F
37

30-40 Male Low 8am-10am No


B2

5B
F

5B

X2
YA

X2
YA

X2

7Y
D5

FD

30-40 Female High 4pm-6pm Yes


D

B2
B2
7
2

F
37

23
AF

YA
23
5B

>40 Male Medium 6pm-8pm Yes


YA

D5
5
X2

2X
2X

D
Y

FD

37

>40 Female Medium 8am-10am No


AF
37

AF
37
B2

5B
5B

X2
YA

X2

7Y

>40 Male High 12pm-2pm Yes


X2

7Y
D5

FD
FD

B2
2
37

23
2

20-30 Female Low 10am-12pm No


23
5B
AF

YA
B

D5
X2

2X
2X
D5

20-30 Male Medium 2pm-4pm Yes


Y

37

AF
7
B2

5B
AF
37

5B
F

23

30-40 Female High 8am-10am Yes


A

7Y
X2

X
5

FD
Y
2X

FD
Y

FD

B2
7
7

23
B2

YA
23
5B

A
23

YA

D5

Use ID3 to build the decision tree and predict the following example:
2X
Y
2X
2X

FD

37
7

AF
7

5B
23
5B
3

X2
5B

Age Gender Income Browsing Time


YA
X2

Y
X

FD
D

B2
FD

20-30 Male Medium 10am-12pm


B2
37
B2

AF

23

D5
YA

D5

Y
2X
5

7Y

B Illustrate page rank algorithm with example.


2X
D

37

AF
AF
37

5B
AF

23
5B

X2

7Y
7Y
2X

D
Y

B2
AF
37

23
F

23
5B
YA

D5
X2

2X
Y
X
FD

37
B2

AF
37
2

5B
5B

X2
2

D5

7Y

FD
7Y
X
FD

B2
2

AF

23

YA
23
B

D5

X
D5

7Y
X

37
B2
2

AF
AF

23
5B

X2
5
Y
2X

FD
7Y

FD

B2
37
B

A
23

X2

D5
D5

Y
Y
2X

37
B2

AF
37

AF

X2
X2

D5

7Y
Y

B2
7
B2

AF

23
23

2X
D5

7Y
2X

FD

5B
AF

23
5B

56039 Page 2 of 3
YA
2X

FD
7Y

FD

37
5B

YA
YA

X2
FD

X237YAFD5B2X237YAFD5B2X237YAFD5B2X237YAFD5B2
37
B2
37
7Y 2X FD 37 B2
AF 23 5B YA X2
D5 7Y 2 X2 FD 37
Y
37 B2 AF AF 37 5B
YA X2 2X D5
B D5 Y
FD 37 2 AF 23
5B Y A X D 5 7Y
B2
237 A X2
2X
23
FD
5B YA
B2
X2 F D5 37
B YA
FD
5B 7 Y 2 X F D 37 2 FD
AF 23 5B YA X2 5B
2X D5 7Y 2X FD 37
Y

56039
23 B A 2 5 2X

B
7Y 2X FD 37 B2 AF 23
A
B2 AF 23 5B YA X2 D5 7Y
7Y FD 37 AF
Question 6

X2 D5 2X YA
B2
37 B2 AF 23 5B X2 D5
YA X2 D5 7Y 2X FD 37 B2
FD 37 B2 AF 23 5B YA X2
5 Y X D 7 Y 2 F 37
37
YA
B2
X2
AF
D5
23
7Y
5B
2X A FD
X2
37
D5
B2 YA
37 AF 23 5B YA X2 FD
FD B2 5B
Y 7 Y FD 3 7

generation.
5B A X2 D5 2X Y 2X
2X FD 37 B2 AF 23 5B
23 5 Y X D 7 2 AF 23
10 marks each

B2 AF 23 5B YA X2 D5 7Y
7Y X D 7 Y 2 F 3 7 B AF
AF 23 5B A X2 D5 Y 2X D5

7
6
5
4
3
2
1
D5 7Y 2X FD 37 B2 AF 23 B2
B2 AF 23 5B YA X2 D5 7Y
X2 D5 7 Y 2X F D 3 7 B 2 A F
X2
Food Item

37 23 5B Y A X D 5
37
YA
B2
X2
AF
D5 7Y 2X FD 2 37 B2 YA
37 AF 23 5B YA X2 FD
FD B2 5B
5B Y A X D 5 7 Y 2 X FD 3 7
2 A Y 2X

3.9
2.0
7.6
1.5
4.2
8.2
1.1

2X FD 37 B2 F 23 5B A 23
Protein

23 5B YA X2 D5 7Y 2X FD 7Y
7Y 2X FD 37 B2 AF 23 5B
Y A X D 7 Y 2 X
AF

Page 3 of 3
AF 23 5B 2 5 A D5

39
55
15
21
35
20
60

7Y 37 23
Fat

D5
B2 AF
2X
23
FD
5B YA
B2
X2 F D5 7Y B2
7Y FD 37 AF X2
2X B2
linkage clustering and construct dendrogram.

X2 D5
37 A 23 5B Y A X D5
37
Y
B2 F 7Y 2X FD 2 37 B2

_________________________
YA X2 D5
FD 37 B2 AF 23 5B Y AF X2
5B YA X2 D5 7Y 2X D 37
2X FD 37 B2 AF 23 5B YA
23 5B YA X2 D5 7Y 2X FD
7Y 2X FD 37 B2 AF 23 5

X237YAFD5B2X237YAFD5B2X237YAFD5B2X237YAFD5B2
AF 23 5B YA X2 D5 7Y
D5 7Y 2X FD 37 B2 AF
Paper / Subject Code: 31924 / Data Warehousing & Mining

B2 AF 23 5B YA X2 D5

Explain in brief what is data discretization and concept hierarchy


X2 D5 7Y 2X FD 37 B2
37 B2 AF 23 5B YA X2
YA X2 D5 7Y 2X FD
Following table gives fat and proteins content of items. Apply single

FD 37 B2 AF 23 5B
5B YA X2 D5 7Y 2X
2X FD 37 B2 AF 23
23 5B YA X2 D5 7Y
7Y 2X FD 37 B2
AF 23 5B YA X2
D5 7Y 2X FD 37
B2 AF 23 5B YA
X2 D5 7Y 2X FD
37 B2 AF 23 5
YA X2 D5 7Y
FD 37 B2 AF
5 Y X D

You might also like