0% found this document useful (0 votes)
44 views3 pages

DWH&DM Ver2

This document contains instructions and questions for a Data Warehousing and Data Mining terminal examination. It instructs students to attempt all questions, be precise, and submit responses in the proper format. The exam contains 4 long questions involving designing a data warehouse schema, applying hierarchical agglomerative clustering and decision tree algorithms, and running the FP-Growth algorithm. Students are advised to carefully manage their time on the 3-hour exam.

Uploaded by

ramish489868
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views3 pages

DWH&DM Ver2

This document contains instructions and questions for a Data Warehousing and Data Mining terminal examination. It instructs students to attempt all questions, be precise, and submit responses in the proper format. The exam contains 4 long questions involving designing a data warehouse schema, applying hierarchical agglomerative clustering and decision tree algorithms, and running the FP-Growth algorithm. Students are advised to carefully manage their time on the 3-hour exam.

Uploaded by

ramish489868
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

COMSATS University Islamabad, Wah Campus

Terminal Examinations Fall 2020


Department of Computer Science

Program/Class: BSSE-7 Date: 13th January, 2021


Subject: Data Warehousing and Data Mining Instructor: Mamuna Fatima
Total Time Allowed: 3 hrs Maximum Marks: 50 marks
Student Name: Registration#:
_______________________________ _______________________________

Instructions:
 Attempt all questions.
 BE PRECISE AND DON’T COPY.
 Attempt the paper in sequence as given in question paper. Make sure to write
actual question number on the answer sheet.
 Each question must be attempted on A4 sheet (HAND-WRITTEN) with
following personal information on the top of each sheet:
STUDENT NAME, REGISTRATION NUMBER, CNIC NUMBER,
CONTINUATION SHEET NUMBER (E.G., 1/4, 2/4 …), SIGNATURE.
 Submit one pdf file on CUOnline on time. Late submissions will not be
accepted. So take care of time.

Question No 1 [10 marks]


10 MCQS on MS Teams—Time allowed 15 minutes

Long Questions

Question 02 [CLO 2] [10 marks]


A multinational company has stores in several regions. They would like to track profit
information across different departments (Video Sales and Video Rentals) and regions
(East, West, Central) in different years (e.g. 2011 and 2012). Design an appropriate
data warehouse schema using the star multi-dimensional model and discuss the fact
and dimension tables you would need. Would you need/recommend a snowflake
schema? Explain your views.
Question 03 [CLO 4] [10 marks]

Given the data in the following table. Apply Hierarchical Agglomerative clustering
using Average link. Use Euclidean distance as distance measure and draw
dendrogram. Clearly mention all steps and calculations.

S# x y
1 0.40 0.53
2 0.A2 0.38
3 0.35 0.32
4 0.26 0.C9
5 0.B8 0.41
6 0.45 0.30
Where,

A= Use (reg_no_last2digits % 5) + 1
B= Use (reg_no_last2digits % 5) + 2
C= Use (reg_no_last2digits % 5) + 3
Instruction: No marks will be given if numbers used other than your own registration
number.

Question 04 [CLO 4] [10 marks]

Apply the Decision Tree(DT) algorithm to the training data in below table and
clearly mention all steps and calculations of Entropy and Information Gain. Make
Decision Tree after applying calculations.

S# Holiday Weather Paper Picnic


(Category)
1 Yes Rainy easy No
2 Yes Rainy Difficult No
3 Yes Rainy Difficult Yes
4 Yes Sunny Difficult Yes
5 Yes Sunny easy Yes
6 Yes Sunny easy No
7 Yes Rainy Difficult No
8 Yes Sunny Difficult Yes
9 No Sunny Difficult No
10 No Rainy Difficult No
11 No Sunny easy No

a) Which attribute would information gain choose as the root of the tree?
b) Draw the decision tree that would be constructed by recursively applying
information gain to select roots of sub-trees, as in the Decision-Tree-Learning
algorithm.
c) Generate Decision rules from decision tree.

Question 05 [CLO 4] [10 marks]

Run the FP Growth algorithm on the following transactional database with minimum
support equal to 50%. Mine the FP-tree and extract the set of frequent patterns.
Show step by step execution.

TID ITEMS
T1 {A,B,C,E}
T2 {B,D,E,F}
T3 {A,B,C,D,F,G}
T4 {A,B,C,D,E,G,H}
T5 {A,C,D,E,H}

Good Luck

You might also like