0% found this document useful (0 votes)

19 views8 pages

IS421 Exam

Uploaded by

Shikha Nand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views8 pages

IS421 Exam

Uploaded by

Shikha Nand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

IS421: Knowledge Discovery in Databases

School of Computing, Information and Mathematical Sciences

Final Examination
Semester 1, 2017

F2F Mode

Duration of Exam: 3 hours + 10 minutes

Reading Time: 10 minutes

Writing Time: 3 hours

Instructions:

1. This exam has two sections:

a. Section A – 7 questions (30 marks)
b. Section B – 5 questions (70 marks)
2. Answer ALL questions in the two sections.
3. The exam is worth 50% of the overall course mark. Students must score a
minimum of 40 marks in this exam to pass the course.
4. There are a total of 8 pages (including the cover page) of this exam questions
booklet.
5. This is a CLOSED book exam.
6. No other materials are allowed into the exam room.
7. A non-programmable calculator may be used during the exam.
Section A – Short Answers (30 marks)

Write your answers in the Answer Book provided.

1. Discuss in your own words along with examples the four factors that enhance data
quality. (4 marks)

2. Outline three methods for cleaning data. (3 marks)

3. You have been hired as a data analyst for the Tappoos Fiji Ltd. Upon examining the
price of certain items sold at their Duty Free Department you realized that the data
needs preprocessing. Illustrate three binning methods that you will use to smooth the
data using the following prices: 15, 21, 8, 4, 21, 25, 24, 34, 28 (6 marks)

4. Calculate the z-score normalization for the attribute income of $39,500. The mean and
standard deviation values are $30,000 and $8,000, respectively. Elaborate on the
significance of the z-score and when it is most suitable to be used. (4 marks)

5. Explain the four major features of data warehouse. (4 marks)

6. Discuss four benefits of using information from data warehouses. (4 marks)

7. Compare and contrast online transaction processing (OLTP) systems and online
analytical processing (OLAP) systems. (5 marks)

2
Section B (70 marks)

Question 8 KDD Process [14 marks]

Knowledge discovery in databases is the process of identifying hidden knowledge buried in
the huge volumes of data that have been created and stored.

a) Elaborate on the steps involved in the KDD process. (5 marks)

Figure 1

b) Figure 1 illustrates the application of a data mining technique in the field of medicine.
Identify the data mining technique and justify your choice. (4 marks)

c) Uncovering fraudulent use of credit cards can be detected using a data mining
functionality. Discuss the functionality and how this is done. (3 marks)

d) Examine Figure 2 and discuss the data mining technique applied to derived new
knowledge. (2 marks)

Figure 2
3
Question 9 Association Rule [14 marks]
Study the data provided below and answer the questions that follow.

Customer Items purchased

1 Orange juice, potato chips
2 Milk, orange juice, window cleaner
3 Orange juice, washing detergent
4 Orange juice, washing detergent, potato chips
5 Window cleaner, potato chips

a) Calculate the confidence score for all two items purchased by customers (5 marks)

b) Using above 50% as the threshold figure for the confidence score, which 2 items
purchased together. (3 marks)

c) Determine which item is never purchased with potato chips or washing detergent.
(2 marks)

d) Discuss the apriori algorithm and its significance in KDD. Use an example to support
your discussion. (4 marks)

4
Question 10 Cluster Analysis [16 marks]

Use Figure 3 shown below to answer the questions that follow.

Figure 3

The distance function is Euclidean distance and points A1, B1, and C1 are initially assigned
as the center of each cluster, respectively. Use the k-means algorithm to:

a) show the three cluster centers after the first cycle. (4 marks)

b) determine the final number of clusters in the data set by showing all the calculations
required to arrive at your answer. (12 marks)

5
Question 11 Classification [14 marks]

Use Table 1 to answer the questions that follow.

Outlook Temperature Humidity Windy Play?
sunny hot high false No
sunny hot high true No
overcast hot high false Yes
rain mild high false Yes
rain cool normal false Yes
rain cool normal false No
overcast cool normal true Yes
sunny mild high false No
sunny cool normal false Yes
rain mild normal false Yes
sunny mild normal false Yes
overcast mild high true Yes
overcast hot normal false Yes
rain mild high True No

Table 1

a) Develop a decision tree for “Play”. (4 marks)

b) Calculate the information gain for the attributes: outlook, temperature, windy, and
humidity. (8 marks)

c) Identify the best attribute and justify your choice. (2 marks)

6
Question 12 Data Cube [12 marks]

a) Discuss the use of data cube technology in data mining. (2 marks)

b) You have been hired by the Rups Big Bear Company Ltd as a data analyst and you
saw an opportunity to build data cubes to satisfy the marketing department’s request
to analyze all of the sales by products and customers that were made in the 2016
calendar year. List the key steps in building the required data cube named “Sales”
(6 marks)

c) Use the data cube shown in Figures 3 and 4 to answer the questions that follow:

Figure 3

7
Figure 4

i) Identify the customer and the store location that sold the highest number of a
single part. (1 mark)

ii) Determine the part number and store location that sells the highest and lowest
items (3 marks)

THE END.

DP 600
100% (1)
DP 600
44 pages
Information Management: Select From Like
No ratings yet
Information Management: Select From Like
8 pages
4
No ratings yet
4
3 pages
3
No ratings yet
3
4 pages
IS328 Final Exam
No ratings yet
IS328 Final Exam
12 pages
DMDW
No ratings yet
DMDW
4 pages
Dcs 7302
No ratings yet
Dcs 7302
17 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
13 pages
Data Mning
No ratings yet
Data Mning
40 pages
Ans DM
No ratings yet
Ans DM
16 pages
Question Bank Bca - Ids
No ratings yet
Question Bank Bca - Ids
3 pages
DWM (W2022)
No ratings yet
DWM (W2022)
2 pages
Data Warehousing and Mining April 2019
No ratings yet
Data Warehousing and Mining April 2019
4 pages
Data Mining
No ratings yet
Data Mining
7 pages
COSF 221 INTE 421 BMIS 313 DATA MINING - Kabarak University
No ratings yet
COSF 221 INTE 421 BMIS 313 DATA MINING - Kabarak University
5 pages
MTech (DS) Sem-II Data Mining and Predictive Analytics - Out
No ratings yet
MTech (DS) Sem-II Data Mining and Predictive Analytics - Out
2 pages
Document 3
No ratings yet
Document 3
10 pages
Answer Midterm Exam Data Mining1 2021 - 2022
100% (2)
Answer Midterm Exam Data Mining1 2021 - 2022
4 pages
DM QB
No ratings yet
DM QB
7 pages
Data Mining-Exams
100% (2)
Data Mining-Exams
3 pages
Exam DUT 070816 Ans
No ratings yet
Exam DUT 070816 Ans
5 pages
Cosf 221 Inte 421 Bmis 313 Data Mining - Kabarak University
No ratings yet
Cosf 221 Inte 421 Bmis 313 Data Mining - Kabarak University
11 pages
117 CD 032017
No ratings yet
117 CD 032017
2 pages
DMDW Question Bank
No ratings yet
DMDW Question Bank
17 pages
DM Jun 2011
No ratings yet
DM Jun 2011
1 page
CS 515 Data Warehousing and Data Mining
No ratings yet
CS 515 Data Warehousing and Data Mining
5 pages
Assignment 2 Slot8 TTS3208 Summer
No ratings yet
Assignment 2 Slot8 TTS3208 Summer
11 pages
Dmbi
No ratings yet
Dmbi
3 pages
Mod 1,2
No ratings yet
Mod 1,2
15 pages
INTE 421 - BBIT 421 - Data Mining & Warehousing MAY-AUG 2019
No ratings yet
INTE 421 - BBIT 421 - Data Mining & Warehousing MAY-AUG 2019
3 pages
Model BSC
No ratings yet
Model BSC
1 page
Question Bank Semester: IV Sem Subject: Data Science Sub Code: 17MCA441 SL - No. Questions Marks
No ratings yet
Question Bank Semester: IV Sem Subject: Data Science Sub Code: 17MCA441 SL - No. Questions Marks
4 pages
B.Tech Degree S8 (S, FE) / S6 (PT) (S, FE) Examination June 2023 (2015 Scheme)
No ratings yet
B.Tech Degree S8 (S, FE) / S6 (PT) (S, FE) Examination June 2023 (2015 Scheme)
4 pages
Final Exam BWA44603
No ratings yet
Final Exam BWA44603
4 pages
126VW122019
No ratings yet
126VW122019
2 pages
DM PYQ Merged
No ratings yet
DM PYQ Merged
26 pages
Exam dm1 121017 Ans
No ratings yet
Exam dm1 121017 Ans
8 pages
Question Bank: Q1) What Is Data Warehouse?
No ratings yet
Question Bank: Q1) What Is Data Warehouse?
17 pages
SEM 5 - Comps, IOT, CYBER, CS - Data Warehousing & Mining - 2024 MAY To 2022 DEC PYQ - Aeraxia - in
No ratings yet
SEM 5 - Comps, IOT, CYBER, CS - Data Warehousing & Mining - 2024 MAY To 2022 DEC PYQ - Aeraxia - in
10 pages
Gujarat Technological University: Subject Code: 171601 Date: 25/11/2014 Subject Name: Data Warehousing and Data Mining
No ratings yet
Gujarat Technological University: Subject Code: 171601 Date: 25/11/2014 Subject Name: Data Warehousing and Data Mining
2 pages
Jntuqp DWDM
No ratings yet
Jntuqp DWDM
8 pages
Seperated
No ratings yet
Seperated
11 pages
Sample Question DMW
No ratings yet
Sample Question DMW
4 pages
Isp565 - Its665 Feb 22
No ratings yet
Isp565 - Its665 Feb 22
17 pages
Gtu Computer 3160714 Summer 2023
No ratings yet
Gtu Computer 3160714 Summer 2023
3 pages
SPB 2403 Data Warehousing and Mining Year 4 Semester II
No ratings yet
SPB 2403 Data Warehousing and Mining Year 4 Semester II
3 pages
Koe093 Data Warehousing Data Mining
100% (1)
Koe093 Data Warehousing Data Mining
2 pages
Data Mining Model Qns
100% (1)
Data Mining Model Qns
14 pages
Data Warehousing and Mining (Notes)
No ratings yet
Data Warehousing and Mining (Notes)
12 pages
Data Mining
No ratings yet
Data Mining
6 pages
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 5 Data Mining
100% (1)
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 5 Data Mining
13 pages
CEUC502 - DMBI - Question - Bank
No ratings yet
CEUC502 - DMBI - Question - Bank
12 pages
CS3352 Iat QB
No ratings yet
CS3352 Iat QB
2 pages
DM UNIT-1 Question and Answer
No ratings yet
DM UNIT-1 Question and Answer
25 pages
Data Mining Merged
No ratings yet
Data Mining Merged
10 pages
DM - Make Up - Sep 2019
No ratings yet
DM - Make Up - Sep 2019
3 pages
Amiete - It December 2016: Code: At78 Subject: Data Mining & Warehousing
No ratings yet
Amiete - It December 2016: Code: At78 Subject: Data Mining & Warehousing
3 pages
It-3031 (DMDW) - CS End Nov 2023
No ratings yet
It-3031 (DMDW) - CS End Nov 2023
23 pages
Winter 2024 3160714
No ratings yet
Winter 2024 3160714
2 pages
Data Mining R18 Set 4
No ratings yet
Data Mining R18 Set 4
2 pages
1569928600-7cs It3a dmwh-3555
No ratings yet
1569928600-7cs It3a dmwh-3555
2 pages
CBSE Class 10 Data Science Previous Years Solved Question Papers
From Everand
CBSE Class 10 Data Science Previous Years Solved Question Papers
Manish Soni
No ratings yet
Chapter 16-IT Controls Part II: Security and Access: True/False
No ratings yet
Chapter 16-IT Controls Part II: Security and Access: True/False
17 pages
Topic 05 - Data Preprocessing
No ratings yet
Topic 05 - Data Preprocessing
62 pages
LiveTracker - All Network Details
No ratings yet
LiveTracker - All Network Details
4 pages
IBM Watson Analytics Automating Visualization Desc
No ratings yet
IBM Watson Analytics Automating Visualization Desc
12 pages
Norcal OAUG BI Publisher
No ratings yet
Norcal OAUG BI Publisher
27 pages
Tripleten 5 - Introduction To Table Relationships and Joining Tables
No ratings yet
Tripleten 5 - Introduction To Table Relationships and Joining Tables
31 pages
Digital Documentation Watermark
No ratings yet
Digital Documentation Watermark
23 pages
Difference Between B Tree and B+ Tree
No ratings yet
Difference Between B Tree and B+ Tree
1 page
Online Book Store Report
No ratings yet
Online Book Store Report
30 pages
Indexes
No ratings yet
Indexes
70 pages
RAJU AWS Data Engineer Resume
No ratings yet
RAJU AWS Data Engineer Resume
6 pages
Database Management System
No ratings yet
Database Management System
9 pages
A MG - SOD - L4 - Website and Database Integration
No ratings yet
A MG - SOD - L4 - Website and Database Integration
18 pages
Fake Media Detection Based On Natural Language Processing and Blockchain Approaches
No ratings yet
Fake Media Detection Based On Natural Language Processing and Blockchain Approaches
12 pages
22bit0518 VL2023240503986 Da
No ratings yet
22bit0518 VL2023240503986 Da
94 pages
Project Report
No ratings yet
Project Report
44 pages
Pharmacy Management System Ijariie17172
No ratings yet
Pharmacy Management System Ijariie17172
5 pages
Cyber Security On Transactions in Smart Metering Systems Using Blockchain
No ratings yet
Cyber Security On Transactions in Smart Metering Systems Using Blockchain
20 pages
Relational Notation
No ratings yet
Relational Notation
3 pages
Material Master
No ratings yet
Material Master
31 pages
Mercury Learning and Information (2024)
No ratings yet
Mercury Learning and Information (2024)
441 pages
Abnormal Errors After ORA-1013 Received in Application
No ratings yet
Abnormal Errors After ORA-1013 Received in Application
2 pages
10.1515 - PDTC 2024 0001
No ratings yet
10.1515 - PDTC 2024 0001
10 pages
CS441 FinalTerm PPT by AC 03222254114
No ratings yet
CS441 FinalTerm PPT by AC 03222254114
456 pages
SAP HANA System Replication - Ashu
No ratings yet
SAP HANA System Replication - Ashu
15 pages
Abstract:: Farmer Buddy
No ratings yet
Abstract:: Farmer Buddy
4 pages
Chapter 2 Data Merise
No ratings yet
Chapter 2 Data Merise
12 pages
Developer's Reference Collection by Amzad Baig: Siebel Requirements, References & Solutions
No ratings yet
Developer's Reference Collection by Amzad Baig: Siebel Requirements, References & Solutions
60 pages

IS421 Exam

Uploaded by

IS421 Exam

Uploaded by

IS421: Knowledge Discovery in Databases

School of Computing, Information and Mathematical Sciences

Duration of Exam: 3 hours + 10 minutes

Reading Time: 10 minutes

Writing Time: 3 hours

1. This exam has two sections:

Write your answers in the Answer Book provided.

2. Outline three methods for cleaning data. (3 marks)

5. Explain the four major features of data warehouse. (4 marks)

6. Discuss four benefits of using information from data warehouses. (4 marks)

Question 8 KDD Process [14 marks]

a) Elaborate on the steps involved in the KDD process. (5 marks)

Customer Items purchased

Use Figure 3 shown below to answer the questions that follow.

Use Table 1 to answer the questions that follow.

a) Develop a decision tree for “Play”. (4 marks)

c) Identify the best attribute and justify your choice. (2 marks)

a) Discuss the use of data cube technology in data mining. (2 marks)

You might also like