0% found this document useful (0 votes)

137 views17 pages

Question Bank: Q1) What Is Data Warehouse?

This document contains a question bank for the subjects of Data Mining and Data Warehouses for the years 2016-2017. It includes 75 multiple choice and short answer questions covering topics such as data warehousing, OLAP, OLTP, data marts, ETL processes, data mining techniques like classification, clustering, association rule mining and applications of data mining.

Uploaded by

anagha2982

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

137 views17 pages

Question Bank: Q1) What Is Data Warehouse?

Uploaded by

anagha2982

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Question Bank

Year: 2016- 2017 Semester: First and Second

Subject Dept: CS Subject Name: Data Mining

………………………………………………………………………………………………………………………………………….

Q1) What is data warehouse?

Q2) What is the benefits of data warehouse?

Q3) What is the difference between OLTP and OLAP?

Q4) Briefly state different between data ware house & data mart?

Q5 ) What are the difference among dependent data mart, independent data mart and
hybrid data mart?

Q6) List some properties of data Marts.

Q7) List the Reasons for Creating a Data Mart

Q8) What are the characteristics of data warehouse?

Q9) Differentiate between Data Warehouse versus Operational DBMS

Q10) multiple choice questions

1. Operational database is

Q11) Give some alternative terms for data mining.

Q12) What is KDD.
Q13) What are the steps involved in KDD process.

1
Q14) What are the benefits of data mining?

Q15) Define prediction and description models

Q16) Describe challenges to Separate Data Warehouse regarding performance and functions issues.

Q17). Why Do We Need Data Warehouses?

Q18). How is a Data Warehouse different from an Operational DBMS.

Q19). Define an operational system

Q20) What is Decision tree?

Q21) What is meant by Pattern?

Q22) What are the different Applications of Data Mining

Q23) Explain the following Major Tasks in Data Processing

Q24) Explain the knowledge discovery phases.

Q25) Define the following terms:

Q26). What is Descriptive and predictive data mining?

Q27) Write the preprocessing steps that may be applied to the data for prediction.

Q28) Why do you need data warehouse life cycle process? and what are the steps in the life
cycle approach?

Q29) Explain OLAP and OLTP?

Q30) Write the main steps of Data cleaning tasks

2
Q31) Why Do We Need Data Warehouses?

Q32) Name some of the data mining applications?

Q33) Why is data quality so important in a data warehouse environment?

Q34) How can data visualization help in decision-making?

Q35) Explain the different types of data repositories on which mining can be performed?

Q36) Explain the concept of Data Mining Classification Schemes

Q37) Define data warehouse. Draw the architecture of data warehouse and explain the three
tiers in detail.

Q38) Draw and explain block diagram of Online Transaction Processing Cycle.

Q39) Consider the sales market transactions shown in table below, what is the
Multidimensional OLAP Cube that can be derived from this data set.

Q40) What are the main advantages and disadvantages of MOLAP cube.

…………………………………………………………………………………………………………………………………….

Q41) Cluster analysis is said to be a collection of objects. It is used in

various application in the real world. Enumerate the applications of cluster analysis,
in details.
………………………………………………………………………………..
Q42) Explain in details each one of these steps.

…………………………………………………………………………………………………………………………………………………..

Q43) Consider a database, D, consisting of 9 transactions. Suppose

min.support count required is 2 and let min.confidence required is 70%. Use
the apriori algorithm to generate all the frequent candidate itemsets Ci and
frequent itemsets Li.

3
TID List of Items

T100 I1, I2, I5

T200 I2, I4
T300 I2, I3
T400 I1, I2, I4
T500 I1, I3
T600 I2, I3
T700 I1, I3
T800 I1, I2 ,I3, I5
T900 I1, I2, I3

…………………………………………………………………………………………………………………………………………………..

Q44) Draw a flowchart to show how the K-Mean Clustering algorithm works?

……………………………………………………………………………………………………………………………………

Q45) Clustering technique is used in various fields of our real life enumerate five of the
Clustering Applications.

…………………………………………………………………………..
Q46) Explain in details each one of these steps.
1. Decision Support System,
2. Market-Basket Data
3. Association rules
4. The Apriori algorithm Key Concepts

…………………………………………………………………………………………………………………………..

Q47) Consider a database, D, consisting of 5 transactions. Use this table to show the
implementation of k-means algorithm together with Euclidean distance function. Use K=2 and
suppose A and C are selected as the initial means.

4
…………………………………………………………………………………………………………………………….……

Q48) When we can say the association rules are interesting?

Q49) Explain Association rule in mathematical notations.

Q50) Define support and confidence in Association rule mining.

Q51) Suppose that we have the following table of a database of transactions D, depending
on these transactions determine Support and Confidence values for the following items I.

items I
a database of transactions D

…………………………………………………………………………………………………

5
Q52) Consider a database, D, consisting of 4 transactions. Suppose
min.support count required is 2 and let min.confidence required is 70%. Use
the apriori algorithm to generate all the frequent candidate itemsets Ci and
frequent itemsets Li.

TID Items
100 134
200 235
300 1235
400 25

…………………………………………………………………………………………..

Q53) How are association rules mined from large databases?

Q54) What is the purpose of Apriori Algorithm?

Q55) How to generate association rules from frequent item sets?

Q56) Define the concept of classification and explain the main steps.

Q57) What is Decision tree?

Q58) What is Hierarchical method?

6
Q59) Explain the General Steps of Hierarchical Clustering

Q60) Explain the Methods of Hierarchical Clustering and give example for
each one.

Q61) Differentiate between Agglomerative and Divisive Hierarchical

Clustering Algorithm?

Q62) Explain with Examples the various applications of Classification

Algorithm.

Q63) Explain classification by Decision tree induction?

…………………………………………………………………..
Q64) Discuss (shortly) whether or not each of the following activities is a
data mining task.

(a) Dividing the customers of a company according to their profitability.

(b) Computing the total sales of a company.

(c) Sorting a student database based on student identification numbers.

(d) Predicting the outcomes of tossing a (fair) pair of dice.

(e) Predicting the future stock price of a company using historical records.

(f) Monitoring the heart rate of a patient for abnormalities.

(g) Monitoring seismic waves for earthquake activities.

(h) Extracting the frequencies of a sound wave.

………………………………………………………………………..

7
Q65) What are the main advantages and disadvantages of Decision Tree
classification algorithms?

Q66) Explain the partitioning method of clustering.

Q67) State the categories of clustering methods?

Q68) Name some specific applications of cases where the data analysis task
is Classification.

Q69) What are the essential steps in the process of making a decision
approach?

Q70) How can data visualization help in decision-making?

Q71) Explain classification techniques by rule-based classifier and decision

tree induction with example for each case?

Q72) Visual classification: an interactive approach to decision tree

construction. Draw a flowchart to explain in detail all the main steps of visual
classification.

8
Q73) What is descriptive and predictive data mining, name three examples for
each one?

Q74) Multiple Choice Questions. Please choose the best answer for the
following questions:-
1. Which of the following is the most important when deciding on the data
structure of a data mart?
(a) XML data exchange standards
(b) Data access tools to be used
(c) Metadata naming conventions
(d) Extract, Transform, and Load (ETL) tool to be used

2. Which of the following is not a data mining functionality?

A) Characterization and Discrimination
B) Classification and regression
C) Selection and interpretation
D) Clustering and Analysis

3. Strategic value of data mining is ......................

A) cost-sensitive
B) work-sensitive
C) time-sensitive
D) technical-sensitive
4. The output of KDD is .............
A) Data
B) Information
C) Query
D) Useful information

5. Which one manages both current and historic transactions?

(a) OLTP
(b) OLAP

9
(c) Spread sheet
(d) XML

6. Which of the following process includes data cleaning, data integration,

data selection, data transformation, data mining, pattern evolution and
knowledge presentation?
(a) KDD process
(b) ETL process
(c) KTL process
(d) MDX process
7. Data modeling technique used for data marts is
(a) Dimensional modeling
(b) ER – model
(c) Extended ER – model
(d) Physical model

8. Which of the following tools a business intelligence system will have?

(a) OLAP tool
(b) Data mining tool
(c) Reporting tool
(d) Both(a) and (b) above

9. Which of the following is not a kind of data warehouse application?

A) Information processing
B) Analytical processing
C) Data mining
D) Transaction processing

10. The data is stored, retrieved and updated in ....................

A) OLAP
B) OLTP

10
C) Data Mart
D) FTP

11. The .................. allows the selection of the relevant information

necessary for the data warehouse.

A) top-down view
B) data warehouse view
C) data source view
D) business query view

Q75) What is ETL?

Q76) What is data warehouse architectures: conceptual view, explain in

details.

Q77) Decision support systems are used for

a. Management decision making
b. Providing tactical information to management
c. Providing strategic information to management
d. Better operation of an organization

Q78) Decision support systems are essential for

A. Day–to-day operation of an organization.
B. Providing statutory information.
C. Top level strategic decision making.
D. Ensuring that organizations are profitable.

11
Q79) Multiple Choice Questions. Please choose the best answer for the
following questions:-
1. Data mining is best described as the process of
a. identifying patterns in data.
b. deducing relationships in data.
c. representing data.
d. simulating trends in data.

2. Data used to build a data mining model.

a. validation data
b. training data
c. test data
d. hidden data

3. Classification problems are distinguished from estimation problems in that

a. classification problems require the output attribute to be numeric.
b. classification problems require the output attribute to be categorical.
c. classification problems do not allow an output attribute.
d. classification problems are designed to predict future outcome.

4. This approach is best when we are interested in finding all

possible interactions among a set of attributes.
a. decision tree
b. association rules
c. K-Means algorithm
d. genetic learning

5. This step of the KDD process model deals with noisy data.
a. Creating a target dataset
b. data preprocessing
c. data transformation
d. data mining

12
6. This clustering algorithm initially assumes that each data instance
represents a single cluster.
a. agglomerative clustering
b. conceptual clustering
c. K-Means clustering
d. expectation maximization

Q80) Construct a decision tree with root node Type from the data in the table
below. The first row contains attribute names. Each row after the first
represents the values for one data instance. The output attribute is Class.

Scale Type Shade Texture Class

One One Light Thin A
Two One Light Thin A
Two Two Light Thin B
Two Two Dark Thin B
Two One Dark Thin C
One One Dark Thin C
One Two Light Thin C

Q81) What is outlier analysis?

Q82) Explain each one

1. What is data cleaning?

2. What is data integration?

3. What is data transformation?

Q83) What are the two steps in data classification?

13
Q84) What is the difference between classification and clustering?

Q85) What are hierarchical methods, and give example for each one?

Q86) List out some clustering methods.

Q87) Define the following terms: Data Mart, MOLAP, OLTP,

Q88) The data classification process includes two steps of training set and
test set explain these steps with example.

Q89) List the general steps of hierarchical clustering algorithm.

Q90) Briefly discuss the hierarchical agglomerative algorithm and convert

the following scenario to agglomerative.
In this scenario, after the second step of the agglomerative algorithm will
yield clusters, {a} {b c} {d e} {f}. In the third step will yield clusters {a} {b
c} {d e f}, which is a clustering, in the fourth step will give a small number
but larger clusters that are {a} {b c d e f} and finally will yield cluster of {a
b c d e f}.

Q91) Explain each one with example.

1. Traditional Hierarchical Clustering
2. Traditional Dendrogram
3. Non-traditional Hierarchical Clustering
4. Non-traditional Dendrogram

Q92) Explain the market basket analysis problem

Q93) Consider a database, D, consisting of 7 transactions. Use this table to

show the implementation of k-means algorithm together with Euclidean
14
distance function. Use K=2 and suppose 1 and 4 are selected as the initial
means.

Q94) List the basic steps to develop a clustering task

 Q95) Extract a rule based system from a decision tree given bellow,
use rule-based ordering technique.

age?

<=30 31..40 >40

student? credit rating?

yes

no yes excellent fair

no yes no

15
Q96) Explain the dendrogram of the hierarchical technique and convert the
numbers of the figure below of nested clusters to a dendrogram.

Q97) State the advantages of the decision tree approach over other approaches
for performing classification.

Q98) Explain in detail the coverage of a rule and accuracy of a rule methods
of a data mining classification with example for each one.

Q99) What do you mean by hierarchical cluster analysis.

Q100) What do you mean by the Apriori algorithm Key Concepts.

16
Q101) Consider a database, D, consisting of 4 transactions. Suppose
min.support count required is 2 and let min.confidence required is 70%. Use
the apriori algorithm to generate all the frequent candidate itemsets Ci and
frequent itemsets Li.

SPJ S4 2008 PDF
100% (4)
SPJ S4 2008 PDF
202 pages
Android MCQ
100% (3)
Android MCQ
5 pages
# Sample Questions of Computer Applications - All For BCom, MCom, BBA
No ratings yet
# Sample Questions of Computer Applications - All For BCom, MCom, BBA
18 pages
Data Mining MCQ Multiple Choice Questions With Answers: Eguardian
No ratings yet
Data Mining MCQ Multiple Choice Questions With Answers: Eguardian
15 pages
Strategic Information Systems Planning - Answers To Study Questions
No ratings yet
Strategic Information Systems Planning - Answers To Study Questions
19 pages
Chapter 7
No ratings yet
Chapter 7
24 pages
Data Mining Question Bank
No ratings yet
Data Mining Question Bank
4 pages
RHI MR Services Bulletin 1 2012-Data
100% (1)
RHI MR Services Bulletin 1 2012-Data
64 pages
Introduction To Web Technology: Unit 3 HTML
No ratings yet
Introduction To Web Technology: Unit 3 HTML
49 pages
27 36 and 84 87
No ratings yet
27 36 and 84 87
34 pages
Ewis Job Card
100% (1)
Ewis Job Card
4 pages
Data Mining Suggestions
No ratings yet
Data Mining Suggestions
5 pages
Data Mining Questions
No ratings yet
Data Mining Questions
5 pages
Mining MCQ PDF
67% (3)
Mining MCQ PDF
3 pages
Parameter List For GSM Huawei
No ratings yet
Parameter List For GSM Huawei
1,103 pages
Traa, J.W.A. - RUP vs. MSF - A Comparative Study
100% (1)
Traa, J.W.A. - RUP vs. MSF - A Comparative Study
140 pages
Chapter 1 Question Answers
No ratings yet
Chapter 1 Question Answers
4 pages
Sybca Bigdata MCQ
No ratings yet
Sybca Bigdata MCQ
7 pages
Unit 1
No ratings yet
Unit 1
14 pages
DMBI QB AssignmentQ
No ratings yet
DMBI QB AssignmentQ
8 pages
DW Model Questions
No ratings yet
DW Model Questions
8 pages
Difference Between Data Warehousing and Data Mining
No ratings yet
Difference Between Data Warehousing and Data Mining
8 pages
Decision-Making Lecture Note PDF
100% (1)
Decision-Making Lecture Note PDF
21 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
46 pages
Grade 5-Mathematics-Learner Book Answer Key
No ratings yet
Grade 5-Mathematics-Learner Book Answer Key
32 pages
Chapter 1: Purchasing and Supply Chain Management
No ratings yet
Chapter 1: Purchasing and Supply Chain Management
23 pages
Enterprise System Lecture02
No ratings yet
Enterprise System Lecture02
12 pages
Iceberg Queries and Other Data Mining Concepts
No ratings yet
Iceberg Queries and Other Data Mining Concepts
53 pages
Dwques
75% (4)
Dwques
5 pages
Cloud Computing and Business Intelligence by Alexandru TOLE
No ratings yet
Cloud Computing and Business Intelligence by Alexandru TOLE
10 pages
Business Intelligence Interview Questions and Answer
No ratings yet
Business Intelligence Interview Questions and Answer
12 pages
Pom 76 Page
No ratings yet
Pom 76 Page
76 pages
Einstein First Paper PDF
67% (3)
Einstein First Paper PDF
18 pages
Cloud Computing Made Easy
No ratings yet
Cloud Computing Made Easy
67 pages
Mis Notes-2
100% (1)
Mis Notes-2
71 pages
Olap and Oltap
No ratings yet
Olap and Oltap
14 pages
Subject Code: 80359 Subject Name: Data Warehousing and Data Mining Common Subject Code (If Any)
No ratings yet
Subject Code: 80359 Subject Name: Data Warehousing and Data Mining Common Subject Code (If Any)
9 pages
Role of Business Intelligence in Business Performance Management.
No ratings yet
Role of Business Intelligence in Business Performance Management.
6 pages
Bachelor in Business Administration School: Business Administration Semester
No ratings yet
Bachelor in Business Administration School: Business Administration Semester
15 pages
ETL Testing in Less Time
No ratings yet
ETL Testing in Less Time
16 pages
Introduction of Software Engineering
No ratings yet
Introduction of Software Engineering
21 pages
DMDW Question Bank
No ratings yet
DMDW Question Bank
17 pages
Cloud Computing World Issue 3 - December 2014 PDF
No ratings yet
Cloud Computing World Issue 3 - December 2014 PDF
46 pages
Hooke Law Lab Report
No ratings yet
Hooke Law Lab Report
10 pages
Untitled
No ratings yet
Untitled
13 pages
Bahria University: Assignment # 1
No ratings yet
Bahria University: Assignment # 1
9 pages
Notes On MIS
No ratings yet
Notes On MIS
17 pages
Spark With R
No ratings yet
Spark With R
6 pages
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
No ratings yet
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
52 pages
Master of Science in Management & Informatics
No ratings yet
Master of Science in Management & Informatics
7 pages
Securing Data Using Encryption and Blockchain Technology
No ratings yet
Securing Data Using Encryption and Blockchain Technology
5 pages
Human Resource Management Course Outline
No ratings yet
Human Resource Management Course Outline
22 pages
Module 1
No ratings yet
Module 1
132 pages
Diploma in Logistic & Supply (Cargo Management)
No ratings yet
Diploma in Logistic & Supply (Cargo Management)
10 pages
Dont Do That
No ratings yet
Dont Do That
30 pages
SQL Server 2008 Business Intelligence
100% (5)
SQL Server 2008 Business Intelligence
16 pages
Histology Procedure
100% (2)
Histology Procedure
37 pages
BA7205 Information Management
0% (1)
BA7205 Information Management
20 pages
MOBILE MCQ
No ratings yet
MOBILE MCQ
17 pages
4 Management Information System (Mis) : Backdrop
No ratings yet
4 Management Information System (Mis) : Backdrop
31 pages
School of Management Studies Mba Programme Annexure-I Semester-Wise Schedules of Courses 2017-19 Batch
No ratings yet
School of Management Studies Mba Programme Annexure-I Semester-Wise Schedules of Courses 2017-19 Batch
42 pages
BCA 2nd Year
No ratings yet
BCA 2nd Year
31 pages
Enterprise Resource Planing - ERP
No ratings yet
Enterprise Resource Planing - ERP
14 pages
Machine Learning Part: Domain Overview
No ratings yet
Machine Learning Part: Domain Overview
20 pages
Cloud e Book
No ratings yet
Cloud e Book
10 pages
Decision Science Project Report On "Big Data"
No ratings yet
Decision Science Project Report On "Big Data"
9 pages
CBIS
No ratings yet
CBIS
6 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
84 pages
BCA-604 (Knowledge Management) : Important Questions
No ratings yet
BCA-604 (Knowledge Management) : Important Questions
1 page
Commissioning Instructions Motormaster 200 Motor Protection Relays
No ratings yet
Commissioning Instructions Motormaster 200 Motor Protection Relays
20 pages
SPPM Notes 1
No ratings yet
SPPM Notes 1
12 pages
ThermostatCatalog 570-280
0% (1)
ThermostatCatalog 570-280
12 pages
EC Ders Föyü - 241025 - 115702
No ratings yet
EC Ders Föyü - 241025 - 115702
226 pages
Stacks and Linked Lists
No ratings yet
Stacks and Linked Lists
88 pages
Quadratics and Transformations
No ratings yet
Quadratics and Transformations
26 pages
Association Analysis: Basic Concepts and Algorithms: Market-Basket Transactions
No ratings yet
Association Analysis: Basic Concepts and Algorithms: Market-Basket Transactions
42 pages
Chap-9. Publishing Android Application To Market: Android Development Life Cycle
No ratings yet
Chap-9. Publishing Android Application To Market: Android Development Life Cycle
7 pages
Handbook of Production Economics Subhash C. Ray (Editor)
100% (1)
Handbook of Production Economics Subhash C. Ray (Editor)
72 pages
Zen 10c1ar A V2
No ratings yet
Zen 10c1ar A V2
44 pages
Q. 1-Q.30 Carry One Mark Each: India's No.1 Institute For GATE Chemical Engineering CH-1
No ratings yet
Q. 1-Q.30 Carry One Mark Each: India's No.1 Institute For GATE Chemical Engineering CH-1
29 pages
Python - (Msme in India)
No ratings yet
Python - (Msme in India)
15 pages
Design Spec WASP UAV
No ratings yet
Design Spec WASP UAV
42 pages
Wei Et Al 2025 Lowering The Kinetic Barrier Via The Synergistic Catalysis of N Cnts Supported RHP Subnanoclusters and
No ratings yet
Wei Et Al 2025 Lowering The Kinetic Barrier Via The Synergistic Catalysis of N Cnts Supported RHP Subnanoclusters and
12 pages
Cambridge International AS & A Level: CHEMISTRY 9701/22
No ratings yet
Cambridge International AS & A Level: CHEMISTRY 9701/22
16 pages
Data Warehousing and OLAP Technology
No ratings yet
Data Warehousing and OLAP Technology
51 pages
Chap-3. Starting With Application Coding: Android Intent
No ratings yet
Chap-3. Starting With Application Coding: Android Intent
11 pages
Zeecom ZC3000
No ratings yet
Zeecom ZC3000
4 pages
Need of Two Types of Data: Information
No ratings yet
Need of Two Types of Data: Information
7 pages
Layouts
No ratings yet
Layouts
35 pages
Enzymes
No ratings yet
Enzymes
10 pages
Chap-5. Data Storage, Retrieval and Sharing: File System in Android
No ratings yet
Chap-5. Data Storage, Retrieval and Sharing: File System in Android
8 pages
Cell Structures and Their Functions
No ratings yet
Cell Structures and Their Functions
1 page
FOSS and Common Core Math - Grade 8
No ratings yet
FOSS and Common Core Math - Grade 8
28 pages
Verb Tense Consistency
No ratings yet
Verb Tense Consistency
10 pages
On The Computation of The Dispersion Diagram of Symmetric One-Dimensionally Periodic Structures
No ratings yet
On The Computation of The Dispersion Diagram of Symmetric One-Dimensionally Periodic Structures
15 pages
Geotropism
No ratings yet
Geotropism
8 pages
What Is An Activity in Android?
No ratings yet
What Is An Activity in Android?
9 pages
Assignment Technique: Working Rules and Guidelines
No ratings yet
Assignment Technique: Working Rules and Guidelines
15 pages
Lecture Notes: Android UI Basics
No ratings yet
Lecture Notes: Android UI Basics
2 pages
Exercise 2.1 Page 143 PDF
No ratings yet
Exercise 2.1 Page 143 PDF
3 pages
Unit 5 10 PDF
No ratings yet
Unit 5 10 PDF
4 pages