Question Bank: Q1) What Is Data Warehouse?
Question Bank: Q1) What Is Data Warehouse?
Question Bank: Q1) What Is Data Warehouse?
………………………………………………………………………………………………………………………………………….
Q4) Briefly state different between data ware house & data mart?
Q5 ) What are the difference among dependent data mart, independent data mart and
hybrid data mart?
1
Q14) What are the benefits of data mining?
Q16) Describe challenges to Separate Data Warehouse regarding performance and functions issues.
Q28) Why do you need data warehouse life cycle process? and what are the steps in the life
cycle approach?
2
Q31) Why Do We Need Data Warehouses?
Q35) Explain the different types of data repositories on which mining can be performed?
Q37) Define data warehouse. Draw the architecture of data warehouse and explain the three
tiers in detail.
Q38) Draw and explain block diagram of Online Transaction Processing Cycle.
Q39) Consider the sales market transactions shown in table below, what is the
Multidimensional OLAP Cube that can be derived from this data set.
Q40) What are the main advantages and disadvantages of MOLAP cube.
…………………………………………………………………………………………………………………………………….
…………………………………………………………………………………………………………………………………………………..
3
TID List of Items
…………………………………………………………………………………………………………………………………………………..
Q44) Draw a flowchart to show how the K-Mean Clustering algorithm works?
……………………………………………………………………………………………………………………………………
Q45) Clustering technique is used in various fields of our real life enumerate five of the
Clustering Applications.
…………………………………………………………………………..
Q46) Explain in details each one of these steps.
1. Decision Support System,
2. Market-Basket Data
3. Association rules
4. The Apriori algorithm Key Concepts
…………………………………………………………………………………………………………………………..
Q47) Consider a database, D, consisting of 5 transactions. Use this table to show the
implementation of k-means algorithm together with Euclidean distance function. Use K=2 and
suppose A and C are selected as the initial means.
4
…………………………………………………………………………………………………………………………….……
Q51) Suppose that we have the following table of a database of transactions D, depending
on these transactions determine Support and Confidence values for the following items I.
items I
a database of transactions D
…………………………………………………………………………………………………
5
Q52) Consider a database, D, consisting of 4 transactions. Suppose
min.support count required is 2 and let min.confidence required is 70%. Use
the apriori algorithm to generate all the frequent candidate itemsets Ci and
frequent itemsets Li.
TID Items
100 134
200 235
300 1235
400 25
…………………………………………………………………………………………..
Q56) Define the concept of classification and explain the main steps.
6
Q59) Explain the General Steps of Hierarchical Clustering
Q60) Explain the Methods of Hierarchical Clustering and give example for
each one.
…………………………………………………………………..
Q64) Discuss (shortly) whether or not each of the following activities is a
data mining task.
(e) Predicting the future stock price of a company using historical records.
7
Q65) What are the main advantages and disadvantages of Decision Tree
classification algorithms?
Q68) Name some specific applications of cases where the data analysis task
is Classification.
Q69) What are the essential steps in the process of making a decision
approach?
8
Q73) What is descriptive and predictive data mining, name three examples for
each one?
Q74) Multiple Choice Questions. Please choose the best answer for the
following questions:-
1. Which of the following is the most important when deciding on the data
structure of a data mart?
(a) XML data exchange standards
(b) Data access tools to be used
(c) Metadata naming conventions
(d) Extract, Transform, and Load (ETL) tool to be used
9
(c) Spread sheet
(d) XML
10
C) Data Mart
D) FTP
A) top-down view
B) data warehouse view
C) data source view
D) business query view
11
Q79) Multiple Choice Questions. Please choose the best answer for the
following questions:-
1. Data mining is best described as the process of
a. identifying patterns in data.
b. deducing relationships in data.
c. representing data.
d. simulating trends in data.
5. This step of the KDD process model deals with noisy data.
a. Creating a target dataset
b. data preprocessing
c. data transformation
d. data mining
12
6. This clustering algorithm initially assumes that each data instance
represents a single cluster.
a. agglomerative clustering
b. conceptual clustering
c. K-Means clustering
d. expectation maximization
Q80) Construct a decision tree with root node Type from the data in the table
below. The first row contains attribute names. Each row after the first
represents the values for one data instance. The output attribute is Class.
13
Q84) What is the difference between classification and clustering?
Q85) What are hierarchical methods, and give example for each one?
Q88) The data classification process includes two steps of training set and
test set explain these steps with example.
Q95) Extract a rule based system from a decision tree given bellow,
use rule-based ordering technique.
age?
no yes no
15
Q96) Explain the dendrogram of the hierarchical technique and convert the
numbers of the figure below of nested clusters to a dendrogram.
Q97) State the advantages of the decision tree approach over other approaches
for performing classification.
Q98) Explain in detail the coverage of a rule and accuracy of a rule methods
of a data mining classification with example for each one.
16
Q101) Consider a database, D, consisting of 4 transactions. Suppose
min.support count required is 2 and let min.confidence required is 70%. Use
the apriori algorithm to generate all the frequent candidate itemsets Ci and
frequent itemsets Li.
17