0% found this document useful (0 votes)
8 views

DWDM Unitwise Questions

This document is a unit-wise question bank for a Data Warehousing and Data Mining course at V.V.P. Engineering College. It covers various topics including data warehousing concepts, data mining processes, data preprocessing techniques, mining frequent patterns, classification and prediction methods, clustering, and advanced topics like web mining and text mining. Each unit contains specific questions designed to assess understanding and application of the subject matter.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

DWDM Unitwise Questions

This document is a unit-wise question bank for a Data Warehousing and Data Mining course at V.V.P. Engineering College. It covers various topics including data warehousing concepts, data mining processes, data preprocessing techniques, mining frequent patterns, classification and prediction methods, clustering, and advanced topics like web mining and text mining. Each unit contains specific questions designed to assess understanding and application of the subject matter.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Semester 6th – I.T. Department – V.V.P. Engg.

College
DATAWAREHOUSING AND DATA MINING
Subject Code: 3161610
Unit-Wise Question Bank

Unit-1 Data Warehousing


1) What is Data Warehousing? Explain its features.
2) Difference between a) Data warehouse and Data Mart b) OLTP and OLAP systems
c) Fact table vs. Dimension table.
3) With the help of a neat diagram explain the 3-tier architecture of a data warehouse.
4) Explain Star, Snowflake, and Fact Constellation Schema for Multidimensional
Database with diagram.
5) What is Cube? Explain various OLAP Operations on Data Cube with example.

Unit-2 Introduction to data mining (DM)

6) Define the term “Data Mining”. With the help of a suitable diagram explain the
process of knowledge discovery from databases. Why is it called data mining rather
knowledge mining?
7) List the types of data on which data mining can be performed. Explain different data
mining functionalities.
8) Write a note on Classification of data mining.
9) Discuss possible ways for integration of a Data Mining system with a Database or
Data Warehouse system.
10) List and describe major issues in data mining.

Unit-3 Data Preprocessing


11) Explain the pre-processing required to handle missing data and noisy data during
the process of data mining. Or List and describe the methods for handling the
missing and noisy values in data cleaning.
12) Explain with example how continuous numerical data values can be discretized.
13) Describe methods for data transformation.
14) What is Measures? List and explain types of measures Or Short Note :Distributive
and Holistic measures
15) Suppose that the data for analysis includes the attribute age. The age values for the
data tuples are (in increasing order): 13, 15, 16, 16, 19, 20, 23, 29, 35, 41, 44, 53, 62,
69, 72
a. Use min-max normalization to transform the value 45 for age onto the range
[0:0, 1:0]
b. Use z-score normalization to transform the value 45 for age, where the
standard deviation of age is 20.64 years.
Compiled By: Darshana H. Patel
V.V.P. Engineering College, Rajkot
16) Enlist various data reduction strategies and explain any two.
17) What is noise? Explain data smoothing methods as noise removal technique to divide
given data into bins of size 3 by bin partition (equal frequency), by bin means, by bin
medians and by bin boundaries. Consider the data: 10, 2, 19, 18, 20, 18, 25, 28, 22
18) Explain following Terms: Concept Hierarchy and its types, Histogram, Sampling, Co-
relation analysis, Chi-square test.
19) Clarify Mean, Median, Mode, Variance, Standard Deviation & five number summary
with suitable database example.
20) Explain Feature selection, Feature extraction and CUR decomposition in brief.

Unit-4 Mining Frequent Patterns, Associations and Correlations


21) Write and discuss the algorithm which is used to generate frequent itemsets using
an iterative level-wise approach based on candidate generation. State the Apriori
Property. Also, list the technique to improve efficiency of Apriori algorithm
Generate large itemsets and association rules using Apriori algorithm on the
following data set with minimum support value and minimum confidence value
set as 50% and 75% respectively

22) What is Market Basket Analysis and explain with its use? Explain Association Rules
with Confidence & Support giving an example.
23) Why strong association rule is not always interesting? Explain with example. How
multilevel association rules can be mined efficiently using concept hierarchy?
24) Write a note on sequential pattern mining or advanced association rule mining
techniques.
25) Briefly explain mining frequent patterns without candidate generation giving an
example.

Unit-5 Classification and Prediction


26) Explain the Classification by Decision Tree Induction Algorithm illustrating an
example alongwith algorithm
27) What is classification and prediction? List out Issues regarding Classification and
prediction.
28) Discuss Tree Pruning in detail. Or why tree pruning useful in decision tree induction?
29) What is an attribute selection measure? Explain different attribute selection measures
with example. OR Explain the following as attribute selection measure: (i)
Information Gain (ii) Gain Ratio
Compiled By: Darshana H. Patel
V.V.P. Engineering College, Rajkot
30) Explain “Linear Regression” using suitable example. Or Explain Linear & Non-
Linear Regression methods of Predictions. Or Explain linear regression? What are the
reasons for not using the linear regression model to estimate the output data?
31) Why naïve Bayesian classification is called “naïve”? Briefly outline the major ideas of
naïve Bayesian classification giving an example. Or Explain Baye’s Theorm and
Statistical based algorithm used for classification.
32) Explain how the accuracy of a classifier/predictor can be measured (evaluating the
accuracy of a classifier/predictor) & also describe by which methods accuracy can be
increased (Ensemble methods/Combining methods).
33) Write a note on accuracy and error measures for classification and prediction
34) Explain rule based classification and case based reasoning in details.
35) What are neural networks? Describe the various factors which make them useful for
classification and prediction in data mining. Explain how the topology of neural
network is designed. List strengths and weakness of neural network as classifier. What
are the terminating conditions to stop training process of neural network classifier?

Unit-6 Clustering
36) What is meant by “clustering”? Explain why clustering is called unsupervised
learning. Mention any two applications of clustering.
37) Explain k-Means and K-Mediods clustering algorithm in detail. How K-Mean
clustering method differs from K-Medoid clustering method?
38) What is outlier analysis? Why outlier mining is important? Briefly describe the
different approaches for outlier detection.
39) Discuss Agglomerative Methods and divisive methods along with strength and
weakness of hierarchical clustering.
40) Write a note on clustering high dimensional data.

Unit-7 Advance Topics


41) What is web log? Explain web structure mining and web usage mining in detail.
42) Briefly explain basic concepts of text mining and Spatial mining using example.
43) Write a note on Temporal Mining and Multimedia Mining.

Compiled By: Darshana H. Patel


V.V.P. Engineering College, Rajkot

You might also like