0% found this document useful (0 votes)
41 views8 pages

Data Mining Question Bank

Uploaded by

saniyaa.fatimaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views8 pages

Data Mining Question Bank

Uploaded by

saniyaa.fatimaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 8

Question Bank

2
3
3
3

Name of the Course: DATA MINING Course Code:


Branch: CSE/CSD/CSM Academic Year: 2024-2025

Name & Details of the Course Coordinator:


Dept. of. CSE-

K1-Remembering; K2-Understanding; K3-Applying; K4-Analyzing; K5-Evaluating; K6-


Creating
UNIT –I
Short Answer Question:
Bloom’s
Course
Q.No. Question Taxonomy Marks
Outcome
Level
What are the main components of data
1 CO1 K1 2
warehouse architecture?

Describe the role of the data staging area in


2 CO1 K1 3
data warehouse architecture.

3 What is data mining and why is it important? CO1 K2 2

What types of data are typically used in data


4 CO1 K1 3
mining?

What are the primary functionalities of data


5 CO1 K1 3
mining?

How are data mining systems classified based


6 CO1 K1 2
on data types?
Long Answer Questions:
Bloom’s
Course
S.No. Question Taxonomy Marks
Outcome
Level
Explain the concept of a data warehouse and its
1 CO1 K3 10
significance in business intelligence.
Describe the multidimensional model and its role in
2 data warehousing. Include an explanation of CO1 K4 10
dimensions, measures, and data cubes.
What are the advantages of using the multidimensional
3 CO1 K4 10
model in OLAP systems?
Describe the typical architecture of a data warehouse,
4 CO1 K4 10
including its key components and their functions.
Define data mining and discuss its importance in
5 extracting valuable insights from large datasets. CO1 K4 10
Structures in the context of document processing.
Explain the different functionalities of data mining and
6 CO1 K3 10
provide examples of each
UNIT: 2

Short Answer Questions:

Course Bloom’s
S. No. Question Marks
Outcome Taxonomy Level

What is association analysis in data


1 CO2 2
mining? K1

What is the Apriori algorithm used


2 CO2 K1 2
for?

What is an FP-tree and how does it


3 CO2 2
relate to frequent itemset mining? K1

What is meant by multilevel


4 CO2 K1 2
association rule mining?

What are multi-dimensional


5 CO2 K1 2
association rules?

Long Answer Questions:

Course Bloom’s
S.No. Question Marks
Outcome Taxonomy Level

Explain the concept of association


analysis in data mining. How does it
help in discovering relationships
1 among items in large datasets? Provide CO2 K4 10
examples of practical applications
where association analysis can be
beneficial.

Elaborate on the Apriori algorithm for


frequent itemset mining. How does the
algorithm generate candidate itemsets,
2 and what role does the concept of CO2 K2 10
support play in pruning the search
space? Discuss its advantages and
limitations.

Describe the FP-tree (Frequent Pattern


Tree) structure and explain how it
facilitates efficient frequent itemset
3 mining. What are the key features of CO2 K3 10
the FP-tree, and how do they differ
from the Apriori algorithm’s
approach?
Course Bloom’s
S.No. Question Marks
Outcome Taxonomy Level

Describe is multilevel association rule


mining, and how does it address
hierarchical data structures? Explain
4 CO2 K3 10
the process of mining rules at different
levels of abstraction and the challenges
associated with this approach.

Discuss the concept of multi-


dimensional association rule mining.
How does it differ from traditional
5 CO1 K5 10
association rule mining, and what are
the benefits of incorporating multiple
dimensions into the analysis?

Explain the role of correlation


analysis in the context of association
rule mining. How does correlation
6 CO2 K4 10
complement other measures like
support and confidence in evaluating
the strength of associations?

UNIT: 3

Short Answer Questions:

Bloom’s
Course
S.No. Question Taxonomy Marks
Outcome
Level

What is the primary goal of classification


1 CO3 2
in machine learning? K1

What is a class label in the context of K1


2 CO3 2
classification?

MID II

What are the typical steps involved in the K1


3 CO3 2
classification process?

How is the training set used in a K1


4 CO3 3
classification problem?

What is the Gini index, and how is it used K1


5 CO3 2
in decision tree induction?

Long Answer Questions:


Bloom’s
Course
S.No. Question Taxonomy Marks
Outcome
Level

Describe the classification problem in


machine learning. How does it differ from
1 other types of predictive modeling, such as CO3 K3 10
regression and clustering? Provide examples
to illustrate your explanation.

Outline the general approach to solving a K4


classification problem. Discuss each step in
2 CO3 10
detail, from data collection and preprocessing
to model selection, training, and evaluation.

MID II

Explain the process of decision tree induction.


How do decision trees split data at each node,
and what criteria are commonly used for
3 CO3 10
making these splits? Discuss the advantages K4
and disadvantages of using decision trees for
classification.

Describe the rule-based classifiers, and how


are classification rules generated from data?
4 Discuss the methods for evaluating and CO3 K4 10
refining rules to improve the accuracy and
interpretability of the classifier.

Explain the k-nearest neighbor (k-NN)


algorithm. How does it determine the class of K4
a new instance, and what factors influence the
5 CO3 10
performance of the classifier? Discuss the
impact of the choice of k and distance metrics
on the results.
UNIT: 4

Short Answer Questions:

Bloom’s
Course
S.No. Question Taxonomy Marks
Outcome
Level

1 What is the primary goal of cluster analysis? CO4 K1 2

Why are similarity and distance metrics crucial K2


2 CO4 2
in clustering?

What are the key characteristics of a good K1


3 CO4 2
clustering algorithm?

What is the basic principle of partition-based K1


4 CO4 2
clustering?

How does BIRCH handle large datasets K2


5 CO4 2
efficiently?

Long Answer Questions:

Bloom’s
Course
S.No. Question Taxonomy Marks
Outcome
Level

Explain the main objectives of cluster analysis and


discuss how it is utilized in various fields such as
6 marketing, biology, and image processing. How does CO4 K4 10
cluster analysis contribute to the discovery of
underlying patterns in large datasets?.

Describe the importance of similarity and distance


metrics in the context of clustering. How do these K3
7 metrics influence the formation of clusters, and what CO4 10
are the common challenges associated with choosing
an appropriate metric?

Identify and discuss the key characteristics that define


a clustering algorithm. How do factors such as
8 scalability, interpretability, and cluster shape influence CO4 K3 10
the selection of a clustering algorithm for a particular
application?
Bloom’s
Course
S.No. Question Taxonomy Marks
Outcome
Level

Examine the challenges associated with clustering


large-scale datasets. What strategies can be employed
9 to ensure that a clustering algorithm remains efficient CO4 K4 10
and effective when dealing with massive amounts of
data?

Compare k-means with other partition-based clustering


techniques such as k-medoids and fuzzy c-means.
10 Discuss the advantages and disadvantages of each CO4 K3 10
method, and provide examples of scenarios where one
might be preferred over the others.

UNIT: 5

Short Answer Questions:

Bloom’s
Course
S.No. Question Taxonomy Marks
Outcome
Level

What is a data stream, and how does it


1 CO5 2
differ from traditional data processing? K1

What is a time series, and how is it used


2 CO5 2
in data mining? K2

How do time series forecasting models


3 like ARIMA and Exponential CO5 K1 2
Smoothing work??

What is the goal of sequence pattern


4 CO5 2
mining in transactional databases? K1

Explain the difference between frequent


5 itemset mining and sequential pattern CO5 K2 2
mining..
Long Answer Questions:

Course Bloom’s
S.No. Question Marks
Outcome Taxonomy Level

Describe the
fundamental challenges
and strategies involved
in mining data streams.
How do concepts such
as concept drift and data
1 stream fragmentation CO5 K4 10
impact the effectiveness
of data stream mining
algorithms? Discuss how
incremental learning
approaches are used to
address these challenges.

Discuss the key


techniques for
summarizing and
approximating data in
data stream mining, such
as sketching and K3
2 CO5 10
sampling. How do these
techniques help in
handling the scalability
and memory limitations
inherent in data stream
environments?

Compare and contrast


time series forecasting
models such as ARIMA,
Exponential Smoothing, K4
and machine learning
approaches like LSTM
3 CO5 10
(Long Short-Term
Memory) networks.
How do these models
handle seasonality, trend
components, and noise
in time series data?
Course Bloom’s
S.No. Question Marks
Outcome Taxonomy Level

Describe the process of


mining sequential
patterns in transactional
databases, focusing on
algorithms such as
Prefix Span and SPADE. K3
4 C05 10
How do these algorithms
efficiently find frequent
sequential patterns and
handle the challenges of
large-scale sequence
data?

Explain how the


concept of constraint-
based mining can be
applied to sequential
pattern mining. What
5 types of constraints are C05 K4 10
commonly used, and
how do they impact the
efficiency and relevance
of the discovered
sequential patterns?

Analyze the unique


challenges associated
with mining object data
compared to traditional
tabular data. How do
object-oriented data K4
6 C05 10
models and complex
data structures affect the
mining process? Discuss
specific algorithms or
techniques used for
mining object data.

You might also like