Data Mining MCQ (Multiple Choice Questions)
Data Mining MCQ (Multiple Choice Questions)
Questions)
Data Mining MCQ PDF arranged chapterwise! Start practicing now for exams,
online tests, quizzes, and interviews!
Answer: d
Explanation: Data mining is the process of discovering patterns, correlations, or
trends by analyzing large datasets. It involves various techniques from statistics,
machine learning, and database systems to uncover valuable insights from data.
These insights can be used for decision-making, prediction, and optimization in
various fields such as business, science, healthcare, and finance.
Answer: a
Explanation: Spooling facilitates data exchange between slow peripheral devices
and the computer applications and hence, is not a data mining task.
Classification, which maps data to predefined groups, is a basic data mining task.
Similarly, prediction, which predicts the data values based on the past data, and
clustering, which maps data to non-predefined groups, are also basic data mining
tasks.
Answer: b
Explanation: The data mining, as a field, evolved due to the presence of large
amounts of data. Due to the growing involvement of internet in our everyday
lives, there is a huge amount of data generated which has been put to analytical
use by data mining. On the other hand, overfitting, outliers and high
dimensionality are some of the key problems faced in the implementation of data
mining.
4. Which of the following is not a motivating factor for the development of data
mining tools?
a) Data tombs
b) Data rich but information poor situation
c) Data cleaning
d) Dependency on domain experts in expert systems
View Answer
Answer: c
Explanation: The presence of a huge amount of data but the inability to extract
information from this data, also described as data rich but information poor
situation, led to the need for data mining tools. This data stored in databases,
when not used much, form data tombs. Expert systems formed to assist analysis
of data require domain knowledge so it was also not completely error-free. All
these situations motivated the development of data mining tools. Data cleaning,
on the other hand, is a step towards data mining.
Answer: a
Explanation: The data mart is a subset of data warehouse and is oriented to a
specific functional area or subject. Data warehouse, on the other hand, is
oriented towards different functional areas and may have a more complex design
than a data mart.
Answer: d
Explanation: Knowledge and data discovery management systems (KDDMS) are
the upcoming data mining systems that will include data mining tools, data
management tools, concurrency features, recovery features, and will also ensure
data consistency.
Answer: b
Explanation: Data Mining refers to the process of extraction of hidden patterns
from the Data Warehouse data. Data Preprocessing, Outlier detection and
removal and Uncertainty Reasoning are the methods which aim at removing
uncertainty, noise, or incompleteness of data.
8. Which among the following are not among Various Operations in Data
Warehousing?
a) Sticking
b) Dice
c) Drill down
d) Roll up
View Answer
Answer: a
Explanation: Sticking is not at all an Operation. Instead it is slicing which is just
mis-spelt to confuse. Drilling down is used to increase granularity. Roll up is an
operation to decrease granularity. Dice is the projection operation.
9. Pick the wrong data mining functionality among the given data mining
functionalities.
a) Classification
b) Clustering
c) Class Description
d) Object Description
View Answer
Answer: d
Explanation: There are 5 data mining functionalities. They are class/concept
description, Mining Frequent Patterns: associations and correlations,
Classification and Regression, Clustering and Outlier analysis.
10. Which of the following refers to the set of features that describe a data
object?
a) Attribute vector
b) Instance
c) Sample
d) Data point
View Answer
Answer: a
Explanation: A data object is described by one or more attributes or features. The
set of attributes or features that represent the characteristics of a data point is
called an attribute vector or a feature vector.
11. Which of the following is the most effective measure of the center of
symmetric data set?
a) Mode
b) Midrange
c) Mean
d) Median
View Answer
Answer: c
Explanation: In symmetric data distribution, the variable values occur at regular
frequencies. The arithmetic mean is the most commonly used measure of central
tendency for symmetric data and represents the center of the data set.
Answer: a
Explanation: Scatter plots are the plots that are used in bivariate distribution.
They are used to identify the relationships between the data values. They are also
used to identify clusters and outliers in a data set.
Answer: c
Explanation: The proximity measures are used to evaluate the similarity and
dissimilarity between the two objects. Similarity measures, dissimilarity
measures and distance measures are the commonly used proximity measures.
14. Which of the following is true about the supremum distance between the
given objects?
object 1 3 4 8
object 2 2 7 3
Answer: a
Explanation: The supremum difference is the maximum difference between
attribute values of two objects.
Supremum distance = max(|x1-y1|, …….. , |xn-yn|)
S = max(|3-2|, |4-7|, |8-3|)
S = max(1, 3, 5) = 5
Answer: c
Explanation: The data auditing tools discover rules and relationships in the data
and find the data that violate these rules. They make use of statistical techniques
to find the correlations in the data.
Answer: b
Explanation: In bottom up discretization, also known as merging, all the values
are considered as potential split points, which are then merged to form intervals
recursively.
If you would like to learn "Data Mining" thoroughly, you should attempt to work
on the complete set of 1000+ MCQs - multiple choice questions and answers
mentioned above. It will immensely help anyone trying to crack an exam or an
interview.
Wish you the best in your endeavor to learn and master Data Mining!
If you find a mistake in question / option / answer, kindly take a screenshot and
email to [email protected]