0% found this document useful (0 votes)
176 views7 pages

Data Mining MCQ (Multiple Choice Questions)

Uploaded by

roysayanccp05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
176 views7 pages

Data Mining MCQ (Multiple Choice Questions)

Uploaded by

roysayanccp05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Data Mining MCQ (Multiple Choice

Questions)
Data Mining MCQ PDF arranged chapterwise! Start practicing now for exams,
online tests, quizzes, and interviews!

6 min. readView original

Here are Data Mining MCQs (Chapterwise).

1. What is data mining?


a) Deleting unnecessary data
b) Sorting data alphabetically
c) Storing data securely
d) Extracting useful patterns or information from large datasets
View Answer

Answer: d
Explanation: Data mining is the process of discovering patterns, correlations, or
trends by analyzing large datasets. It involves various techniques from statistics,
machine learning, and database systems to uncover valuable insights from data.
These insights can be used for decision-making, prediction, and optimization in
various fields such as business, science, healthcare, and finance.

2. Which of the following is not a basic data mining task?


a) Spooling
b) Prediction
c) Classification
d) Clustering
View Answer

Answer: a
Explanation: Spooling facilitates data exchange between slow peripheral devices
and the computer applications and hence, is not a data mining task.
Classification, which maps data to predefined groups, is a basic data mining task.
Similarly, prediction, which predicts the data values based on the past data, and
clustering, which maps data to non-predefined groups, are also basic data mining
tasks.

3. Which of the following is not an issue in data mining?


a) High dimensionality
b) Shortage of data
c) Overfitting
d) Outliers
View Answer

Answer: b
Explanation: The data mining, as a field, evolved due to the presence of large
amounts of data. Due to the growing involvement of internet in our everyday
lives, there is a huge amount of data generated which has been put to analytical
use by data mining. On the other hand, overfitting, outliers and high
dimensionality are some of the key problems faced in the implementation of data
mining.

4. Which of the following is not a motivating factor for the development of data
mining tools?
a) Data tombs
b) Data rich but information poor situation
c) Data cleaning
d) Dependency on domain experts in expert systems
View Answer

Answer: c
Explanation: The presence of a huge amount of data but the inability to extract
information from this data, also described as data rich but information poor
situation, led to the need for data mining tools. This data stored in databases,
when not used much, form data tombs. Expert systems formed to assist analysis
of data require domain knowledge so it was also not completely error-free. All
these situations motivated the development of data mining tools. Data cleaning,
on the other hand, is a step towards data mining.

5. Which of the following is a subset of data warehouse focused on a specific


functional area?
a) Data mart
b) Association rules
c) Flat files
d) Database
View Answer

Answer: a
Explanation: The data mart is a subset of data warehouse and is oriented to a
specific functional area or subject. Data warehouse, on the other hand, is
oriented towards different functional areas and may have a more complex design
than a data mart.

6. Which of the following statement about knowledge and data discovery


management system (KDDMS) is false?
a) It will provide concurrency features
b) It will provide recovery features
c) It will include data mining tools and data management tools
d) It will include data mining tools but not data management tools
View Answer

Answer: d
Explanation: Knowledge and data discovery management systems (KDDMS) are
the upcoming data mining systems that will include data mining tools, data
management tools, concurrency features, recovery features, and will also ensure
data consistency.

7. Which field of data mining helps in removing uncertainty, noise etc?


a) Data preprocessing
b) Data Mining
c) Outlier detection and removal
d) Uncertainty Reasoning
View Answer

Answer: b
Explanation: Data Mining refers to the process of extraction of hidden patterns
from the Data Warehouse data. Data Preprocessing, Outlier detection and
removal and Uncertainty Reasoning are the methods which aim at removing
uncertainty, noise, or incompleteness of data.

8. Which among the following are not among Various Operations in Data
Warehousing?
a) Sticking
b) Dice
c) Drill down
d) Roll up
View Answer

Answer: a
Explanation: Sticking is not at all an Operation. Instead it is slicing which is just
mis-spelt to confuse. Drilling down is used to increase granularity. Roll up is an
operation to decrease granularity. Dice is the projection operation.

9. Pick the wrong data mining functionality among the given data mining
functionalities.
a) Classification
b) Clustering
c) Class Description
d) Object Description
View Answer
Answer: d
Explanation: There are 5 data mining functionalities. They are class/concept
description, Mining Frequent Patterns: associations and correlations,
Classification and Regression, Clustering and Outlier analysis.

10. Which of the following refers to the set of features that describe a data
object?
a) Attribute vector
b) Instance
c) Sample
d) Data point
View Answer

Answer: a
Explanation: A data object is described by one or more attributes or features. The
set of attributes or features that represent the characteristics of a data point is
called an attribute vector or a feature vector.

11. Which of the following is the most effective measure of the center of
symmetric data set?
a) Mode
b) Midrange
c) Mean
d) Median
View Answer

Answer: c
Explanation: In symmetric data distribution, the variable values occur at regular
frequencies. The arithmetic mean is the most commonly used measure of central
tendency for symmetric data and represents the center of the data set.

12. Which of the following is not true about scatter plots?


a) It is used in the case of univariate distribution
b) It is used to identify relationship between attributes
c) It is used to identify clusters
d) It is used to identify outliers
View Answer

Answer: a
Explanation: Scatter plots are the plots that are used in bivariate distribution.
They are used to identify the relationships between the data values. They are also
used to identify clusters and outliers in a data set.

13. Which of the following is not a proximity measure?


a) Dissimilarity measures
b) Similarity measures
c) Probability measures
d) Distance measures
View Answer

Answer: c
Explanation: The proximity measures are used to evaluate the similarity and
dissimilarity between the two objects. Similarity measures, dissimilarity
measures and distance measures are the commonly used proximity measures.

14. Which of the following is true about the supremum distance between the
given objects?

object Part 1 Part 2 Part 3

object 1 3 4 8

object 2 2 7 3

a) The supremum distance between the objects is 5


b) The supremum distance between the objects is 4
c) The supremum distance between the objects is 6
d) The supremum distance between the objects is 2
View Answer

Answer: a
Explanation: The supremum difference is the maximum difference between
attribute values of two objects.
Supremum distance = max(|x1-y1|, …….. , |xn-yn|)
S = max(|3-2|, |4-7|, |8-3|)
S = max(1, 3, 5) = 5

15. Which of the following is not true about data reduction?


a) It involves dimensionality reduction
b) It involves numerosity reduction
c) Reduced data strives to gives same analytical results as the original data
d) Reduced data gives strives to give less accurate analytical results the original
data
View Answer
Answer: d
Explanation: Data reduction is a part of the data preprocessing. It aims to reduce
the size of the data, yet give same results on analysis of the reduced data as the
original data. it involves dimensionality reduction and numerosity reduction.

16. What do data auditing tools not do?


a) Detect data that violate certain rules
b) Discover rules and relationships in the data
c) Use parsing to find rules in the data
d) Use statistical analysis to find rules in the data
View Answer

Answer: c
Explanation: The data auditing tools discover rules and relationships in the data
and find the data that violate these rules. They make use of statistical techniques
to find the correlations in the data.

17. Which of the following is true about bottom up discretization?


a) Some the values are treated as potential split points
b) All the values are treated as potential split points
c) Split points are not considered
d) Only one value are treated as potential split points
View Answer

Answer: b
Explanation: In bottom up discretization, also known as merging, all the values
are considered as potential split points, which are then merged to form intervals
recursively.

Chapterwise Multiple Choice Questions on Data


Mining
Our MCQs focus on all topics of the Data Mining subject, covering all topics. This
will help you to prepare for exams, contests, online tests, quizzes, viva-voce,
interviews, and certifications. You can practice these MCQs chapter by chapter
starting from the 1st chapter or you can jump to any chapter of your choice.

1. Data Mining Basics


2. Data Exploration and Analysis
3. Data Preprocessing

If you would like to learn "Data Mining" thoroughly, you should attempt to work
on the complete set of 1000+ MCQs - multiple choice questions and answers
mentioned above. It will immensely help anyone trying to crack an exam or an
interview.
Wish you the best in your endeavor to learn and master Data Mining!

If you find a mistake in question / option / answer, kindly take a screenshot and
email to [email protected]

You might also like