5clustering Solved MCQs of Clustering in Data Mining With Answers
5clustering Solved MCQs of Clustering in Data Mining With Answers
1-50
1) Which of the following refers to the problem of finding abstracted patterns (or structures)
in the unlabeled data?
a. Supervised learning
b. Unsupervised learning
c. Hybrid learning
d. Reinforcement learning
Hide Answer Workspace
Answer: b
2) Which one of the following refers to querying the unstructured textual data?
a. Information access
b. Information update
c. Information retrieval
d. Information manipulation
Hide Answer Workspace
Answer: c
3) Which of the following can be considered as the correct process of Data Mining?
Hide Answer Workspace
Answer: a
4) Which of the following is an essential process in which the intelligent methods are
applied to extract data patterns?
a. Warehousing
b. Data Mining
c. Text Mining
d. Data Selection
Hide Answer Workspace
Answer: b
Hide Answer Workspace
Answer: a
a. Science of making machine performs the task that would require intelligence
when performed by humans.
b. A computational procedure that takes some values as input and produces some
values as the output.
c. It uses machine learning techniques, in which programs learn from their past
experience and adapt themself to new conditions or situations.
d. All of the above.
Hide Answer Workspace
Answer: c
7) For what purpose, the analysis tools pre-compute the summaries of the huge amount of
data?
Hide Answer Workspace
Answer: d
Explanation:
Whenever a query is fired, the response of the query would be put very earlier. So, for
the query response, the analysis tools pre-compute the summaries of the huge amount
of data. To understand it in more details, consider the following example:
Suppose that to get some information about something, you write a keyword in Google
search. Google's analytical tools will then pre-compute large amounts of data to provide
a quick output related to the keywords you have written.
Hide Answer Workspace
Answer: d
Explanation: In data mining, there are several functionalities used for performing the
different types of tasks. The common functionalities used in data mining are cluster
analysis, prediction, characterization, and evolution. Still, the association and
correctional analysis classification are also one of the important functionalities of data
mining.
a. Hierarchal
b. Naive Bayes
c. Partitional
d. None of the above
Hide Answer Workspace
Answer: a
Explanation: In the above-given diagram, the hierarchal type of clustering is used. The
hierarchal type of clustering categorizes data through a variety of scales by making a
cluster tree. So the correct answer is A.
10) Which of the following statements is incorrect about the hierarchal clustering?
a. The hierarchal type of clustering is also known as the HCA
b. The choice of an appropriate metric can influence the shape of the cluster
c. In general, the splits and merges both are determined in a greedy manner
d. All of the above
Hide Answer Workspace
Answer: a
Explanation: All following statements given in the above question are incorrect, so the
correct answer is D.
11) Which one of the following can be considered as the final output of the hierarchal type
of clustering?
a. A tree which displays how the close thing are to each other
b. Assignment of each point to clusters
c. Finalize estimation of cluster centroids
d. None of the above
Hide Answer Workspace
Answer: a
12) Which one of the following statements about the K-means clustering is incorrect?
a. The goal of the k-means clustering is to partition (n) observation into (k) clusters
b. K-means clustering can be defined as the method of quantization
c. The nearest neighbor is the same as the K-means
d. All of the above
Hide Answer Workspace
Answer: c
Explanation: There is nothing to deal in between the k-means and the K- means the
nearest neighbor.
a. The hierarchal clustering can primarily be used for the aim of exploration
b. The hierarchal clustering should not be primarily used for the aim of exploration
c. Both A and B
d. None of the above
Hide Answer Workspace
Answer: a
14) Which one of the clustering technique needs the merging approach?
a. Partitioned
b. Naïve Bayes
c. Hierarchical
d. Both A and C
Hide Answer Workspace
Answer: c
Explanation: The hierarchal type of clustering is one of the most commonly used
methods to analyze social network data. In this type of clustering method, multiple
nodes are compared with each other on the basis of their similarities and several larger
groups' are formed by merging the nodes or groups of nodes that have similar
characteristics.
15) The self-organizing maps can also be considered as the instance of _________ type of
learning.
a. Supervised learning
b. Unsupervised learning
c. Missing data imputation
d. Both A & C
Hide Answer Workspace
Answer: b
Explanation: The Self Organizing Map (SOM), or the Self Organizing Feature Map is a
kind of Artificial Neural Network which is trained through unsupervised learning.
16) The following given statement can be considered as the examples of_________
Suppose one wants to predict the number of newborns according to the size of storks'
population by performing supervised learning
Hide Answer Workspace
Answer: c
17) In the example predicting the number of newborns, the final number of total newborns
can be considered as the _________
a. Features
b. Observation
c. Attribute
d. Outcome
Hide Answer Workspace
Answer: d
Explanation: In the example of predicting the total number of newborns, the result will
be represented as the outcome. Therefore, the total number of newborns will be found
in the outcome or addressed by the outcome.
a. It is a measure of accuracy
b. It is a subdivision of a set
c. It is the task of assigning a classification
d. None of the above
Hide Answer Workspace
Answer: b
Explanation: The term "classification" refers to the classification of the given data into
certain sub-classes or groups according to their similarities or on the basis of the
specific given set of rules.
Hide Answer Workspace
Answer: d
a. 5
b. 4
c. 2
d. 3
Hide Answer Workspace
Answer: c
Explanation: There are only two categories of functions included in data mining:
Descriptive, Classification and Prediction. Therefore the correct answer is C.
21) Which of the following can be considered as the classification or mapping of a set or
class with some predefined group or classes?
a. Data set
b. Data Characterization
c. Data Sub Structure
d. Data Discrimination
Hide Answer Workspace
Answer: d
22) The analysis performed to uncover the interesting statistical correlation between
associated -attributes value pairs are known as the _______.
a. Mining of association
b. Mining of correlation
c. Mining of clusters
d. All of the above
Hide Answer Workspace
Answer: b
a. Evaluation Analysis
b. Outliner Analysis
c. Classification
d. Prediction
Hide Answer Workspace
Answer: b
Explanation: It may be defined as the object that doesn't comply with the general
behavior or with the model of available data.
24) Which one of the following statements is not correct about the data cleaning?
Hide Answer Workspace
Answer: d
Explanation: Data cleaning is a kind of process that is applied to data set to remove the
noise from the data (or noisy data), inconsistent data from the given data. It also
involves the process of transformation where wrong data is transformed into the correct
data as well. In other words, we can also say that data cleaning is a kind of pre-process
in which the given set of data is prepared for the data warehouse.
25) The classification of the data mining system involves:
a. Database technology
b. Information Science
c. Machine learning
d. All of the above
Hide Answer Workspace
Answer: d
26) In order to integrate heterogeneous databases, how many types of approaches are
there in the data warehousing?
a. 3
b. 4
c. 5
d. 2
Hide Answer Workspace
Answer: d
27) The issues like efficiency, scalability of data mining algorithms comes under_______
a. Performance issues
b. Diverse data type issues
c. Mining methodology and user interaction
d. All of the above
Hide Answer Workspace
Answer: a
28) Which of the following is the correct advantage of the Update-Driven Approach?
Hide Answer Workspace
Answer: c
Explanation: The statements given in both A and B are the advantage of the Update-
Driven Approach in Data Warehousing. So the correct answer is C.
29) Which of the following statements about the query tools is correct?
Hide Answer Workspace
Answer: a
Explanation: The query tools are used to query the database. Or we can also say that
these tools are generally used to get only the necessary information from the entire
database.
30) Which one of the following correctly defines the term cluster?
Hide Answer Workspace
Answer: a
Explanation: The term "cluster" refers to the set of similar objects or items that differ
significantly from the other available objects. In other words, we can understand clusters
as making groups of objects that contain similar characteristics form all available
objects. Therefore the correct answer is A.
a. This takes only two values. In general, these values will be 0 and 1, and they can
be coded as one bit
b. The natural environment of a certain species
c. Systems that can be used without knowledge of internal operations
d. All of the above
Hide Answer Workspace
Answer: a
Explanation: In general, the binary attribute takes only two types of values, that are 0
and 1and these values can be coded as one bit. So the correct answer will be A.
Hide Answer Workspace
Answer: c
Explanation: Data selection can be defined as the stage in which the correct data is
selected for the phase of a knowledge discovery process (or KKD process). Therefore the
correct answer C.
33) Which one of the following correctly refers to the task of the classification?
Hide Answer Workspace
Answer: b
Explanation: The task of classification refers to dividing the set into subsets or in the
numbers of the classes. Therefore the correct answer is C.
a. Approach to the design of learning algorithms that is structured along the lines
of the theory of evolution.
b. Decision support systems that contain an information base filled with the
knowledge of an expert formulated in terms of if-then rules.
c. Combining different types of method or information
d. None of these
Hide Answer Workspace
Answer: c
Explanation: The term "hybrid" refers to merging two objects and forms individual
object that contains features of the combined objects.
a. It is hidden within a database and can only be recovered if one is given certain
clues (an example IS encrypted information).
b. An extremely complex molecule that occurs in human chromosomes and that
carries genetic information in the form of genes.
c. It is a kind of process of executing implicit, previously unknown and potentially
useful information from data
d. None of the above
Hide Answer Workspace
Answer: c
Explanation: The term "discovery" means to discover something new that has not yet
been discovered. It can also be interpreted as a process of executing underlying,
previously unknown and potentially useful information from data.
Hide Answer Workspace
Answer: c
37) Which one of the following can be considered as the correct application of the data
mining?
a. Fraud detection
b. Corporate Analysis & Risk management
c. Management and market analysis
d. All of the above
Hide Answer Workspace
Answer: d
38) Which one of the following correctly refers to the Class study in the data cauterization?
a. Final class
b. Study class
c. Target class
d. Both A and C
Hide Answer Workspace
Answer: c
Explanation: In the data cauterization, generally, the study class refers to the target
class, and the study class is the class that is under the process of summarizing data.
39) Which of the following refers to the sequence of pattern that occurs frequently?
a. Frequent sub-sequence
b. Frequent sub-structure
c. Frequent sub-items
d. All of the above
Hide Answer Workspace
Answer: a
40) Which one of the following refers to the model regularities or to the objects that trends
or not consistent with the change in time?
a. Prediction
b. Evolution analysis
c. Classification
d. Both A and B
Hide Answer Workspace
Answer: b
Explanation: In general, the evolution analysis refers to the model regularities or the
object trends that vary with change in time.
41) The issues like "handling the rational and complex types of data" comes under which of
the following category?
Hide Answer Workspace
Answer: a
Explanation: It is quite often that a database can contain multiple types of data,
complex objects, and temporary data, etc., so it is not possible that only one type of
system can filter all data. Therefore this type of issue comes under the category Diverse
Data type. So the correct answer is A.
42) Which of the following also used as the first step in the knowledge discovery process?
a. Data selection
b. Data cleaning
c. Data transformation
d. Data integration
Hide Answer Workspace
Answer: b
43) Which of the following refers to the steps of the knowledge discovery process, in which
the several data sources are combined?
a. Data selection
b. Data cleaning
c. Data transformation
d. Data integration
Hide Answer Workspace
Answer: d
Hide Answer Workspace
Answer: d
Explanation: All statements given in the above question are drawbacks of the query-
driven approach. Therefore the correct answer is D.
45) Which of the following correctly refers to the term "Data Independence"?
a. It means that the programs are not dependent on the logical attributes
b. It refers to that data that is defined separately, not included in the program
c. It means that the programs are totally dependent on the physical attributes of
data
d. Both A and C
Hide Answer Workspace
Answer: d
Explanation: The term "Data Independence" refers that the programs are not
dependent on the physical attributes of data and neither on the logical attributes of
data.
46) Which of the following is generally used by the E-R model to represent the weak
entities?
a. Diamond
b. Doubly outlined rectangle
c. Dotted rectangle
d. Both B & C
Hide Answer Workspace
Answer: b
a. It can be referred as the system that can be used without the knowledge of the
internal operations
b. It referrers the natural environment of the specific species
c. It takes only two values at most that are 0 and 1
d. All of the above
Hide Answer Workspace
Answer: a
Explanation: Black Box is referred to as the system which takes only two values at most
are zero and one.
48) Which one of the following issues must be considered before investing in data mining?
a. Compatibility
b. Functionality
c. Vendor consideration
d. All of the above
Hide Answer Workspace
Answer: d
Hide Answer Workspace
Answer: c
Explanation: The term "DMQL" refers to the Data Mining Query Language. Therefore
the correct answer is C.
50) In certain cases, it is not clear what kind of pattern need to find, data mining
should_________:
Answer: c
Explanation: In some data mining operations where it is not clear what kind of pattern
needed to find, here the user can guide the data mining process. Because a user has a
good sense of which type of pattern he wants to find. So, he can eliminate the discovery
of all other non-required patterns and focus the process to find only the required
pattern by setting up some rules. Therefore the correct answer is C.