0% found this document useful (0 votes)
4 views

Data_Mining_QA

The document contains a series of questions and answers related to data mining concepts, including decision trees, clustering techniques, and knowledge discovery processes. Key topics include definitions of terms such as numerical attributes, information retrieval, and the KDD process, as well as methods like supervised and unsupervised learning. It also discusses algorithms like CART and K-means clustering, emphasizing their applications and characteristics.

Uploaded by

Ay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Data_Mining_QA

The document contains a series of questions and answers related to data mining concepts, including decision trees, clustering techniques, and knowledge discovery processes. Key topics include definitions of terms such as numerical attributes, information retrieval, and the KDD process, as well as methods like supervised and unsupervised learning. It also discusses algorithms like CART and K-means clustering, emphasizing their applications and characteristics.

Uploaded by

Ay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Data Mining - Questions and Answers

1. A __________ is a classification scheme which generates a tree and a set of rules


representing the model of different classes from a given data set

Answer: Decision tree

2. An attribute whose domain is numerical is called

Answer: Numerical attributes

3. The __________ split is defined as one that does the best job of separating the records into
groups

Answer: Best

4. The initial steps concerned in the process of knowledge discovery is:

Answer: Data Selection

5. Which of the following is generally used in finding hidden structure and patterns in a
given unlabeled data?

Answer: Unsupervised learning

6. Which of the following refers to obtaining information from unstructured textual data?

Answer: Information retrieval

7. KDD stands for?

Answer: Knowledge Discovery Database

8. Which of the following statements is true about classification?

Answer: It is the task of assigning a classification


9. Which of the following is defined as the Euclidean distance measure?

Answer: Neither (a) nor (b)

10. _____________ can be considered as the correct application of data mining.

Answer: All of the above

11. The total categories of functions that are involved in Data Mining are:

Answer: 5

12. Which one of the clustering techniques needs the merging approach?

Answer: Hierarchical

13. Which of these is correct about data mining?

Answer: All of the above

14. Clustering is also called:

Answer: All the above

15. Decision trees are also known as CART. What is CART?

Answer: Classification and Regression Trees

16. What are the advantages of Classification and Regression Trees (CART)?

Answer: All of the above

17. Decision tree is a ______ algorithm


Answer: Supervised learning

18. Decision tree can be used for ______

Answer: Both

19. What is the maximum depth in a decision tree?

Answer: The length of the longest path from a root to a leaf

20. Suppose your target variable is the price of a house using Decision Tree. What type of
tree do you need to predict the target variable?

Answer: Regression tree

21. What is splitting in the decision tree?

Answer: Dividing a node into two or more sub-nodes based on if-else conditions

22. What is a leaf or terminal node in the decision tree?

Answer: The end of the decision tree where it cannot be split into further sub-nodes.

23. What is pruning in a decision tree?

Answer: Removing a sub-node from the tree

24. In a decision tree algorithm, entropy helps to determine a feature or attribute that gives
maximum information about a class which is called _____.

Answer: Information gain

25. In Decision Trees, for predicting a class label, the algorithm starts from which node of
the tree?
Answer: Root

26. CART uses the ____________ for determining the best split.

Answer: Gini index

27. Which of the following is an essential process in which intelligent methods are applied
to extract data patterns?

Answer: Data Mining

28. Which one of the following statements is TRUE for a Decision Tree?

Answer: In a decision tree, entropy determines purity.

29. How do you choose the right node while constructing a decision tree?

Answer: An attribute having the highest information gain.

30. A __________ is a classification scheme which generates a tree and a set of rules
representing the model of different classes from a given data set

Answer: Decision tree

31. An attribute whose domain is numerical is called

Answer: Numerical attributes

32. K-means clustering requires prior knowledge about the number of clusters required as
its input

Answer: True

33. Data mining is defined as?


Answer: The real discovery stage of a knowledge discovery process

34. CLARANS stands for

Answer: Clustering Large Applications based on Randomized Search

35. __________ is a classification scheme which generates a tree and a set of rules.

Answer: Decision tree

36. A ______________ database stores a large amount of space-related data, such as maps,
preprocessed remote sensing or medical imaging data etc.

Answer: Spatial Database

37. The method of arranging data into homogeneous classes according to the common
features present in the data is known as

Answer: Clustering

38. KDD process consists of _______ steps

Answer: 5

39. Which among the following is a Data Mining Algorithm?

Answer: All of the above

40. ___________ is defined as a process used to extract usable data from a larger set of any raw
data.

Answer: Data Mining

41. Which among the following is a data mining tool


Answer: All of the above

42. _______ are the types of Data mining?

Answer: All the above

43. Web mining is the application of _______.

Answer: Data Mining & Text Mining

44. Web content mining describes the discovery of useful information from the _______
contents.

Answer: Web

45. _______________ describes the discovery of useful information from web contents.

Answer: Web content mining

46. Which one of the following can be defined as the data object which does not comply with
the general behavior (or the model of available data)?

Answer: Outlier Analysis

47. In the example predicting the number of newborns, the final number of total newborns
can be considered as the _________

Answer: Outcome

48. The following given statement can be considered as the example of_________: Suppose one
wants to predict the number of newborns according to the size of storks' population by
performing supervised learning

Answer: Regression
49. Which of the following is an essential process in which intelligent methods are applied
to extract data patterns?

Answer: Data Mining

50. Multiple numbers of data sources get combined in which step of the Knowledge
Discovery?

Answer: Data integration

You might also like