100% found this document useful (2 votes)

1K views95 pages

Machine Learning Imp Questions

Uploaded by

golik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

1K views95 pages

Machine Learning Imp Questions

Uploaded by

golik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 95

1.

______is a method of data analysis that automates analytical model

building

A) Artificial Intelligence

B) Machine Learning
C) Data Science
D) Deep Learning
ANSWER= (B) Machine Learning
2.Machine Learning is a branch of______

A) Artificial Intelligence

B) Machine Learning
C) Data Science
D) Deep Learning
ANSWER= (A) Artificial Intelligence
3.Machine learning algorithms build a model based on sample data,
known as ______

A) Testing Data

B) Dummy Data
C) Training Data
D) None of the Above
ANSWER= (C) Training Data
4. Machine learning approaches are traditionally divided into ____ broad
categories

A) Five
B) Four
C) Three
D) Two
ANSWER= (C) Three
5.The term machine learning was coined by ______
A) Tom M. Mitchell
B) Alan Turing
C) Arthur Samuel
D) None of the Above
ANSWER= (C) Arthur Samuel
6.The term machine learning was coined in ______

A) 1958
B) 1957
C) 1960
D) 1959
ANSWER= (D) 1959
7.representative book of the machine learning research during the 1960s
was the ____ on Learning Machines.

A) Arthur Samuel

B) Wilsson's book
C) Nilsson's book
D) Tom M. Mitchell
ANSWER= (C) Nilsson's book
8._____uses many machine learning methods, but with different goals.

A) Data Mining

B) Artificial Intelligence
C) Data Science
D) Deep Learning
ANSWER= (A)Data Mining
9.The difference between optimization and machine learning arises from
the goal of______.

A) Optimization
B) unsupervised learning
C) Generalization
D) supervised learning
ANSWER= (C) Generalization
10.According to _____, the ideas of machine learning, from
methodological principles to theoretical tools, have had a long pre-
history in statistics.

A) Alan Turing

B) Tom M. Mitchell
C) Arthur Samuel
D) Michael I. Jordan
ANSWER= (D) Michael I. Jordan
11. How do you handle missing or corrupted data in a dataset?

A. Drop missing rows or columns

B. Replace missing values with mean/median/mode
C. Assign a unique category to missing values
D. All of the above
View Answer
Answer : D

12. The most widely used metrics and tools to assess a

classification model are:

A. Confusion matrix
B. Cost-sensitive accuracy
C. Area under the ROC curve
D. All of the above
View Answer
Answer : D

13. A model of language consists of the categories which do not

include?

A. Language units
B. Structural units
C. Role structure of units
D. System constraints
View Answer
Answer : B

14. Suppose we would like to perform clustering on spatial data

such as the geometrical locations of houses. We wish to produce
clusters of many different sizes and shapes. Which of the following
methods is the most appropriate?

A. Decision Trees
B. Model-based clustering
C. K-means clustering
D. Density-based clustering
View Answer
Answer : D

15. Which of the following is a disadvantage of decision trees?

A. Factor analysis
B. Decision trees are robust to outliers
C. Decision trees are prone to be overfit
D. None of the above
View Answer
Answer : C

16. Which of the following is true about Naive Bayes?

A. Assumes that all the features in a dataset are equally important

B. Assumes that all the features in a dataset are independent
C. Both A and B
D. None of the above options
View Answer
Answer : C

17. Among the following which is not a horn clause?

A. p → Øq
B. p
C. p → q
D. Øp V q
View Answer
Answer : A
18. Which of the following techniques can not be used for
normalization in text mining?

A. Stop Word Removal

B. Stemming
C. Lemmatization
D. None of the above
View Answer
Answer : A

19. Which of the following is a reasonable way to select the number

of principal components “k”?

A. Choose k to be the smallest value so that at least 99% of the

varinace is retained
B. Use the elbow method
C. Choose k to be 99% of m (k = 0.99*m, rounded to the nearest
integer)
D. Choose k to be the largest value so that 99% of the variance is
retained
View Answer
Answer : A

20. In which of the following cases will K-means clustering fail to

give good results?

1. Data points with outliers

2. Data points with different densities
3. Data points with nonconvex shapes
A. 1 & 2
B. 1, 2, & 3
C. 2 & 3
D. 1 & 3
View Answer
Answer : B

21) What is machine learning ?

 A. Machine learning is the science of getting computers to act

without being explicitly programmed.
 B.Machine Learning is a Form of AI that Enables a System to
Learn from Data.

 C.Both A and B
 D.None of the above
22) Machine learning is an application of ___________.

 A. Blockchain

 B.Artificial Intelligence
 C.Both A and B
 D.None of the above
23) Application of Machine learning is __________.

 A. email filtering
 B.sentimental analysis
 C.face recognition

 D.All of the above

24) The term machine learning was coined in which year?

 A. 1958

 B.1959
 C.1960
 D.1961

25) Machine learning approaches can be traditionally categorized into

______ categories.

 A. 3
 B.4
 C.7
 D.9
26) The categories in which Machine learning approaches can be
traditionally categorized are ______ .

 A. Supervised learning
 B.Unsupervised learning
 C.Reinforcement learning

 D.All of the above

7) _________ is the machine learning algorithms that can be used with
labeled data.

 A. Regression algorithms
 B.Clustering algorithms
 C.Association algorithms
 D.All of the above
27) __________ is the machine learning algorithms that can be used
with unlabeled data.

 A. Regression algorithms

 B.Clustering algorithms
 C.Instance-based algorithms
 D.All of the above
28) The Real-world machine learning use cases are _______.

 A. Digital assistants
 B.Chatbots
 C.Fraud detection

 D.All of the above

29) The Real-world machine learning use cases are _______.

 A. Digital assistants
 B.Chatbots
 C.Fraud detection
 D.All of the above
30) Which among the following algorithms are used in Machine learning?

 A. Naive Bayes
 B.Support Vector Machines
 C.K-Nearest Neighbors

 D.All of the above

31) __________ are the techniques of keyword normalization

 A. Lemmatization
 B.Stemming

 C.Both A and B
 D.None of the abov
32) Replace missing values with mean/median/mode helps to handle
missing or corrupted data in a dataset. True/False?

 A. True
 B.False
33) ________ is a disadvantage of decision trees?

 A. Decision trees are robust to outliers

 B.Decision trees are prone to be overfit

 C.Both A and B
 D.None of the above
34) ________ is a part of machine learning that works with neural
networks.

 A. Artificial inteligence

 B.Deep learning
 C.Both A and B
 D.None of the above
35) Overfitting is a type of modelling error which results in the failure to
predict future observations effectively or fit additional data in the existing
model. Yes/No?

 A. Yes
 B.No
 C.May be
 D.Can't say
36) ________ is used as an input to the machine learning model for
training and prediction purposes.

 A. Feature

 B.Feature Vector
 C.Both A and B
 D.None of the above
37) _______ is the scenario when the model fails to decipher the
underlying trend in the input data.

 A. Overfitting

 B.Underfitting
 C.Both A and B
 D.None of the above
38) Which Language is Best for Machine Learning?

 A. C
 B.Java

 C.Python
 D.HTML
39) The supervised learning problems can be grouped as _______.

 A. Regression problems
 B.Classification problems

 C.Both A and B
 D.None of the above
40) The unsupervised learning problems can be grouped as _______.

 A. Clustering
 B.Association

 C.Both A and B
 D.None of the above
41) Automatic Speech Recognition systems find a wide variety of
applications in the _________ domains.

 A. Medical Assistance
 B.Industrial Robotics
 C.Defence & Aviation

 D.All of the above

42) The term machine learning was coined by __________.

 A. James Gosling

 B.Arthur Samuel
 C.Guido van Rossum
 D.None of the above
43) Machine Learning can automate many tasks, especially the ones
that only humans can perform with their innate intelligence.

 A. True

 B.False
44) Features of Machine Learning are______.

 A. Automation
 B.Improved customer experience
 C.Business intelligence

 D.All of the above

45) Which machine learning models are trained to make a series of
decisions based on the rewards and feedback they receive for their
actions?

 A. Supervised learning
 B.Unsupervised learning

 C.Reinforcement learning
 D.All of the above

Skip to content
TheCodingShef
 Home

 Course

o
o
o
 Aktu final Exam

 About us

 Contact us

 Privacy Policy
content

 All Unit MCQ’s of Data Compression

 All Unit MCQ’s Question of Entrepreneurship Development

 All Unit MCQ’s Questions of Image Processing

 All Unit MCQ questions of ML

All Unit MCQ questions of ML

Leave a Comment / Aktu final Exam / By thecodingshef
MCQ Question of Machine learning

subscribe our channel

1. What is Machine Learning (ML)?
A. The autonomous acquisition of knowledge through the use of
manual programs
B. The selective acquisition of knowledge through the use of
computer programs
C. The selective acquisition of knowledge through the use of
manual programs
D. The autonomous acquisition of knowledge through the use of
computer programs
Correct option is D

2. Father of Machine Learning (ML)

A. Geoffrey Chaucer
B. Geoffrey Hill
C. Geoffrey Everest Hinton
D. None of the above
Correct option is C

3. Which is FALSE regarding regression?

A. It may be used for interpretation
B. It is used for prediction
C. It discovers causal relationships
D. It relates inputs to outputs
Correct option is C

4. Choose the correct option regarding machine learning (ML)

and artificial intelligence (AI)
A. ML is a set of techniques that turns a dataset into a software
B. AI is a software that can emulate the human mind
C. ML is an alternate way of programming intelligent machines
D. All of the above
Correct option is D

5. Which of the factors affect the performance of the learner

system does not include?
A. Good data structures
B. Representation scheme used
C. Training scenario
D. Type of feedback
Correct option is A

6. In general, to have a well-defined learning problem, we must

identity which of the following
A. The class of tasks
B. The measure of performance to be improved
C. The source of experience
D. All of the above
Correct option is D

7. Successful applications of ML
A. Learning to recognize spoken words
B. Learning to drive an autonomous vehicle
C. Learning to classify new astronomical structures
D. Learning to play world-class backgammon
E. All of the above
Correct option is E

8. Which of the following does not include different learning

methods
A. Analogy
B. Introduction
C. Memorization
D. Deduction
Correct option is B
9. In language understanding, the levels of knowledge that
does not include?
A. Empirical
B. Logical
C. Phonological
D. Syntactic
Correct option is A

10. Designing a machine learning approach involves:-

A. Choosing the type of training experience
B. Choosing the target function to be learned
C. Choosing a representation for the target function
D. Choosing a function approximation algorithm
E. All of the above
Correct option is E

11. Concept learning inferred a valued function

from training examples of its input and output.
A. Decimal
B. Hexadecimal
C. Boolean
D. All of the above
Correct option is C

12. Which of the following is not a supervised learning?

A. Naive Bayesian
B. PCA
C. Linear Regression
D. Decision Tree Answer
Correct option is B

13. What is Machine Learning?

 Artificial Intelligence
 Deep Learning
 Data Statistics
A. Only (i)
B. (i) and (ii)
C. All
D. None
Correct option is B
14. What kind of learning algorithm for “Facial identities or
facial expressions”?
A. Prediction
B. Recognition Patterns
C. Generating Patterns
D. Recognizing Anomalies Answer
Correct option is B

15. Which of the following is not type of learning?

A. Unsupervised Learning
B. Supervised Learning
C. Semi-unsupervised Learning
D. Reinforcement Learning
Correct option is C

16. Real-Time decisions, Game AI, Learning Tasks, Skill

Aquisition, and Robot Navigation are applications of which of
the folowing
A. Supervised Learning: Classification
B. Reinforcement Learning
C. Unsupervised Learning: Clustering
D. Unsupervised Learning: Regression
Correct option is B

17. Targetted marketing, Recommended Systems, and

Customer Segmentation are applications in which of the
following
A. Supervised Learning: Classification
B. Unsupervised Learning: Clustering
C. Unsupervised Learning: Regression
D. Reinforcement Learning
Correct option is B

18. Fraud Detection, Image Classification, Diagnostic, and

Customer Retention are applications in which of the following
A. Unsupervised Learning: Regression
B. Supervised Learning: Classification
C. Unsupervised Learning: Clustering
D. Reinforcement Learning
Correct option is B
19. Which of the following is not function of symbolic in the
various function representation of Machine Learning?
A. Rules in propotional Logic
B. Hidden-Markov Models (HMM)
C. Rules in first-order predicate logic
D. Decision Trees
Correct option is B

20. Which of the following is not numerical functions in the

various function representation of Machine Learning?
A. Neural Network
B. Support Vector Machines
C. Case-based
D. Linear Regression
Correct option is C

21. FIND-S Algorithm starts from the most specific

hypothesis and generalize it by considering only
A. Negative
B. Positive
C. Negative or Positive
D. None of the above
Correct option is B

22. FIND-S algorithm ignores

A. Negative
B. Positive
C. Both
D. None of the above
Correct option is A

23. The Candidate-Elimination Algorithm represents the .

A. Solution Space
B. Version Space
C. Elimination Space
D. All of the above
Correct option is B

24. Inductive learning is based on the knowledge that if

something happens a lot it is likely to be generally
A. True
B. False Answer
Correct option is A

25. Inductive learning takes examples and generalizes

rather than starting with
A. Inductive
B. Existing
C. Deductive
D. None of these
Correct option is B

26. A drawback of the FIND-S is that it assumes the

consistency within the training set
A. True
B. False
Correct option is A

27. What strategies can help reduce overfitting in decision

trees?
 Enforce a maximum depth for the tree
 Enforce a minimum number of samples in leaf nodes
 Pruning
 Make sure each leaf node is one pure class
A. All
B. (i), (ii) and (iii)
C. (i), (iii), (iv)
D. None
Correct option is B

28. Which of the following is a widely used and effective

machine learning algorithm based on the idea of bagging?
A. Decision Tree
B. Random Forest
C. Regression
D. Classification
Correct option is B

29. To find the minimum or the maximum of a function, we

set the gradient to zero because which of the following
A. Depends on the type of problem
B. The value of the gradient at extrema of a function is always
zero
C. Both (A) and (B)
D. None of these
Correct option is B

30. Which of the following is a disadvantage of decision

trees?
A. Decision trees are prone to be overfit
B. Decision trees are robust to outliers
C. Factor analysis
D. None of the above
Correct option is A

31. What is perceptron?

A. A single layer feed-forward neural network with pre-
processing
B. A neural network that contains feedback
C. A double layer auto-associative neural network
D. An auto-associative neural network
Correct option is A

32. Which of the following is true for neural networks?

 The training time depends on the size of the
 Neural networks can be simulated on a conventional
 Artificial neurons are identical in operation to biological
A. All
B. Only (ii)
C. (i) and (ii)
D. None
Correct option is C

34. What is Neuro software?

A. It is software used by Neurosurgeon
B. Designed to aid experts in real world
C. It is powerful and easy neural network
D. A software used to analyze neurons
Correct option is C

35. Which is true for neural networks?

A. Each node computes it‟s weighted input
B. Node could be in excited state or non-excited state
C. It has set of nodes and connections
D. All of the above
Correct option is D

36. What is the objective of backpropagation algorithm?

37. Which of the following is true?

Single layer associative neural networks do not have the ability to:-

 Perform pattern recognition

 Find the parity of a picture
 Determine whether two or more shapes in a picture are
connected or not
A. (ii) and (iii)
B. Only (ii)
C. All
D. None
Correct option is A
38. The backpropagation law is also known as generalized
delta rule
A. True
B. False
Correct option is A

38. Which of the following is true?

 On average, neural networks have higher computational
rates than conventional computers.
 Neural networks learn by
 Neural networks mimic the way the human brain
A. All
B. (ii) and (iii)
C. (i), (ii) and (iii)
D. None
Correct option is A

39. What is true regarding backpropagation rule?

40. There is feedback in final stage of backpropagation

A. True
B. False
Correct option is B

41. An auto-associative network is

A. A neural network that has only one loop
B. A neural network that contains feedback
C. A single layer feed-forward neural network with pre-
processing
D. A neural network that contains no loops
Correct option is B

42. A 3-input neuron has weights 1, 4 and 3. The transfer

function is linear with the constant of proportionality being
equal to 3. The inputs are 4, 8 and 5 respectively. What will
be the output?
A. 139
B. 153
C. 162
D. 160
Correct option is B

43. What of the following is true regarding backpropagation

rule?
A. Hidden layers output is not all important, they are only meant
for supporting input and output layers
B. Actual output is determined by computing the outputs of units
for each hidden layer
C. It is a feedback neural network
D. None of the above
Correct option is B

44. What is back propagation?

A. It is another name given to the curvy function in the
perceptron
B. It is the transmission of error back through the network to
allow weights to be adjusted so that the network can learn
C. It is another name given to the curvy function in the
perceptron
D. None of the above
Correct option is B

45. The general limitations of back propagation rule is/are

A. Scaling
B. Slow convergence
C. Local minima problem
D. All of the above
Correct option is D

46. What is the meaning of generalized in statement

“backpropagation is a generalized delta rule” ?
A. Because delta is applied to only input and output layers, thus
making it more simple and generalized
B. It has no significance
C. Because delta rule can be extended to hidden layer units
D. None of the above
Correct option is C
47. Neural Networks are complex functions with many
parameter
A. Linear
B. Non linear
C. Discreate
D. Exponential
Correct option is A

48. The general tasks that are performed with

backpropagation algorithm
A. Pattern mapping
B. Prediction
C. Function approximation
D. All of the above
Correct option is D

49. Backpropagaion learning is based on the gradient

descent along error surface.
A. True
B. False
Correct option is A

50. In backpropagation rule, how to stop the learning

process?
A. No heuristic criteria exist
B. On basis of average gradient value
C. There is convergence involved
D. None of these
Correct option is B

51. Applications of NN (Neural Network)

A. Risk management
B. Data validation
C. Sales forecasting
D. All of the above
Correct option is D

52. The network that involves backward links from output

to the input and hidden layers is known as
A. Recurrent neural network
B. Self organizing maps
C. Perceptrons
D. Single layered perceptron
Correct option is A

53. Decision Tree is a display of an Algorithm?

A. True
B. False
Correct option is A

54. Which of the following is/are the decision tree nodes?

A. End Nodes
B. Decision Nodes
C. Chance Nodes
D. All of the above
Correct option is D

55. End Nodes are represented by which of the following

A. Solar street light
B. Triangles
C. Circles
D. Squares
Correct option is B

56. Decision Nodes are represented by which of the

following
A. Solar street light
B. Triangles
C. Circles
D. Squares
Correct option is D

57. Chance Nodes are represented by which of the

following
A. Solar street light
B. Triangles
C. Circles
D. Squares
Correct option is C

58. Advantage of Decision Trees

A. Possible Scenarios can be added
B. Use a white box model, if given result is provided by a model
C. Worst, best and expected values can be determined for
different scenarios
D. All of the above
Correct option is D

59. terms are required for building a bayes model.

A. 1
B. 2
C. 3
D. 4
Correct option is C

60. Which of the following is the consequence between a

node and its predecessors while creating bayesian network?
A. Conditionally independent
B. Functionally dependent
C. Both Conditionally dependant & Dependant
D. Dependent
Correct option is A

61. Why it is needed to make probabilistic systems feasible

in the world?
A. Feasibility
B. Reliability
C. Crucial robustness
D. None of the above
Correct option is C

62. Bayes rule can be used for:-

A. Solving queries
B. Increasing complexity
C. Answering probabilistic query
D. Decreasing complexity
Correct option is C

63. provides way and means of weighing up the

desirability of goals and the likelihood of achieving
A. Utility theory
B. Decision theory
C. Bayesian networks
D. Probability theory
Correct option is A
64. Which of the following provided by the Bayesian
Network?
A. Complete description of the problem
B. Partial description of the domain
C. Complete description of the domain
D. All of the above
Correct option is C

65. Probability provides a way of summarizing the that comes

from our laziness and

A. Belief
B. Uncertaintity
C. Joint probability distributions
D. Randomness
Correct option is B

66. The entries in the full joint probability distribution can

be calculated as
A. Using variables
B. Both Using variables & information
C. Using information
D. All of the above
Correct option is C

67. Causal chain (For example, Smoking cause cancer)

gives rise to:-
A. Conditionally Independence
B. Conditionally Dependence
C. Both
D. None of the above
Correct option is A

68. The bayesian network can be used to answer any

query by using:-
A. Full distribution
B. Joint distribution
C. Partial distribution
D. All of the above
Correct option is B

69. Bayesian networks allow compact specification of:-

A. Joint probability distributions
B. Belief
C. Propositional logic statements
D. All of the above
Correct option is A

70. The compactness of the bayesian network can be

described by
A. Fully structured
B. Locally structured
C. Partially structured
D. All of the above
Correct option is B

71. The Expectation-Maximization Algorithm has been

used to identify conserved domains in unaligned proteins
only. State True or False.
A. True
B. False
Correct option is B

72. Which of the following is correct about the Naive

Bayes?
A. Assumes that all the features in a dataset are independent
B. Assumes that all the features in a dataset are equally
important
C. Both
D. All of the above
Correct option is C

73. Which of the following is false regarding EM Algorithm?

A. The alignment provides an estimate of the base or amino
acid composition of each column in the site
B. The column-by-column composition of the site already
available is used to estimate the probability of finding the site
at any position in each of the sequences
C. The row-by-column composition of the site already available
is used to estimate the probability
D. None of the above
Correct option is C

74. Naïve Bayes Algorithm is a learning algorithm.

A. Supervised
B. Reinforcement
C. Unsupervised
D. None of these
Correct option is A

75. EM algorithm includes two repeated steps, here the

step 2 is .
A. The normalization
B. The maximization step
C. The minimization step
D. None of the above
Correct option is C

76. Examples of Naïve Bayes Algorithm is/are

A. Spam filtration
B. Sentimental analysis
C. Classifying articles
D. All of the above
Correct option is D

77. In the intermediate steps of “EM Algorithm”, the

number of each base in each column is determined and then
converted to
A. True
B. False
Correct option is A

78. Naïve Bayes algorithm is based on and used for

solving classification problems.
A. Bayes Theorem
B. Candidate elimination algorithm
C. EM algorithm
D. None of the above
Correct option is A

79. Types of Naïve Bayes Model:

A. Gaussian
B. Multinomial
C. Bernoulli
D. All of the above
Correct option is D
80. Disadvantages of Naïve Bayes Classifier:
A. Naive Bayes assumes that all features are independent or
unrelated, so it cannot learn the relationship between
B. It performs well in Multi-class predictions as compared to the
other
C. Naïve Bayes is one of the fast and easy ML algorithms to
predict a class of
D. It is the most popular choice for text classification problems.
Correct option is A

81. The benefit of Naïve Bayes:-

A. Naïve Bayes is one of the fast and easy ML algorithms to
predict a class of
B. It is the most popular choice for text classification problems.
C. It can be used for Binary as well as Multi-class
D. All of the above
Correct option is D

82. In which of the following types of sampling the

information is carried out under the opinion of an expert?
A. Convenience sampling
B. Judgement sampling
C. Quota sampling
D. Purposive sampling
Correct option is B

83. Full form of MDL?

A. Minimum Description Length
B. Maximum Description Length
C. Minimum Domain Length
D. None of these
Correct option is A

84. For the analysis of ML algorithms, we need

A. Computational learning theory
B. Statistical learning theory
C. Both A & B
D. None of these
Correct option is C

85. PAC stand for

A. Probably Approximate Correct
B. Probably Approx Correct
C. Probably Approximate Computation
D. Probably Approx Computation
Correct option is A

86. hypothesis h with respect to target concept c and

distribution D , is the probability that h will misclassify an instance drawn
at random according to D.
A. True Error
B. Type 1 Error
C. Type 2 Error
D. None of these
Correct option is A

87. Statement: True error defined over entire instance

space, not just training data
A. True
B. False
Correct option is A

88. What are the area CLT comprised of?

A. Sample Complexity
B. Computational Complexity
C. Mistake Bound
D. All of these
Correct option is D

88. What area of CLT tells “How many examples we need

to find a good hypothesis ?”?
A. Sample Complexity
B. Computational Complexity
C. Mistake Bound
D. None of these
Correct option is A

89. What area of CLT tells “How much computational

power we need to find a good hypothesis ?”?
A. Sample Complexity
B. Computational Complexity
C. Mistake Bound
D. None of these
Correct option is B
90. What area of CLT tells “How many mistakes we will
make before finding a good hypothesis ?”?
A. Sample Complexity
B. Computational Complexity
C. Mistake Bound
D. None of these
Correct option is C

91. (For question no. 9 and 10) Can we say that concept
described by conjunctions of Boolean literals are PAC
learnable?
A. Yes
B. No
Correct option is A

92. How large is the hypothesis space when we have n

Boolean attributes?
A. |H| = 3 n
B. |H| = 2 n
C. |H| = 1 n
D. |H| = 4n
Correct option is A

93. The VC dimension of hypothesis space H1 is larger

than the VC dimension of hypothesis space H2. Which of the
following can be inferred from this?
A. The number of examples required for learning a hypothesis
in H1 is larger than the number of examples required for H2
B. The number of examples required for learning a hypothesis
in H1 is smaller than the number of examples required for
C. No relation to number of samples required for PAC learning.
Correct option is A

94. For a particular learning task, if the requirement of error

parameter changes from 0.1 to 0.01. How many more
samples will be required for PAC learning?
A. Same
B. 2 times
C. 1000 times
D. 10 times
Correct option is D
95. Computational complexity of classes of learning
problems depends on which of the following?
A. The size or complexity of the hypothesis space considered
by learner
B. The accuracy to which the target concept must be
approximated
C. The probability that the learner will output a successful
hypothesis
D. All of these
Correct option is D

96. The instance-based learner is a

A. Lazy-learner
B. Eager learner
C. Can‟t say
Correct option is A

97. When to consider nearest neighbour algorithms?

A. Instance map to point in kn
B. Not more than 20 attributes per instance
C. Lots of training data
D. None of these
E. A, B & C
Correct option is E

98. What are the advantages of Nearest neighbour alogo?

A. Training is very fast
B. Can learn complex target functions
C. Don‟t lose information
D. All of these
Correct option is D

99. What are the difficulties with k-nearest neighbour algo?

A. Calculate the distance of the test case from all training cases
B. Curse of dimensionality
C. Both A & B
D. None of these
Correct option is C

100. What if the target function is real valued in kNN algo?

A. Calculate the mean of the k nearest neighbours
B. Calculate the SD of the k nearest neighbour
C. None of these
Correct option is A

101. What is/are true about Distance-weighted KNN?

A. The weight of the neighbour is considered
B. The distance of the neighbour is considered
C. Both A & B
D. None of these
Correct option is C

102. What is/are advantage(s) of Distance-weighted k-NN

over k-NN?
A. Robust to noisy training data
B. Quite effective when a sufficient large set of training data is
provided
C. Both A & B
D. None of these
Correct option is C

103. What is/are advantage(s) of Locally Weighted

Regression?
A. Pointwise approximation of complex target function
B. Earlier data has no influence on the new ones
C. Both A & B
D. None of these
Correct option is C

104. The quality of the result depends on (LWR)

A. Choice of the function
B. Choice of the kernel function K
C. Choice of the hypothesis space H
D. All of these
Correct option is D

105. How many types of layer in radial basis function neural

networks?
A. 3
B. 2
C. 1
D. 4
Correct option is A, Input layer, Hidden layer, and Output layer
106. The neurons in the hidden layer contains Gaussian
transfer function whose output are to the
distance from the centre of the neuron.
A. Directly
B. Inversely
C. equal
D. None of these
Correct option is B

107. PNN/GRNN networks have one neuron for each point

in the training file, While RBF network have a variable
number of neurons that is usually
A. less than the number of training
B. greater than the number of training points
C. equal to the number of training points
D. None of these
Correct option is A

108. Which network is more accurate when the size of

training set between small to medium?
A. PNN/GRNN
B. RBF
C. K-means clustering
D. None of these
Correct option is A

109. What is/are true about RBF network?

A. A kind of supervised learning
B. Design of NN as curve fitting problem
C. Use of multidimensional surface to interpolate the test data
D. All of these
Correct option is D

110. Application of CBR

A. Design
B. Planning
C. Diagnosis
D. All of these
Correct option is A

111. What is/are advantages of CBR?

A. A local approx. is found for each test case
B. Knowledge is in a form understandable to human
C. Fast to train
D. All of these
Correct option is D

112 In k-NN algorithm, given a set of training examples and the value of
k < size of training set (n), the algorithm predicts the class of a test
example to be the. What is/are advantages of CBR?

A. Least frequent class among the classes of k closest training

B. Most frequent class among the classes of k closest training
C. Class of the closest
D. Most frequent class among the classes of the k farthest
training examples.
Correct option is B

113. Which of the following statements is true about PCA?

 We must standardize the data before applying
 We should select the principal components which explain the
highest variance
 We should select the principal components which explain the
lowest variance
 We can use PCA for visualizing the data in lower dimensions
A. (i), (ii) and (iv).
B. (ii) and (iv)
C. (iii) and (iv)
D. (i) and (iii)
Correct option is A

114. Genetic algorithm is a

A. Search technique used in computing to find true or
approximate solution to optimization and search problem
B. Sorting technique used in computing to find true or
approximate solution to optimization and sort problem
C. Both A & B
D. None of these
Correct option is A

115. GA techniques are inspired by

A. Evolutionary
B. Cytology
C. Anatomy
D. Ecology
Correct option is A

116. When would the genetic algorithm terminate?

A. Maximum number of generations has been produced
B. Satisfactory fitness level has been reached for the
C. Both A & B
D. None of these
Correct option is C

117. The algorithm operates by iteratively updating a pool of

hypotheses, called the
A. Population
B. Fitness
C. None of these
Correct option is A

118. What is the correct representation of GA?

A. GA(Fitness, Fitness_threshold, p)
B. GA(Fitness, Fitness_threshold, p, r )
C. GA(Fitness, Fitness_threshold, p, r, m)
D. GA(Fitness, Fitness_threshold)
Correct option is C

119. Genetic operators includes

A. Crossover
B. Mutation
C. Both A & B
D. None of these
Correct option is C

120. Produces two new offspring from two parent string by

copying selected bits from each parent is called
A. Mutation
B. Inheritance
C. Crossover
D. None of these
Correct option is C

121. Each schema the set of bit strings containing the

indicated as
A. 0s, 1s
B. only 0s
C. only 1s
D. 0s, 1s, *s
Correct option is D

122. 0*10 represents the set of bit strings that includes

exactly (A) 0010, 0110
A. 0010, 0010
B. 0100, 0110
C. 0100, 0010
Correct option is A

123. Correct ( h ) is the percent of all training examples

correctly classified by hypothesis then Fitness function is
equal to
A. Fitness ( h) = (correct ( h)) 2
B. Fitness ( h) = (correct ( h)) 3
C. Fitness ( h) = (correct ( h))
D. Fitness ( h) = (correct ( h)) 4
Correct option is A

124. Statement: Genetic Programming individuals in the

evolving population are computer programs rather than bit
A. True
B. False
Correct option is A

125. evolution over many generations was

directly influenced by the experiences of individual
organisms during their lifetime
A. Baldwin
B. Lamarckian
C. Bayes
D. None of these
Correct option is B

126. Search through the hypothesis space cannot be

characterized. Why?
A. Hypotheses are created by crossover and mutation
operators that allow radical changes between successive
generations
B. Hypotheses are not created by crossover and mutation
C. None of these
Correct option is A

127. ILP stand for

A. Inductive Logical programming
B. Inductive Logic Programming
C. Inductive Logical Program
D. Inductive Logic Program
Correct option is B

128. What is/are the requirement for the Learn-One-Rule

method?
A. Input, accepts a set of +ve and -ve training examples.
B. Output, delivers a single rule that covers many +ve examples
and few -ve.
C. Output rule has a high accuracy but not necessarily a high
D. A & B
E. A, B & C
Correct option is E

129. is any predicate (or its negation) applied to

any set of terms.
A. Literal
B. Null
C. Clause
D. None of these
Correct option is A

subscribe our channel

130. Ground literal is a literal that
A. Contains only variables
B. does not contains any functions
C. does not contains any variables
D. Contains only functions Answer
Correct option is C

131. emphasizes learning feedback that

evaluates the learner’s performance without providing
standards of correctness in the form of behavioural
A. Reinforcement learning
B. Supervised Learning
C. None of these
Correct option is A

132. Features of Reinforcement learning

A. Set of problem rather than set of techniques
B. RL is training by reward and
C. RL is learning from trial and error with the
D. All of these
Correct option is D

133. Which type of feedback used by RL?

A. Purely Instructive feedback
B. Purely Evaluative feedback
C. Both A & B
D. None of these
Correct option is B

134. What is/are the problem solving methods for RL?

A. Dynamic programming
B. Monte Carlo Methods
C. Temporal-difference learning
D. All of these
Correct option is D

135. The FIND-S Algorithm

A. Starts with starts from the most specific hypothesis
Answer
B. It considers negative examples
C. It considers both negative and positive
D. None of these Correct
136. The hypothesis space has a general-to-specific ordering of
hypotheses, and the search can be efficiently organized by taking
advantage of a naturally occurring structure over the hypothesis space

1.
A. TRUE
B. FALSE
Correct option is A

137. The Version space is:

A. The subset of all hypotheses is called the version space with
respect to the hypothesis space H and the training examples
D, because it contains all plausible versions of the target
B. The version space consists of only specific
C. None of these
D.
Correct option is A

138. The Candidate-Elimination Algorithm

A. The key idea in the Candidate-Elimination algorithm
is to output a description of the set of all hypotheses
consistent with the training
B. Candidate-Elimination algorithm computes the
description of this set without explicitly enumerating
all of its
C. This is accomplished by using the more-general-
than partial ordering and maintaining a compact
representation of the set of consistent
D. All of these
Correct option is D

139. Concept learning is basically acquiring the definition of

a general category from given sample positive and negative
training examples of the
A. TRUE
B. FALSE
Correct option is A

140. The hypothesis h1 is more-general-than hypothesis h2

( h1 > h2) if and only if h1≥h2 is true and h2≥h1 is false. We
also say h2 is more-specific-than h1
A. The statement is true
B. The statement is false
C. We cannot
D. None of these
Correct option is A

141. The List-Then-Eliminate Algorithm

A. The List-Then-Eliminate algorithm initializes the
version space to contain all hypotheses in H, then
eliminates any hypothesis found inconsistent with
any training
B. The List-Then-Eliminate algorithm not initializes to
the version
C. None of these Answer
Correct option is A

142. What will take place as the agent observes its

interactions with the world?
A. Learning
B. Hearing
C. Perceiving
D. Speech
Correct option is A

143. Which modifies the performance element so that it

makes better decision?Performance element
A. Performance element
B. Changing element
C. Learning element
D. None of the mentioned
Correct option is C

144. Any hypothesis found to approximate the target

function well over a sufficiently large set of training examples
will also approximate the target function well over other
unobserved example is called:
A. Inductive Learning Hypothesis
B. Null Hypothesis
C. Actual Hypothesis
D. None of these
Correct option is A

145. Feature of ANN in which ANN creates its own

organization or representation of information it receives
during learning time is
A. Adaptive Learning
B. Self Organization
C. What-If Analysis
D. Supervised Learning
Correct option is B

146. How the decision tree reaches its decision?

A. Single test
B. Two test
C. Sequence of test
D. No test
Correct option is C

147. Which of the following is a disadvantage of decision

trees?
 Factor analysis
 Decision trees are robust to outliers
 Decision trees are prone to be overfit
 None of the above
Correct option is C

148. Tree/Rule based classification algorithms generate

which rule to perform the classification.
A. if-then.
B. then
C. do
D. Answer
Correct option is A

149. What is Gini Index?

A. It is a type of index structure
B. It is a measure of purity
C. None of the options
Correct option is A

150. What is not a RNN in machine learning?

A. One output to many inputs
B. Many inputs to a single output
C. RNNs for nonsequential input
D. Many inputs to many outputs
Correct option is A

151. Which of the following sentences are correct in

reference to Information gain?
A. It is biased towards multi-valued attributes
B. ID3 makes use of information gain
C. The approach used by ID3 is greedy
D. All of these
Correct option is D
152. A Neural Network can answer
A. For Loop questions
B. what-if questions
C. IF-The-Else Analysis Questions
D. None of these Answer
Correct option is B

153. Artificial neural network used for

A. Pattern Recognition
B. Classification
C. Clustering
D. All Answer
Correct option is D

154. Which of the following are the advantage/s of Decision

Trees?
A. Possible Scenarios can be added
B. Use a white box model, If given result is provided by a model
C. Worst, best and expected values can be determined for
different scenarios
D. All of the mentioned
Correct option is D

155. What is the mathematical likelihood that something will

occur?
A. Classification
B. Probability
C. Naïve Bayes Classifier
D. None of the other
Correct option is C

A. What does the Bayesian network provides?

B. Complete description of the domain
C. Partial description of the domain
D. Complete description of the problem
E. None of the mentioned
Correct option is C

157. Where does the Bayes rule can be used?

A. Solving queries
B. Increasing complexity
C. Decreasing complexity
D. Answering probabilistic query
Correct option is D

158. How many terms are required for building a Bayes

model?
A. 2
B. 3
C. 4
D. 1
Correct option is B

159. What is needed to make probabilistic systems feasible

in the world?
A. Reliability
B. Crucial robustness
C. Feasibility
D. None of the mentioned
Correct option is B

160. It was shown that the Naive Bayesian method

A. Can be much more accurate than the optimal
Bayesian method
B. Is always worse off than the optimal Bayesian
method
C. Can be almost optimal only when attributes are
independent
D. Can be almost optimal when some attributes are
dependent
Correct option is C

161. What is the consequence between a node and its

predecessors while creating Bayesian network?
A. Functionally dependent
B. Dependant
C. Conditionally independent
D. Both Conditionally dependant & Dependant
Correct option is C

162. How the compactness of the Bayesian network can be

described?
A. Locally structured
B. Fully structured
C. Partial structure
D. All of the mentioned
Correct option is A

163. How the entries in the full joint probability distribution

can be calculated?
A. Using variables
B. Using information
C. Both Using variables & information
D. None of the mentioned
Correct option is B

164. How the Bayesian network can be used to answer any

query?
A. Full distribution
B. Joint distribution
C. Partial distribution
D. All of the mentioned
Correct option is B

165. Sample Complexity is

A. The sample complexity is the number of training-
samples that we need to supply to the algorithm, so
that the function returned by the algorithm is within
an arbitrarily small error of the best possible
function, with probability arbitrarily close to 1
B. How many training examples are needed for learner
to converge to a successful hypothesis.
C. All of these
Correct option is C

166. PAC stands for

A. Probability Approximately Correct
B. Probability Applied Correctly
C. Partition Approximately Correct
Correct option is A

167. Which of the following will be true about k in k-NN in

terms of variance
A. When you increase the k the variance will increases
B. When you decrease the k the variance will
increases
C. Can‟t say
D. None of these
Correct option is B

168. Which of the following option is true about k-NN

algorithm?
A. It can be used for classification
B. It can be used for regression
C. It can be used in both classification and regression
Answer
Correct option is C

169. In k-NN it is very likely to overfit due to the curse of

dimensionality. Which of the following option would you
consider to handle such problem? 1). Dimensionality
Reduction 2). Feature selection
A. 1
B. 2
C. 1 and 2
D. None of these
Correct option is C

170. When you find noise in data which of the following

option would you consider in k- NN
A. I will increase the value of k
B. I will decrease the value of k
C. Noise can not be dependent on value of k
D. None of these
Correct option is A

171. Which of the following will be true about k in k-NN in

terms of Bias?
A. When you increase the k the bias will be increases
B. When you decrease the k the bias will be increases
C. Can‟t say
D. None of these
Correct option is A

172. What is used to mitigate overfitting in a test set?

A. Overfitting set
B. Training set
C. Validation dataset
D. Evaluation set
Correct option is C

173. A radial basis function is a

A. Activation function
B. Weight
C. Learning rate
D. none
Correct option is A

174. Mistake Bound is

A. How many training examples are needed for learner to
converge to a successful hypothesis.
B. How much computational effort is needed for a learner to
converge to a successful hypothesis
C. How many training examples will the learner misclassify
before conversing to a successful hypothesis
D. None of these
Correct option is C

175. All of the following are suitable problems for genetic

algorithms EXCEPT
A. dynamic process control
B. pattern recognition with complex patterns
C. simulation of biological models
D. simple optimization with few variables
Correct option is D

176. Adding more basis functions in a linear model… (Pick

the most probably option)
A. Decreases model bias
B. Decreases estimation bias
C. Decreases variance
D. Doesn‟t affect bias and variance
Correct option is A

177. Which of these are types of crossover

A. Single point
B. Two point
C. Uniform
D. All of these
Correct option is D
178. A feature F1 can take certain value: A, B, C, D, E, & F
and represents grade of students from a college. Which of
the following statement is true in following case?
A. Feature F1 is an example of nominal
B. Feature F1 is an example of ordinal
C. It doesn‟t belong to any of the above category.
Correct option is B

179. You observe the following while fitting a linear

regression to the data: As you increase the amount of
training data, the test error decreases and the training error
increases. The train error is quite low (almost what you
expect it to), while the test error is much higher than the train
error. What do you think is the main reason behind this
behaviour? Choose the most probable option.
A. High variance
B. High model bias
C. High estimation bias
D. None of the above Answer
Correct option is C

180. Genetic algorithms are heuristic methods that do not

guarantee an optimal solution to a problem
A. TRUE
B. FALSE
Correct option is A

181. Which of the following statements about regularization

is not correct?
A. Using too large a value of lambda can cause your
hypothesis to underfit the
B. Using too large a value of lambda can cause your
hypothesis to overfit the
C. Using a very large value of lambda cannot hurt the
performance of your hypothesis.
D. None of the above
Correct option is A

182. Consider the following: (a) Evolution (b) Selection (c)

Reproduction (d) Mutation Which of the following are found
in genetic algorithms?
A. All
B. a, b, c
C. a, b
D. b, d
Correct option is A

183. Genetic Algorithm are a part of

A. Evolutionary Computing
B. inspired by Darwin’s theory about evolution –
“survival of the fittest”
C. are adaptive heuristic search algorithm based on the
evolutionary ideas of natural selection and genetics
D. All of the above
Correct option is D

184. Genetic algorithms belong to the family of methods in

the
A. artificial intelligence area
B. optimization
C. complete enumeration family of methods
D. Non-computer based (human) solutions area
Correct option is A

185. For a two player chess game, the environment

encompasses the opponent
A. True
B. False
Correct option is A

186. Which among the following is not a necessary feature

of a reinforcement learning solution to a learning problem?
A. exploration versus exploitation dilemma
B. trial and error approach to learning
C. learning based on rewards
D. representation of the problem as a Markov Decision
Process
Correct option is D

187. Which of the following sentence is FALSE regarding

reinforcement learning
A. It relates inputs to
B. It is used for
C. It may be used for
D. It discovers causal relationships.
Correct option is D

188. The EM algorithm is guaranteed to never decrease the

value of its objective function on any iteration
A. TRUE
B. FALSE Answer
Correct option is A

189. Consider the following modification to the tic-tac-toe

game: at the end of game, a coin is tossed and the agent
wins if a head appears regardless of whatever has happened
in the game.Can reinforcement learning be used to learn an
optimal policy of playing Tic-Tac-Toe in this case?
A. Yes
B. No
Correct option is B

190. Out of the two repeated steps in EM algorithm, the step 2 is _

A. the maximization step

B. the minimization step
C. the optimization step
D. the normalization step
Correct option is A

191. Suppose the reinforcement learning player was greedy,

that is, it always played the move that brought it to the
position that it rated the best. Might it learn to play better, or
worse, than a non greedy player?
A. Worse
B. Better
Correct option is B

192. A chess agent trained by using Reinforcement

Learning can be trained by playing against a copy of the
same
A. True
B. False
Correct option is A
193. The EM iteration alternates between performing an
expectation (E) step, which creates a function for the
expectation of the log-likelihood evaluated using the current
estimate for the parameters, and a maximization (M) step,
which computes parameters maximizing the expected log-
likelihood found on the E
A. TRUE
B. FALSE
Correct option is A

194. Expectation–maximization (EM) algorithm is an

A. Iterative
B. Incremental
C. None
Correct option is A

195. Feature need to be identified by using Well Posed

Learning Problem:
A. Class of tasks
B. Performance measure
C. Training experience
D. All of these
Correct option is D

196. A computer program that learns to play checkers might

improve its performance as:
A. Measured by its ability to win at the class of tasks
involving playing checkers
B. Experience obtained by playing games against
C. Both a & b
D. None of these
Correct option is C

197. Learning symbolic representations of concepts known

as:
A. Artificial Intelligence
B. Machine Learning
C. Both a & b
D. None of these
Correct option is A
198. The field of study that gives computers the capability to
learn without being explicitly programmed
A. Machine Learning
B. Artificial Intelligence
C. Deep Learning
D. Both a & b
Correct option is A

199. The autonomous acquisition of knowledge through the

use of computer programs is called
A. Artificial Intelligence
B. Machine Learning
C. Deep learning
D. All of these
Correct option is B

200. Learning that enables massive quantities of data is

known as
A. Artificial Intelligence
B. Machine Learning
C. Deep learning
D. All of these
Correct option is B

201. A different learning method does not include

A. Memorization
B. Analogy
C. Deduction
D. Introduction
Correct option is D

202. Types of learning used in machine

A. Supervised
B. Unsupervised
C. Reinforcement
D. All of these
Correct option is D

203. A computer program is said to learn from experience E

with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured by
P, improves with experience
A. Supervised learning problem
B. Un Supervised learning problem
C. Well posed learning problem
D. All of these
Correct option is C

204. Which of the following is a widely used and effective

machine learning algorithm based on the idea of bagging?
A. Decision Tree
B. Regression
C. Classification
D. Random Forest
Correct option is D

205. How many types are available in machine learning?

A. 1
B. 2
C. 3
D. 4
Correct option is C

205. A model can learn based on the rewards it received for

its previous action is known as:
A. Supervised learning
B. Unsupervised learning
C. Reinforcement learning
D. Concept learning
Correct option is C

206. A subset of machine learning that involves systems

that think and learn like humans using artificial neural
networks.
A. Artificial Intelligence
B. Machine Learning
C. Deep Learning
D. All of these
Correct option is C

207. A learning method in which a training data contains a

small amount of labeled data and a large amount of
unlabeled data is known
as
A. Supervised Learning
B. Semi Supervised Learning
C. Unsupervised Learning
D. Reinforcement Learning
Correct option is C

208. Methods used for the calibration in Supervised

Learning
A. Platt Calibration
B. Isotonic Regression
C. All of these
D. None of above
Correct option is C

209. The basic design issues for designing a learning

A. Choosing the Training Experience
B. Choosing the Target Function
C. Choosing a Function Approximation Algorithm
D. Estimating Training Values
E. All of these
Correct option is E

210. In Machine learning the module that must solve the

given performance task is known as:
A. Critic
B. Generalizer
C. Performance system
D. All of these
Correct option is C

211. A learning method that is used to solve a particular

computational program, multiple models such as classifiers
or experts are strategically generated and combined is called
as
A. Supervised Learning
B. Semi Supervised Learning
C. Unsupervised Learning
D. Reinforcement Learning
E. Ensemble learning
Correct option is E
212. In a learning system the component that takes as takes
input the current hypothesis (currently learned function) and
outputs a new problem for the Performance System to
explore.
A. Critic
B. Generalizer
C. Performance system
D. Experiment generator
E. All of these
Correct option is D

213. Learning method that is used to improve the

classification, prediction, function approximation etc of a
model
A. Supervised Learning
B. Semi Supervised Learning
C. Unsupervised Learning
D. Reinforcement Learning
E. Ensemble learning
Correct option is E

214. In a learning system the component that takes as input

the history or trace of the game and produces as output a set
of training examples of the target function is known as:
A. Critic
B. Generalizer
C. Performance system
D. All of these
Correct option is A

215. The most common issue when using ML is

A. Lack of skilled resources
B. Inadequate Infrastructure
C. Poor Data Quality
D. None of these
Correct option is C

216. How to ensure that your model is not over fitting

A. Cross validation
B. Regularization
C. All of these
D. None of these
Correct option is C

217. A way to ensemble multiple classifications or

regression
A. Stacking
B. Bagging
C. Blending
D. Boosting
Correct option is A

218. How well a model is going to generalize in new

environment is known as
A. Data Quality
B. Transparent
C. Implementation
D. None of these
Correct option is B

219. Common classes of problems in machine learning

is
A. Classification
B. Clustering
C. Regression
D. All of these
Correct option is D

220. Which of the following is a widely used and effective

machine learning algorithm based on the idea of bagging?
A. Decision Tree
B. Regression
C. Classification
D. Random Forest
Correct option is D

221. Cost complexity pruning algorithm is used in?

A. CART
B. 5
C. ID3
D. All of
Correct option is A

222. Which one of these is not a tree based learner?

A. CART
B. 5
C. ID3
D. Bayesian Classifier
Correct option is D

223. Which one of these is a tree based learner?

A. Rule based
B. Bayesian Belief Network
C. Bayesian classifier
D. Random Forest
Correct option is D

224. What is the approach of basic algorithm for decision

tree induction?
A. Greedy
B. Top Down
C. Procedural
D. Step by Step
Correct option is A

225. Which of the following classifications would best suit

the student performance classification systems?
A. If-.then-analysis
B. Market-basket analysis
C. Regression analysis
D. Cluster analysis
Correct option is A

226. What are two steps of tree pruning work?

A. Pessimistic pruning and Optimistic pruning
B. Post pruning and Pre pruning
C. Cost complexity pruning and time complexity
pruning
D. None of these
Correct option is B

227. How will you counter over-fitting in decision tree?

A. By pruning the longer rules
B. By creating new rules
C. Both By pruning the longer rules‟ and „ By creating
new rules‟
D. None of Answer
Correct option is A

228. Which of the following sentences are true?

A. In pre-pruning a tree is ‘pruned’ by halting its
construction early
B. A pruning set of class labeled tuples is used to
estimate cost
C. The best pruned tree is the one that minimizes the
number of encoding
D. All of these
Correct option is D

229. Which of the following is a disadvantage of decision

trees?
A. Factor analysis
B. Decision trees are robust to outliers
C. Decision trees are prone to be over fit
D. None of the above
Correct option is C

230. In which of the following scenario a gain ratio is

preferred over Information Gain?
A. When a categorical variable has very large number
of category
B. When a categorical variable has very small number
of category
C. Number of categories is the not the reason
D. None of these
Correct option is A

231. Major pruning techniques used in decision tree are

A. Minimum error
B. Smallest tree
C. Both a & b
D. None of these
Correct option is B

232. What does the central limit theorem state?

A. If the sample size increases sampling distribution
must approach normal distribution
B. If the sample size decreases then the sample
distribution must approach normal distribution.
C. If the sample size increases then the sampling
distributions much approach an exponential
D. If the sample size decreases then the sampling
distributions much approach an exponential
Correct option is A

233. The difference between the sample value expected

and the estimates value of the parameter is called as?
A. Bias
B. Error
C. Contradiction
D. Difference
Correct option is A

234. In which of the following types of sampling the

information is carried out under the opinion of an expert?
A. Quota sampling
B. Convenience sampling
C. Purposive sampling
D. Judgment sampling
Correct option is D

235. Which of the following is a subset of population?

A. Distribution
B. Sample
C. Data
D. Set
Correct option is B

236. The sampling error is defined as?

A. Difference between population and parameter
B. Difference between sample and parameter
C. Difference between population and sample
D. Difference between parameter and sample
Correct option is C

237. Machine learning is interested in the best hypothesis h

from some space H, given observed training data D. Here
best hypothesis means
A. Most general hypothesis
B. Most probable hypothesis
C. Most specific hypothesis
D. None of these
Correct option is B

238. Practical difficulties with Bayesian Learning :

A. Initial knowledge of many probabilities is required
B. No consistent hypothesis
C. Hypotheses make probabilistic predictions
D. None of these
Correct option is A

239. Bayes’ theorem states that the relationship between

the probability of the hypothesis before getting the evidence
P(H) and the probability of the hypothesis after getting the
evidence P(H∣E) is
A. [P(E∣H)P(H)] / P(E)
B. [P(E∣H) P(E) ] / P(H)
C. [P(E) P(H) ] / P(E∣H)
D. None of these
Correct option is A

240. A doctor knows that Cold causes fever 50% of the

time. Prior probability of any patient having cold is 1/50,000.
Prior probability of any patient having fever is 1/20. If a
patient has fever, what is the probability he/she has cold?
A. P(C/F)= 0.0003
B. P(C/F)=0.0004
C. P(C/F)= 0.0002
D. P(C/F)=0.0045
Correct option is C

241. Which of the following will be true about k in K-Nearest

Neighbor in terms of Bias?
A. When you increase the k the bias will be increases
B. When you decrease the k the bias will be increases
C. Can‟t say
D. None of these
Correct option is A

242. When you find noise in data which of the following

option would you consider in K- Nearest Neighbor?
A. I will increase the value of k
B. I will decrease the value of k
C. Noise cannot be dependent on value of k
D. None of these
Correct option is A

243. In K-Nearest Neighbor it is very likely to overfit due to

the curse of dimensionality. Which of the following option
would you consider to handle such problem?
 Dimensionality Reduction
 Feature selection
A. 1
B. 2
C. 1 and 2
D. None of these
Correct option is C

244. Radial basis functions is closely related to distance-

weighted regression, but it is
A. lazy learning
B. eager learning
C. concept learning
D. none of these
Correct option is B

245. Radial basis function networks provide a global

approximation to the target function, represented by of
many local kernel function.
A. a series combination
B. a linear combination
C. a parallel combination
D. a non linear combination
Correct option is B

246. The most significant phase in a genetic algorithm is

A. Crossover
B. Mutation
C. Selection
D. Fitness function
Correct option is A
247. The crossover operator produces two new offspring
from
A. Two parent strings, by copying selected bits from
each parent
B. One parent strings, by copying selected bits from
selected parent
C. Two parent strings, by copying selected bits from
one parent
D. None of these
Correct option is A

248. Mathematically characterize the evolution over time of

the population within a GA based on the concept of
A. Schema
B. Crossover
C. Don‟t care
D. Fitness function
Correct option is A

249. In genetic algorithm process of selecting parents which

mate and recombine to create off-springs for the next
generation is known as:
A. Tournament selection
B. Rank selection
C. Fitness sharing
D. Parent selection
Correct option is D

250. Crossover operations are performed in genetic

programming by replacing
A. Randomly chosen sub tree of one parent program
by a sub tree from the other parent program.
B. Randomly chosen root node tree of one parent
program by a sub tree from the other parent
program
C. Randomly chosen root node tree of one parent
program by a root node tree from the other parent
program
D. None of these
Correct option is A

Post navigation
← Previous Post
Next Post →
Leave a Comment
Your email address will not be published. Required fields are marked *

Type here..

Name*

Email*

Website

Save my name, email, and website in this browser for the next time I
comment.

Post Comment »

CodeMonk
 ML

 SDET

 Java

 Microservices

 Spring

 More

 Books

 Feeds

1. Home

2. Machine Learning

3. Machine Learning based Multiple choice questions

Machine Learning based Multiple choice questions

Carvia Tech | September 10, 2019 | 4 min read | 117,792 views

1. Which of the following is a widely used and effective machine learning

algorithm based on the idea of bagging?
a. Decision Tree
b. Regression
c. Classification
d. Random Forest - answer
2. To find the minimum or the maximum of a function, we set the
gradient to zero because:
a. The value of the gradient at extrema of a function is
always zero - answer
b. Depends on the type of problem
c. Both A and B
d. None of the above
3. The most widely used metrics and tools to assess a classification model
are:
a. Confusion matrix
b. Cost-sensitive accuracy
c. Area under the ROC curve
d. All of the above - answer
4. Which of the following is a good test dataset characteristic?
a. Large enough to yield meaningful results
b. Is representative of the dataset as a whole
c. Both A and B - answer
d. None of the above
5. Which of the following is a disadvantage of decision trees?
a. Factor analysis
b. Decision trees are robust to outliers
c. Decision trees are prone to be overfit - answer
d. None of the above
6. How do you handle missing or corrupted data in a dataset?
a. Drop missing rows or columns
b. Replace missing values with mean/median/mode
c. Assign a unique category to missing values
d. All of the above - answer
7. What is the purpose of performing cross-validation?
a. To assess the predictive performance of the models
b. To judge how the trained model performs outside the sample
on test data
c. Both A and B - answer
8. Why is second order differencing in time series needed?
a. To remove stationarity
b. To find the maxima or minima at the local point
c. Both A and B - answer
d. None of the above
9. When performing regression or classification, which of the following is
the correct way to preprocess the data?
a. Normalize the data → PCA → training - answer
b. PCA → normalize PCA output → training
c. Normalize the data → PCA → normalize PCA output →
training
d. None of the above
10. Which of the folllowing is an example of feature extraction?
a. Constructing bag of words vector from an email
b. Applying PCA projects to a large high-dimensional data
c. Removing stopwords in a sentence
d. All of the above - answer
11. What is pca.components_ in Sklearn?
a. Set of all eigen vectors for the projection space - answer
b. Matrix of principal components
c. Result of the multiplication matrix
d. None of the above options
12. Which of the following is true about Naive Bayes ?
a. Assumes that all the features in a dataset are equally
important
b. Assumes that all the features in a dataset are independent
c. Both A and B - answer
d. None of the above options
13. Which of the following statements about regularization is not correct?
a. Using too large a value of lambda can cause your hypothesis
to underfit the data.
b. Using too large a value of lambda can cause your hypothesis
to overfit the data.
c. Using a very large value of lambda cannot hurt the
performance of your hypothesis.
d. None of the above - answer
14. How can you prevent a clustering algorithm from getting stuck in bad
local optima?
a. Set the same seed value for each run
b. Use multiple random initializations - answer
c. Both A and B
d. None of the above
15. Which of the following techniques can be used for normalization in text
mining?
a. Stemming
b. Lemmatization
c. Stop Word Removal
d. Both A and B - answer
16. In which of the following cases will K-means clustering fail to give good
results? 1) Data points with outliers 2) Data points with different
densities 3) Data points with nonconvex shapes
a. 1 and 2
b. 2 and 3
c. 1, 2, and 3 - answer
d. 1 and 3
17. Which of the following is a reasonable way to select the number of
principal components "k"?
a. Choose k to be the smallest value so that at least 99% of
the varinace is retained. - answer
b. Choose k to be 99% of m (k = 0.99*m, rounded to the nearest
integer).
c. Choose k to be the largest value so that 99% of the variance is
retained.
d. Use the elbow method
18. You run gradient descent for 15 iterations with a=0.3 and compute
J(theta) after each iteration. You find that the value of J(Theta) decreases
quickly and then levels off. Based on this, which of the following
conclusions seems most plausible?
a. Rather than using the current value of a, use a larger value of
a (say a=1.0)
b. Rather than using the current value of a, use a smaller value
of a (say a=0.1)
c. a=0.3 is an effective choice of learning rate - answer
d. None of the above
19. What is a sentence parser typically used for?
a. It is used to parse sentences to check if they are utf-8
compliant.
b. It is used to parse sentences to derive their most likely
syntax tree structures. - answer
c. It is used to parse sentences to assign POS tags to all tokens.
d. It is used to check if sentences can be parsed into meaningful
tokens.
20. Suppose you have trained a logistic regression classifier and it outputs
a new example x with a prediction ho(x) = 0.2. This means
a. Our estimate for P(y=1 | x)
b. Our estimate for P(y=0 | x) - answer
c. Our estimate for P(y=1 | x)
d. Our estimate for P(y=0 | x)

250 MCQ Machine Learning

What is Machine learning?

A. The autonomous acquisition of knowledge through the use of computer programs
B. The autonomous acquisition of knowledge through the use of manual programs
C. The selective acquisition of knowledge through the use of computer programs
D. The selective acquisition of knowledge through the use of manual programs
ANSWER: A
What is true about Machine Learning?
A. Machine Learning (ML) is the field of computer science
B. ML is a type of artificial intelligence that extract patterns out of raw data by using an
algorithm or method
C. The main focus of ML is to allow computer systems learn from experience without being
explicitly programmed or human intervention
D. All of the above
ANSWER: D

ML is a field of AI consisting of learning algorithms that?

A. Improve their performance
B. At executing some task
C. Over time with experience
D. All of the above
ANSWER: D

Different learning methods do not include?

A. Memorization
B. Analogy
C. Introduction
D. Deduction
ANSWER: C

Which of the following is a widely used and effective machine learning algorithm based
on the idea of bagging?
A. Decision Tree
B. Random Forest
C. Regression
D. Classification
ANSWER: B

High entropy means that the partitions in classification are

A. pure
B. not pure
C. useful
D. useless
ANSWER: B

Which of the following are ML methods?

A. Based on human supervision
B. Supervised Learning
C. Semi-reinforcement Learning
D. All of the above
ANSWER: A

In language understanding, the levels of knowledge do not include?

A. Phonological
B. Syntactic
C. Empirical
D. Logical
ANSWER: C

A model of language consists of the categories which does not include ________.
A. System Unit
B. Structural Unit.
C. Data units
D. Empirical units
ANSWER: B

The model will be trained with data in one single batch is known as?
A. Batch learning
B. Offline learning
C. Both A and B
D. None of the above
ANSWER: C
Which of the following is a disadvantage of decision trees?
A. Factor analysis
B. Decision trees are robust to outliers
C. Decision trees are prone to be overfit
D. None of the above
ANSWER: C

Which of the following statements about regularization is not correct?

A. Using too large a value of lambda can cause your hypothesis to underfit the data.
B. Using too large a value of lambda can cause your hypothesis to overfit the data
C. Using a very large value of lambda cannot hurt the performance of your hypothesis.
D. None of the above
ANSWER: D

Which is FALSE regarding regression?

A. It may be used for interpretation
B. It is used for prediction
C. It discovers causal relationships
D. It relates inputs to outputs
ANSWER: C

Choose the correct option regarding machine learning (ML) and artificial intelligence (AI)
A. ML is a set of techniques that turns a dataset into software
B. AI is software that can emulate the human mind
C. ML is an alternate way of programming intelligent machines
D. All of the above
ANSWER: D

Which of the factors affect the performance of the learner system does not include?
A. Good data structures
B. Representation scheme used
C. Training scenario
D. Type of feedback
ANSWER: A

In general, to have a well-defined learning problem, we must identity which of the following
A. The class of tasks
B. The measure of performance to be improved
C. The source of experience
D. All of the above
ANSWER: D

Successful applications of ML
A. Learning to recognize spoken words
B. Learning to drive an autonomous vehicle
C. Learning to classify new astronomical structures
D. All of the above
ANSWER: D

Designing a machine learning approach involves:-

A. Choosing the type of training experience
B. Choosing the target function to be learned
C. Choosing a representation for the target function
D. All of the above
ANSWER: D

Which of the following is not a supervised learning?

A. Naive Bayesian
B. PCA
C. Linear Regression
D. Decision Tree Answer
ANSWER: B

What is Machine Learning? (i) Artificial Intelligence (ii) Deep Learning (iii) Data Statistics
A. Only (i)
B. (i) and (ii)
C. All
D. None
ANSWER: B

What kind of learning algorithm for “Facial identities or facial expressions”?

A. Prediction
B. Recognition Patterns
C. Generating Patterns
D. Recognizing Anomalies Answer
ANSWER: B

Which of the following is not type of learning?

A. Unsupervised Learning
B. Supervised Learning
C. Semi-unsupervised Learning
D. Reinforcement Learning
ANSWER: C

Real-Time decisions, Game AI, Learning Tasks, Skill Aquisition, and Robot Navigation are
applications of which of the following
A. Supervised Learning: Classification
B. Reinforcement Learning
C. Unsupervised Learning: Clustering
D. Unsupervised Learning: Regression
ANSWER: B

Targeted marketing, Recommended Systems, and Customer Segmentation are applications in

which of the following
A. Supervised Learning: Classification
B. Unsupervised Learning: Clustering
C. Unsupervised Learning: Regression
D. Reinforcement Learning
ANSWER: B

Fraud Detection, Image Classification, Diagnostic, and Customer Retention are applications
in which of the following
A. Unsupervised Learning: Regression
B. Supervised Learning: Classification
C. Unsupervised Learning: Clustering
D. Reinforcement Learning
ANSWER: B

Which of the following is not function of symbolic in the various function representation of
Machine Learning?
A. Rules in propotional Logic
B. Hidden-Markov Models (HMM)
C. Rules in first-order predicate logic
D. Decision Trees
ANSWER: B

Which of the following is a not numerical function in the various function representation of
Machine Learning?
A. Neural Network
B. Support Vector Machines
C. Case-based
D. Linear Regression
ANSWER: C

What is perceptron?
A. A single layer feed-forward neural network with pre-processing
B. A neural network that contains feedback
C. A double layer auto-associative neural network
D. An auto-associative neural network
ANSWER: A
Which of the following is true for neural networks? (i) The training time depends on the size
of the (ii) Neural networks can be simulated on a conventional (iii) Artificial neurons are
identical in operation to biological
A. All
B. Only (ii)
C. (i) and (ii)
D. None
ANSWER: C

What are the advantages of neural networks over conventional computers? (i) They have the
ability to learn by (ii) They are more fault (iii) They are more suited for real time operation
due to their high computational
A. (i) and (ii)
B. (i) and (iii)
C. Only (i)
D. All
ANSWER: D

Which is true for neural networks?

A. Each node computes it’s weighted input
B. Node could be in excited state or non-excited state
C. It has set of nodes and connections
D. All of the above
ANSWER: D

What is the objective of backpropagation algorithm?

A. To develop learning algorithm for multilayer feedforward neural network, so that network
can be trained to capture the mapping implicitly
B. To develop learning algorithm for multilayer feedforward neural network
C. To develop learning algorithm for single layer feedforward neural network
D. All of the above
ANSWER: A
Which of the following is true?
Single layer associative neural networks do not have the ability to:- (i) Perform pattern
recognition (ii) Find the parity of a picture (iii) Determine whether two or more shapes in
a picture are connected or not
A. (ii) and (iii)
B. Only (ii)
C. All
D. None
ANSWER: A

The backpropagation law is also known as generalized delta rule

A. True
B. False
C. Depends on data
D. Not necessary
ANSWER: A

Which of the following is true? (i) On average, neural networks have higher computational
rates than conventional computers. (ii) Neural networks learn by (iii) Neural networks
mimic the way the human brain
A. All
B. (ii) and (iii)
C. (i), (ii) and (iii)
D. None
ANSWER: A

What is true regarding backpropagation rule?

A. Error in output is propagated backwards only to determine weight updates
B. There is no feedback of signal at nay stage
C. It is also called generalized delta rule
D. All of the above
ANSWER: D
There is feedback in final stage of backpropagation
A. True
B. False
C. Depends on data
D. Not necessary
ANSWER: B

An auto-associative network is
A. A neural network that has only one loop
B. A neural network that contains feedback
C. A single layer feed-forward neural network with pre-processing
D. A neural network that contains no loops
ANSWER: B

What of the following is true regarding backpropagation rule?

A. Hidden layers output is not all important, they are only meant for supporting input and
output layers
B. Actual output is determined by computing the outputs of units for each hidden layer
C. It is a feedback neural network
D. None of the above
ANSWER: B

What is back propagation?

A. It is another name given to the curvy function in the perceptron
B. It is the transmission of error back through the network to allow weights to be adjusted so
that the network can learn
C. It is another name given to the curvy function in the perceptron
D. None of the above
ANSWER: B

The general limitations of back propagation rule is/are

A. Scaling
B. Slow convergence
C. Local minima problem
D. All of the above
ANSWER: D

What is meaning of generalized in statement “backpropagation is a generalized delta rule”?

A. Because delta is applied to only input and output layers, thus making it more simple and
generalized
B. It has no significance
C. Because delta rule can be extended to hidden layer units
D. None of the above
ANSWER: C

Neural Networks are complex functions with many parameters

A. Linear
B. Non linear
C. Discreate
D. Exponential
ANSWER: A

The general tasks that are performed with backpropagation algorithm

A. Pattern mapping
B. Prediction
C. Function approximation
D. All of the above
ANSWER: D

Backpropagaion learning is based on the gradient descent along error surface.

A. True
B. False
C. Depends on data
D. Not necessary
ANSWER: A
In backpropagation rule, how to stop the learning process?
A. No heuristic criteria exist
B. On basis of average gradient value
C. There is convergence involved
D. None of these
ANSWER: B

Applications of NN (Neural Network)

A. Risk management
B. Data validation
C. Sales forecasting
D. All of the above
ANSWER: D

The network that involves backward links from output to the input and hidden layers is
known as
A. Recurrent neural network
B. Self organizing maps
C. Perceptrons
D. Single layered perceptron
ANSWER: A

Decision Tree is a display of an Algorithm?

A. True
B. False
C. Depends on data
D. Not necessary
ANSWER: A

Which of the following is the consequence between a node and its predecessors while
creating bayesian network?
A. Conditionally independent
B. Functionally dependent
C. Both Conditionally dependent & Dependent
D. Dependent
ANSWER: A

Why it is needed to make probabilistic systems feasible in the world?

A. Feasibility
B. Reliability
C. Crucial robustness
D. None of the above
ANSWER: C

Bayes rule can be used for:-

A. Solving queries
B. Increasing complexity
C. Answering probabilistic query
D. Decreasing complexity
ANSWER: C

provides way and means of weighing up the desirability of goals and the likelihood of
achieving
A. Utility theory
B. Decision theory
C. Bayesian networks
D. Probability theory
ANSWER: A

Which of the following provided by the Bayesian Network?

A. Complete description of the problem
B. Partial description of the domain
C. Complete description of the domain
D. All of the above
ANSWER: C

The bayesian network can be used to answer any query by using:-

A. Full distribution
B. Joint distribution
C. Partial distribution
D. All of the above
ANSWER: B

Bayesian networks allow compact specification of:-

A. Joint probability distributions
B. Belief
C. Propositional logic statements
D. All of the above
ANSWER: A

The compactness of the bayesian network can be described by

A. Fully structured
B. Locally structured
C. Partially structured
D. All of the above
ANSWER: B

For the analysis of ML algorithms, we need

A. Computational learning theory
B. Statistical learning theory
C. Both A & B
D. None of these
ANSWER: C

PAC stand for

A. Probably Approximate Correct
B. Probably Approx Correct
C. Probably Approximate Computation
D. Probably Approx Computation
ANSWER: A

Genetic algorithm is a
A. Search technique used in computing to find true or approximate solution to optimization
and search problem
B. Sorting technique used in computing to find true or approximate solution to optimization
and sort problem
C. Both A & B
D. None of these
ANSWER: A

emphasizes learning feedback that evaluates the learner’s performance without

providing standards of correctness in the form of behavioural
A. Reinforcement learning
B. Supervised Learning
C. Unsupervised Learning
D. None of these
ANSWER: A

Features of Reinforcement learning

A. Set of problem rather than set of techniques
B. RL is training by reward and
C. RL is learning from trial and error with the
D. All of these
ANSWER: D

Feature of ANN in which ANN creates its own organization or representation of information
it receives during learning time is
A. Adaptive Learning
B. Self Organization
C. What-If Analysis
D. Supervised Learning
ANSWER: B

Learning that enables massive quantities of data is known as

A. Artificial Intelligence
B. Machine Learning
C. Deep learning
D. All of these
ANSWER: B

A model can learn based on the rewards it received for its previous action is known as:
A. Supervised learning
B. Unsupervised learning
C. Reinforcement learning
D. Concept learning
ANSWER: C

What is Machine Learning

A. The autonomous acquisition of knowledge through the use of manual programs
B. The selective acquisition of knowledge through the use of computer programs
C. The selective acquisition of knowledge through the use of manual programs
D. The autonomous acquisition of knowledge through the use of computer programs
ANSWER: D

Father of Machine Learning

A. Geoffrey Chaucer
B. Geoffrey Hill
C. Geoffrey Everest Hinton
D. None of the above
ANSWER: C
Which is FALSE regarding regression?
A. It may be used for interpretation
B. It is used for prediction
C. It discovers causal relationships
D. It relates inputs to outputs
ANSWER: C

Choose the correct option regarding machine learning (ML) and artificial intelligence (AI)
A. ML is a set of techniques that turns a dataset into a software
B. AI is a software that can emulate the human mind
C. ML is an alternate way of programming intelligent machines
D. All of the above
ANSWER: D

Which of the factors affect the performance of the learner system does not include?
A. Good data structures
B. Representation scheme used
C. Training scenario
D. Type of feedback
ANSWER: A

Successful applications of ML
A. Learning to recognize spoken words
B. Learning to drive an autonomous vehicle
C. Learning to classify new astronomical structures
D. Learning to play world-class backgammon
E.All of the above
ANSWER: E

Which of the following does not include different learning methods

A. Analogy
B. Introduction
C. Memorization
D. Deduction
ANSWER: B

In language understanding, the levels of knowledge that does not include?

A. Empirical
B. Logical
C. Phonological
D. Syntactic
ANSWER: A

Designing a machine learning approach involves:-

Concept learning inferred a valued function from training examples of its input and output.
A. Decimal
B. Hexadecimal
C. Boolean
D. All of the above
ANSWER: C
Which of the following is not a supervised learning?
A. Naive Bayesian
B. PCA
C. Linear Regression
D. Decision Tree ANSWER:
ANSWER: B

What is Machine Learning?

I Artificial Intelligence
II Deep Learning
III Data Statistics
A. Only (i)
B. (i) and (ii)
C. All
D. None
ANSWER: B

What kind of learning algorithm for “Facial identities or facial expressions”?

A. Prediction
B. Recognition Patterns
C. Generating Patterns
D. Recognizing Anomalies

ANSWER: B

Which of the following is not type of learning?

A. Unsupervised Learning
B. Supervised Learning
C. Semi-unsupervised Learning
D. Reinforcement Learning
ANSWER: C
Real-Time decisions, Game AI, Learning Tasks, Skill Aquisition, and Robot Navigation are
applications of which of the folowing
A. Supervised Learning: Classification
B. Reinforcement Learning
C. Unsupervised Learning: Clustering
D. Unsupervised Learning: Regression
ANSWER: B

Targeted marketing, Recommended Systems, and Customer Segmentation are applications in

which of the following
A. Supervised Learning: Classification
B. Unsupervised Learning: Clustering
C. Unsupervised Learning: Regression
D. Reinforcement Learning
ANSWER: B

Which of the following is not function of symbolic in the various function representation of
Machine Learning?
A. Rules in proportional Logic
B. Hidden-Markov Models (HMM)
C. Rules in first-order predicate logic
D. Decision Trees
ANSWER: B
Which of the following is not numerical functions in the various function representation of
Machine Learning?
A. Neural Network
B. Support Vector Machines
C. Case-based
D. Linear Regression
ANSWER: C

FIND-S Algorithm starts from the most specific hypothesis and generalize it by considering
only
A. Negative
B. Positive
C. Negative or Positive
D. None of the above
ANSWER: B

FIND-S algorithm ignores

A. Negative
B. Positive
C. Both
D. None of the above
ANSWER: A

The Candidate-Elimination Algorithm represents the .

A. Solution Space
B. Version Space
C. Elimination Space
D. All of the above
ANSWER: B

Inductive learning is based on the knowledge that if something happens a lot it is likely to be
generally
A. True
B. False
ANSWER: A

Inductive learning takes examples and generalizes rather than starting with
A. Inductive
B. Existing
C. Deductive
D. None of these
ANSWER: B

A drawback of the FIND-S is that it assumes the consistency within the training set
A. True
B. False
ANSWER: A

What strategies can help reduce overfitting in decision trees? (i) Enforce a maximum depth
for the tree (ii) Enforce a minimum number of samples in leaf nodes (iii) Pruning
(iv) Make sure each leaf node is one pure class
A. All
B. (i), (ii) and (iii)
C. (i), (iii), (iv)
D. None
ANSWER: B

Which of the following is a widely used and effective machine learning algorithm based on
the idea of bagging?
A. Decision Tree
B. Random Forest
C. Regression
D. Classification
ANSWER: B
To find the minimum or the maximum of a function, we set the gradient to zero because
which of the following
A. Depends on the type of problem
B. The value of the gradient at extrema of a function is always zero
C. Both (A) and (B)
D. None of these
ANSWER: B

Which of the following is a disadvantage of decision trees?

A. Decision trees are prone to be overfit
B. Decision trees are robust to outliers
C. Factor analysis
D. None of the above
ANSWER: A

Which of the following is true for neural networks? (I) The training time depends on the size
of the (II) Neural networks can be simulated on a conventional (III) Artificial neurons are
identical in operation to biological
A. All
B. Only (ii)
C. (i) and (ii)
D. None
ANSWER: C

What are the advantages of neural networks over conventional computers? (I) They have the
ability to learn by (II) They are more fault (III) They are more suited for real time operation
due to their high computational
A. (i) and (ii)
B. (i) and (iii)
C. Only (i)
D. All
E. None
ANSWER: D

What is Neuro software?

A. It is software used by Neurosurgeon
B. Designed to aid experts in real world
C. It is powerful and easy neural network
D. A software used to analyze neurons
ANSWER: C

Which is true for neural networks?

A. Each node computes it‟s weighted input
B. Node could be in excited state or non-excited state
C. It has set of nodes and connections
D. All of the above
ANSWER: D

What is the objective of backpropagation algorithm?

Which of the following is true for Single layer associative neural networks do not have the
ability to:- (I) Perform pattern recognition (II) Find the parity of a picture (III)
Determine whether two or more shapes in a picture are connected or not
A. (ii) and (iii)
B. Only (ii)
C. All
D. None
ANSWER: A

The backpropagation law is also known as generalized delta rule

A. True
B. False
ANSWER: A

What is true regarding backpropagation rule?

A. Error in output is propagated backwards only to determine weight updates
B. There is no feedback of signal at nay stage
C. It is also called generalized delta rule
D. All of the above
ANSWER: D

There is feedback in final stage of backpropagation

A. True
B. False
ANSWER: B

A 3-input neuron has weights 1, 4 and 3. The transfer function is linear with the constant of
proportionality being equal to 3. The inputs are 4, 8 and 5 respectively. What will be the
output?
A. 139
B. 153
C. 162
D. 160
ANSWER: B

What of the following is true regarding backpropagation rule?

What is back propagation?

A. It is another name given to the curvy function in the perceptron
B. It is the transmission of error back through the network to allow weights to be adjusted so
that the network can learn
C. It is another name given to the curvy function in the perceptron
D. None of the above
ANSWER: B

The general limitations of back propagation rule is/are

A. Scaling
B. Slow convergence
C. Local minima problem
D. All of the above
ANSWER: D

What is the meaning of generalized in statement “backpropagation is a generalized delta rule”

?
A. Because delta is applied to only input and output layers, thus making it more simple and
generalized
B. It has no significance
C. Because delta rule can be extended to hidden layer units
D. None of the above
ANSWER: C

Neural Networks are complex functions with many parameter

A. Linear
B. Non linear
C. Discreate
D. Exponential
ANSWER: A

The general tasks that are performed with backpropagation algorithm

A. Pattern mapping
B. Prediction
C. Function approximation
D. All of the above
ANSWER: D

Backpropagaion learning is based on the gradient descent along error surface.

A. True
B. False
ANSWER: A

In backpropagation rule, how to stop the learning process?

A. No heuristic criteria exist
B. On basis of average gradient value
C. There is convergence involved
D. None of these
ANSWER: B

Machine Learning Multiple Choice Questions - Free Practice Test
100% (1)
Machine Learning Multiple Choice Questions - Free Practice Test
12 pages
MCQ Machine Learning
No ratings yet
MCQ Machine Learning
23 pages
Machine Learning
100% (3)
Machine Learning
2,520 pages
Machine Learning MCQ
No ratings yet
Machine Learning MCQ
11 pages
ML Interview Questions PDF
100% (5)
ML Interview Questions PDF
20 pages
Machine Learning MCQ S
No ratings yet
Machine Learning MCQ S
318 pages
Marks Hi Marks: Be Comp MCQ PDF
100% (1)
Marks Hi Marks: Be Comp MCQ PDF
878 pages
The Purpose of Dreams
No ratings yet
The Purpose of Dreams
8 pages
RCS-080 Machine Learning MCQs
100% (2)
RCS-080 Machine Learning MCQs
57 pages
Combined ML
100% (1)
Combined ML
705 pages
COS20007 Portfolio Format and Assessment Criteria
No ratings yet
COS20007 Portfolio Format and Assessment Criteria
8 pages
1000 Machine Learning MCQ (Multiple Choice Questions) - Sanfoundry
No ratings yet
1000 Machine Learning MCQ (Multiple Choice Questions) - Sanfoundry
16 pages
Machine Learning Solved Mcqs Set 1
100% (8)
Machine Learning Solved Mcqs Set 1
6 pages
Professional Ethics - Midterm
No ratings yet
Professional Ethics - Midterm
7 pages
Lecture 10 - AI Vs ML Vs DL - Classification
No ratings yet
Lecture 10 - AI Vs ML Vs DL - Classification
34 pages
ML Interview Questions and Answers
100% (1)
ML Interview Questions and Answers
25 pages
CS 601 ML Lab Manual
0% (1)
CS 601 ML Lab Manual
14 pages
ML MCQ Question Bank
100% (1)
ML MCQ Question Bank
4 pages
Machine Learning Question Paper 21 22
100% (1)
Machine Learning Question Paper 21 22
3 pages
Assessment 1 Sample - EDUC6027
No ratings yet
Assessment 1 Sample - EDUC6027
15 pages
ECE6009 Poster
No ratings yet
ECE6009 Poster
1 page
Notes On Metaphores
No ratings yet
Notes On Metaphores
3 pages
(Final) 600+ ML MCQ
100% (2)
(Final) 600+ ML MCQ
319 pages
ML Unit 2 MCQ
100% (2)
ML Unit 2 MCQ
3 pages
Tos Format
No ratings yet
Tos Format
2 pages
250 MCQ of ML
100% (3)
250 MCQ of ML
47 pages
Machine Learning Summarized Notes 1660762916
No ratings yet
Machine Learning Summarized Notes 1660762916
111 pages
Top 100 Machine Learning Questions With Answers For Interview PDF
100% (3)
Top 100 Machine Learning Questions With Answers For Interview PDF
48 pages
I Am Sharing 'Interview' With You
100% (3)
I Am Sharing 'Interview' With You
65 pages
Computers Lesson 2
No ratings yet
Computers Lesson 2
2 pages
5clustering Solved MCQs of Clustering in Data Mining With Answers
No ratings yet
5clustering Solved MCQs of Clustering in Data Mining With Answers
26 pages
Question Bank - Machine Learning (Repaired)
100% (1)
Question Bank - Machine Learning (Repaired)
78 pages
ML MCQ 250
100% (1)
ML MCQ 250
44 pages
MCQ of Machine Learning
100% (2)
MCQ of Machine Learning
151 pages
ML MCQs
55% (11)
ML MCQs
17 pages
Pre-Reading Activities Questions Write Your Own Answers
No ratings yet
Pre-Reading Activities Questions Write Your Own Answers
4 pages
Machine Learning Unit 2 MCQ
No ratings yet
Machine Learning Unit 2 MCQ
17 pages
Machine Learning Multiple Choice Questions
100% (1)
Machine Learning Multiple Choice Questions
20 pages
Worksheet in Practical Research
No ratings yet
Worksheet in Practical Research
3 pages
Practical Research 2 Module 6 Q1
No ratings yet
Practical Research 2 Module 6 Q1
9 pages
Proposal Yauma
No ratings yet
Proposal Yauma
62 pages
Berkeley 1713 Part 1
No ratings yet
Berkeley 1713 Part 1
27 pages
Review of Introducing Linguistics by Dav
No ratings yet
Review of Introducing Linguistics by Dav
1 page
MCQ Unit Wise ML (ROE083) Que Bank With Ans.
100% (4)
MCQ Unit Wise ML (ROE083) Que Bank With Ans.
22 pages
CB Text Bank of Midterm Exam - 2022
No ratings yet
CB Text Bank of Midterm Exam - 2022
7 pages
ML Notes
100% (2)
ML Notes
125 pages
Applied Machine Learning Question Paper
100% (1)
Applied Machine Learning Question Paper
2 pages
Big Data MCQ
100% (1)
Big Data MCQ
4 pages
ICT and Society: Social Informatics
No ratings yet
ICT and Society: Social Informatics
62 pages
49 Machine Learning
No ratings yet
49 Machine Learning
300 pages
ML Unit 1 MCQ
100% (1)
ML Unit 1 MCQ
9 pages
MCQs (Machine Learning)
50% (22)
MCQs (Machine Learning)
7 pages
Data Science and Machine Learning - MCQ
No ratings yet
Data Science and Machine Learning - MCQ
19 pages
ML MCQ
100% (4)
ML MCQ
31 pages
Answer AIL 1
No ratings yet
Answer AIL 1
12 pages
Machine Learning MCQ
100% (2)
Machine Learning MCQ
29 pages
T1 Machine Learning MCQ Questions and Answers - Key
No ratings yet
T1 Machine Learning MCQ Questions and Answers - Key
15 pages
Python Machine Learning
100% (2)
Python Machine Learning
70 pages
01 The Formal Cause of Business
No ratings yet
01 The Formal Cause of Business
4 pages
Tutorial 04
No ratings yet
Tutorial 04
2 pages
ML 2 (Mainly KNN)
100% (1)
ML 2 (Mainly KNN)
12 pages
SIM Ch4 PDF
No ratings yet
SIM Ch4 PDF
20 pages
What Is Cultural Translation and Why Is It Important
No ratings yet
What Is Cultural Translation and Why Is It Important
10 pages
AI MCQ QUESTION 100 MCQ
No ratings yet
AI MCQ QUESTION 100 MCQ
13 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
38 pages
Machine Learning
100% (5)
Machine Learning
56 pages
MCQ
100% (1)
MCQ
9 pages
ROB521 Assignment 3
No ratings yet
ROB521 Assignment 3
7 pages
Figure 6. Research Paradigm
No ratings yet
Figure 6. Research Paradigm
7 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
Unit I Notes Machine Learning Techniques 1
No ratings yet
Unit I Notes Machine Learning Techniques 1
21 pages
Issues and Challenges of Inclusive Education
No ratings yet
Issues and Challenges of Inclusive Education
27 pages
Deep Learning Questions
50% (2)
Deep Learning Questions
51 pages
Data Science Interview Questions
100% (2)
Data Science Interview Questions
55 pages
Test Tip: IELTS Writing Task 1: Describing A Line Graph
No ratings yet
Test Tip: IELTS Writing Task 1: Describing A Line Graph
4 pages
Thesis-Chap-123 With Reference
No ratings yet
Thesis-Chap-123 With Reference
50 pages
Deep Learning Interview Questions
No ratings yet
Deep Learning Interview Questions
17 pages
Lesson 2 - Comparative Adjectives
No ratings yet
Lesson 2 - Comparative Adjectives
2 pages
Mcqs Bank Unit 1: A) The Autonomous Acquisition of Knowledge Through The Use of Computer Programs
100% (1)
Mcqs Bank Unit 1: A) The Autonomous Acquisition of Knowledge Through The Use of Computer Programs
8 pages
40 Interview Questions Asked at Startups in Machine Learning - Data Science
100% (3)
40 Interview Questions Asked at Startups in Machine Learning - Data Science
33 pages
ML Mcqs Without Answers
50% (2)
ML Mcqs Without Answers
21 pages
MCQ
No ratings yet
MCQ
4 pages
Machine Learning Question Paper Solved ML
No ratings yet
Machine Learning Question Paper Solved ML
55 pages
UNIT 1 Practice Quiz - MCQs - ML
100% (1)
UNIT 1 Practice Quiz - MCQs - ML
10 pages
ML - LAB Record
No ratings yet
ML - LAB Record
36 pages
6 Sea-Shon Chen
No ratings yet
6 Sea-Shon Chen
8 pages
50 Machine Learning Interview
No ratings yet
50 Machine Learning Interview
8 pages
Tier II Behavior Intervention Descriptions2
No ratings yet
Tier II Behavior Intervention Descriptions2
13 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
4 pages
Ind Group Presentation Checklist
No ratings yet
Ind Group Presentation Checklist
1 page