Machine Learning Unit 2 MCQ
Machine Learning Unit 2 MCQ
c) to develop learning algorithm for multilayer feedforward neural network, so that network
can be trained to capture the mapping implicitly
Answer: c
a) yes
b) no
Answer: a
Answer: d
a) yes
b) no
Answer: b
b) actual output is determined by computing the outputs of units for each hidden layer
c) hidden layers output is not all important, they are only meant for supporting input and
output layers
Answer: b
b) because delta is applied to only input and output layers, thus making it more simple and
generalized
c) it has no significance
Answer: a
Explanation: The term generalized is used because delta rule could be extended to hidden
layer units.
b) slow convergence
c) scaling
8. What are the general tasks that are performed with backpropagation algorithm?
a) pattern mapping
b) function approximation
c) prediction
Answer: d
Explanation: These all are the tasks that can be performed with backpropagation algorithm
in general.
a) yes
b) no
c) cannot be said
Answer: a
Answer: c
Explanation: If average gadient value fall below a preset threshold value, the process may
be stopped.
# Decision Trees
11. A _________ is a decision support tool that uses a tree-like graph or model of
decisions and their possible consequences, including chance event outcomes,
resource costs, and utility.
a) Decision tree
b) Graphs
c) Trees
d) Neural Networks
Answer: a
a) True
b) False
Answer: a
Explanation: None.
a) Flow-Chart
Answer: c
b) False
Answer: a
Explanation: None.
15. Choose from the following that are Decision Tree nodes?
a) Decision Nodes
b) End Nodes
c) Chance Nodes
Answer: d
Explanation: None.
a) Disks
b) Squares
c) Circles
d) Triangles
Answer: b
a) Disks
b) Squares
c) Circles
d) Triangles
Answer: c
Explanation: None.
a) Disks
b) Squares
c) Circles
d) Triangles
Answer: d
Explanation: None.
c) Worst, best and expected values can be determined for different scenarios
Answer: d
Explanation: None.
20. Which of the following statement(s) is / are true for Gradient Decent
(GD) and Stochastic Gradient Decent (SGD)?
In GD and SGD, you update a set of parameters in an iterative manner
to minimize the error function.
In SGD, you have to run through all the samples in your training set for
a single update of a parameter in each iteration.
In GD, you either use the entire data or a subset of training data to
update a parameter in each iteration.
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3
Solution: (A)
In SGD for each iteration you choose the batch which is generally
contain the random sample of data But in case of GD each iteration
contain the all of the training observations.
21. Below are the 8 actual values of target variable in the train file.
[0,0,0,1,1,1,1,1]
What is the entropy of the target variable?
A) -(5/8 log(5/8) + 3/8 log(3/8))
B) 5/8 log(5/8) + 3/8 log(3/8)
C) 3/8 log(5/8) + 5/8 log(3/8)
D) 5/8 log(3/8) – 3/8 log(5/8)
Solution: (A)The formula for entropy is
So the answer is A.
#ANN
22. A 3-input neuron is trained to output a zero when the input is 110 and a one
when the input is 111. After generalization, the output will be zero when and only
when the input is?
a) 000 or 110 or 011 or 101
b) 010 or 100 or 110 or 101
c) 000 or 010 or 110 or 100
d) 100 or 111 or 101 or 001
View Answer
Answer: c
Explanation: The truth table before generalization is:
Inputs Output
0$
1$
10$
11$
100$
101$
110 0
111 1
Inputs Output
000 0
001 1
010 0
011 1
100 0
101 1
110 0
111 1
Answer: a
Explanation: The perceptron is a single layer feed-forward neural network. It is not
an auto-associative network because it has no feedback and is not a multiple layer
neural network because the pre-processing stage is not made of neurons.
Answer: b
Explanation: An auto-associative network is equivalent to a neural network that
contains feedback. The number of feedback paths(loops) does not have to be one.
25. A 4-input neuron has weights 1, 2, 3 and 4. The transfer function is linear with
the constant of proportionality being equal to 2. The inputs are 4, 10, 5 and 20
respectively. What will be the output?
a) 238
b) 76
c) 119
d) 123
View Answer
Answer: a
Explanation: The output is found by multiplying the weights with their respective
inputs, summing the results and multiplying with the transfer function. Therefore:
Output = 2 * (1*4 + 2*10 + 3*5 + 4*20) = 238.
Answer: a
Explanation: Neural networks have higher computational rates than conventional
computers because a lot of the operation is done in parallel. That is not the case
when the neural network is simulated on a computer. The idea behind neural nets
is based on the way the human brain works. Neural nets cannot be programmed,
they can only learn by examples.
Answer: c
Explanation: The training time depends on the size of the network; the number of
neuron is greater and therefore the number of possible ‘states’ is increased. Neural
networks can be simulated on a conventional computer but the main advantage of
neural networks – parallel execution – is lost. Artificial neurons are not identical in
operation to the biological ones.
28. What are the advantages of neural networks over conventional computers?
(i) They have the ability to learn by example
(ii) They are more fault tolerant
(iii)They are more suited for real time operation due to their high ‘computational’
rates
a) (i) and (ii) are true
b) (i) and (iii) are true
c) Only (i)
d) All of the mentioned
View Answer
Answer: d
Explanation: Neural networks learn by example. They are more fault tolerant
because they are always able to respond and small changes in input do not
normally cause a change in output. Because of their parallel architecture, high
computational rates are achieved.
Answer: a
Explanation: Pattern recognition is what single layer neural networks are best at
but they don’t have the ability to find the parity of a picture or to determine whether
two shapes are connected or not.
Answer: d
Explanation: All mentioned are the characteristics of neural network.
Answer: b
Explanation: None.
Answer: d
Explanation: None.
Answer: c
Explanation: Back propagation is the transmission of error back through the
network to allow weights to be adjusted so that the network can learn.
34. Why are linearly separable problems of interest of neural network researchers?
a) Because they are the only class of problem that network can solve successfully
b) Because they are the only class of problem that Perceptron can solve
successfully
c) Because they are the only mathematical functions that are continue
d) Because they are the only mathematical functions you can draw
View Answer
Answer: b
Explanation: Linearly separable problems of interest of neural network researchers
because they are the only class of problem that Perceptron can solve successfully.
35. Which of the following is not the promise of artificial neural network?
a) It can explain result
b) It can survive the failure of some nodes
c) It has inherent parallelism
d) It can handle noise
View Answer
Answer: a
Explanation: The artificial Neural Network (ANN) cannot explain result.
Answer: a
Explanation: Neural networks are complex linear functions with many parameters.
37. A perceptron adds up all the weighted inputs it receives, and if it exceeds a
certain value, it outputs a 1, otherwise it just outputs a 0.
a) True
b) False
c) Sometimes – it can also output intermediate values as well
d) Can’t say
View Answer
Answer: a
Explanation: Yes the perceptron works like that.
38. What is the name of the function in the following statement “A perceptron adds
up all the weighted inputs it receives, and if it exceeds a certain value, it outputs a
1, otherwise it just outputs a 0”?
a) Step function
b) Heaviside function
c) Logistic function
d) Perceptron function
View Answer
Answer: b
Explanation: Also known as the step function – so answer 1 is also right. It is a
hard thresholding function, either on or off with no in-between.
39. Having multiple perceptrons can actually solve the XOR problem satisfactorily:
this is because each perceptron can partition off a linear part of the space itself,
and they can then combine their results.
a) True – this works always, and these multiple perceptrons learn to classify even
complex problems
b) False – perceptrons are mathematically incapable of solving linearly inseparable
functions, no matter what you do
c) True – perceptrons can do this but are unable to learn to do it – they have to be
explicitly hand-coded
d) False – just having a single perceptron is enough
View Answer
Answer: c
Explanation: None.
40. The network that involves backward links from output to the input and hidden
layers is called _________
a) Self organizing maps
b) Perceptrons
c) Recurrent neural network
d) Multi layered perceptron
View Answer
Answer: c
Explanation: RNN (Recurrent neural network) topology involves backward links
from output to the input and hidden layers.
Answer: d
Explanation: All mentioned options are applications of Neural Network.
Accepted Answers:
A. Attributes
43. Leaf nodes of a decision tree correspond to:
A. Attributes
B. Classes
C. Data instances
D. None of the above
Accepted Answers:
B. Classes
44. Which of the following criteria is not used to decide which attribute to split next in a
decision tree:
A. Gini index
B. Information gain
C. Entropy
D. Scatter
Accepted Answers:
D. Scatter
45. Which of the following is a valid logical rule for the decision tree below?
B. F Business Appointment = Yes & Temp above 70 = Yes THEN Decision = wear shorts
Accepted Answers:
47. For questions 7-11, consider the following small data table for two classes of woods.
Using
information gain, construct a decision tree to classify the data set. Answer the following question
for the resulting tree.
47.(a)Which attribute would information gain choose as the root of the tree?
A. Density
B. Grain
C. Hardness
Accepted Answers:
C. Hardness
47.(b) What class does the tree infer for the example {Density=Light, Grain=Small, Hardness=Hard}?
A. Oak
B. Pine
C. The example cannot be classified
D. Both classes are equally likely
Accepted Answers:
B. Pine
47.(c) What class does the tree infer for the example {Density=Light, Grain=Small, Hardness=Soft}?
A. Oak
B. Pine
C. The example cannot be classified
D. Both classes are equally likely
Accepted Answers:
A. Oak
47.(d) What class does the tree infer for the example {Density=Heavy, Grain=Small, Hardness=Soft}?
A. Oak
B. Pine
C. The example cannot be classified
D. Both classes are equally likely
Accepted Answers:
B. Pine
47.(e) What class does the tree infer for the example {Density=Heavy, Grain=Small, Hardness=Hard}?
A. Oak
B. Pine
C. The example cannot be classified
D. Both classes are equally likely
Accepted Answers:
A. Oak