0% found this document useful (0 votes)
23 views17 pages

ITML MID-2 Bits

Uploaded by

vardhanvalluri5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views17 pages

ITML MID-2 Bits

Uploaded by

vardhanvalluri5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Introduction to Machine Learning

MULTIPLE CHOICE

1. The range of sigmoid function is


a. 0 to 1 b. -1 to 1 c. 0 to  d. -
ANS: A PTS: 1

2. The range of tanh activation function is


a. 0 to 1 b. -1 to 1 c. 0 to  d. -
ANS: B PTS: 1

3. The range of ReLU activation function is


a. 0 to 1 b. -1 to 1 c. 0 to  d. -
ANS: C PTS: 1

4. The range of leaky ReLU activation function is


a. 0 to 1 b. -1 to 1 c. 0 to  d. -
ANS: D PTS: 1

5. What is the purpose of activation function ?


a. generates non b. generates linear c. both a and b d. none of the above
linear output output
ANS: A PTS: 1

6. Which activation function is zero centered activation function ?


a. Sigmoid b. Tanh c. Relu d. Leaky Relu
ANS: B PTS: 1

7. Relu activation makes ____ values of the input to zero.


a. Positive b. Negative c. both d. Fractional values
ANS: B PTS: 1

8. ____ activation function doesnot suffer from vanishing gradient.


a. Sigmoid b. Tanh c. ReLu d. leaky ReLu
ANS: D PTS: 1

9. The representation of sigmoid function is


a. g(z)=1/(1+e^-z) b. g(z)=1/(1-e^-z) c. g(z)=(1+e^-z) d. g(z)=(1-e^-z)

ANS: A PTS: 1
10. Softmax activation function is used for
a. Binary b. Multiclass c. Regression d. for all
classification classification

ANS: B PTS: 1

11. Sigmoid activation function is used for


a. Binary b. Multiclass c. Regression d. for all
classification classification

ANS: A PTS: 1

12. Softmax is used at ___________


a. Hidden layer b. Second hidden c. a and b d. Output layer
layer
ANS: D PTS: 1

13. The Dendrites of human brain is equivalent to ___________


a. Input b. Processor c. Link d. Output

ANS: A PTS: 1

14. The axon of human brain is equivalent to ___________


a. Input b. Processor c. Link d. Output

ANS: D PTS: 1

15. A typical human brain contains ___________ neurons


a. 1 billion b. 10 billion c. 1 million d. 10 million

ANS: B PTS: 1

16. Which of the following layers may be more than one in number?
a. Input b. Output c. Hidden d. Initial

ANS: C PTS: 1

17. Back propagation is used in _________ phase of the model.


a. Testing b. Validation c. Training d. a and b

ANS: C PTS: 1

18. The role of Back propagation is _________


a. reduce error b. updating c. calculating error d. finding hypothesis
parameters
ANS: B PTS: 1

19. The role of forward propagation is _________


a. reduce error b. updating c. calculating error d. finding hypothesis
parameters
ANS: D PTS: 1

20. The function of Neuron is


a. Weighted sum b. Introducing c. Applying d. a and c
calculation linearity Activation

ANS: D PTS: 1

21. The function of Activation function is


a. Weighted sum b. Introducing non c. Applying d. Updating weights
calculation linearity Activation

ANS: B PTS: 1

22. The function of gradient descent is


a. Weighted sum b. Introducing non c. Applying d. Updating weights
calculation linearity Activation

ANS: D PTS: 1

23. A Neuron has


a. Multiple inputs b. Multiple outputs c. Multiple biases d. a and b

ANS: A PTS: 1

24. A shallow neural network has


a. Multiple hidden b. Single hidden c. Zero hidden layers d. None of the above
layers layer

ANS: B PTS: 1

25. An MLP has


a. Zero hidden layers b. Single hidden c. Multiple hidden d. None of the above
layer layers

ANS: C PTS: 1

26. Normally Vanishing gradient problem occurs at ____________ of input.


a. Larger Positive b. Smaller Positive c. Larger negative d. a and b
values values values

ANS: A PTS: 1

27. Normally Vanishing gradient problem occurs at ____________ of input.


a. Smaller Positive b. Smaller negative c. Larger negative d. a and b
values values values

ANS: B PTS: 1

28. Overfitting of a model means


a. Bias is low and b. Bias is low and c. Bias is low and d. Bias is high and
Variance is low. Variance is low. Variance is high. Variance is low.

ANS: C PTS: 1

29. Underfitting of a model means


a. Bias is low and b. Bias is low and c. Bias is low and d. Bias is high and
Variance is low. Variance is low. Variance is high. Variance is high.

ANS: D PTS: 1

30. Overfitting of a model means


a. Training accuarcy b. Training accuarcy c. Training accuarcy d. Training accuarcy
is low and Testing is high and is low and Testing is high and
Accuracy is high. Testing Accuracy Accuracy is low. Testing Accuracy
is low. is high.

ANS: B PTS: 1

31. Underfitting of a model means


a. Training accuarcy b. Training accuarcy c. Training accuarcy d. Training accuarcy
is low and Testing is high and is low and Testing is high and
Accuracy is high. Testing Accuracy Accuracy is low. Testing Accuracy
is low. is high.

ANS: D PTS: 1

32. Vanishing gradient problem occurs if


a. gradient is too b. gradient is too c. gradient is d. gradient is too
high small infinity negative

ANS: B PTS: 1

33. __________ layer does not perform any computation in Neural Network.
a. Hidden b. Output c. Input d. b and c
ANS: C PTS: 1

34. __________ layers perform computation in Neural Network.


a. Hidden b. Output c. Input d. a and b

ANS: D PTS: 1

35. Which of the following is a goal of clustering algorithms?


a. Classification b. Regression c. Grouping similar d. Dimensionality
data points reduction
together

ANS: C PTS: 1

36. Which node has only outgoing branches in decision tree?


a. Decision b. Hidden c. Root d. Leaf

ANS: C PTS: 1

37. Which node has only incoming branches in decision tree?


a. Decision b. Hidden c. Root d. Leaf

ANS: D PTS: 1

38. Which node has both incoming and outgoing branches in decision tree?
a. Decision b. hidden c. Root d. Leaf

ANS: A PTS: 1

39. A good clustering method has


a. High Interclass b. High Intraclass c. Low Interclass d. All of the above
distance distance distance

ANS: A PTS: 1

40. A good clustering method has


a. Low Interclass b. Low Intraclass c. High Intraclass d. All of the above
distance distance distance

ANS: B PTS: 1

41. A good clustering method has


a. High Intraclass b. Low Intraclass c. High Intraclass d. All of the above
distance similarity similarity

ANS: C PTS: 1
42. A good clustering method has
a. High Intraclass b. Low Interclass c. High Interclass d. All of the above
distance similarity similarity

ANS: B PTS: 1

43. A good clustering method has


a. High Intraclass b. Low Interclass c. High Intraclass d. b and c
distance similarity similarity

ANS: D PTS: 1

44. _________based clustering organizes the data into hierarchical clusters.


a. Distribution b. Centroid c. Hierarchical d. Density

ANS: C PTS: 1

45. What is the main purpose of using regularization techniques in ML models?


a. To increase the b. To prevent c. To increase d. To reduce error
complexity of overfitting training speed
the model

ANS: B PTS: 1

46. What is the trade-off referred to as the bias-variance trade-off?


a. Balancing model b. Balancing c. Balancing error d. Balancing data
complexity and performance on due to bias and quality and
interpretability training and variance quantity
testing data

ANS: C PTS: 1

47. Neural network is compared with______________in human body

a. eyes b. organic system c. brain d. heart

ANS: C PTS: 1

48. The axon endings almost touch the ________________of the next neuron
a. dendrites b. dendrites c. cell body d. None

ANS: B PTS: 1

49. In the human brain, output is presented at the____________


a. dendrites b. synapses c. cell body d. axon
ANS: D PTS: 1

50. In the human brain, Multiple inputs from dendrites are processed by ___________
a. dendrites b. synapses c. cell body d. axon

ANS: C PTS: 1

51. Which activation function is commonly used in hidden layers?


a. sigmoid b. Tanh c. RELU d. softmax

ANS: C PTS: 1

52. A decision tree is a _________ algorithm.

a. Unsupervised b. Supervised c. parametric d. All

ANS: B PTS: 1

53. A decision tree is used for _________

a. Classification b. Clustering c. Dimensionality d. All


reduction

ANS: A PTS: 1

54. In decision tree, features of the dataset are represented by _________

a. Leaf nodes b. Internal nodes c. root nodes d. None of the above

ANS: B PTS: 1

55. In decision tree, _________ has multiple branches.

a. Leaf nodes b. Internal nodes c. root nodes d. b and c

ANS: D PTS: 1

56. Removing unwanted branches in the decision tree is called _________ .

a. Underfitting b. Splitting c. Overfitting d. Pruning

ANS: D PTS: 1

57. Child nodes in the decision tree are _________ .

a. Root b. Internal c. leaf d. b and c

ANS: D PTS: 1

58. In decision tree, decision nodes are also called as _________ .


a. Root b. Internal c. leaf d. none of the above

ANS: B PTS: 1

59. In decision tree, Final nodes are also called as _________ .

a. Root b. Internal c. leaf d. none of the above

ANS: C PTS: 1

60. The decision tree may have _________ .

a. No fitting b. Overfitting c. Underfitting d. Best fitting

ANS: B PTS: 1

61. The decision tree is _________ .

a. non Parametric b. Parametric c. Parametric d. non Parametric


unsupervised supervised unsupervised supervised
learning learning learning learning

ANS: D PTS: 1

62. Random forest is based on_________ .

a. unsupervised b. ensemble learning c. semisupervised d. reinforcement


learning learning learning

ANS: B PTS: 1

63. The cost function of an MLP is equivalent to _________ .

a. Linear regression b. SVM c. Logistic d. none of the above


regression

ANS: C PTS: 1

64. In MLP, the output classes are represented by _________ .

a. number of hidden b. nodes in the c. nodes in the input d. none of the above
layers output layer layer

ANS: B PTS: 1

65. Deep learning is the subset of _________ .

a. Machine learning b. Classification c. Regression d. Clustering

ANS: A PTS: 1
66. The intensity of a pixel in gray scale image is represented by________ bits.
a. 24 b. 1 c. 16 d. 8

ANS: D PTS: 1

67. The range of pixel’s intensity in gray scale image is ________ .


a. 1 to 256 b. 1 to 255 c. 0 to 255 d. 0 to 256

ANS: C PTS: 1

68. The intensity of a pixel in color scale image is represented by________ bits.
a. 24 b. 1 c. 16 d. 8

ANS: A PTS: 1

69. What is the primary purpose of a Convolutional Neural Network (CNN)?


a. Unsupervised b. Reinforcement c. Image d. Text generation
learning learning classification

ANS: C PTS: 1

70. What is the advantage of using convolutional layers in a CNN?


a. They can handle b. They can generate c. They can handle d. They can capture
variable-length synthetic data sequential data local spatial
inputs patterns in the
input data

ANS: D PTS: 1

71. Which layer type is typically used to extract local features in a CNN?
a. Pooling layer b. Convolutional c. Activation layer d. Fully connected
layer layer

ANS: B PTS: 1

72. Which activation function is commonly used in the convolutional layers of a CNN?
a. Softmax b. Tanh c. ReLu d. Signmoid

ANS: C PTS: 1

73. What is the purpose of the pooling layer in a CNN?


a. To introduce b. To reduce the c. To adjust the d. To compute the
non-linearity to spatial dimensions weights and biases gradients for
the network of the feature of the network backpropagation
maps

ANS: B PTS: 1

74. What is the purpose of the stride parameter in a convolutional layer?


a. To determine the b. To adjust the c. To control the step d. None of the
size of the learning rate size of the kernel above
receptive field during training

ANS: C PTS: 1

75. What is the purpose of the kernel size in a convolutional layer?


a. To determine the b. To adjust the c. To control the step d. None of the
size of the learning rate size of the kernel above
receptive field during training

ANS: A PTS: 1

76. Which layer type is used to reduce the spatial dimensions in a CNN?
a. Activation b. Convolution c. Pooling d. Fully Connected

ANS: C PTS: 1

77. What is the purpose of the padding parameter in a convolutional layer?


a. To regularize the b. To regularize the c. To adjust the d. To prevent the
network and kernel movement learning rate reduction of
prevent overfitting during training spatial dimensions

ANS: D PTS: 1

78. Which layer type is responsible for making final predictions in a CNN?
a. Pooling b. Fully connected c. Convolution d. Activation

ANS: B PTS: 1

79. Which layer type is responsible for applying non-linear transformations to the feature maps in a
CNN?
a. Pooling b. Fully connected c. Convolution d. Activation

ANS: D PTS: 1

80. What is the purpose of the fully connected layers in a CNN?


a. To reduce the b. To apply c. To capture global d. To initialize the
spatial dimensions non-linear patterns and make weights and biases
of the input data transformations to predictions of the network
the feature maps

ANS: C PTS: 1

81. What is the purpose of dropout regularization in a CNN?


a. To randomly b. To adjust the c. To increase the d. None of the above
disable neurons learning rate number of layers
during training to during training in the network
prevent overfitting

ANS: A PTS: 1
82. Which layer type is responsible for backpropagating the gradients and updating the network's
parameters in a CNN?
a. Pooling b. Fully connected c. Activation d. Convolution

ANS: B PTS: 1

83. What is the primary advantage of using a CNN over a fully connected neural network for image
processing tasks?
a. CNNs can handle b. CNNs have a c. CNNs can capture d. CNNs have a
sequential data higher number of local spatial higher training
neurons patterns in the speed
input data

ANS: C PTS: 1

84. What is the purpose of the receptive field in a convolutional layer?


a. To determine the b. To determine the c. To specify the size d. None of the above
number of filters size of the feature of the local region
in the layer maps for the
convolution
operation

ANS: C PTS: 1

85. Which layer type is responsible for spatial downsampling in a CNN?


a. Convolution b. Fully connected c. Activation d. Pooling

ANS: D PTS: 1

86. What is the purpose of the filter/kernel in a convolutional layer?


a. To extract local b. To specify the size c. To determine the d. None of the above
features from the of the feature number of neurons
input data maps in the layer

ANS: A PTS: 1

87. Which layer type is commonly used in CNNs to normalize the input data?
a. Convolution b. Pooling c. Activation d. Batch
Normalization

ANS: D PTS: 1

88. What is the primary goal of training a CNN?


a. To maximize the b. To minimize the c. To achieve 100% d. None of the above
number of layers prediction error on accuracy on the
in the network the training data test data

ANS: B PTS: 1

89. Which layer type is responsible for introducing translation invariance in a CNN?
a. fully connected b. Pooling c. Convolution d. None of the above
ANS: C PTS: 1

90. What is the purpose of the output layer in a CNN?


a. To compute the b. To reduce the c. To apply d. To initialize the
predicted output spatial dimensions non-linear weights and biases
based on the final of the input data transformations to of the network
feature the feature maps
representation

ANS: A PTS: 1

91. Which parameter is used to control the size of the featuremap in a CNN?
a. filter b. Stride c. padding d. b and c

ANS: D PTS: 1

92. What is the purpose of zero-padding in a CNN?


a. To prevent the b. To adjust the c. To regularize the d. None of the above
reduction of learning rate network and
spatial dimensions during training prevent overfitting

ANS: A PTS: 1

93. What is the purpose of the learning rate in CNN training?


a. To adjust the size b. To increase the c. To control the step d. None of the above
of the filters in the number of layers size of the
convolutional in the network parameter updates
layers during
optimization

ANS: C PTS: 1

94. What does the convolutional layer do in a CNN?


a. Normalizes the b. Detects patterns c. Computes the loss d. Determines the
data and features function model's accuracy

ANS: B PTS: 1

95. What layer connects the output of one neuron to the input of another?
a. Convolution b. Pooling c. Fully connected d. Dropout

ANS: C PTS: 1

96. The number of feature maps produced by the convolution layer depends on _____
a. Number of b. Number of kernels c. Kernel size d. Kernel depth
channels in the
input

ANS: B PTS: 1

97. The number of channels in the kernel to perform convolution depends on _____
a. Number of b. Number of kernels c. Kernel size d. none of the above
channels in the
input

ANS: A PTS: 1

98. The parameters in a convolution layer depends on _____


a. Number of b. Number of c. number of rows d. b and c
channels in the channels in the and columns in
input kernel the Kernel

ANS: D PTS: 1

99. The global pooling layer calculates _____ of all values in the featuremap.
a. Addition b. Average c. Multiplication d. dot product

ANS: B PTS: 1

100. Alexnet has ______ Convolution layers


a. 3 b. 5 c. 16 d. 13

ANS: B PTS: 1

101. VGG-16 has ______ Convolution layers


a. 3 b. 5 c. 16 d. 13

ANS: D PTS: 1

102. Alexnet has ______ pooling layers


a. 3 b. 5 c. 16 d. 13

ANS: A PTS: 1

103. VGG-16 has ______ pooling layers


a. 3 b. 5 c. 16 d. 13

ANS: B PTS: 1

104. AlexNet uses ______ to update the parameters


a. Batch gradient b. stochastic gradient c. Minibatch d. None of the above
descent descent gradient descent

ANS: B PTS: 1

105. Each kernel in the VGG-16 has a size of _______.


a. 11x11 b. 3x3 c. 5x5 d. 7x7

ANS: B PTS: 1

106. Output layer in the VGG-16 has _______ channels.


a. 256 b. 9126 c. 1000 d. 4096

ANS: C PTS: 1

107. Total number of parameters in the VGG-16 are _______.


a. 6,32,78,344 b. 13,84,32,208 c. 6,23,78,344 d. 13,84,23,208

ANS: D PTS: 1

108. Total number of parameters in the Alexnet are _______.


a. 6,32,78,344 b. 13,84,32,208 c. 6,23,78,344 d. 13,84,23,208

ANS: C PTS: 1

109. The dropout layer is used after _______.


a. Pooling layer b. FC Layer c. Convolution layer d. Activation Layer

ANS: B PTS: 1

110. In stochastic gradient, ____ taken at a time to take a single step.


a. Fixed number of b. One sample c. All samples d. None of the above
samples

ANS: B PTS: 1

111. In batch stochastic gradient, ____ taken at a time to take a single step.
a. Fixed number of b. One sample c. All samples d. None of the above
samples

ANS: C PTS: 1

112. In minibatch stochastic gradient, ____ taken at a time to take a single step.
a. Fixed number of b. One sample c. All samples d. None of the above
samples

ANS: A PTS: 1

113. Which of the following is not a Data Augmentation technique.


a. Convolution b. Cropping c. Zooming d. Scaling

ANS: A PTS: 1

114. What is the primary purpose of a Recurrent Neural Network (RNN)?


a. Image b. Reinforcement c. Text generation d. Object detection
classification learning

ANS: C PTS: 1

115. Tha advantage of RNN is


a. handles b. cannot handle c. cannot memorize d. None of the above
sequential data
sequential data previous data

ANS: A PTS: 1

116. ______ doesnot memorize previous data.


a. RNN b. CNN c. SVM d. Kmeans clustering

ANS: B PTS: 1

117. ______ memorizes previous data.


a. RNN b. CNN c. SVM d. Kmeans clustering

ANS: A PTS: 1

118. Which layer type is typically used to capture sequential dependencies in an RNN?
a. Pooling b. Hidden c. Input d. Output

ANS: B PTS: 1

119. What is the advantage of using recurrent layers in an RNN?


a. They can handle b. They can generate c. They can handle d. They can capture
variable-length synthetic data non-linear temporal
inputs transformations dependencies in
the input data

ANS: D PTS: 1

120. What is the purpose of the hidden state in an RNN?


a. To adjust the b. To compute the c. To store the d. None of the
learning rate gradients for information from above
during training backpropagation the previous time
step

ANS: C PTS: 1

121. Which activation function is commonly used in the recurrent layers of an RNN?
a. Sigmoid b. ReLu c. Softmax d. Hyperbolic
Tangent

ANS: D PTS: 1

122. Which layer type is responsible for making final predictions in an RNN?
a. Output layer b. Input layer c. Activation layer d. Hidden layer

ANS: A PTS: 1

123. What is the purpose of the recurrent connection in an RNN?


a. To adjust the b. To propagate the c. To reduce the d. None of the above
weights and biases hidden state across dimensionality of
of the network different time
steps the input data

ANS: B PTS: 1

124. Which layer type is responsible for introducing non-linearity in an RNN?


a. Input layer b. Hidden layer c. Output layer d. Activation layer

ANS: D PTS: 1

125. What is the purpose of the input gate in an LSTM network?


a. To adjust the b. To control the c. To introduce d. None of the above
learning rate flow of non-linearity to
during training information from the network
the current input

ANS: B PTS: 1

126. What is the purpose of the output gate in an LSTM network?


a. To adjust the b. To control the c. To control the d. None of the above
learning rate flow of flow of
during training information from information to the
the current input current output

ANS: C PTS: 1

127. Which type of RNN is used for Machine Learning problems?


a. One to One b. One to Many c. Many to One d. Many to many

ANS: A PTS: 1

128. Which type of RNN is used for image caption?


a. One to One b. One to Many c. Many to One d. Many to many

ANS: B PTS: 1

129. Which type of RNN is used for sentimental analysis?


a. One to One b. One to Many c. Many to One d. Many to many

ANS: C PTS: 1

130. Which type of RNN is used for machine translation?


a. One to One b. One to Many c. Many to One d. Many to many

ANS: D PTS: 1

131. LSTM is used for to avoid __________


a. Vanishing b. Exploding c. Overfitting d. Underfitting
gradient descent gradient descent
ANS: A PTS: 1

132. Predict the correct order of gates in LSTM?


a. Forget gate, b. Forget gate, Input c. Input gate, d. None of the above
Output gate & gate & Output Forget gate &
Input gate gate Output gate

ANS: B PTS: 1

133. The following activation functions are used in Input gate.


a. Sigmoid & b. ReLu & Softmax c. ReLu & Sigmoid d. Sigmoid and Tanh
Softmax

ANS: D PTS: 1

134. In the Confusion matrix, TP indicates _________ .


a. Actual output = 1 b. Actual output = 0 c. Actual output = 1 d. Actual output = 0
& predicted output & predicted output & predicted output & predicted output
= 1. = 0. = 0. = 1.

ANS: A PTS: 1

135. In the Confusion matrix, TN indicates _________ .


a. Actual output = 1 b. Actual output = 0 c. Actual output = 1 d. Actual output = 0
& predicted output & predicted output & predicted output & predicted output
= 1. = 0. = 0. = 1.

ANS: B PTS: 1

136. In the Confusion matrix, FN indicates _________ .


a. Actual output = 1 b. Actual output = 0 c. Actual output = 1 d. Actual output = 0
& predicted output & predicted output & predicted output & predicted output
= 1. = 0. = 0. = 1.

ANS: C PTS: 1

137. In the Confusion matrix, FP indicates _________ .


a. Actual output = 1 b. Actual output = 0 c. Actual output = 1 d. Actual output = 0
& predicted output & predicted output & predicted output & predicted output
= 1. = 0. = 0. = 1.

ANS: D PTS: 1

138. Confusion matrix can be calculated for _________ application.


a. Regression b. Classification c. both a and b d. None of the above

ANS: B PTS: 1

You might also like