Is The Data Linearly Separable?: A) Yes B) No
Is The Data Linearly Separable?: A) Yes B) No
A) Yes
B) No
Solution: B
If you can draw a line or plane between the data points, it is said to be linearly separable.
A) Kernel SVM
B) Neural Networks
3) In which of the following applications can we use deep learning to solve the problem?
A) Protein structure prediction
D) All of these
Solution: DWe can use a neural network to approximate any function so it can theoretically be
4) Which of the following statements is true when you use 1×1 convolutions in a CNN?
5) Question Context:
Solution: BEven if all the biases are zero, there is a chance that neural network may learn. On
the other hand, if all the weights are zero; the neural neural network may never learn to perform
the task.
6) The number of nodes in the input layer is 10 and the hidden layer is 5. The maximum
number of connections from the input layer to the hidden layer are
A) 50
B) Less than 50
C) More than 50
D) It is an arbitrary value
Solution: A
Since MLP is a fully connected directed graph, the number of connections are a multiple of
size 7 X 7 with a stride of 1. What will be the size of the convoluted matrix?
A) 22 X 22
B) 21 X 21
C) 28 X 28
D) 7 X 7
Solution: A
The size of the convoluted matrix is given by C=((I-F+2P)/S)+1, where C is the size of the
Convoluted matrix, I is the size of the input matrix, F the size of the filter matrix and P the
padding applied to the input matrix. Here P=0, I=28, F=7 and S=1. There the answer is 22.
8) In a simple MLP model with 8 neurons in the input layer, 5 neurons in the hidden layer
and 1 neuron in the output layer. What is the size of the weight matrices between hidden
A) [1 X 5] , [5 X 8]
B) [8 X 5] , [ 1 X 5]
C) [8 X 5] , [5 X 1]
D) [5 x 1] , [8 X 5]
Solution: D
The size of weights between any layer 1 and layer 2 Is given by [nodes in layer 1 X nodes in
layer 2]
9) Given below is an input matrix named I, kernel F and Convoluted matrix named C.
Which of the following is the correct option for matrix C with stride =2 ?
A)
B)
C)
D)
Solution: C
1 and 2 are automatically eliminated since they do not conform to the output size for a stride of
10) Given below is an input matrix of shape 7 X 7. What will be the output on applying a
A)
B)
C)
D)
Solution: A
Max pooling takes a 3 X 3 matrix and takes the maximum of the matrix as the output. Slide it
over the entire input matrix with a stride of 2 and you will get option (1) as the answer.
11) Which of the following functions can be used as an activation function in the output
layer if we wish to predict the probabilities of n classes (p1, p2..pk) such that sum of p over
all n equals to 1?
A) Softmax
B) ReLu
C) Sigmoid
D) Tanh
Solution: A
Softmax function is of the form in which the sum of probabilities over all k sum to 1.
12) Assume a simple MLP model with 3 neurons and inputs= 1,2,3. The weights to the
input neurons are 4,5 and 6 respectively. Assume the activation function is a linear constant
A) 32
B) 643
C) 96
D) 48
Solution: C
13) Which of following activation function can’t be used at output layer to classify an image
A) sigmoid
B) Tanh
C) ReLU
D) If(x>5,1,0)
Solution: C
ReLU gives continuous output in range 0 to infinity. But in output layer, we want a finite range
14) [True | False] In the neural network, every parameter can have their different learning
rate.
A) TRUE
B) FALSE
Solution: A
Yes, we can define the learning rate for each parameter and it can be different from other
parameters.
A) TRUE
B) FALSE
Solution: A
Look at the below model architecture, we have added a new Dropout layer between the input (or
visible layer) and the first hidden layer. The dropout rate is set to 20%, meaning one in 5 inputs
def create_model():
# create model
model = Sequential()
model.add(Dropout(0.2, input_shape=(60,)))
model.add(Dense(60, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model sgd = SGD(lr=0.1)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])
return model
16) I am working with the fully connected architecture having one hidden layer with 3
neurons and one output neuron to solve a binary classification challenge. Below is the
To train the model, I have initialized all weights for hidden and output layer with 1.
What do you say model will able to learn the pattern in the data?
A) Yes
B) No
Solution: B
As all the weights of the neural network model are same, so all the neurons will try to do the
17) Which of the following neural network training challenge can be solved using batch
normalization?
A) Overfitting
Solution: D
Batch normalization restricts the activations and indirectly improves training time.
18) Which of the following would have a constant input in each epoch of training a Deep
Learning model?
Solution: A
19) True/False: Changing Sigmoid activation to ReLu will help to get over the vanishing
gradient issue?
A) TRUE
B) FALSE
Solution: A
A) TRUE
B) FALSE
Solution: B
This is not always true. If we have a max pooling layer of pooling size as 1, the parameters
21) [True or False] BackPropogation cannot be applied when using pooling layers
A) TRUE
B) FALSE
Solution: B
A) 3
B) 4
C) 5
D) 6
Solution: B
Option B is correct
23) For a binary classification problem, which of the following architecture would you
choose?
A) 1
B) 2
C) Any one of these
D) None of these
Solution: C
We can either use one neuron as output for binary classification problem or two separate
neurons.
24) Suppose there is an issue while training a neural network. The training loss/validation
C) Both of these
Solution: C
https://www.analyticsvidhya.com/blog/2017/07/debugging-neural-network-with-tensorboard/
25)
The red curve above denotes training accuracy with respect to each epoch in a deep
learning algorithm. Both the green and blue curves denote validation accuracy.
A) Green Curve
B) Blue Curve
Solution: B
B) Both 1 and 3
C) Both 2 and 3
D) All 1, 2 and 3
Solution: B
Statements 1 and 3 are correct, statement 2 is not always true. Even after applying dropout and
27) Gated Recurrent units can help prevent vanishing gradient problem in RNN.
A) True
B) False
Solution: A
Option A is correct. This is because it has implicit memory to remember past behavior.
28) Suppose you are using early stopping mechanism with patience as 2, at which point will
A) 2
B) 3
C) 4
D) 5
Solution: C
As we have set patience as 2, the network will automatically stop training after epoch 4.
29) [True or False] Sentiment analysis using Deep Learning is a many-to one prediction
task
A) TRUE
B) FALSE
Solution: A
Option A is correct. This is because from a sequence of words, you have to predict whether the
B) Weight Sharing
C) Early Stopping
D) Dropout
Solution: E
All of the above mentioned methods can help in preventing overfitting problem.