Q51. In a neural network, which of the following techniques is used to deal with overfitting?

(A)Dropout
(B)Regularization
(C)Batch Normalization
(D)All of these

Correct Answer: D


Q52. Y = ax^2 + bx + c (polynomial equation of degree 2) Can this equation be represented by a neural network of the single hidden layer with a linear threshold?

(A)Yes
(B)No

Correct Answer: B


Q53. What is a dead unit in a neural network?

(A)A unit which does‟t update during training by any of its neighbor
(B)A unit that does not respond completely to any of the training patterns
(C)The unit which produces the biggest sum-squared error
(D)None of these

Correct Answer: A


Q54. Which of the following statement is the best description of early stopping?

(A)Train the network until a local minimum in the error function is reached
(B)Simulate the network on a test dataset after every epoch of training. Stop training when the generalization error starts to increase
(C)Add a momentum term to the weight update in the Generalized Delta Rule, so that raining converges more quickly
(D)A faster version of backpropagation, such as the `Quickprop‟ algorithm

Correct Answer: B


Q55. What if we use a learning rate that‟s too large?

(A)Network will converge
(B)Network will not converge
(C)Can‟t Say

Correct Answer: B


Q56. Which gradient technique is more advantageous when the data is too big to handle in RAM simultaneously?

(A)Full Batch Gradient Descent
(B)Stochastic Gradient Descent

Correct Answer: B


Q57. What are the factors to select the depth of neural network?

1.Type of neural network (eg. MLP, CNN etc)
2.Input data
3.Computation power, i.e. Hardware capabilities and software capabilities
4.Learning Rate
5.The output function to map

(A) 1, 2, 4, 5
(B) 2, 3, 4, 5
(C) 1, 3, 4, 5
(D) All of these

Correct Answer: D


Q58. Consider the scenario. The problem you are trying to solve has a small amount of data. Fortunately, you have a pre-trained neural network that was trained on a similar problem. Which of the following methodologies would you choose to make use of this pre-trained network?

(A)Re-train the model for the new dataset
(B)Assess on every layer how the model performs and only select a few of them
(C)Fine-tune the last couple of layers only
(D)Freeze all the layers except the last, re-train the last layer

Correct Answer: D


Q59. An increase in the size of a convolutional kernel would necessarily increase the performance of a convolutional network.

(A)True
(B)False

Correct Answer: B


Q60. Which kind of activation function is typical for a convolution layer in an RNN

(A)Gaussian
(B)Sigmoid
(C)Hyperbolic Tangent
(D)ReLU

Correct Answer: D


Q61. Which of the following method use arithmetic operation in the evaluation of word representation?

(A)Semantic relatedness
(B)Synonym detection
(C)Semantic analogy
(D)None of the above

Correct Answer: C


Q62. Which of the following model directly learn word representations?

(A)Prediction based model
(B)Count based model
(C)Both prediction and count based model
(D)None of these

Correct Answer: A


Q63. Which of the following is correct in word representation model?

I.) In the continuous bag of word model, the softmax function is computationally expensive.
II.) In the continuous bag of word model, the softmax function is computationally inexpensive.
III.) In the Skip-gram model, the softmax function is computationally inexpensive.
IV.) In the Skip-gram model, the softmax function is computationally expensive

(A)I only
(B)II & III
(C)III only
(D)I & IV

Correct Answer: D


Q64. Which of the following solution constructs a binary tree in learning word representation using prediction-based models?

(A)Use negative sampling
(B)Use contrastive estimation
(C)Use hierarchical softmax
(D)None of these

Correct Answer: C


Q65.Which of the following method(s) use dot product in the evaluation of word representation?

I.Semantic relatedness
II.Synonym detection
III.Semantic analogy

(A)I & II
(B)III
(C)I, II, & III
(D)None of the above

Correct Answer: A


Q66.If you need to design a model for textual entailment from the text, then which of the following steps will you choose?

I.) CNN is used to encode the text
II.) CNN is used to decode the text
III.) RNN is used to decode the text from the encoding
IV.) RNN is used to encode the text.
V.) RNN is used to encode the text and decode the text.

(A)IV, II
(B)I, III
(C)V
(D)None of these

Correct Answer: C


Q67. The problem of generating the sentence given an image can be possibly solved with the encoder-decoder architecture.

(A)Yes
(B)No

Correct Answer: A


Q68. For document classification and summarization, it is important to look at the important sentences and important words. What kind of “attention” mechanism is required for encoding?

(A)Hierarchical
(B)Ungraded
(C)Sequential
(D)Unordered

Correct Answer: A


Q69. 48 filters of size 21 x 21 is applied to an image of size 327 x 327, with zero padding and stride of 3. The image is an RGBimage. The depth of the filter is same as the depth of image. What will be the volume of the final image?

(A) 103 x 103 x 3
(B) 103 x 103 x 48
(C) 327 x 327 x 3
(D) 327 x 327 x 48

Correct Answer: B


Q70. Consider the following:

W= [0.2, 0.7, 0.05, 0.75, 0.86, 0.21]
X= [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

What output St is obtained by sliding the filter Wt over the input Xt ?

(A) [1.306, 1.421, 1.567, 2.002]
(B) [1.031, 1.308, 1.585, 1.862]
(C) [2.345, 2.121, 3.547, 2.409]
(D) [2.127, 2.229, 3.212, 3.421]

Correct Answer: B

Leave a Comment