0% found this document useful (0 votes)
11 views

neural network -test questions

The document discusses various concepts related to artificial neurons and neural networks, including input patterns, activation functions, network architectures, training challenges, regularization techniques, hyperparameters, and evaluation methods. It also covers specific neural network types such as CNNs, RNNs, and GANs, as well as techniques for handling imbalanced datasets and improving training efficiency. Additionally, it explains the importance of activation functions, the bias-variance trade-off, and methods like batch normalization and softmax in neural network training.

Uploaded by

Dr. R. Gowri CIT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

neural network -test questions

The document discusses various concepts related to artificial neurons and neural networks, including input patterns, activation functions, network architectures, training challenges, regularization techniques, hyperparameters, and evaluation methods. It also covers specific neural network types such as CNNs, RNNs, and GANs, as well as techniques for handling imbalanced datasets and improving training efficiency. Additionally, it explains the importance of activation functions, the bias-variance trade-off, and methods like batch normalization and softmax in neural network training.

Uploaded by

Dr. R. Gowri CIT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

1.

Below is a diagram if a single artificial neuron (unit):

The node has three inputs x = (x1, x2, x3) that receive only binary signals (either 0 or 1). How many
different input patterns this node can receive? What if the node had four inputs? or Five inputs ?

2. Suppose that the weights corresponding to the three inputs have the following values:

The activation of the unit is given by the step-function:

Calculate what will be the output value y of the unit for each of
the following input patterns:

3. Find the output value y for each pattern for the following network which represents and function
4. Suggest how above network is used to implement the logical OR function (true when at least one of
the arguments is true)

5. The following diagram represents a feed-forward neural network with one hidden layer:

A weight on connection between nodes i and j is denoted by w I j , such as w13 is the weight on the
connection between nodes 1 and 3. The following table lists all the weights in the network

Where v denotes the weighted sum of a node. Each of the input nodes (1 and 2) can only receive binary
values (either 0 or 1). Calculate the output of the network (y5 and y6) for each of the input pattern

6. Common techniques for dealing with vanishing or exploding gradients in RNNs?

A. LSTM (Long Short-Term Memory) networks


B. GRU (Gated Recurrent Unit) networks
C. Gradient clipping
D. Weight normalization
E. All of the above
F. None of the above

7. What are activation functions, and why are they important?


 Answer: Activation functions introduce non-linearity into the neural
network. They transform the weighted sum of inputs into an output value.
Common activation functions include:
o Sigmoid: Outputs values between 0 and 1, often used in binary
classification.
o ReLU (Rectified Linear Unit): Outputs the input if positive, 0
otherwise, known for its computational efficiency.
o Tanh (Hyperbolic Tangent): Similar to sigmoid, outputs values
between -1 and 1.
o Softmax: Used for multi-class classification, outputs probabilities
for each class summing to 1.

1. 8 What are the different types of neural network architectures?


o Answer: There are various neural network architectures, each
suited for specific tasks:
 Feedforward Neural Networks: Information flows in one
direction, from input to output, without loops. Examples
include Multilayer Perceptrons (MLPs).
 Recurrent Neural Networks (RNNs): Process sequential
data by having feedback loops, enabling them to remember
past information. Examples include LSTMs and GRUs.
 Convolutional Neural Networks (CNNs): Designed for
image processing, they use convolutional layers to extract
features from spatial data.
 Generative Adversarial Networks (GANs): Composed of
two networks, a generator and a discriminator, competing
against each other to learn realistic data distributions.
 Autoencoders: Learn compressed representations of data
by encoding and decoding information, useful for
dimensionality reduction and anomaly detection.
2. What are the common challenges faced during neural network
training?
oAnswer:
 Overfitting: The network learns the training data too well
and fails to generalize to unseen data.
 Underfitting: The network is not complex enough to learn
the underlying patterns in the data.
 Vanishing or Exploding Gradients: During
backpropagation, gradients can become extremely small or
large, hindering effective weight updates.
 Local Minima: Gradient descent can get stuck in local
minima, not reaching the global minimum of the loss
function.
 Data Imbalance: If the training data is unevenly distributed
across classes, the network may bias towards the majority
class.
3. Explain the concept of regularization in neural networks.
o Answer: Regularization techniques are used to prevent overfitting
by adding constraints to the network's learning process. Common
regularization methods include:
 L1 Regularization (Lasso): Adds a penalty proportional to
the absolute value of the weights, encouraging sparsity
(setting some weights to zero).
 L2 Regularization (Ridge): Adds a penalty proportional to
the squared value of the weights, reducing the magnitude of
the weights and preventing them from becoming too large.
 Dropout: Randomly drops out neurons during training,
preventing co-adaptation and encouraging the network to
learn more robust features.
4. What are hyperparameters in neural networks, and how are they
tuned?
o Answer: Hyperparameters are parameters that are not learned by
the network during training but are set beforehand. Examples
include:
 Learning rate: Controls the step size of weight updates.
 Number of layers and neurons: Determines the network's
complexity.
 Batch size: The number of training examples used in each
update step.
 Regularization parameters: Control the strength of
regularization.

Hyperparameters are tuned using techniques like:

 Grid search: Trying different combinations of


hyperparameters on a predefined grid.
 Random search: Randomly sampling hyperparameters from
a predefined distribution.
 Bayesian optimization: Using a Bayesian model to guide
the search for optimal hyperparameters.
5. What are some popular frameworks for building and training
neural networks?
o Answer: Popular deep learning frameworks include:
 TensorFlow: Developed by Google, widely used for research
and production.
 PyTorch: Developed by Facebook, known for its flexibility
and ease of use.
 Keras: A high-level API that can run on top of TensorFlow or
Theano, simplifying neural network development.
 Caffe: Designed for image processing, known for its speed
and efficiency.
 MXNet: Developed by Apache, supports both CPU and GPU
computation.
6. What are the differences between batch gradient descent,
stochastic gradient descent, and mini-batch gradient descent?
Answer:
o
 Batch Gradient Descent: Updates weights after processing
the entire training dataset. It is slow but often converges to
the global minimum.
 Stochastic Gradient Descent (SGD): Updates weights
after processing a single training example. It is faster but can
be noisy and oscillate around the minimum.
 Mini-batch Gradient Descent: Updates weights after
processing a small batch of training examples (typically 32-
256). It offers a balance between speed and stability, being
the most commonly used approach.
7. Explain the difference between "epoch" and "batch" in neural
network training.
Answer:
o
 Epoch: One complete pass through the entire training
dataset. Each epoch consists of multiple batches.
 Batch: A small subset of training examples used to update
the weights during one iteration of gradient descent. The size
of the batch can influence the speed and stability of training.
8. What are some common techniques for visualizing the internal
representations learned by neural networks?
o Answer: Techniques for visualizing internal representations
include:
 Activation maps: Showing the activation values of neurons
in different layers, providing insights into which features the
network is focusing on.
 t-SNE (t-Distributed Stochastic Neighbor
Embedding): Reducing the dimensionality of the latent
space to visualize the relationships between different data
points.
 Saliency maps: Highlighting the regions of the input image
that contribute most to the network's prediction.

1. What is the purpose of "padding" in convolutional layers?

o Answer: Padding is a technique used in convolutional layers to add


extra values (usually zeros) to the borders of the input image. This
helps to preserve the spatial resolution of the feature maps by
preventing the shrinking of the output size after convolution. Padding
can also help to capture features near the edges of the image, which
might be missed otherwise due to the limited receptive field of the
filters.
2. Describe the different types of pooling layers commonly used in
CNNs.

o Answer: Common pooling layers in CNNs include:


 Max pooling: Takes the maximum value from a small region
(e.g., 2x2) in the feature map. This helps to reduce the number
of parameters and makes the network more robust to small
variations in the input.
 Average pooling: Calculates the average value from a small
region in the feature map. This can provide a smoother
representation of the features compared to max pooling.
3. What is the difference between "convolution" and "cross-
correlation" in CNNs?

o Answer: Convolution and cross-correlation are similar operations but


with a subtle difference:
 Convolution: The filter is flipped (rotated by 180 degrees)
before being applied to the input image. This allows for feature
extraction with a focus on spatial relationships.
 Cross-correlation: The filter is applied directly to the input
image without flipping. This is often used for image processing
tasks that don't require spatial feature extraction.

In practice, the term "convolution" is often used loosely to refer to


both operations, as the difference is often negligible.

4. Explain the concept of "stride" in convolutional layers.

o Answer: The stride is the step size by which the filter is moved
across the input image during convolution. A stride of 1 means the
filter moves one pixel at a time, while a larger stride (e.g., 2) means it
skips pixels. Using a stride larger than 1 reduces the size of the
output feature map, leading to a faster computation and potentially
coarser feature extraction.
5. What are some common techniques for evaluating the performance
of neural networks?

o Answer: Common evaluation techniques for neural networks include:


 Accuracy: The proportion of correctly classified instances in a
classification task.
 Precision: The proportion of correctly predicted positive
instances among all instances predicted as positive.
 Recall: The proportion of correctly predicted positive instances
among all actual positive instances.
 F1-score: A harmonic mean of precision and recall, providing a
balanced measure of performance.
 AUC (Area Under the Curve): A measure of the classifier's
ability to distinguish between positive and negative instances.
 Mean Squared Error (MSE): A measure of the average
squared difference between predictions and true values in
regression tasks.
 Loss function value: The value of the loss function, which is
minimized during training, indicating the overall error of the
model.
6. Describe the concept of a "confusion matrix" and its use in
evaluating classification models.

o Answer: A confusion matrix is a table that summarizes the


performance of a classification model by showing the number of
correct and incorrect predictions for each class. It helps to visualize
the model's accuracy, precision, recall, and other performance
metrics. The rows of the confusion matrix represent the actual class
labels, and the columns represent the predicted class labels. By
analyzing the different cells in the matrix, we can understand the
model's strengths and weaknesses in terms of classifying different
classes.
7. Explain the concept of "bias-variance trade-off" in machine
learning, particularly in the context of neural networks.

o Answer: The bias-variance trade-off is a fundamental concept in


machine learning, including neural networks, where there is a trade-
off between bias and variance in model predictions:
 Bias: A model's tendency to underfit the data, making
systematic errors. High bias models are often simple and fail to
capture complex patterns in the data.
 Variance: A model's sensitivity to variations in the training
data. High variance models are often complex and can overfit
the training data, leading to poor generalization on unseen
data.

The goal is to find a balance between bias and variance to achieve


optimal performance. Techniques like regularization, early stopping,
and increasing the size of the training dataset can help to manage
this trade-off.

8. What are some common techniques for dealing with imbalanced


datasets in machine learning, specifically in the context of neural
networks?

o Answer: Techniques for handling imbalanced datasets in neural


networks include:
 Oversampling: Increasing the number of instances of the
minority class by replicating or generating synthetic samples.
 Undersampling: Reducing the number of instances of the
majority class.
 Cost-sensitive learning: Assigning different costs to errors
made on different classes, giving more weight to
misclassifications of the minority class.
 Ensemble methods: Combining multiple models trained on
different subsets of the data or with different weighting
schemes.
 Data augmentation: Applying transformations to the minority
class to generate more diverse samples.
 Class-balanced loss functions: Using loss functions that are
specifically designed to handle imbalanced data.
9. What is the purpose of a "softmax" activation function in neural
networks, and how does it work?

o Answer: The softmax activation function is typically used in the


output layer of a neural network for multi-class classification tasks. It
takes a vector of raw scores as input and converts it into a probability
distribution over the different classes, where the sum of the
probabilities across all classes is equal to 1. The softmax function
applies an exponential transformation to each score and then
normalizes the results by dividing by the sum of all exponentiated
scores. This ensures that the output values represent probabilities
and are interpretable as the likelihood of belonging to each class.
10. Explain the concept of "batch normalization" and its benefits in
neural network training.

o Answer: Batch normalization is a technique used to normalize the


activations of neurons in a neural network during training. It involves
standardizing the outputs of each layer by subtracting the mean and
dividing by the standard deviation of the activations within a batch of
training examples. This helps to:
 Reduce internal covariate shift: Stabilize the distribution of
activations across layers, preventing the network from
becoming overly sensitive to small changes in the input data.
 Accelerate training: Enable the use of higher learning rates
without causing instability, leading to faster convergence.
 Improve generalization: Regularize the network, preventing
it from overfitting and improving performance on unseen data.

You might also like