100% found this document useful (1 vote)
32 views3 pages

Deep Learning - Unit-III Two Marks

Deep learning unit 3 two marks questions
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
32 views3 pages

Deep Learning - Unit-III Two Marks

Deep learning unit 3 two marks questions
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

UNIT III Regularization for Deep Learning, Deep Models

Regularization for Deep Learning: Parameter Norm Penalties, Norm Penalties as


Constrained Optimization, Regularization and Under-Constrained Problems, Dataset
Augmentation, Noise Robustness, Semi-Supervised Learning, Multi-Task Learning, Early
Stopping, Parameter Tying and Parameter Sharing, Sparse Representations, Bagging and Other
Ensemble Methods, Dropout, Adversarial Training, Tangent Distance, Tangent Prop and
Manifold Tangent Classifier.
Optimization for Training Deep Models: Pure Optimization, Challenges in Neural Network
Optimization, Basic Algorithms, Parameter Initialization Strategies, Algorithms with Adaptive
Learning Rates, Approximate Second-Order Methods, Optimization Strategies and Meta-
Algorithms.

Regularization for Deep Learning


Question: What is parameter norm penalty in regularization?
Answer: Parameter norm penalty is a regularization technique that adds a penalty term to the
loss function based on the magnitude of the model parameters (e.g., L1 or L2 norm),
discouraging overly complex models.
Question: Explain norm penalties as constrained optimization.
Answer: Norm penalties can be viewed as constraints in optimization, where the objective is
to minimize the loss while keeping the parameters within a specified norm limit, leading to
simpler models.
Question: How does regularization help in under-constrained problems?
Answer: Regularization helps in under-constrained problems by adding constraints that prevent
the model from fitting noise in the training data, thus improving generalization to unseen data.
Question: What is dataset augmentation?
Answer: Dataset augmentation involves artificially increasing the size of the training dataset
by applying transformations (like rotation, scaling, and flipping) to the existing data, improving
model robustness.
Question: Describe noise robustness in deep learning.
Answer: Noise robustness refers to the ability of a model to maintain performance despite
variations or corruptions in the input data, often enhanced through techniques like data
augmentation or dropout.
Question: What is semi-supervised learning?
Answer: Semi-supervised learning is a machine learning approach that utilizes both labeled
and unlabeled data for training, often improving learning efficiency when labeled data is scarce.
Question: Explain multi-task learning.
Answer: Multi-task learning is a strategy where a model is trained to perform multiple tasks
simultaneously, sharing representations to improve generalization and learning efficiency
across tasks.
Question: What is early stopping in training deep learning models?
Answer: Early stopping is a regularization technique that monitors model performance on a
validation set during training and halts training when performance stops improving, preventing
overfitting.
Question: Define parameter tying and parameter sharing.
Answer: Parameter tying involves using the same parameters across different layers or parts of
a model, while parameter sharing refers to using shared parameters across multiple tasks or
models, reducing complexity.
Question: What are sparse representations in deep learning?
Answer: Sparse representations refer to encoding data such that most of the values are zero,
capturing essential features and reducing the amount of information needed for effective
learning.
Question: Explain bagging in ensemble methods.
Answer: Bagging (Bootstrap Aggregating) is an ensemble method that involves training
multiple models on different subsets of the training data and combining their predictions to
improve overall performance and reduce variance.
Question: What is dropout?
Answer: Dropout is a regularization technique where randomly selected neurons are ignored
during training, preventing co-adaptation and helping to reduce overfitting in neural networks.
Question: Describe adversarial training.
Answer: Adversarial training involves augmenting the training set with adversarial examples—
inputs designed to deceive the model—improving its robustness against such attacks during
inference.
Question: What is tangent distance?
Answer: Tangent distance measures the similarity between data points in a manifold by
considering the local geometry of the manifold, often used in tasks where data lies on a curved
space.
Question: Explain tangent propagation (Tangent Prop).
Answer: Tangent Prop is a method that leverages the tangent space of a manifold to define
distances between data points, allowing for effective learning in manifold-based representation.
Question: What is a manifold tangent classifier?
Answer: A manifold tangent classifier is a model that uses the local geometry of data on a
manifold for classification, effectively incorporating the structure of the data distribution.

Optimization for Training Deep Models


Question: What is pure optimization in the context of deep learning?
Answer: Pure optimization refers to the mathematical techniques and methods used to
minimize or maximize an objective function, crucial for training deep learning models
effectively.
Question: Describe some challenges in neural network optimization.
Answer: Challenges in neural network optimization include vanishing/exploding gradients,
local minima, saddle points, and the high dimensionality of parameter spaces, which
complicate the training process.
Question: What are basic algorithms used for optimization?
Answer: Basic optimization algorithms include Stochastic Gradient Descent (SGD),
Momentum, and Adagrad, each with unique strategies for updating model parameters based on
gradient information.
Question: Explain parameter initialization strategies.
Answer: Parameter initialization strategies involve setting the initial values of model weights,
with common methods including random initialization, Xavier initialization, and He
initialization to improve convergence.
Question: What are algorithms with adaptive learning rates?
Answer: Algorithms with adaptive learning rates adjust the learning rate based on the gradients'
statistics, such as Adagrad, RMSProp, and Adam, allowing for faster convergence and
improved performance.
Question: Describe approximate second-order methods.
Answer: Approximate second-order methods utilize approximations of the Hessian matrix to
inform updates, improving convergence rates without the computational cost of full second-
order methods, like L-BFGS.
Question: What are optimization strategies and meta-algorithms?
Answer: Optimization strategies and meta-algorithms involve higher-level techniques that
combine various optimization methods and heuristics to enhance model training, such as
learning rate schedules and batch normalization.

You might also like