0% found this document useful (0 votes)
2 views

Machine Learning1

The document presents the results of various machine learning experiments focusing on learning rates, number of neurons, and dropout rates, highlighting that a learning rate of 0.1 and 10 neurons in dense layers yielded the best validation accuracy. It also discusses one-pass and incremental learning approaches, outlining their advantages, disadvantages, and applications. Additionally, the document addresses concept drift and online active learning strategies, emphasizing the importance of monitoring real concept drift and employing methods like diversity sampling and active search to enhance model performance.

Uploaded by

MAFOQ UL HASSAN
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Machine Learning1

The document presents the results of various machine learning experiments focusing on learning rates, number of neurons, and dropout rates, highlighting that a learning rate of 0.1 and 10 neurons in dense layers yielded the best validation accuracy. It also discusses one-pass and incremental learning approaches, outlining their advantages, disadvantages, and applications. Additionally, the document addresses concept drift and online active learning strategies, emphasizing the importance of monitoring real concept drift and employing methods like diversity sampling and active search to enhance model performance.

Uploaded by

MAFOQ UL HASSAN
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

ECE 805-Machine Learning

Assignment_03 - 016/02/2023

Name: Syed Mafooq Ul Hassan


I.D: RA6804071
Exercise 1: a)
Learning Rate: 0.1
Validation Accuracy: 86.1%

Learning Rate: 0.01


Validation Accuracy: 85.4%

Learning Rate: 0.001


Validation Accuracy: 82.5%
Learning Rate: 0.0001
Validation Accuracy: 61.7%

Learning Rate: 0.00001


Validation Accuracy: 24.3%

In this experiment, the learning rate of 0.1 resulted in the best validation accuracy. This
suggests that the optimization algorithm was able to quickly find a good set of weights for the
neural network with this learning rate.

The learning rate of 0.1 resulted in the best validation accuracy. We can see that the
corresponding loss curve started with a relatively high value and decreased quickly, resulting
in fast convergence to an optimal set of weights. Whereas, the accuracy curve increased
steadily and reached a high value, indicating that the neural network was able to generalize
well to unseen data. Learning rate of 0.01 was also close in comparison to 0.1.

For other learning rates, we can identify that loss curve started with a high value but it was
computationally slow and took many epochs to get closer to the ground truth which resulted
in poor training performance on validation set.
Exercise 1: b)
Number of Neurons: 2 in both dense layers
Validation Accuracy: 65.3%

Number of Neurons: 4 in both dense layers


Validation Accuracy: 78.7%

Number of Neurons: 10 in both dense layers


Validation Accuracy: 85.6%
We achieved best validation accuracy for 10 neurons in each dense layer in this experiment.
This is because more neurons have allowed the model to better capture the complexity of the
data and learn more discriminative features. Loss curve started with a high value and
converged to the ground truth in less than 5 epochs resulting in best validation accuracy.

Exercise 1: c)
Dropout Rate: 0.2
Validation Accuracy: 72.1%

Dropout Rate: 0.5


Validation Accuracy: 59.4%

The loss curve for the model with a dropout rate of 0.2 started with a higher value and then
decreased quickly, resulting in fast convergence. The accuracy has increased steadily and then
plateaued at a high value, indicating good generalization performance. The model with a
dropout rate of 0.5 had a similar pattern, but with a lower final accuracy due to dropping out
more neurons during training.

Best Performance: Learning Rate: 0.1, Number of Neurons: 10.


Exercise 2: a) One-Pass Learning:
One-pass learning is a machine learning approach that involves training a model on a dataset
in a single pass, without the need for multiple iterations or epochs. In other words, the model
processes the entire training set just once and makes updates to its parameters accordingly.

Advantages:

• One-pass learning can be faster and more memory-efficient than traditional batch or
mini-batch gradient descent methods, as it avoids the need to store and process the
entire dataset multiple times.
• It is well-suited to large-scale datasets, as it can process data in a streaming fashion
and scale to handle very large datasets.
• It can be useful in real-time or online learning scenarios, where data arrives in a
continuous stream and the model needs to be updated in real-time.

Disadvantages:

• Since one-pass learning only sees each example once, it may not converge to the
optimal solution, especially if the data is noisy or there are outliers. Additionally, the
model may not have enough opportunities to adjust its parameters to the underlying
patterns in the data.
• One-pass learning may not be able to learn complex patterns in the data, as it has
limited opportunities to adjust its parameters to fit the data. It may be less effective
for deep learning models with many layers and parameters.
• One-pass learning requires careful selection of hyperparameters such as the learning
rate and regularization to ensure that the model learns effectively and does not
overfit.

Useful Cases:

One-pass learning is most useful in scenarios where the data is large and continuous, and
where the goal is to process the data in real-time or with limited resources. Examples include
anomaly detection in streaming data, fraud detection in financial transactions, and natural
language processing in real-time chat applications. One-pass learning may also be useful as a
pre-training step to initialize the weights of a deep learning model before fine-tuning with
more iterations or epochs.

Exercise 2: b) Incremental Learning:


Incremental learning is a machine learning approach in which the model is first trained on an
initial subset of the data, and then additional data is sequentially presented to the model,
allowing it to adapt and learn from new examples.

Incremental learning is often used in scenarios where data is constantly changing or being
added, such as in online learning or real-time data processing, and it can also help to improve
efficiency and reduce the amount of time and resources needed for training models.
Advantages:

• Incremental learning can be more resource-efficient than retraining the model from
scratch each time new data is added, as it only needs to update the parameters based
on the new data. This can be especially useful in scenarios where the dataset is very
large or continuously growing.
• It allows the model to adapt to changing data distributions over time, which can be
particularly useful in scenarios where the underlying data generating process may
change over time.
• It can allow the model to adapt more quickly to changes in the data and produce faster
responses than traditional batch learning approaches.

Disadvantages:

• Incremental learning can suffer from the problem of catastrophic forgetting, where
the model forgets previous knowledge as it learns new information. This can be
particularly problematic in scenarios where the model needs to maintain its
performance on previously learned tasks.
• It requires careful selection of representative examples to avoid introducing bias into
the updated model. In addition, the order in which the examples are presented can
also impact the performance of the updated model.
• It can be susceptible to model drift, where the updated model no longer accurately
represents the underlying data distribution due to changes in the data over time.

Alternative Approaches:

An alternative approach to incremental learning is transfer learning, where a pre-trained


model on a source task is fine-tuned on a target task with additional data. Transfer learning
can be particularly useful in scenarios where the initial dataset is too small to train an effective
model, or where the target task is related to the source task. Transfer learning can also help
mitigate the issue of catastrophic forgetting, as the pre-trained model is already initialized
with knowledge from the source task and may be able to adapt to the new task more
effectively. However, transfer learning requires access to a pre-trained model, which may not
always be available or feasible, and may not be as flexible as incremental learning in scenarios
where the underlying data generating process changes over time.

Exercise 2: c) Justification:
In general, a real concept drift is a more important problem than virtual concept drift.

Real concept drift refers to changes in the underlying data distribution over time, which can
lead to a degradation in the model's performance if the model is not adapted to the new
distribution. In contrast, virtual concept drift refers to changes in the relationship between
the features and the target variable, which can occur due to changes in the feature extraction
or preprocessing methods.

Real concept drift is likely to be a more important problem than virtual concept drift, as it can
have a more significant impact on the model's performance and may be more difficult to
detect and address. However, both types of concept drift should be carefully monitored and
addressed to ensure that the model's predictions remain accurate and reliable over time.
Exercise 2: d)
Online active learning is a machine learning approach that combines online learning and active
learning to learn from streaming data while minimizing the labeling effort. In online active
learning, uncertainty-based sampling is a common strategy used to select informative
examples for labeling. Below are two widely used methods to improve an uncertainty based
sampling strategy.

Diversity sampling: In addition to selecting samples that the model is uncertain about;
diversity sampling selects samples that maximize the diversity of the labeled data. This can
help improve the performance of the model by reducing the redundancy of the labeled data,
and ensuring that the model learns a representative sample of the data distribution.

Active search: Active search is a method that focuses on sampling examples that the model is
likely to misclassify, rather than just focusing on the most uncertain examples. This can help
the model learn more quickly from the labeled data by actively searching for the most
informative examples that will help the model better discriminate between different classes.
Active search can be particularly useful when the data distribution is skewed or when there
are imbalanced classes.

Overall, both diversity sampling and active search are methods that can be used to improve
the effectiveness of an uncertainty-based sampling strategy in online active learning. By
selecting the most informative examples for labeling, these methods can help improve the
performance of the model while minimizing the labeling effort required.

You might also like