pytorch
pytorch
REPORT
ID: 22127385
Class: 22TGMT
TABLE OF CONTENT
10. References.............................................................................19
2
1. Import Required Libraries
In this section, we import the necessary libraries required for building and training
the neural network using PyTorch, as well as for plotting the results.
Libraries Imported:
- torch: used for creating and manipulating tensors, defining neural network layers, and
performing various operations on tensors.
- torch.nn as nn: provides various classes and functions to build neural networks.
- time: used to measure the training time of the neural network.
- matplotlib.pyplot as plt: to visualize the training loss over iterations.
a. train_network FUNCTION
- Purpose: This function trains the neural network by performing forward and
backward propagation over a specified number of iterations.
- Parameters:
3
- Returns: A list of loss values recorded at each iteration.
b. evaluate_network FUNCTION
- Purpose: This function evaluates the performance of the trained neural network
on a given dataset.
- Parameters:
- Returns: The final loss value and the predictions made by the network.
c. plot_loss FUNCTION
- Purpose: This function plots the training loss over iterations to visualize the
training process.
- Parameters:
d. measure_training_time FUNCTION
- Purpose: This function measures the time taken to train the neural network over a
specified number of iterations.
- Parameters:
4
4. Experiment: Learning Rate
a. Overview:
The learning rate is a crucial parameter in the training process of neural networks.
It determines the extent to which the weights of the network are updated during each step
of the optimization process. A learning rate that is too high can cause the model to not
converge or oscillate around the optimal value, while a learning rate that is too low can
make the training process slow and potentially get stuck in local optima.
These values are chosen to represent a small and simple dataset, making it easy to
observe and analyze the training results.
- Fixed Parameters:
+ input_size = 3
+ hidden_size = 4
+ output_size = 1
+ iterations = 1000
These parameters are chosen to create a simple neural network with one hidden layer,
sufficient to illustrate the impact of the learning rate without being overly complex.
c. Implement:
- Define the learning rates to test:
learning_rates = [0.001, 0.01, 0.1, 0.5, 1.0]
- Training process:
+ Initialize the neural network with fixed parameters.
+ Train the neural network with each learning rate in the list.
5
d. Result:
e. Visualization:
6
5. Experiment: Layer Sizes
a. Overview:
Layer sizes in a neural network, including the number of neurons in each layer
and the number of layers, play a crucial role in determining the network's capacity to
learn and generalize from data. The input layer size corresponds to the number of features
in the input data, the hidden layer size affects the network's ability to capture complex
patterns, and the output layer size corresponds to the number of prediction targets.
b. Test Set and Parameters:
- Test Set:
+ Input: X = torch.tensor([[2, 9, 0], [1, 5, 1], [3, 6, 2]], dtype=torch.float32)
+ Output: y = torch.tensor([[90], [100], [88]], dtype=torch.float32)
These values are chosen to represent a small and simple dataset, making it easy to
observe and analyze the training results.
- Fixed Parameters:
+ learning_rate = 0.1
+ iterations = 1000
These parameters are chosen to provide a consistent training environment across
different layer configurations, allowing us to isolate the impact of layer sizes on the
network's performance.
c. Implementation:
- Define the layer configurations to test:
7
layer_configs = [
{"input_size": 2, "hidden_size": 3, "output_size": 1},
{"input_size": 3, "hidden_size": 5, "output_size": 1},
{"input_size": 3, "hidden_size": 8, "output_size": 2}
]
- Training process.
+ Initialize the neural network with the specified layer sizes.
+ Adjust the input and output data sizes according to the current configuration.
+ Train the neural network and record the loss, training time, and final predictions for
each configuration.
d. Results:
e. Visualization:
8
6. Experiment: Hidden Layer Depth
a. Overview:
The depth of hidden layers in a neural network, which refers to the number of
hidden layers and the number of neurons in each layer, plays a crucial role in determining
the network's capacity to learn and generalize from data. Deeper networks with more
hidden layers can capture more complex patterns but may also require more
computational resources and careful tuning to avoid overfitting.
b. Test Set and Parameters:
- Test Set:
+ Input: X = torch.tensor([[2, 9, 0], [1, 5, 1], [3, 6, 2]], dtype=torch.float32)
+ Output: y = torch.tensor([[90], [100], [88]], dtype=torch.float32)
These values are chosen to represent a small and simple dataset, making it easy to
observe and analyze the training results.
- Fixed Parameters:
+ learning_rate = 0.1
+ iterations = 1000
9
These parameters are chosen to provide a consistent training environment across
different hidden layer configurations, allowing us to isolate the impact of hidden layer
depth on the network's performance.
c. Implementation:
- Define the hidden layer configurations to test:
hidden_layer_configs = [
[4],
[4, 3],
[4, 3, 2] ]
- Training process.
d. Results:
Observation - A single hidden layer with a sufficient number of neurons can perform well on simple
datasets.
- Adding more hidden layers can capture more complex patterns but may also increase
the training time.
- The choice of hidden layer depth should balance the complexity of the problem and the
computational resources available.
e. Visualization:
10
7. Experiment: Activation Functions
a. Overview:
+ input_size = 3
+ hidden_size = 4
11
+ output_size = 1
+ learning_rate = 0.1
+ iterations = 1000
c. Implemetation:
- Define the activation functions to test:
activation_functions = {
"Sigmoid": nn.Sigmoid(),
"ReLU": nn.ReLU(),
"Tanh": nn.Tanh()
}
- Training process.
d. Results:
Observation - The Sigmoid and Tanh activation functions provided good results with low
final loss and accurate predictions.
- The ReLU activation function did not perform well in this experiment, likely
due to the small dataset and specific initialization leading to dead neurons.
e. Visualization:
12
8. Experiment: Iteration Count
a. Overview:
The iteration count in the training process of a neural network refers to the number of
times the entire dataset is passed through the network. It is a crucial parameter that affects the
convergence and performance of the model. A higher iteration count allows the network to
learn more from the data, potentially leading to better performance, but it also increases the
training time.
+ learning_rate = 0.1
13
+ input_size = 3
+ hidden_size = 4
+ output_size = 1
c. Implementation:
- Define the iteration counts to test:
iteration_counts = [500, 1000, 2000, 5000]
- Training process.
d. Results:
Observation Increasing the iteration count beyond a certain point yields diminishing returns in terms of
performance improvement.
e. Visualization:
14
9. Compile and Report Results
In this section, I compile the results from the various experiments conducted on learning rate,
layer sizes, hidden layer depth, activation functions, and iteration count. I analyze the final loss,
training time, and predictions to draw conclusions about the impact of these parameters on the
performance of the neural network.
15
Table2. Layer Size Experiment Results
16
Diagram1. Learning Rate Experiment Results
17
Diagram2. Hidden Layer Depth Experiment Results
18
Diagram4. Iteration Count Experiment Results
10. References
1. PyTorch Documentation. (https://pytorch.org/docs/stable/index.html)
2. Activation function. (Tanh vs. Sigmoid vs. ReLU - GeeksforGeeks)
19