0% found this document useful (0 votes)
3 views

pytorch

This report details experiments conducted using PyTorch to analyze the effects of various parameters on neural network performance, including learning rate, layer sizes, hidden layer depth, activation functions, and iteration count. Each section outlines the methodology, results, and observations from the experiments, highlighting how these factors influence training loss, prediction accuracy, and training time. The report concludes with compiled results and visualizations for better understanding of the impact of these parameters.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

pytorch

This report details experiments conducted using PyTorch to analyze the effects of various parameters on neural network performance, including learning rate, layer sizes, hidden layer depth, activation functions, and iteration count. Each section outlines the methodology, results, and observations from the experiments, highlighting how these factors influence training loss, prediction accuracy, and training time. The report concludes with compiled results and visualizations for better understanding of the impact of these parameters.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

UNIVERSITY OF SCIENCE – VNUHCM

FACULTY OF INFORMATION TECHNOLOGY

REPORT

LAB 03: PYTORCH

Student’s name: Nguyen Quoc Thang

ID: 22127385

Class: 22TGMT
TABLE OF CONTENT

1. Import Required Libraries..............................................................................3

2. Define Neural Network Class........................................................................3

3. Define Helper Functions...............................................................................3

4. Experiment: Learning Rate...........................................................................5

5. Experiment: Layer Sizes.......................................................................7

6. Experiment: Hidden Layer Depth...........................................................9

7. Experiment: Activation Functions........................................................11

8. Experiment: Iteration Count................................................................13

9. Compile and Report Results................................................................15

10. References.............................................................................19

2
1. Import Required Libraries
In this section, we import the necessary libraries required for building and training
the neural network using PyTorch, as well as for plotting the results.
Libraries Imported:
- torch: used for creating and manipulating tensors, defining neural network layers, and
performing various operations on tensors.
- torch.nn as nn: provides various classes and functions to build neural networks.
- time: used to measure the training time of the neural network.
- matplotlib.pyplot as plt: to visualize the training loss over iterations.

2. Define Neural Network Class


In this section, I use the FFNeuralNetwork class (guidance document), which
represents a feed-forward neural network. This class is built using PyTorch's nn.Module and
includes methods for forward propagation, backward propagation, training, and
saving/loading weights.

(detail in the code)

3. Define Helper Functions


In this section, I define several helper functions that are essential for training,
evaluating, and visualizing the performance of the neural network.

a. train_network FUNCTION
- Purpose: This function trains the neural network by performing forward and
backward propagation over a specified number of iterations.
- Parameters:

+ network: The neural network model to be trained.

+ X, y : The input data, target output data.

+ learning_rate: The learning rate for the optimizer.

+ iterations: The number of training iterations.

3
- Returns: A list of loss values recorded at each iteration.

b. evaluate_network FUNCTION

- Purpose: This function evaluates the performance of the trained neural network
on a given dataset.
- Parameters:

+ network: The trained neural network model.

+ X, y: The input data, target output data.

- Returns: The final loss value and the predictions made by the network.

c. plot_loss FUNCTION

- Purpose: This function plots the training loss over iterations to visualize the
training process.
- Parameters:

+ losses: A list of loss values recorded during training.

+ title: The title of the plot.

- Returns: None (displays the plot).

d. measure_training_time FUNCTION

- Purpose: This function measures the time taken to train the neural network over a
specified number of iterations.
- Parameters:

+ network: The neural network model to be trained.

+ X, y : The input data, output data.

+ learning_rate: The learning rate for the optimizer.

+ iterations: The number of training iterations.

- Returns: The total training time in seconds.

4
4. Experiment: Learning Rate
a. Overview:

The learning rate is a crucial parameter in the training process of neural networks.
It determines the extent to which the weights of the network are updated during each step
of the optimization process. A learning rate that is too high can cause the model to not
converge or oscillate around the optimal value, while a learning rate that is too low can
make the training process slow and potentially get stuck in local optima.

b. Test Set and Parameters


- Test Set:

+ Input: X = torch.tensor([[2, 9, 0], [1, 5, 1], [3, 6, 2]], dtype=torch.float32)

+ Output: y = torch.tensor([[90], [100], [88]], dtype=torch.float32)

These values are chosen to represent a small and simple dataset, making it easy to
observe and analyze the training results.

- Fixed Parameters:

+ input_size = 3

+ hidden_size = 4

+ output_size = 1

+ iterations = 1000

These parameters are chosen to create a simple neural network with one hidden layer,
sufficient to illustrate the impact of the learning rate without being overly complex.

c. Implement:
- Define the learning rates to test:
learning_rates = [0.001, 0.01, 0.1, 0.5, 1.0]
- Training process:
+ Initialize the neural network with fixed parameters.
+ Train the neural network with each learning rate in the list.

5
d. Result:

\Learning 0.001 0.01 0.1 0.5 1.0


Rate
Final loss 0.01146 0.0077 0.0036 0.0028 0.0007
Training 0.2443 0.2362 0.2533 0.2322 0.2787
Time (s)
Prediction tensor([[0.7924], tensor([[0.8320], tensor([[0.9176], tensor([[0.9159], tensor([[0.9088],
[0.8204], [0.8676], [0.9244], [0.9312], [0.9570], [0.8895]])
[0.8723]]) [0.9109]]) [0.9496]]) [0.9380]])
Observation Learning rates of 0.1 and 0.5 provided the best results in this experiment, with low loss and accurate
predictions.

e. Visualization:

6
5. Experiment: Layer Sizes
a. Overview:
Layer sizes in a neural network, including the number of neurons in each layer
and the number of layers, play a crucial role in determining the network's capacity to
learn and generalize from data. The input layer size corresponds to the number of features
in the input data, the hidden layer size affects the network's ability to capture complex
patterns, and the output layer size corresponds to the number of prediction targets.
b. Test Set and Parameters:
- Test Set:
+ Input: X = torch.tensor([[2, 9, 0], [1, 5, 1], [3, 6, 2]], dtype=torch.float32)
+ Output: y = torch.tensor([[90], [100], [88]], dtype=torch.float32)
These values are chosen to represent a small and simple dataset, making it easy to
observe and analyze the training results.
- Fixed Parameters:
+ learning_rate = 0.1
+ iterations = 1000
These parameters are chosen to provide a consistent training environment across
different layer configurations, allowing us to isolate the impact of layer sizes on the
network's performance.
c. Implementation:
- Define the layer configurations to test:

7
layer_configs = [
{"input_size": 2, "hidden_size": 3, "output_size": 1},
{"input_size": 3, "hidden_size": 5, "output_size": 1},
{"input_size": 3, "hidden_size": 8, "output_size": 2}
]
- Training process.
+ Initialize the neural network with the specified layer sizes.
+ Adjust the input and output data sizes according to the current configuration.
+ Train the neural network and record the loss, training time, and final predictions for
each configuration.
d. Results:

Layer1 Layer2 Layer3

Final loss 0.0025 0.0039 0.0036


Training Time 0.2362 0.2360 0.2369
(s)
Prediction tensor([[0.9285], [0.9235], tensor([[0.9418], [0.9197], tensor([[0.9189, 0.9338],
[0.9097]]) [0.9403]]) [0.9241, 0.9192], [0.9394,
0.9460]])
Observation - Increasing the hidden layer size slightly increases the final loss but still
provides accurate predictions.
- Adding more neurons to the hidden layer and increasing the output size
results in a slightly higher final loss but provides detailed predictions for
multiple outputs.

e. Visualization:

8
6. Experiment: Hidden Layer Depth
a. Overview:
The depth of hidden layers in a neural network, which refers to the number of
hidden layers and the number of neurons in each layer, plays a crucial role in determining
the network's capacity to learn and generalize from data. Deeper networks with more
hidden layers can capture more complex patterns but may also require more
computational resources and careful tuning to avoid overfitting.
b. Test Set and Parameters:
- Test Set:
+ Input: X = torch.tensor([[2, 9, 0], [1, 5, 1], [3, 6, 2]], dtype=torch.float32)
+ Output: y = torch.tensor([[90], [100], [88]], dtype=torch.float32)
These values are chosen to represent a small and simple dataset, making it easy to
observe and analyze the training results.
- Fixed Parameters:
+ learning_rate = 0.1
+ iterations = 1000

9
These parameters are chosen to provide a consistent training environment across
different hidden layer configurations, allowing us to isolate the impact of hidden layer
depth on the network's performance.

c. Implementation:
- Define the hidden layer configurations to test:
hidden_layer_configs = [
[4],
[4, 3],
[4, 3, 2] ]
- Training process.
d. Results:

Hidden Layer1 Hidden Layer2 Hidden Layer3

Final loss 0.0028 0.0028 0.0028


Training Time 0.4174 0.4996 0.8598
(s)
Prediction tensor([[0.9220], [0.9265], tensor([[0.9267], [0.9255], tensor([[0.9255], [0.9253],
[0.9297]]) [0.9262]]) [0.9253]])

Observation - A single hidden layer with a sufficient number of neurons can perform well on simple
datasets.
- Adding more hidden layers can capture more complex patterns but may also increase
the training time.
- The choice of hidden layer depth should balance the complexity of the problem and the
computational resources available.

e. Visualization:

10
7. Experiment: Activation Functions
a. Overview:

Activation functions play a crucial role in neural networks by introducing non-linearity


into the model, allowing it to learn complex patterns in the data. Different activation
functions can have a significant impact on the performance and convergence of the network.
Common activation functions include: Sigmoid, ReLU (Rectified Linear Unit), and Tanh.

b. Test Set and Parameters:


- Test Set:
+ Input: X = torch.tensor([[2, 9, 0], [1, 5, 1], [3, 6, 2]], dtype=torch.float32)
+ Output: y = torch.tensor([[90], [100], [88]], dtype=torch.float32)
These values are chosen to represent a small and simple dataset, making it easy to
observe and analyze the training results.
- Fixed Parameters:

+ input_size = 3

+ hidden_size = 4

11
+ output_size = 1

+ learning_rate = 0.1

+ iterations = 1000

These parameters are chosen to provide a consistent training environment across


different activation functions, allowing us to isolate the impact of activation functions on
the network's performance.

c. Implemetation:
- Define the activation functions to test:
activation_functions = {
"Sigmoid": nn.Sigmoid(),
"ReLU": nn.ReLU(),
"Tanh": nn.Tanh()
}
- Training process.
d. Results:

Activation Sigmoid ReLU Tanh


Function
Final loss 0.0031 0.8615 0.0034
Training Time 0.5766 0.5393 0.4472
(s)
Prediction tensor([[0.9269], [0.9239], tensor([[0.], [0.], [0.]]) tensor([[0.9279], [0.9219],
[0.9316]]) [0.9385]])

Observation - The Sigmoid and Tanh activation functions provided good results with low
final loss and accurate predictions.
- The ReLU activation function did not perform well in this experiment, likely
due to the small dataset and specific initialization leading to dead neurons.

e. Visualization:

12
8. Experiment: Iteration Count
a. Overview:

The iteration count in the training process of a neural network refers to the number of
times the entire dataset is passed through the network. It is a crucial parameter that affects the
convergence and performance of the model. A higher iteration count allows the network to
learn more from the data, potentially leading to better performance, but it also increases the
training time.

b. Test Set and Parameters:


- Test Set:
+ Input: X = torch.tensor([[2, 9, 0], [1, 5, 1], [3, 6, 2]], dtype=torch.float32)
+ Output: y = torch.tensor([[90], [100], [88]], dtype=torch.float32)
These values are chosen to represent a small and simple dataset, making it easy to
observe and analyze the training results.
- Fixed Parameters:

+ learning_rate = 0.1

13
+ input_size = 3

+ hidden_size = 4

+ output_size = 1

These parameters are chosen to provide a consistent training environment across


different iteration counts, allowing us to isolate the impact of iteration count on the
network's performance.

c. Implementation:
- Define the iteration counts to test:
iteration_counts = [500, 1000, 2000, 5000]
- Training process.
d. Results:

Iteration 500 1000 2000 5000


Count
Final loss 0.0041 0.0039 0.0044 0.0028
Training 0.1642 0.2558 0.6321 1.6150
Time (s)
Prediction tensor([[0.9440], tensor([[0.9348], tensor([[0.9482], tensor([[0.8994],
[0.9117], [0.9311]]) [0.9164], [0.9385]]) [0.9133], [0.9383]]) [0.9393], [0.9494]])

Observation Increasing the iteration count beyond a certain point yields diminishing returns in terms of
performance improvement.

e. Visualization:

14
9. Compile and Report Results
In this section, I compile the results from the various experiments conducted on learning rate,
layer sizes, hidden layer depth, activation functions, and iteration count. I analyze the final loss,
training time, and predictions to draw conclusions about the impact of these parameters on the
performance of the neural network.

Table1. Learning Rate experiment results.

15
Table2. Layer Size Experiment Results

Table3. Hidden Layer Depth Experiment Results

Table4. Activation Function Experiment Results

Tabel 5. Iteration Count Experiment Results

16
Diagram1. Learning Rate Experiment Results

Diagram1. Learning Rate Experiment Results

17
Diagram2. Hidden Layer Depth Experiment Results

Diagram3. Activation Function Experiment Results

18
Diagram4. Iteration Count Experiment Results

10. References
1. PyTorch Documentation. (https://pytorch.org/docs/stable/index.html)
2. Activation function. (Tanh vs. Sigmoid vs. ReLU - GeeksforGeeks)

19

You might also like