Exercise 5
Exercise 5
Data Preprocessing: Effectively using MinMaxScaler normalization enables models to operate on time
series data.
Model Architecture: The addition of linear layers (nn.Linear) could benefit the model in extracting
features regarding LSTM outputs for better prediction of the target.
Training: The training loop shows a steady decline in loss, with significant changes in loss every 25
epochs, implying the effectiveness of the model learning.
Prediction: The resultant Mean Squared Error (MSE) of 0.2308 is adequate. The graph describes how
the model has followed the trends and patterns in the actual data against its predictions.
Exercise 2:
1) Multivariate Data Representation And Lstm Prediction The historical bitcoin market data comprises
the following attributes:
High Price: The highest price reached by Bitcoin during the trading day.
Low Price: The lowest price reached by Bitcoin during a trading day.
Then, the task of the LSTM network is to analyse these features and predict the closing price of
Bitcoin for the next 50 days based on last 100 days.
2) Internal Correlation: Yes, there are entire correlations among features with respect to the data set.
For example, opening and closing prices are generally the same; transaction volume also depends on
the prices. These interdependencies are very important for LSTM learning and future price
predictions.
Data Normalization: normalizes all the input features to a common scale so that convergence is
achieved better during training
Model Architecture: Increase the number of LSTM layers or units in order to learn more complex
patterns. For example, for stacked LSTM, greater number of layers improved performance.
STACK OVERFLOW
Hyperparameter Tuning: Learning rates, batch sizes, and sequence lengths are different from the
above setting. Search for the optimal configuration.
Bidirectional LSTM: Use this method to capture patterns from both past and future contexts.
Exercise 3:
This basically takes care of data preprocessing, which is the data import and normalization through
Min Max Scaler, and then reformats data into a supervised learning problem for a look-back period of
1, and splits the data into a train dataset and test dataset. Next, it defines three models-RNN Model:
Uses basic RNN; GRU Model: Has a GRU Layer; and LSTM Model: Contains an LSTM layer.
Training/Evaluation: Each model is built on the Mean Squared Error (MSE) loss using Adam. After
training, the models were evaluated against the test set, and Root Mean Square Error (RMSE) was
computed. Predictions were reverse-transformed to the original scale, and test errors were
calculated.
This would mean plotting true vs predicted values for each model and visualizing the total errors
summed for each test sequence.
Exercise 4a:
dtype: int64
Intercept: -37.552166316589286
R-squared: 0.596623451962375
This dataset contains a total of 20,640 dimensions, with no individual missing values, as revealed by
the zeros in the count of their respective columns in the missing values check. The summary statistics
indicate that the dataset comprises a mixture of numerical features like 'MedInc' (the median
income), 'HouseAge' (median house age), 'MedHouseVal' (the median house value), and the others.
All the features' coefficients are displayed, the intercept being an estimated -37.55. These
coefficients of the features are:
'MedInc': 0.4418
'HouseAge': 0.0097
'AveRooms': -0.1199
'AveBedrms': 0.7847
'Population': -0.0000003395
'AveOccup': -0.0033
'Latitude': -0.4237
'Longitude': -0.4393
Mean Squared Error (MSE) is calculated at around 0.5385, R squared approximates to 0.5966. This
figure denotes that about 59.66% variance in the target variable 'MedHouseVal' has been explained
by the model.
This scatter plot likely shows how the actual values relate to the predicted values of 'MedHouseVal',
whereby a perfectly fitted model would be shown by points clustered very close to the line of perfect
prediction.
Exercise 4b: