Deep Learning and Machine Learning: Lab Explanation
Deep Learning and Machine Learning: Lab Explanation
MACHINE LEARNING
LAB EXPLANATION
import pandas as pd
• This just means that if I want to refer to code in the package ‘pandas’, I’ll refer
to it with the name pd
df = pd.read_csv('housepricedata.csv')
• This line of code means that we will read the csv file ‘housepricedata.csv’
(which should be in the same directory as your notebook) and store it in the
variable ‘df’.
df
• Now that we’ve seen what our data looks like, we want to
convert it into arrays for our machine to process:
• dataset = df.values
• To convert our dataframe into an array, we just store the
values of df (by accessing df.values) into the variable
‘dataset’.
dataset
• To see what is inside
this variable ‘dataset’,
simply type ‘dataset’
into a grey box on your
notebook and run the
cell (Alt-Enter):
• dataset
• As you can see, it is all
stored in an array now:
X . dataset[:,0:10]
• We now split our dataset into input features (X) and the feature we wish to predict (Y). To do that split, we simply
assign the first 10 columns of our array to a variable called X and the last column of our array to a variable called Y.
The code to do the first assignment is this:
• X . dataset[:,0:10]
• This might look a bit weird, but let me explain what’s inside the square brackets. Everything before the comma
refers to the rows of the array and everything after the comma refers to the columns of the arrays.
• Since we’re not splitting up the rows, we put ‘:’ before the comma. This means to take all the rows in dataset and
put it in X.
• We want to extract out the first 10 columns, and so the ‘0:10’ after the comma means take columns 0 to 9 and put it
in X (we don’t include column 10). Our columns start from index 0, so the first 10 columns are really columns 0 to 9.
• We then assign the last column of our array to Y:
Y . dataset[:,10]
• Ok, now we’ve split our dataset into input features (X) and the label of what we want to predict (Y).
Import the code
• We first have to import the code that we want to use:
• Note that we chose 0 and 1 intentionally to aid the training of our neural network.
• Now, our scaled dataset is stored in the array ‘X_scale’. If you wish to see what ‘X_scale’ looks like, simply run the cell:
• X_scale
Your Jupyter
notebook
should now
look a bit like
this:
TRAIN AND TEST:
• Now, we are down to our last step in processing the data, which is to split our dataset into a training set, a validation set and a
test set.
• We will use the code from scikit-learn called ‘train_test_split’, which as the name suggests, split our dataset into a training set
and a test set. We first import the code we need:
from sklearn model_selection
. import train_test_split
• Then, split your dataset like this:
• model = Sequential ([
•
Dense(32, activation ='relu', input_shape =(10,)),
•
Dense(32, activation ='relu'),
•
Dense(1, activation ='sigmoid'),
• ])
MODEL
• And just like that, the code snippet above has defined our architecture! The code above can be interpreted like this:
• model = Sequential([ ... ])
• This says that we will store our model in the variable ‘model’, and we’ll describe it sequentially (layer by layer) in between the
square brackets.
• Dense(32, activation='relu', input_shape=(10,)),
• We have our first layer as a dense layer with 32 neurons, ReLU activation and the input shape is 10 since we have 10 input
features. Note that ‘Dense’ refers to a fully-connected layer, which is what we will be using.
• Dense(32, activation='relu'),
• Our second layer is also a dense layer with 32 neurons, ReLU activation. Note that we do not have to describe the input shape
since Keras can infer from the output of our first layer.
• Dense(1, activation='sigmoid'),
• Our third layer is a dense layer with 1 neuron, sigmoid activation.
• And just like that, we have written our model architecture (template) in code!
Second Step: Filling in the best numbers
• metrics=['accuracy']
• Lastly, we want to track accuracy on top of the loss function. Now once we’ve run that cell, we are ready to train!
• Now that we’ve got our architecture specified, we need to find the best numbers for it. Before we start our training, we have to configure the model
Training on the data is pretty straightforward and requires us to write one
line of code:
batch_size epochs
=32, =100,
•
validation_data X_val Y_val
=( , ))
• The function is called ‘fit’ as we are fitting the parameters to the data. We have to
specify what data we are training on, which is X_train and Y_train. Then, we specify
the size of our mini-batch and how long we want to train it for (epochs). Lastly, we
specify what our validation data is so that the model will tell us how we are doing on
the validation data at each point. This function will output a history, which we save
under the variable hist. We’ll use this variable a little later when we get to
visualization.
Now, run the
cell and watch it
train! Your
Jupyter
notebook should
look like this:
• we can evaluate it on the Due to the randomness in how we have split the dataset as well
as the initialization of the weights, the numbers and graph
test set. To find the will differ slightly each time we run our notebook.
accuracy on our test set, we Nevertheless, you should get a test accuracy anywhere
run this code snippet: between 80% to 95% if you’ve followed the architecture I
specified above!
• model.evaluate(X_tes
t, Y_test)[1]
Visualizing Loss and Accuracy
plt title
. ('Model accuracy')
plt ylabel
. ('Accuracy')
plt xlabel
. ('Epoch')
plt legend
. (['Train', 'Val'], loc ='lower right')
• plt show
. ()
You should
get a graph
that looks a
bit like this:
• Since the improvements in our model to the training set looks
somewhat matched up with improvements to the validation
set, it doesn’t seem like overfitting is a huge problem in our
model.
• Summary: We use matplotlib to visualize the training and
validation loss / accuracy over time to see if there’s overfitting
in our model.
Adding Regularization
to our Neural Network
•
Dense(1000, activation='relu', kernel_regularizer=regularizers.l2(0.01)),
• Dropout(0.3),
•
Dense(1000, activation='relu', kernel_regularizer=regularizers.l2(0.01)),
• Dropout(0.3),
•
Dense(1000, activation='relu', kernel_regularizer=regularizers.l2(0.01)),
• Dropout(0.3),
•
Dense(1, activation='sigmoid', kernel_regularizer=regularizers.l2(0.01)),
• ])
There are two main differences:
• Difference 1: To add L2 regularization, notice that we’ve added a bit of extra code in each of our
dense layers like this:
• kernel_regularizer=regularizers.l2(0.01)
• This tells Keras to include the squared values of those parameters in our overall loss function,
and weight them by 0.01 in the loss function.
• Difference 2: To add Dropout, we added a new layer like this:
• Dropout(0.3),
• This means that the neurons in the previous layer has a probability of 0.3 in dropping out
during training.
Let’s compile it and run it with the
same parameters as our Model 2
(the overfitting one):
model_3.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
plt.plot(hist_3.history['acc'])
plt.plot(hist_3.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Val'], loc='lower right')
plt.show()
we will get a
plot like this: