0% found this document useful (0 votes)
26 views34 pages

Deep Learning and Machine Learning: Lab Explanation

Uploaded by

grajalakshmi0606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views34 pages

Deep Learning and Machine Learning: Lab Explanation

Uploaded by

grajalakshmi0606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

DEEP LEARNING AND

MACHINE LEARNING
LAB EXPLANATION
import pandas as pd
• This just means that if I want to refer to code in the package ‘pandas’, I’ll refer
to it with the name pd

df = pd.read_csv('housepricedata.csv')

• This line of code means that we will read the csv file ‘housepricedata.csv’
(which should be in the same directory as your notebook) and store it in the
variable ‘df’.
df

• If we want to find out what


is in df, simply type df into
the grey box and click Alt-
Enter:
• df
• Your notebook should look
something like this:
• If we want to find out.
dataset = df.values

• Now that we’ve seen what our data looks like, we want to
convert it into arrays for our machine to process:
• dataset = df.values
• To convert our dataframe into an array, we just store the
values of df (by accessing df.values) into the variable
‘dataset’.
dataset
• To see what is inside
this variable ‘dataset’,
simply type ‘dataset’
into a grey box on your
notebook and run the
cell (Alt-Enter):
• dataset
• As you can see, it is all
stored in an array now:
X . dataset[:,0:10]
• We now split our dataset into input features (X) and the feature we wish to predict (Y). To do that split, we simply
assign the first 10 columns of our array to a variable called X and the last column of our array to a variable called Y.
The code to do the first assignment is this:
• X . dataset[:,0:10]
• This might look a bit weird, but let me explain what’s inside the square brackets. Everything before the comma
refers to the rows of the array and everything after the comma refers to the columns of the arrays.
• Since we’re not splitting up the rows, we put ‘:’ before the comma. This means to take all the rows in dataset and
put it in X.
• We want to extract out the first 10 columns, and so the ‘0:10’ after the comma means take columns 0 to 9 and put it
in X (we don’t include column 10). Our columns start from index 0, so the first 10 columns are really columns 0 to 9.
• We then assign the last column of our array to Y:

Y . dataset[:,10]
• Ok, now we’ve split our dataset into input features (X) and the label of what we want to predict (Y).
Import the code
• We first have to import the code that we want to use:

from sklearn import preprocessing


• This says I want to use the code in ‘preprocessing’ within the sklearn package.
• Then, we use a function called the min-max scaler, which scales the dataset so that all the
input features lie between 0 and 1 inclusive:
• min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler .fit_transform (X)

• Note that we chose 0 and 1 intentionally to aid the training of our neural network.
• Now, our scaled dataset is stored in the array ‘X_scale’. If you wish to see what ‘X_scale’ looks like, simply run the cell:

• X_scale
Your Jupyter
notebook
should now
look a bit like
this:
TRAIN AND TEST:
• Now, we are down to our last step in processing the data, which is to split our dataset into a training set, a validation set and a
test set.
• We will use the code from scikit-learn called ‘train_test_split’, which as the name suggests, split our dataset into a training set
and a test set. We first import the code we need:
from sklearn model_selection
. import train_test_split
• Then, split your dataset like this:

X_train, X_val_and_test, Y_train, Y_val_and_test .


train_test_split(X_scale, Y_test_size=0.3)
• This tells scikit-learn that your val_and_test size will be 30% of the overall dataset. The code will store the split data into the first
four variables on the left of the equal sign as the variable names suggest.
Since we want a separate validation set and test set, we can use the same function to do the split again on val_and_test:

X_val, X_test, Y_val, Y_test . train_test_split(X_val_and_test,


Y_val_and_test. test_size=0.3)
• The code above will split the val_and_test size equally to the validation set and the test set.
If you want to see how the shapes of the arrays are for each of

SHAPE: them (i.e. what dimensions they are), simply run


Print(X_train.shape, X_val.shape, X_test.shape,
Y_train.shape, Y_val.shape, Y_test.shape)
This is how your Jupyter notebook should look like:
• Summary: In processing the data, we’ve:
• Summary: In processing the data, we’ve:
• Read in the CSV (comma separated values) file and convert them to arrays.
• Split our dataset into the input features and the label.
• Scale the data so that the input features have similar orders of magnitude.
• Split our dataset into the training set, the validation set and the test set.
Building and
Training our First
Neural Network
First Step: Setting up the Architecture
• The first thing we have to do is to set
up the architecture. Let’s first think
about what kind of neural network
architecture we want. Suppose we
want this neural network:
In words, we want to have these layers:
• Hidden layer 1: 32 neurons, ReLU
activation
• Hidden layer 2: 32 neurons, ReLU
activation
• Output Layer: 1 neuron, Sigmoid
activation
KERAS
• Now, we need to describe this architecture to Keras. We will be using the Sequential model, which means that we
merely need to describe the layers above in sequence.
• First, let’s import the necessary code from Keras:

from keras models . import Sequential

from keras layers . import Dense
• Then, we specify that in our Keras sequential model like this:

• model = Sequential ([


Dense(32, activation ='relu', input_shape =(10,)),


Dense(32, activation ='relu'),


Dense(1, activation ='sigmoid'),

• ])
MODEL
• And just like that, the code snippet above has defined our architecture! The code above can be interpreted like this:
• model = Sequential([ ... ])
• This says that we will store our model in the variable ‘model’, and we’ll describe it sequentially (layer by layer) in between the
square brackets.
• Dense(32, activation='relu', input_shape=(10,)),
• We have our first layer as a dense layer with 32 neurons, ReLU activation and the input shape is 10 since we have 10 input
features. Note that ‘Dense’ refers to a fully-connected layer, which is what we will be using.
• Dense(32, activation='relu'),
• Our second layer is also a dense layer with 32 neurons, ReLU activation. Note that we do not have to describe the input shape
since Keras can infer from the output of our first layer.
• Dense(1, activation='sigmoid'),
• Our third layer is a dense layer with 1 neuron, sigmoid activation.
• And just like that, we have written our model architecture (template) in code!
Second Step: Filling in the best numbers

• Telling it which algorithm you want to use to do the optimization


• Telling it what loss function to use
• Telling it what other metrics you want to track apart from the loss function
• Configuring the model with these settings requires us to call the function model.compile, like this:
• model.compile(optimizer='sgd',
• loss='binary_crossentropy',
• metrics=['accuracy'])
• We put the following settings inside the brackets after model.compile:
• optimizer='sgd'
• ‘sgd’ refers to stochastic gradient descent (over here, it refers to mini-batch gradient descent), which we’ve seen in Intuitive Deep Learning Part 1b.
• loss='binary_crossentropy'
• The loss function for outputs that take the values 1 or 0 is called binary cross entropy.

• metrics=['accuracy']
• Lastly, we want to track accuracy on top of the loss function. Now once we’ve run that cell, we are ready to train!

• Now that we’ve got our architecture specified, we need to find the best numbers for it. Before we start our training, we have to configure the model
Training on the data is pretty straightforward and requires us to write one
line of code:

hist = model fit X_train Y_train


. ( , ,

batch_size epochs
=32, =100,


validation_data X_val Y_val
=( , ))

• The function is called ‘fit’ as we are fitting the parameters to the data. We have to
specify what data we are training on, which is X_train and Y_train. Then, we specify
the size of our mini-batch and how long we want to train it for (epochs). Lastly, we
specify what our validation data is so that the model will tell us how we are doing on
the validation data at each point. This function will output a history, which we save
under the variable hist. We’ll use this variable a little later when we get to
visualization.
Now, run the
cell and watch it
train! Your
Jupyter
notebook should
look like this:
• we can evaluate it on the Due to the randomness in how we have split the dataset as well
as the initialization of the weights, the numbers and graph
test set. To find the will differ slightly each time we run our notebook.
accuracy on our test set, we Nevertheless, you should get a test accuracy anywhere
run this code snippet: between 80% to 95% if you’ve followed the architecture I
specified above!
• model.evaluate(X_tes
t, Y_test)[1]
Visualizing Loss and Accuracy

• we have to import the code we wish to use:



import matplotlib pyplot . as plt
• Then, we want to visualize the training loss and the validation loss. To do so, run this snippet of code:

• plt plot hist history


. ( . ['loss'])

• plt plot hist history


. ( . ['val_loss'])

• plt title . ('Model loss')

• plt ylabel . ('Loss')

• plt xlabel . ('Epoch')

• plt legend . loc (['Train', 'Val'], ='upper right')


Your Jupyter
notebook should
look something
like this:
We can do the same to plot our training accuracy and validation
accuracy with the code below:

plt plot hist history


. ( . ['acc'])

plt plot hist history


. ( . ['val_acc'])

plt title
. ('Model accuracy')

plt ylabel
. ('Accuracy')

plt xlabel
. ('Epoch')

plt legend
. (['Train', 'Val'], loc ='lower right')

• plt show
. ()
You should
get a graph
that looks a
bit like this:
• Since the improvements in our model to the training set looks
somewhat matched up with improvements to the validation
set, it doesn’t seem like overfitting is a huge problem in our
model.
• Summary: We use matplotlib to visualize the training and
validation loss / accuracy over time to see if there’s overfitting
in our model.
Adding Regularization
to our Neural Network

• For the sake of


introducing regularization
to our neural network,
let’s formulate with a
neural network that will
badly overfit on our
training set. We’ll call this
Model 2
• Here, we’ve made a much larger model and we’ve use the
Adam optimizer. Adam is one of the most common
optimizers we use, which adds some tweaks to stochastic

PLOT gradient descent such that it reaches the lower loss


function faster. If we run this code and plot the loss graphs
for hist_2 using the code below (note that the code is the
same except that we use ‘hist_2’ instead of ‘hist’):
We get a plot like this:
This is a clear sign of over-fitting. The
training loss is decreasing, but the validation
loss is way above the training loss and
increasing (past the inflection point of Epoch
Plotting
20). If we plot accuracy using the code below:
We can see a clearer divergence between train and validation
accuracy as well:
REGULARATION
• import the code that we need for L2 regularization and dropout:

from keras.layers import Dropout

from keras import regularizers
• We then specify our third model like this:
• model_3 = Sequential([

Dense(1000, activation='relu', kernel_regularizer=regularizers.l2(0.01), input_shape=(10,)),
• Dropout(0.3),


Dense(1000, activation='relu', kernel_regularizer=regularizers.l2(0.01)),
• Dropout(0.3),


Dense(1000, activation='relu', kernel_regularizer=regularizers.l2(0.01)),
• Dropout(0.3),


Dense(1000, activation='relu', kernel_regularizer=regularizers.l2(0.01)),
• Dropout(0.3),


Dense(1, activation='sigmoid', kernel_regularizer=regularizers.l2(0.01)),
• ])
There are two main differences:

• Difference 1: To add L2 regularization, notice that we’ve added a bit of extra code in each of our
dense layers like this:
• kernel_regularizer=regularizers.l2(0.01)
• This tells Keras to include the squared values of those parameters in our overall loss function,
and weight them by 0.01 in the loss function.
• Difference 2: To add Dropout, we added a new layer like this:
• Dropout(0.3),
• This means that the neurons in the previous layer has a probability of 0.3 in dropping out
during training.
Let’s compile it and run it with the
same parameters as our Model 2
(the overfitting one):

model_3.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])

hist_3 = model_3.fit(X_train, Y_train,


batch_size=32, epochs=100,
• validation_data=(X_val, Y_val))
• And now, let’s plot the loss and accuracy graphs. You’ll
notice that the loss is a lot higher at the start, and
thatplt.plot(hist_3.history['val_loss'])
• plt.title('Model loss')
• plt.ylabel('Loss')
• plt.xlabel('Epoch')
• plt.legend(['Train', 'Val'], loc='upper
We’ll get a
loss graph that
looks like this:
You can see that the validation loss much more closely matches our training
loss. Let’s plot the accuracy with similar code snippet:

plt.plot(hist_3.history['acc'])
plt.plot(hist_3.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Val'], loc='lower right')
plt.show()
we will get a
plot like this:

You might also like