0% found this document useful (0 votes)
165 views41 pages

Data Science RR Itec-Deep Learning

This document provides an overview of deep learning techniques: 1. Deep learning uses neural network architectures to teach computers by example, with many using multiple hidden layers of neurons ("deep" networks). 2. Deep learning models are trained on large labeled datasets to learn features directly from the data without manual feature extraction. 3. Deep learning has applications in industries like automated driving, medical research, and industrial automation by using neural networks to identify objects, detect cancer cells, and ensure worker safety.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
165 views41 pages

Data Science RR Itec-Deep Learning

This document provides an overview of deep learning techniques: 1. Deep learning uses neural network architectures to teach computers by example, with many using multiple hidden layers of neurons ("deep" networks). 2. Deep learning models are trained on large labeled datasets to learn features directly from the data without manual feature extraction. 3. Deep learning has applications in industries like automated driving, medical research, and industrial automation by using neural networks to identify objects, detect cancer cells, and ensure worker safety.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

### Level 05 of 08: Deep Learning ###

33. Deep learning

33.1 Introduction

1. Deep learning is a machine learning technique that teaches computers to do what comes
naturally to humans: learn by example

2. Most deep learning methods use neural network architectures, which is why deep learning
models are often referred to as deep neural networks.

3. The term “deep” usually refers to the number of hidden layers in the neural network.
Traditional neural networks only contain 2-3 hidden layers, while deep networks can have as
many as required.

4. Deep learning models are trained by using large sets of labelled data and neural network
architectures that learn features directly from the data without the need for manual feature
extraction.

Examples of Deep Learning at Work

1. Deep learning applications are used in industries from automated driving to medical devices.

2. Automated Driving: Automotive researchers are using deep learning to automatically


detect objects such as stop signs and traffic lights. In addition, deep learning is used to
detect pedestrians, which helps decrease accidents.

189
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
3. Aerospace and Defence: Deep learning is used to identify objects from satellites that
locate areas of interest, and identify safe or unsafe zones for troops.

4. Medical Research: Cancer researchers are using deep learning to automatically detect
cancer cells. Teams at UCLA built an advanced microscope that yields a high-dimensional
data set used to train a deep learning application to accurately identify cancer cells.

5. Industrial Automation: Deep learning is helping to improve worker safety around heavy
machinery by automatically detecting when people or objects are within an unsafe distance
of machines.

6. Electronics: Deep learning is being used in automated hearing and speech translation. For
example, home assistance devices that respond to your voice and know your preferences
are powered by deep learning applications.

33.2 Forward propagation

1. Bank transactions example

2. Make predictions based on:

a.Number of children

b.Number of existing accounts

190
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
import numpy as np

input_data = np.array([2, 3])

weights = {'node_0': np.array([1, 1]),

'node_1': np.array([-1, 1]),

'output': np.array([2, -1])}

node_0_value = (input_data * weights['node_0']).sum()

node_1_value = (input_data * weights['node_1']).sum()

hidden_layer_values = np.array([node_0_value, node_1_value])

print(hidden_layer_values) #[5, 1]

output = (hidden_layer_values * weights['output']).sum()

print(output)

33.3 Activation functions

1. These are two types

a. Linear

b. Non linear

2. Applied to hidden node inputs to produce node output

import numpy as np

input_data = np.array([2, 3])

191
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
weights = {'node_0': np.array([1, 1]), 'node_1': np.array([-1, 1]),'output': np.array([2, -1])}

node_0_input = (input_data * weights['node_0']).sum()

node_0_output = np.tanh(node_0_input)

node_1_input = (input_data * weights['node_1']).sum()

node_1_output = np.tanh(node_1_input)

hidden_layer_output = np.array([node_0_output, node_1_output])

output = (hidden_layer_output * weights['output']).sum()

print(output)

192
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
33.4 Deeper networks

1. ReLU (Rectified Linear Activation).

2. Multiple hidded layers

import numpy as np

input_data = np.array([3, 5])

weights = {'node_0': np.array([2, 4]),

'node_1': np.array([4, -5]),

193
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
'node_2': np.array([-1, 1]),

'node_3': np.array([2, 2]),

'output': np.array([-3, 7])}

node_0_output = (input_data * weights['node_0']).sum()

node_0_output_relu = np.maximum(node_0_output,0)

node_1_output = (input_data * weights['node_1']).sum()

node_1_output_relu = np.maximum(node_1_output,0)

hidden_layer1_output = np.array([node_0_output_relu, node_1_output_relu])

hidden_layer1_output_relu = np.maximum(hidden_layer1_output,0)

node_2_output = (hidden_layer1_output_relu * weights['node_2']).sum()

node_2_output_relu = np.maximum(node_2_output,0)

node_3_output = (hidden_layer1_output_relu * weights['node_3']).sum()

node_3_output_relu = np.maximum(node_3_output,0)

hidden_layer2_output = np.array([node_2_output_relu, node_3_output_relu])

hidden_layer2_output_relu = np.maximum(hidden_layer2_output,0)

output = (hidden_layer2_output_relu * weights['output']).sum()

output_relu = np.maximum(output,0)

print(output_relu)

3. Find more ways in implementing relu function

https://stackoverflow.com/questions/32109319/how-to-implement-the-relu-function-in-
numpy

194
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
33.5 Need for optimization

1. Understanding how weights change model accuracy?

2. Hence in ordered to get good quality/optimization, choosing right weights play main role.

3. Exercise 1: Coding how weight changes affect accuracy

#Step 1: Create relu function

def relu(my_input):

return(max(0, my_input))

#Step 2: Create predict with network function

def predict_with_network(input_data_point, weights):

node_0_input = (input_data_point * weights['node_0']).sum()

node_0_output = relu(node_0_input)

node_1_input = (input_data_point * weights['node_1']).sum()

node_1_output = relu(node_1_input)

hidden_layer_values = np.array([node_0_output, node_1_output])

input_to_final_layer = (hidden_layer_values * weights['output']).sum()

model_output = relu(input_to_final_layer)

return(model_output)

195
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
#Step 3: Use above functions to predict

# The data point you will make a prediction for

input_data = np.array([2, 3])

# Sample weights

weights_0 = {'node_0': [1, 1],

'node_1': [-1, 1],

'output': [2, -1]

# The actual target value, used to calculate the error

target_actual = 3

# Make prediction using original weights

model_output_0 = predict_with_network(input_data, weights_0)

# Calculate error: error_0

error_0 = model_output_0 - target_actual

# Create weights that cause the network to make perfect prediction (3): weights_1

weights_1 = {'node_0': [1, 1],

'node_1': [-1, 1],

'output': [3, -2]

# Make prediction using new weights: model_output_1

model_output_1 = predict_with_network(input_data, weights_1)

# Calculate error: error_1

error_1 = model_output_1 - target_actual

print(error_0);print(error_1)

196
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
33.6 Gradient descent

1. How many

2. If the slope is positive:

a. Going opposite the slope means moving to lower numbers

b. Subtract the slope from the current value

c. Too big a step might lead us astray

3. Solution: learning rate

a. Update each weight by subtracting

b. learning rate * slope

4. Exercise 1: Calculating Slope and Improving model weights

197
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
# Step 1 of 2: Calculate slope/gradient

import numpy as np

# Define weights

weights = np.array([1, 2])

input_data = np.array([3, 4])

target = 6

learning_rate = 0.01

# Calculate the predictions: preds

preds = (weights * input_data).sum()

# Calculate the error: error

error = preds - target

print(error)

# Calculate the slope: slope/gradient

gradient = 2 * input_data * error

gradient

# Step 2 of 2: Improving Model weights

weights_updated = weights - learning_rate * gradient

preds_updated = (weights_updated * input_data).sum()

error_updated = preds_updated - target

print(error_updated)

5. Exercise 2: Calculating Slope , Improving model weights and mse

# Step 1 of 3: Calculating slopes

import numpy as np

weights = np.array([0,2,1])

198
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
input_data = np.array([1,2,3])

target = 0

# Calculate the predictions: preds

preds = (weights * input_data).sum()

# Calculate the error: error

error = preds - target

# Calculate the slope: slope

slope = 2 * input_data * error

# Print the slope

print(slope)

#################################################

# Step 2 of 3: Improving model weights

# Set the learning rate: learning_rate

learning_rate = 0.01

# Update the weights: weights_updated

weights_updated = weights - learning_rate * slope

# Get updated predictions: preds_updated

preds_updated = (weights_updated * input_data).sum()

# Calculate updated error: error_updated

error_updated = preds_updated - target

# Print the original error

print(error)

# Print the updated error

print(error_updated)

#################################################

# Step 3 of 3:Making multiple updates to weights

199
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
def get_error(input_data, target, weights):

preds = (weights * input_data).sum()

error = preds - target

return(error)

def get_slope(input_data, target, weights):

error = get_error(input_data, target, weights)

slope = 2 * input_data * error

return(slope)

def get_mse(input_data, target, weights):

errors = get_error(input_data, target, weights)

mse = np.mean(errors**2)

return(mse)

n_updates = 20

mse_hist = []

# Iterate over the number of updates

for i in range(n_updates):

# Calculate the slope: slope

slope = get_slope(input_data, target, weights)

# Update the weights: weights

weights = weights - 0.01 * slope

# Calculate mse with new weights: mse

mse = get_mse(input_data, target, weights)

# Append the mse to mse_hist

mse_hist.append(mse)

# Plot the mse history

import matplotlib.pyplot as plt

200
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
plt.plot(mse_hist)

plt.xlabel('Iterations')

plt.ylabel('Mean Squared Error')

plt.show() # Notice that, the mean squared error decreases as the number of iterations
go up.

201
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
33.7 Backpropagation

1. Update weights using error and iterate till we meet actual target data

2. Try to understand the process, however you will generally use a library that implements forward
and backward propagations.

3. It is working based on chain rule of calculus

202
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
4. If a variable z depends on the variable y, y variables depends on the variable x.We can say z
depends on x . we can write chain rule as for below

5. Find the derivative of sigmoid function

6. Please refer it is very clear here https://mattmazur.com/2015/03/17/a-step-by-step-


backpropagation-example/

7. The relationship between forward and backward propagation

a. If you have gone through 4 iterations of calculating slopes (using backward propagation)
and then updated weights, how many times must you have done forward propagation?

i. 0

ii. 1

iii. 4

iv. 8

203
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
b. If your predictions were all exactly right, and your errors were all exactly 0, the slope of
the loss function with respect to your predictions would also be 0. In that circumstance,
which of the following statements would be correct?

i. The updates to all weights in the network would also be 0.

ii. The updates to all weights in the network would be dependent on the activation
functions.

iii. The updates to all weights in the network would be proportional to values from
the input data.

204
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
33.8 Creating keras Regression Model

1. Keras Model building has mainly 4 steps

1. Specify Architecture

i. Define input nodes/columns

ii. Define Hidden layers

iii. Define hidden nodes

iv. Define activation functions

v. Define output

2. Compile

i. Define optimizer

ii. Define Loss function

3. Fit

i. Applying backpropagation

ii. Updating weights

4. Predict

2. Step 1 of 4: Specify the architecture

a. Predict workers wages based on characteristics like their industry, education,


level of experience… etc

# Step 1: load data

import pandas as pd

df = pd.read_csv("hourly_wages.csv")

df

# Get required variables as numpy array

predictors = (df[df.columns[[1,2,3,4,5,6,7,8,9]]].values)

target = (df[df.columns[0]].values)

#Step 2: Specifying a model

205
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
# Import necessary modules

import keras

from keras.layers import Dense

from keras.models import Sequential

# Save the number of columns in predictors: n_cols

n_cols = predictors.shape[1]

# Set up the model: model

model = Sequential()

# Add the first layer

model.add(Dense(50, activation='relu', input_shape=(n_cols,)))

# Add the second layer

model.add(Dense(32, activation='relu'))

# Add the output layer

model.add(Dense(1))

3. Step 2 of 4: Compiling the model

a. To compile the model, you need to specify the optimizer and loss function

i. Specify the optimizer

1. Many options and mathematically complex

2. “Adam” is usually a good choice

ii. Loss function

1. “mean_squared_error” common for regression

b. Read more on optimizers https://keras.io/optimizers/#adam

c. See the original paper of adam https://arxiv.org/abs/1412.6980v8

# Compile the model

206
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])

# Verify that model contains information from compiling

print("Loss function: " + model.loss)

4. Step 3 of 4: Fitting the model

a. What is fitting a model

i. Applying backpropagation and gradient descent with your data to update


the weights

ii. Scaling data before fitting can ease optimization

# Fit the model

model.fit(predictors, target)

5. Step 4 of 4: predict

model.predict(predictors)

207
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
33.9 Creating keras Classification Models

1. ‘categorical_crossentropy’ loss function

2. Similar to log loss: Lower is better

3. Add metrics = [‘accuracy’] to compile step for easy-to-understand diagnostics

4. Output layer has separate node for each possible outcome, and uses ‘softmax’
activation

Process:

1. Modelling with a new dataset "Titanic" for a classification problem

2. You will use predictors such as age, fare and where each passenger embarked from to
predict who will survive.

# understand data

import pandas as pd

from keras.utils import to_categorical

df = pd.read_csv("titanic_all_numeric_train.csv")

predictors = df.drop(['survived'], axis=1).as_matrix()

target = to_categorical(df.survived)

df = pd.read_csv("titanic_all_numeric_test.csv")

test_data = df.drop(['survived'], axis=1).as_matrix()

# Import necessary modules

import keras

from keras.layers import Dense

from keras.models import Sequential

# Set up the model

model = Sequential()

# Save the number of columns in predictors: n_cols

n_cols = predictors.shape[1]

208
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
# Add the first layer

model.add(Dense(32, activation='relu', input_shape=(n_cols,)))

# Add the output layer

model.add(Dense(2, activation='softmax'))

# Compile the model

model.compile(optimizer='sgd',

loss='categorical_crossentropy',

metrics=['accuracy'])

# Fit the model

model.fit(predictors, target)

33.10 Using models

1. Save

2. Reload

3. Make predictions

from keras.models import load_model

model.save('model_file.h5')

my_model = load_model('model_file.h5')

# Calculate predictions: predictions

predictions = model.predict(test_data)

# Calculate predicted probability of survival: predicted_prob_true

predicted_prob_true = predictions[:,1]

# print predicted_prob_true

print(predicted_prob_true)

209
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
33.11 Understanding Model Optimization

1. Why optimization is hard in deep learning

a. Simultaneously optimizing 1000s of parameters with complex relationships

b. Updates may not improve model meaningfully

c. Updates too small (if learning rate is low) or too large (if learning rate is high)

2. Scenario: Try to optimize a model at a very low learning rate, a very high learning rate,
and a "just right" learning rate. We need to look at the results after running this exercise,
remembering that a low value for the loss function is good

a. Exercise 1: Let us optimize using Stochastic Gradient Descent

# Step 1 of 3: create a tuple

input_shape=(10,)

type(input_shape)

input_shape

# Step 2 of 3: Create model as a function to loop from starting

def get_new_model(input_shape = input_shape):

model = Sequential()

model.add(Dense(100, activation='relu', input_shape = input_shape))

model.add(Dense(100, activation='relu'))

model.add(Dense(2, activation='softmax'))

return(model)

# Step 3 of 3: Changing optimization parameters

# Import the SGD optimizer

from keras.optimizers import SGD

# Create list of learning rates: lr_to_test

lr_to_test = [.000001, 0.01, 1]

# Loop over learning rates

for lr in lr_to_test:

210
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
print('\n\nTesting model with learning rate: %f\n'%lr )

# Build new model to test, unaffected by previous models

model = get_new_model()

# Create SGD optimizer with specified learning rate: my_optimizer

my_optimizer = SGD(lr=lr)

# Compile the model

model.compile(optimizer=my_optimizer, loss='categorical_crossentropy',
metrics=['accuracy'])

# Fit the model

model.fit(predictors, target)

3. Which of the following could prevent a model from showing an improved loss in its first
few epochs?

a. Learning rate too low.

b. Learning rate too high.

c. Poor choice of activation function.

d. All of the above.

211
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
33.12 Model Validation

1. Validation in deep learning

a. Commonly used validation is split rather than cross validation

b. Deep learning widely used on large datasets

c. Single validation score is based on large amount of data, and is reliable

d. Repeated training from cross-validation would take long time

2. Exercise 1: Evaluating model accuracy on validation dataset

# Save the number of columns in predictors: n_cols

n_cols = predictors.shape[1]

input_shape = (n_cols,)

# Specify the model

model = Sequential()

model.add(Dense(100, activation='relu', input_shape = input_shape))

model.add(Dense(100, activation='relu'))

model.add(Dense(2, activation='softmax'))

# Compile the model

model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])

# Fit the model

hist = model.fit(predictors, target, validation_split=0.3)

3. Exercise 2: Early stopping, Optimizing the optimization

# Import EarlyStopping

from keras.callbacks import EarlyStopping

# Save the number of columns in predictors: n_cols

n_cols = predictors.shape[1]

input_shape = (n_cols,)

212
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
# Specify the model

model = Sequential()

model.add(Dense(100, activation='relu', input_shape = input_shape))

model.add(Dense(100, activation='relu'))

model.add(Dense(2, activation='softmax'))

# Compile the model

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Define early_stopping_monitor

early_stopping_monitor = EarlyStopping(patience=2)

# Fit the model

model.fit(predictors, target, epochs=30, validation_split=0.3,


callbacks=[early_stopping_monitor])

4. Exercise 3: Experimenting with wider networks

# Define early_stopping_monitor

early_stopping_monitor = EarlyStopping(patience=2)

# Create the new model: model_1

model_1 = Sequential()

# Add the first and second layers

model_1.add(Dense(10, activation='relu', input_shape=input_shape))

model_1.add(Dense(10, activation='relu'))

# Add the output layer

model_1.add(Dense(2, activation='softmax'))

# Compile model_1

model_1.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])

213
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
# Create the new model: model_2

model_2 = Sequential()

# Add the first and second layers

model_2.add(Dense(100, activation='relu', input_shape=input_shape))

model_2.add(Dense(100, activation='relu'))

# Add the output layer

model_2.add(Dense(2, activation='softmax'))

# Compile model_2

model_2.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])

# Fit model_1

model_1_training = model_1.fit(predictors, target, epochs=15, validation_split=0.2,


callbacks=[early_stopping_monitor], verbose=False)

# Fit model_2

model_2_training = model_2.fit(predictors, target, epochs=15, validation_split=0.2,


callbacks=[early_stopping_monitor], verbose=False)

# Create the plot

import matplotlib.pyplot as plt

plt.plot(model_1_training.history['val_loss'], 'r', model_2_training.history['val_loss'], 'b')

plt.xlabel('Epochs')

plt.ylabel('Validation score')

plt.show()

5. Note: Model_2 blue line in the graph has less loss ,so it is good

6. Exercise 4: Adding layers to a network(Deeper Network)

a. In above exercise 3, you’ve seen how to experiment with wider networks. In this
exercise, you'll try a deeper network (more hidden layers).

The input shape to use in the first hidden layer

214
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
input_shape = (n_cols,)

# Create the new model: model_1

model_1 = Sequential()

# Add one hidden layer

model_1.add(Dense(50, activation='relu', input_shape=input_shape))

# Add the output layer

model_1.add(Dense(2, activation='softmax'))

# Compile model_1

model_1.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])

# Create the new model: model_2

model_2 = Sequential()

# Add the first, second, and third hidden layers

model_2.add(Dense(50, activation='relu', input_shape=input_shape))

model_2.add(Dense(50, activation='relu'))

model_2.add(Dense(50, activation='relu'))

# Add the output layer

model_2.add(Dense(2, activation='softmax'))

# Compile model_2

model_2.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])

# Fit model 1

model_1_training = model_1.fit(predictors, target, epochs=20, validation_split=0.4,


callbacks=[early_stopping_monitor], verbose=False)

# Fit model 2

model_2_training = model_2.fit(predictors, target, epochs=20, validation_split=0.4,


callbacks=[early_stopping_monitor], verbose=False)

215
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
# Create the plot

plt.plot(model_1_training.history['val_loss'], 'r', model_2_training.history['val_loss'], 'b')

plt.xlabel('Epochs')

plt.ylabel('Validation score')

plt.show()

216
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
33.13 Model Capacity

1. How much wider and deeper networks are good?

a. Workflow for optimizing model capacity

i. Start with a small network

ii. Gradually increase capacity

iii. Keep increasing capacity until validation score is no longer Improving

Mean Squared
Hidden Layers Nodes Per Layer Error Next Step
1 100 5.4 Increase Capacity
1 250 4.8 Increase Capacity
2 250 4.4 Increase Capacity
3 250 4.5 Decrease Capacity
3 200 4.3 Done
2. If we are not checking as shown above ,there is a chance to overfit the model

217
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
218
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
### Level 06 of 08: Project on Deep Learning ###

34. Project using keras and tensorflow


Business Statement: MNIST ` Digit Recognition in Keras.

Step 1 of 5: Setting up the Environment

# imports for array-handling and plotting

import numpy as np

import matplotlib

matplotlib.use('agg')

import matplotlib.pyplot as plt

import os

os.environ['TF_CPP_MIN_LOG_LEVEL']='3'

# for testing on CPU

os.environ['CUDA_VISIBLE_DEVICES'] = ''

# keras imports for the dataset and building our neural network

from keras.datasets import mnist

from keras.models import Sequential, load_model

from keras.layers.core import Dense, Dropout, Activation

from keras.utils import np_utils

Step2 of 5: Understanding and Preparing the Dataset

# Load the dataset

(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Observe some images

fig = plt.figure()

for i in range(9):

plt.subplot(3,3,i+1)

219
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
plt.tight_layout()

plt.imshow(X_train[i], cmap='gray', interpolation='none')

plt.title("Class {}".format(y_train[i]))

plt.xticks([])

plt.yticks([])

fig

# In order to train our neural network to classify images we first have to unroll the # height
×width pixel format into one big vector - the input vector. So its length

# must be  28 * 28 = 784. But let's graph the distribution of our pixel values.

fig = plt.figure()

plt.subplot(2,1,1)

plt.imshow(X_train[0], cmap='gray', interpolation='none')

plt.title("Class {}".format(y_train[0]))

plt.xticks([])

plt.yticks([])

plt.subplot(2,1,2)

plt.hist(X_train[0].reshape(784))

plt.title("Pixel Value Distribution")

fig

#Note that the pixel values range from 0 to 255: the background majority close to 0, and those
close to #255 representing the digit.

# Normalizing the input data helps to speed up the training. Also, it reduces the chance of
getting stuck # in local optima, since we're using stochastic gradient descent to find the optimal
weights for the #network.

#Let's reshape our inputs to a single vector and normalize the pixel values to lie between 0 and
1.

# let's print the shape before we reshape and normalize

220
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
print("X_train shape", X_train.shape)

print("y_train shape", y_train.shape)

print("X_test shape", X_test.shape)

print("y_test shape", y_test.shape)

# building the input vector from the 28x28 pixels

X_train = X_train.reshape(60000, 784)

X_test = X_test.reshape(10000, 784)

X_train = X_train.astype('float32')

X_test = X_test.astype('float32')

# normalizing the data to help with the training

X_train /= 255

X_test /= 255

# print the final input shape ready for training

print("Train matrix shape", X_train.shape)

print("Test matrix shape", X_test.shape)

#Let us see the number of expected outcomes

print(np.unique(y_train, return_counts=True))

#Let's encode our categories - digits from 0 to 9 - using one-hot encoding. The result is a vector
with a #length equal to the number of categories. The vector is all zeroes except in the position
for the #respective category. Thus a '5' will be represented by [0,0,0,0,0,1,0,0,0,0]

# one-hot encoding using keras' numpy-related utilities

n_classes = 10

print("Shape before one-hot encoding: ", y_train.shape)

Y_train = np_utils.to_categorical(y_train, n_classes)

Y_test = np_utils.to_categorical(y_test, n_classes)

print("Shape after one-hot encoding: ", Y_train.shape)

Step 3 of 5: Building the Network

221
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
1. Our pixel vector serves as the input. Then, two hidden 512-node layers, with enough
model complexity for recognizing digits. For the multi-class classification we add another
densely-connected (or fully-connected) layer for the 10 different output classes. For this
network architecture we can use the Keras Sequential Model. We can stack layers using
the .add() method.

2. When adding the first layer in the Sequential Model we need to specify the input shape
so Keras can create the appropriate matrices. For all remaining layers the shape is
inferred automatically.

3. In order to introduce nonlinearities into the network and elevate it beyond the capabilities
of a simple perceptron we also add activation functions to the hidden layers. The
differentiation for the training via backpropagation is happening behind the scenes
without having to implement the details.

4. We also add dropout as a way to prevent overfitting. Here we randomly keep some
network weights fixed when we would normally update them so that the network doesn't
rely too much on very few nodes.

5. The last layer consists of connections for our 10 classes and the softmax activation
which is standard for multi-class targets.

# building a linear stack of layers with the sequential model

model = Sequential()

model.add(Dense(512, input_shape=(784,)))

model.add(Activation('relu'))

model.add(Dropout(0.2))

222
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
model.add(Dense(512))

model.add(Activation('relu'))

model.add(Dropout(0.2))

model.add(Dense(10))

model.add(Activation('softmax'))

Step 4 of 5: Compiling and Training the Model

# compiling the sequential model

model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')

#We can start the training the model

# training the model and saving metrics in history

history = model.fit(X_train, Y_train,

batch_size=128, epochs=8,

verbose=2,

validation_data=(X_test, Y_test))

# saving the model

save_dir = "C:/Users/Hi/Google Drive/01 Data Science Lab Copy/02 Lab


Data/Python/Deep Learning/"

model_name = 'keras_mnist.h5'

model_path = os.path.join(save_dir, model_name)

model.save(model_path)

print('Saved trained model at %s ' % model_path)

# plotting the metrics

fig = plt.figure()

plt.subplot(2,1,1)

223
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
plt.plot(history.history['acc'])

plt.plot(history.history['val_acc'])

plt.title('model accuracy')

plt.ylabel('accuracy')

plt.xlabel('epoch')

plt.legend(['train', 'test'], loc='lower right')

plt.subplot(2,1,2)

plt.plot(history.history['loss'])

plt.plot(history.history['val_loss'])

plt.title('model loss')

plt.ylabel('loss')

plt.xlabel('epoch')

plt.legend(['train', 'test'], loc='upper right')

plt.tight_layout()

fig

Note: This learning curve looks quite good! We see that the loss on the training set is
decreasing rapidly for the first two epochs. This shows the network is learning to classify
the digits pretty fast. For the test set the loss does not decrease as fast but stays roughly
within the same range as the training loss. This means our model generalizes well to
unseen data.

224
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
Step 5 of 5: Evaluate the Model Performance

mnist_model = load_model("C:/Users/Hi/Google Drive/01 Data Science Lab Copy/02


Lab Data/Python/Deep Learning/keras_mnist.h5")

loss_and_metrics = mnist_model.evaluate(X_test, Y_test, verbose=2)

print("Test Loss", loss_and_metrics[0])

print("Test Accuracy", loss_and_metrics[1])

# load the model and create predictions on the test set

mnist_model = load_model("C:/Users/Hi/Google Drive/01 Data Science Lab Copy/02


Lab Data/Python/Deep Learning/keras_mnist.h5")

predicted_classes = mnist_model.predict_classes(X_test)

# see which we predicted correctly and which not

correct_indices = np.nonzero(predicted_classes == y_test)[0]

incorrect_indices = np.nonzero(predicted_classes != y_test)[0]

print()

print(len(correct_indices)," classified correctly")

print(len(incorrect_indices)," classified incorrectly")

# plot 9 correct predictions

for i, correct in enumerate(correct_indices[:9]):

plt.subplot(6,3,i+1)

plt.imshow(X_test[correct].reshape(28,28), cmap='gray', interpolation='none')

plt.title("Predicted {}, Class {}".format(predicted_classes[correct], y_test[correct]))

plt.xticks([])

225
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
plt.yticks([])

# plot 9 incorrect predictions

for i, incorrect in enumerate(incorrect_indices[:9]):

plt.subplot(6,3,i+10)

plt.imshow(X_test[incorrect].reshape(28,28), cmap='gray', interpolation='none')

plt.title("Predicted {}, Class {}".format(predicted_classes[incorrect], y_test[incorrect]))

plt.xticks([])

plt.yticks([])

35. Convolutional Neural Networks(CNN)


1. How self-drive car identify road signs?

2. Image is a data of pixels

3. Exercise 1: read image as a array and print array as a image

# Import matplotlib

import matplotlib.pyplot as plt

# Load the image

data = plt.imread('C:\\Users\\Hi\\Google Drive\\01 Data Science Lab Copy\\02 Lab


Data\\Python\\bricks.png')

# Display the image

226
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
plt.imshow(data)

plt.show()

4. Exercise 2:Changing images data, hence image

# Set the red channel in this part of the image to 1

data[:40, :40, 0] = 1

# Set the green channel in this part of the image to 0

data[:40, :40, 1] = 0

# Set the blue channel in this part of the image to 0

data[:40, :40, 2] = 0

# Visualize the result

plt.imshow(data)

plt.show()

5. Exercise 3: One hot encoding understanding

labels= ['shoe', 'shirt', 'shoe', 'shirt', 'dress', 'dress', 'dress']

# The number of image categories

n_categories = 3

# The unique values of categories in the data

categories = np.array(["shirt", "dress", "shoe"])

# Initialize one_hot_encoding_labels as all zeros

one_hot_encoding_labels = np.zeros((len(labels), n_categories))

# Loop over the labels

for ii in range(len(labels)):

227
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
# Find the location of this label in the categories variable

jj = np.where(categories == labels[ii])

# Set the corresponding zero to one

one_hot_encoding_labels[ii, jj] = 1

one_hot_encoding_labels

6.

Business Statement: MNIST ` Digit Recognition in Keras.

228
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182
229
RR ITEC #209, Nilagiri Block, Adithya Enclave, Ameerpet @8374899166, 8790998182

You might also like