0% found this document useful (0 votes)
14 views26 pages

Lecture07. ANN (Chapter 10-2)

Uploaded by

emad qedies
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views26 pages

Lecture07. ANN (Chapter 10-2)

Uploaded by

emad qedies
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

CSC 485/585 DA 515

Introduction to Machine Learning

Lecture 7
MLP or ANNs
Chapter 10-2

Fall 2024
ANNs: artificial neural networks

Outline:
 Perceptron
 MLPs: Multilayer Perceptrons == ANNs
 Classification
 Regression

-----------
 Playground
https://playground.tensorflow.org

 Keras or Pytorch
 Tensorbaord
Short History
 1958: Perceptron (linear model)
 1969: Perceptron has limitation
 1980s: Multi-layer perceptron
Do not have significant difference from DNN today

 1986: Backpropagation

 1989: 1 hidden layer is “good enough”, why deep?


Usually more than 3 hidden layers is not helpful

3
Toward Deep Learning

After year 2000: More data and more computing power, ANN->DNN

 2009: GPU
 2011: Start to be popular in speech recognition
 2012: Win ILSVRC image competition
 2014-2015: Alpha Go

 2022 ChatGPT
 Now: Deep Learning =
Lots of training data + Parallel
Computation + Scalable, smart algorithms
(transformer)
4
ANN: Why Deep

 Recent publications show that Deep is more efficient:


 Same number of neurons: deep gets better performance.
 Same performance: deep uses fewer neurons

5
Shallow vs Deep NNs
 Shallow vs Deep NNs?
 Deep is better

 Now Deep Learning uses ANNs as the very core of Deep Learning
 is built around ANNs, which form the core of its architecture
 Is versatile, powerful, and scalable.

 Applications: ideal to tackle large and highly complex Machine


Learning tasks such as
 Images classification (e.g., Google Images),
 speech recognition services (e.g., Apple’s Siri),
 recommending the best videos to watch to hundreds of millions of users every
day (e.g., YouTube), or
 learning to beat the world champion at the game of Go (DeepMind’s AlphaGo)
 ChatGPT (larger scale language model or LLM)

6
Deep Models of Recent years

7
1. Sk-learn ANN
 Classifier:
https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPCl
assifier.html

 For three layers:


hidden_layer_sizesarray-like of shape(n_layers - 2,),
default=(100,)
MLPClassifier( hidden_layer_sizes=(100,))
The ith element represents the number of neurons in the ith
hidden layer.

 For four layers: For example, if you specify


model = MLPClassifier( hidden_layer_sizes=(50, 25))
it means there are two hidden layers with 50 and 25 neurons
respectively.
8

 Similar for regressor: MLPRegressor


2. Implementing MLPs with Keras

Keras is a high-level Deep Learning API that allows you to easily build,
train, evaluate, and execute all sorts of neural networks.

APIs:
1. Sequential API: easy but limited
2. Functional API:
 more complex topologies, or
 with multiple inputs or outputs.
9
2.1 Sequential API:
 Sequential API code:
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(300, activation="relu"),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])

10
2.2 Building Complex Models
Using the Functional API
 use function:
 Wide + deep model
 Use all features twice

input_ = keras.layers.Input(shape=X_train.shape[1:])
hidden1 = keras.layers.Dense(30,activation="relu")(input_)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.Concatenate()([input_, hidden2])
output = keras.layers.Dense(1)(concat)
model = keras.Model(inputs=[input_], outputs=[output])

11
Handling multiple inputs: split features

12
Handling multiple inputs

input_A = keras.layers.Input(shape=[5],name="wide_input")
input_B = keras.layers.Input(shape=[6],name="deep_input")
hidden1 = keras.layers.Dense(30,activation="relu")(input_B)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.concatenate([input_A, hidden2])
output = keras.layers.Dense(1, name="output")(concat)
model = keras.Model(inputs=[input_A, input_B], outputs=[output])

Figure see previous slide

13
Multi-output

output = keras.layers.Dense(1, name="main_output")(concat)


aux_output = keras.layers.Dense(1, name="aux_output")(hidden2)

model = keras.Model(inputs=[input_A, input_B], outputs=[output,


aux_output])

14
5 Step Life-Cycle for Neural Network
Models in Keras

15
Saving and Restoring a Model

Saving a trained Keras model:

model.save("my_keras_model.h5")

HDF5 format to save:


 model’s architecture (including every
layer’s hyperparameters)
 values of all the model parameters for
every layer (e.g., connection weights and
biases).
 optimizer (including its hyperparameters
and any state it may have). 16
Restoring a saved model

Loading the model is just as easy:

model = keras.models.load_model("my_keras_model.h5")

Reuse the model:


 For prediction
 deployment

17
Using Callbacks

The fit() method accepts a callbacks argument: specify a list


of objects that Keras will call:
 at the start and end of training,
 at the start and end of each epoch,
 even before and after processing each batch.

For example, the ModelCheckpoint callback saves checkpoints of your


model at regular intervals during training, by default at the end of each
epoch:

[...] # build and compile the model


checkpoint_cb = keras.callbacks.ModelCheckpoint("my_keras_model.h5")
history = model.fit(X_train, y_train, epochs=10,callbacks=[checkpoint_cb])

18
Callback example

checkpoint_cb =
keras.callbacks.ModelCheckpoint("my_keras_model.h5",
save_best_only=True)

history = model.fit(X_train, y_train, epochs=10,


validation_data=(X_valid, y_valid),
callbacks=[checkpoint_cb])

model = keras.models.load_model("my_keras_model.h5")

# roll back to best model:


when creating the ModelCheckpoint:
use a validation set during training, and set save_best_only=True.

19
Early Stopping: over epochs
 Train the model many epochs:
 if val_error < minimum_val_error:

20
Early stopping

To implement early stopping:

 use the EarlyStopping callback.

early_stopping_cb =
keras.callbacks.EarlyStopping(patience=10,
restore_best_weights=True)

history = model.fit(X_train, y_train, epochs=100,


validation_data=(X_valid, y_valid),
callbacks=[checkpoint_cb, early_stopping_cb])

no progress on the validation set for a number of epochs (defined by


the patience)
21
Demo
 You need install tensorflow uinsg pip install xxx in Jupyter notebook
pip install tensorflow

1. Classification
2. Regression
3. Complex model
4. Save and restore

As you can see, you can build any sort of architecture you want quite
easily with the Functional API. Let’s look at one last way you can build
Keras models.

22
Parameter Tuning
 Grid Search (P321, might take hours)
 autoML (both best structure and parameters)

https://machinelearningmastery.com/automl-libraries-for-python/

 Wide or Deep:
 An MLP with just one hidden layer can theoretically model even the most
complex functions, provided it has enough neurons.
 But for complex problems, deep networks have a much higher parameter
efficiency than shallow ones: they can model complex functions using
exponentially fewer neurons than shallow nets, allowing them to reach
much better performance with the same amount of training data.

 Number of Hidden Layers/Number of Neurons per Hidden Layer


More complex problem, more neurons. Too many, overfitting
 Learning Rate, Batch Size, and Other Hyperparameters
23
Summary

 Single Perceptron
 MLP: ANN for Classification/Regression

 Train: Back-propagations

 Keras:
 Sequential/Functional API
 Save and Restore model
 Callback to early stop, find the best model

24
-- END --

 Next Week:
 Oct. 16: Naïve Byes

 Homework
 HW4.Part A: due by this Friday
 HW4.Part B: due in two weeks (same day as the
Midterm)

25
Mid-term: Oct. 23

 Mid-term: Oct. 23
 Lecture time
 2.5 hrs (all lectures)
 100 points
 paper-pencil, closed-book
 Materials: Lecture 1-7

26

You might also like