0% found this document useful (0 votes)

79 views66 pages

Chapter10 Keras

introduction to artificial neural networks with Keras

Uploaded by

Sivaiah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views66 pages

Chapter10 Keras

introduction to artificial neural networks with Keras

Uploaded by

Sivaiah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 66

Chapter 10: Introduction to Artificial

Neural Networks with Keras

Tsz-Chiu Au
[email protected]

Ulsan National Institute of Science and Technology (UNIST)

South Korea
Introduction to ANNs
• Artificial neural networks (ANNs) are machine learning
models inspired by the networks of biological neurons found
in our brains.
• ANNs are at the very core of Deep Learning, being used in
» Google Images
» Apple’s Siri
» YouTube
» DeepMind’s AlphaGo
• We will discuss what are ANNs and how to implement them in
Keras.
From Biological to Artificial Neurons
• ANNs were first introduced back in 1943 by the neurophysiologist
Warren McCulloch and the mathematician Walter Pitts.
» A simplified computational models of how biological neurons might work together
in animal brains to perform complex computations using propositional logic.
• The early successes of ANNs led to the widespread belief that we
would soon be conversing with truly intelligent machines.
• Sadly, this promise went unfulfilled, triggering the first AI winter in
the 1970s.
• In mid-80s, new architectures such as Multilayer Perceptrons
(MLPs) and better training techniques such as backpropagation
algorithm revives the interest in connectionism (i.e., the study of
neural networks).
• However, by the 1990s, other powerful Machine Learning
techniques such as Support Vector Machines and Random Forests
overtake ANNs.
From Biological to Artificial Neurons (cont.)
• Since early 2010s, the success of deep learning in computer vision,
there is a huge wave of interest in ANNs.
• Reasons for this AI spring:
» There is now a huge quantity of data available to train neural networks
§ ANNs frequently outperform other ML techniques on very large and complex problems.
» The tremendous increase in computing power since the 1990s now makes
it possible to train large neural networks in a reasonable amount of time.
§ The availability of Powerful GPU cards and cloud computing platforms.
» The training algorithms have been improved.
§ We can train very large networks now.
» Some theoretical limitations of ANNs have turned out to be benign in
practice.
» ANNs seem to have entered a virtuous circle of funding and progress.
A Biological Neuron
Multiple layers in a biological neural
network (human cortex)
Logical Computations with Neurons
• McCulloch and Pitts proposed a very simple model of the
biological neuron, which later became known as an artificial
neuron: it has one or more binary (on/off) inputs and one
binary output.
» Even with such a simplified model it is possible to build a network of
artificial neurons that computes any logical proposition you want.

• These networks can be combined to compute complex logical

expressions
The Perceptron
• The Perceptron is one of the simplest ANN architectures,
invented in 1957 by Frank Rosenblatt.
» Based on a slightly different artificial neuron called a threshold logic
unit (TLU), or sometimes a linear threshold unit (LTU).
§ Compute a weighted sum of its inputs then applies a step function
The Perceptron (cont.)
• A single TLU can be used for simple linear binary
classification.
» It computes a linear combination of the inputs, and if the
result exceeds a threshold, it outputs the positive class.
• A Perceptron is simply composed of a single layer of
TLUs.
» When all the neurons in a layer are connected to every
neuron in the previous layer (i.e., its input neurons), the
layer is called a fully connected layer, or a dense layer.
The Perceptron (cont.)
• Architecture of a Perceptron with two input neurons, one bias
neuron, and three output neurons:
How is a Perceptron trained?
• The Perceptron training algorithm proposed by Rosenblatt was
largely inspired by Hebb’s rule.
» when a biological neuron triggers another neuron often, the connection
between these two neurons grows stronger.
» “Cells that fire together, wire together”
• Perceptron learning rule:
» For every output neuron that produced a wrong prediction, it reinforces
the connection weights from the inputs that would have contributed to
the correct prediction.

• Perceptron convergence theorem:

» if the training instances are linearly separable, Rosenblatt demonstrated
that this algorithm would converge to a solution.
Perception in SciKit-Learn
Limitations of Perceptrons
• In their 1969 monograph Perceptrons, Marvin Minsky and
Seymour Papert highlighted a number of serious weaknesses
of Perceptrons
» Exclusive OR (XOR) classification problem
• But some of the limitations can be eliminated by Multilayer
Perceptron (MLP)
Architecture of a Multilayer Perceptron
Backpropagation Algorithm
• The field of Deep Learning studies deep neural networks
(DNNs)---ANNs contain a deep stack of hidden layers---and
more generally models containing deep stacks of
computations.
• For many years researchers struggled to find a way to train
MLPs, without success.
• In 1986, David Rumelhart, Geoffrey Hinton, and Ronald
Williams introduced the backpropagation training
algorithm.
» Two passes through the network (one forward, one backward)
» Compute the gradient of the network’s error with regard to
every single model parameter.
» Adjust the parameters by using gradient descent.
Backpropagation Algorithm (cont.)
• The algorithm handles one mini-batch at a time (e.g., 32 instances in the
training set).
• It goes through the full training set multiple times. Each pass is called an
epoch.
• Forward pass: Each mini-batch is passed from the network’s input layer to
the output layer through the hidden layers. All intermediate results are
preserved.
• Measures the network’s output error by a loss function that compares the
desired output and the actual output of the network.
• Computes how much each output connection contributed to the error by
the chain rule.
• Reserve pass: Measures how much of these error contributions came
from each connection in the layer below, again using the chain rule,
working backward until the algorithm reaches the input layer.
• Performs a Gradient Descent step to tweak all the connection weights in
the network, using the error gradients it just computed.
Activation Functions
• Backpropagation algorithm cannot use with the step function,
which provides no gradient for Gradient Descent
• Logistic (sigmoid) function: σ(z) = 1 / (1 + exp(–z)).
• Hyperbolic tangent function: tanh(z) = 2σ(2z) – 1
• Rectified Linear Unit function: ReLU(z) = max(0, z)
Output Neurons for Regression MLPs
• For regression tasks, an ANN predicts a single value only. Thus,
one output neuron is sufficient.
• For multivariate regression, there is one output neuron per
output dimension.
• Output neurons should return any range of values.
» ReLU function: ReLU(z) = max(0, z)
» Softplus function: softplus(z) = log(1 + exp(z))
» logistic function or hyperbolic tangent with scaling factors.
• Loss functions:
» Mean Squared Error
» Mean Absolute Error
» Huber Loss
Typical regression MLP architecture
Output Neurons for Classification MLPs
• For binary classification tasks, there is one output neuron with
the logistic activation function
» The output can be interpreted as the estimated probability of the
positive class.
• For multilabel binary classification tasks, you need multiple
output neurons.
• For multiclass classification tasks, there is one output neuron
per class, and the softmax activation function should be used
for the whole output layer.
Output Neurons for Classification MLPs (cont.)
• The predicted class is:

• Loss function: cross-entropy loss (a.k.a. the

log loss):
A modern MLP for classification
Typical classification MLP architecture
Implementing MLPs with Keras
• Keras is a high-level Deep Learning API that
allows you to easily build, train, evaluate, and
execute all sorts of neural networks.
» https://keras.io
» Computation backend: TensorFlow, Microsoft
Cognitive Toolkit (CNTK), Theano, Apache MXNet,
Apple’s Core ML, JavaScript or TypeScript, and
PlaidML.
• tf.keras: Extended Keras implementation based
on TensorFlow with TensorFlow-specific features.
• PyTorch is also quite popular.
Multibackend Keras vs. tf.keras
Installing TensorFlow 2
• If you are using Google Colab only, you can skip this step.
• If you plan to run your code on your own computer, please install Jypyter,
Scikit-Learn, etc. according to the instructions in Chapter 2.
• Activate the virtual environment and then use pip to install TensorFlow 2:

• Open a Python shell or a Jupyter notebook and print the version of

TensorFlow and tf.keras
Fashion MNIST Dataset
• 70,000 grayscale images of 28 × 28 pixels each, with 10 classes

• Drop-in replacement of MNIST in Chapter 2.

» But the images represent fashion items rather than handwritten digits.
» More challenging than MNIST: a simple linear model reaches about 92%
accuracy on MNIST, but only about 83% on Fashion MNIST.
Using Keras to Load the Dataset
• Keras provides some utility functions to fetch and load common datasets.

• Loading data from Keras is different from Scikit-Learn:

» Every image is represented as a 28 × 28 array rather than a 1D array of size 784.

» The pixel intensities are represented as integers (from 0 to 255) rather than floats (from
0.0 to 255.0)
• Since we are going to train the neural network using Gradient Descent, we
must scale the input features down to the 0–1 range by dividing them by
255.0:
Naming the Labels
• Unlike MNIST, Fashion MNIST needs the list of class names of
each label to know what the image are:

• For example, the first image in the training set represents a

coat:
Creating the model using the Sequential API

• The first method for building a neural network in tf.keras is the use of
Sequential API.
» Only for neural neteworks that compose of a single stack of layers connected
sequentially.
• The tf.keras code for building a classification MLP with two hidden layers:

» The Flatten layer convert each input image into a 1D array.

» Each Dense layer manages its own weight matrix, containing all the connection weights
between the neurons and their inputs, as well as the bias terms.
Creating the model using the Sequential API
(cont.)
• Alternatively, you can add the layers when the Sequential model is created.

• The model’s summary() method displays the information of the model’s layers:
Accessing the Information of a Model
• Directly get a model’s list of layers:

• All the parameters of a layer can be accessed using its get_weights() and set_weights() methods:
Compiling the Model
• Before training the model, you must compile the model:

• Use the "sparse_categorical_cross entropy" loss when we have

sparse labels (i.e., for each instance, there is just a tar- get class
index, from 0 to 9 in this case) and the classes are exclusive.
» Otherwise, the "categorical_crossentropy" loss if one-hot vectors is used
(i.e., [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.] represents class 3).
» Otherwise, use the "binary_crossentropy" loss if the "sigmoid" (i.e.,
logistic) activation function in the output layer is used for binary
classification tasks.
• Use “sgd” for Stochastic Gradient Descent (i.e., reverse-mode
autodiff plus Gradient Descent)
• Use “accuracy” because our model is a classifier.
Training the Model
• After compiling the model, call fit() to train the model with the training and
validation datasets.

• You should check whether overfitting occurs (i.e., accuracy >> val_accuracy)
• Consider passing the class_weight argument if the training set is skewed.
Drawing the Learning Curves
• fit() returns a History object, which contains:
» The training parameters (history.params)
» The list of epochs it went through (history.epoch)
» The loss and extra metrics at the end of each epoch on the training set
and on the validation set (history.history).
• You can draw the learning curves using matplotlib:
Drawing the Learning Curves (cont.)
• The learning curve shows the mean training loss and accuracy
measured over each epoch, and the mean validation loss and
accuracy measured at the end of each epoch:

• When reporting the learning curves, you should shift the training
curve in the above graph by half an epoch to the left.
Continue the Training
• If the model has not converged yet, call fit() again to continue the
training.
• If you are not satisfied with the performance of your model, you
should go back and tune the hyperparameters.
» Tune the learning rate
» Try another optimizer
» Adjust the number of layers, the number of neurons per layer, and the
types of activation functions to use for each hidden layer
» Change the batch size
• Finally, estimate the generalization error using the test set before
you deploy the model to production.

• Don’t tweak the hyperparameters to improve the accuracy of the

test set
Using the Model to Make Predictions
• After training the model, you can use the model’s predict() method to
make predictions on new instances:

• If you want to know the class with the highest estimated probability only,
use the predict_classes() method instead:

• They should be correct (otherwise, more training)

California Housing with the Sequential API
Building Complex Models Using the
Functional API
• You cannot use the Sequential API to build nonsequential
neural networks.
• For example, consider the Wide & Deep neural network:
» can learn both deep patterns (using the deep path) and simple rules
(through the short path)
Using the Functional API
• How about sending a subset of the features through the wide
path and a different subset (possibly overlapping) through the
deep path:
Using the Functional API (cont.)
• Compile, train, and evaluate the model, and then make
predictions:
Models with Multiple Outputs
• Reasons for having multiple outputs:
» The task may demand it.
» You have multiple independent tasks
based on the same data---multitask
classification.
» Add some auxiliary outputs for
regularization.
Models with Multiple Outputs (cont.)
• Each output will need its own loss function:

• Train the models with two datasets:

• Evaluate the outputs separately:

• Likewise, make predictions separately:

Using the Subclassing API to Build
Dynamic Models
• Both the Sequential API and the Functional API
are declarative
» Advantages:
§ The model can easily be saved, cloned, and shared
§ its structure can be displayed and analyzed
§ the framework can infer shapes and check types, so errors
can be caught early
§ It’s also fairly easy to debug, since the whole model is a
static graph of layers.
» Disadvantage:
§ The models are static---cannot build models that involve
loops, varying shapes, conditional branching, and other
dynamic behaviors.
Using the Subclassing API to Build
Dynamic Models (cont.)
• The Subclassing API: subclass the Model class, create the layers you need in
the constructor, and use them to perform the computations you want in the
call() method.
» Advantage: Imperative programming style---you can use for loops, if statements,
low-level TensorFlow operation in call()
» Disadvantage: Keras cannot inspect the model’s architecture and it is hard to debug
Saving and Restoring a Model
• When using the Sequential API or the Functional API, you can save a
trained Keras model:

• Keras will use the HDF5 format to save

» The model’s architecture (including every layer’s hyperparameters)
» The values of all the model parameters for every layer (e.g., connection
weights and biases)
» The optimizer (including its hyperparameters and any state it may have)
» etc.
• To load the model:
Using Callbacks to Save Intermediate
Models during Training
• Remember to save models at regular intervals during a long training
session to avoid losing everything if your computer crashes.
• The fit() method accepts a callbacks argument that lets you specify a list of
objects that Keras will call at the start and end of training, at the start and
end of each epoch, and even before and after processing each batch.

• If you use a validation set during training, you can set

save_best_only=True when creating the ModelCheckpoint to implement
early stopping:
Using Callbacks to Implement Early
Stopping and custom callbacks
• Another way to implement early stopping is to simply use the
EarlyStopping callback.

• If you need extra control, you can easily write your own
custom callbacks. For example,
Using TensorBoard for Visualization
• TensorBoard is a great interactive visualization
tool that you can use to
» view the learning curves during training
» compare learning curves between multiple runs
» visualize the computation graph
» analyze training statistics
» view images generated by your model
» visualize complex multidimensional data projected
down to 3D and automatically clustered for you
» etc.
Visualizing Learning Curves with TensorBoard
Using TensorBoard
• To use TensorBoard, you must modify your program so that it outputs the
data you want to visualize to special binary log files called event files.
• Each binary data record is called a summary.
• The TensorBoard server will monitor the log directory, and it will
automatically pick up the changes and update the visualizations.
• In general, you want to point the TensorBoard server to a root log
directory and configure your program so that it writes to a different
subdirectory every time it runs.
• Define the root log directory for TensorBoard logs
Using TensorBoard (cont.)
• Keras provides the TensorBoard() callback:

• The callback automatically create the log directory, generate

event files and write summaries to them during training.
• The directory structure:
Using TensorBoard (cont.)
• Start the TensorBoard server by running a command in a
terminal:

• Once the server is up, you can open a web browser and go to
http://localhost:6006
• To use TensorBoard directly within Jupyter:
Using TensorBoard (cont.)
• TensorFlow offers a lower-level API in the tf.summary package.
» E.g., you can create a SummaryWriter using the create_file_writer() function,
and it uses this writer as a context to log scalars, histograms, images, audio,
and text, all of which can then be visualized using TensorBoard
What we’ve learned so far
• We learnt about the history of neural nets research.
• What an MLP is and how you can use it for
classification and regression
• How to use tf.keras’s Sequential API to build MLPs
• How to use the Functional API or the Subclassing API to
build more complex model architectures
• How to save and restore a model
• How to use callbacks for checkpointing, early stopping,
and more.
• How to use TensorBoard for visualization.
Fine-Tuning Neural Network Hyperparameters
• There are many hyperparameters to tweak:
» E.g., the number of layers, the number of neurons per layer, the type
of activation function to use in each layer, the weight initialization
logic
• How do you know what combination of hyperparameters is
the best?
• One option is to try many combinations of hyperparameters
and see which one works best on the validation set (or use K-
fold cross-validation).
» E.g., use GridSearchCV or RandomizedSearchCV to explore the
hyperparameter space.
Fine-Tuning Hyperparameters (cont.)

• Step 1: Wrap our Keras models in objects that mimic regular

Scikit-Learn regressors:

• Step 2: Create a KerasRegressor based on this build_model()

function:
Fine-Tuning Hyperparameters (cont.)

• Step 3: Use the KerasRegressor object like a regular Scikit-

Learn regressor for training, evaluation, and prediction
» Any extra parameter you pass to the fit() method will get passed to
the underlying Keras model.
» The score will be the opposite of the MSE because Scikit-Learn wants
scores, not losses (i.e., higher should be better).

• Then use GridSearchCV() to perform grid search

Fine-Tuning Hyperparameters (cont.)
• Use a randomized search rather than grid search if there are
too many combinations of hyperparameters.

• After exploring for many hours, the result is

Beyond Randomized search
• When training with randomized search is slow, first run a quick
random search using wide ranges of hyperparameter values, then
run another search using smaller ranges of values centered on the
best ones found during the first run, and so on.
• Some Python libraries you can use to optimize hyperparameters:
» e.g., Hyperopt, Hyperas, kopt, Talos, Keras Tuner, Scikit-Optimize (skopt), Spearmint,
Hyperband, Sklearn-Deap.
• Some companies offer services for hyperparameter optimization:
» e.g., Google Cloud AI Platform’s hyperparameter tuning service, Arimo and SigOpt,
and CallDesk’s Oscar.
• Google AutoML not just search for hyperpara meters but also to look for the
best neural network architecture for the problem.
Number of Hidden Layers
• Theoretically, you can use a shallow neural networks to model even the
most complex functions, provided it has enough neurons.
• But deep networks have a much higher parameter efficiency than shallow
ones for complex problems.
» Real-world data is often structured in such a hierarchical way, and deep neural networks
automatically take advantage of this fact.
• Not only does this hierarchical architecture help DNNs converge faster to a
good solution, but it also improves their ability to generalize to new
datasets (i.e., transfer learning)
• Very complex tasks, such as large image classification or speech
recognition, typically require networks with hundreds of layers and they
need a huge amount of training data.
» It is more common to reuse parts of a pretrained state-of-the-art network that performs
these tasks.
Number of Neurons per Hidden Layer
• The number of neurons in the input and output layers is
determined by the type of input and output your task requires.
» e.g., the MNIST task requires 28 × 28 = 784 input neurons and 10 output
neurons.
• As for the hidden layers, it used to be common to size them to form
a pyramid, with fewer and fewer neurons at each layer.
• You can try increasing the number of neurons gradually until the
network starts overfitting.
• The “stretch pants” approach: pick a model with more layers and
neurons than you actually need, then use early stopping and other
regularization techniques to prevent it from overfitting.
» Avoid bottleneck layers that could ruin your model.
• In general you will get more bang for your buck by increasing the
number of layers instead of the number of neurons per layer.
Tuning the Learning Rate
• Learning rate is arguably the most important hyperparameter.
• One way to find a good learning rate is to train the model for a few
hundred iterations, starting with a very low learning rate (e.g., 10-5)
and gradually increasing it up to a very large value (e.g., 10).
» This is done by multiplying the learning rate by a constant factor at each
iteration (e.g., by exp(log(106)/500) to go from 10-5 to 10 in 500 iterations).
• If you plot the loss as a function of the learning rate (using a log
scale for the learning rate), you should see it dropping at first.
» But after a while, the learning rate will be too large, so the loss will shoot
back up
• The optimal learning rate will be a bit lower than the point at which
the loss starts to climb (typically about 10 times lower than the
turning point).
• You can then reinitialize your model and train it normally using this
good learning rate.
Tuning Optimizer, Batch Size, Activation
Functions, and Number of Iterations
• Choosing a better optimizer than plain old Mini-batch
Gradient Descent is quite important.
• The main benefit of using large batch sizes is that hardware
accelerators like GPUs can process them efficiently, so the
training algorithm will see more instances per second.
» But some researchers reported that large batch sizes often lead
to training instabilities, the resulting model may not generalize
as well as a model trained with a small batch size.
• There are activation functions better than ReLU
• In most cases, the number of training iterations does not
actually need to be tweaked: just use early stopping
instead.
Conclusions
• This concludes our introduction to artificial neural networks
and their implementation with Keras.
• In the next lecture, we will discuss techniques to train very
deep nets.
» But we will not talk about the customize models using
TensorFlow’s lower-level API and how to load and preprocess
data efficiently using the Data API.
• After that we will dive into other popular neural network
architectures: convolutional neural networks for image
processing and recurrent neural networks for sequential
data.
» But we will skip autoencoders for representation learning, and
generative adversarial networks to model and generate data.

AI 900 Master Cheat Sheet
100% (1)
AI 900 Master Cheat Sheet
14 pages
Electromotive Force PPT For Class 10
No ratings yet
Electromotive Force PPT For Class 10
19 pages
Training Deep Neural Networks
No ratings yet
Training Deep Neural Networks
55 pages
Navigating TheMathCompany
No ratings yet
Navigating TheMathCompany
10 pages
Chapter 1. Introduction To Neural Network
100% (1)
Chapter 1. Introduction To Neural Network
34 pages
Chapter 4 - Engineering Communication
No ratings yet
Chapter 4 - Engineering Communication
30 pages
Lab 1: The Digital Multimeter: 1.1 Objective
No ratings yet
Lab 1: The Digital Multimeter: 1.1 Objective
50 pages
Project Report
No ratings yet
Project Report
56 pages
Fermat's Principle - Wikipedia PDF
No ratings yet
Fermat's Principle - Wikipedia PDF
105 pages
Presentation On Differential Equation
No ratings yet
Presentation On Differential Equation
38 pages
Relationship Between Phase Velocity and Group Velocity
No ratings yet
Relationship Between Phase Velocity and Group Velocity
4 pages
(Experiment 3) BJT - Common Emiter Amplifier PDF
No ratings yet
(Experiment 3) BJT - Common Emiter Amplifier PDF
9 pages
Question Paper PHY C 302 Electrodynamics II
No ratings yet
Question Paper PHY C 302 Electrodynamics II
3 pages
15 AI Tools That Will Transform Your Business
No ratings yet
15 AI Tools That Will Transform Your Business
18 pages
Chapter - 14 Semiconductor Electronics Materials, Devices and Simple Circuits
No ratings yet
Chapter - 14 Semiconductor Electronics Materials, Devices and Simple Circuits
9 pages
Deloitte NL Data Analytics Artificial Intelligence Whitepaper Eng
No ratings yet
Deloitte NL Data Analytics Artificial Intelligence Whitepaper Eng
35 pages
GIKI ElectEngg FEE SAR V1.9 PDF
No ratings yet
GIKI ElectEngg FEE SAR V1.9 PDF
226 pages
Assignment 1
100% (1)
Assignment 1
17 pages
Ruined Beauty - Cara Bianchi
100% (1)
Ruined Beauty - Cara Bianchi
288 pages
Machine Learning Techniques: Important Questions Unit-1
No ratings yet
Machine Learning Techniques: Important Questions Unit-1
8 pages
Sem 1 2024-2025 Bme
No ratings yet
Sem 1 2024-2025 Bme
4 pages
Modern Handbook
100% (1)
Modern Handbook
29 pages
Waves: Ultrasonic Waves: Piezo Electric Generator or Oscillator
100% (1)
Waves: Ultrasonic Waves: Piezo Electric Generator or Oscillator
3 pages
Introduction To Semiconductor Materials and Crystal Structures
No ratings yet
Introduction To Semiconductor Materials and Crystal Structures
27 pages
Eee 2407 - Instrumentation Nov-Mar 2019-2
No ratings yet
Eee 2407 - Instrumentation Nov-Mar 2019-2
87 pages
4.1.unit-4 - Band Theory of Solids - ECE-2 and 3
No ratings yet
4.1.unit-4 - Band Theory of Solids - ECE-2 and 3
16 pages
Landauer Buttiker Formalism
No ratings yet
Landauer Buttiker Formalism
15 pages
Characteristics of P-N Junction Diode
No ratings yet
Characteristics of P-N Junction Diode
15 pages
Neon-Flash-Bulb Experiment Class
No ratings yet
Neon-Flash-Bulb Experiment Class
5 pages
7 Best ChatGPT Prompts For Lawyers in 2024
No ratings yet
7 Best ChatGPT Prompts For Lawyers in 2024
1 page
Innovation: Tim Schindele Lean Six Sigma Black Belt
100% (4)
Innovation: Tim Schindele Lean Six Sigma Black Belt
88 pages
Hetero Junctions
100% (1)
Hetero Junctions
19 pages
Quantum Mechanics Notes
No ratings yet
Quantum Mechanics Notes
2 pages
AI, A Friend or A Foe
No ratings yet
AI, A Friend or A Foe
2 pages
Permeability (Electromagnetism)
No ratings yet
Permeability (Electromagnetism)
6 pages
P-N Junction For BTech
No ratings yet
P-N Junction For BTech
49 pages
Modern Physics 8
No ratings yet
Modern Physics 8
6 pages
Upsc Mains: Science and Tech Current Affairs
No ratings yet
Upsc Mains: Science and Tech Current Affairs
62 pages
MedCoDi-M: A Multi-Prompt Foundation Model For Multimodal Medical Data Generation
No ratings yet
MedCoDi-M: A Multi-Prompt Foundation Model For Multimodal Medical Data Generation
40 pages
Ain3001 - Introduction - To.ann
No ratings yet
Ain3001 - Introduction - To.ann
39 pages
Lecture 04 Interconnects 28092023
No ratings yet
Lecture 04 Interconnects 28092023
37 pages
Applied Artificial Intelligence in Modern Warfare and National Se
No ratings yet
Applied Artificial Intelligence in Modern Warfare and National Se
41 pages
CH - 14 Semiconductor Electronics
No ratings yet
CH - 14 Semiconductor Electronics
33 pages
AFM Lab Report
0% (1)
AFM Lab Report
14 pages
Introduction To Artificial Intelligence: Chapter 3: Kwoledge Representation and Reasoning (1) Logic Agents
No ratings yet
Introduction To Artificial Intelligence: Chapter 3: Kwoledge Representation and Reasoning (1) Logic Agents
32 pages
Training of Neural Networks: Q.J. Zhang, Carleton University
No ratings yet
Training of Neural Networks: Q.J. Zhang, Carleton University
44 pages
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
100% (1)
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
8 pages
General Physics Lab Report: Faculty of Engineering and Technology
No ratings yet
General Physics Lab Report: Faculty of Engineering and Technology
19 pages
A Taxonomy of Fake News Classification Techniques Survey and Implementation Aspects
No ratings yet
A Taxonomy of Fake News Classification Techniques Survey and Implementation Aspects
28 pages
Big Data Analytics For Oncology PDF
No ratings yet
Big Data Analytics For Oncology PDF
49 pages
5 MCQ Ann Ann Quiz Selected
No ratings yet
5 MCQ Ann Ann Quiz Selected
21 pages
Applied Phy Lab Manual
No ratings yet
Applied Phy Lab Manual
109 pages
Pseudo Forces in Uniformly Rotating Frame
No ratings yet
Pseudo Forces in Uniformly Rotating Frame
21 pages
ML L1 PDF
No ratings yet
ML L1 PDF
43 pages
Carrier Mobility in Semiconductors
No ratings yet
Carrier Mobility in Semiconductors
2 pages
Applied Physics Lab Lab Lab
No ratings yet
Applied Physics Lab Lab Lab
31 pages
Semiconductor Physics
No ratings yet
Semiconductor Physics
23 pages
Comsats University Islamabad Lab Report#2 Applied Physics For Engineers
No ratings yet
Comsats University Islamabad Lab Report#2 Applied Physics For Engineers
4 pages
Ai and Machine Learning in Biotechnology A Paradigm
No ratings yet
Ai and Machine Learning in Biotechnology A Paradigm
11 pages
3 Days Workshop On Generative Ai Agenda1-1 PDF
No ratings yet
3 Days Workshop On Generative Ai Agenda1-1 PDF
4 pages
MarketingAutomationSurvey PDF
No ratings yet
MarketingAutomationSurvey PDF
11 pages
Boylestad
No ratings yet
Boylestad
18 pages
Planck Constant Using Photocell
No ratings yet
Planck Constant Using Photocell
6 pages
Amutenda r206668v Survey Paper
No ratings yet
Amutenda r206668v Survey Paper
6 pages
ED - Lab - OEL - Final - Spring 2022-23
No ratings yet
ED - Lab - OEL - Final - Spring 2022-23
2 pages
Research Paper
No ratings yet
Research Paper
3 pages
Newton's Formula and Laplace's Correction: K. R. Niraimathi
No ratings yet
Newton's Formula and Laplace's Correction: K. R. Niraimathi
8 pages
MIT2 57S12 Lec Notes 2004 PDF
No ratings yet
MIT2 57S12 Lec Notes 2004 PDF
177 pages
Experiment of Ohm's Law
No ratings yet
Experiment of Ohm's Law
8 pages
Lecture 3 Diffusion
No ratings yet
Lecture 3 Diffusion
68 pages
03 17th October Traffic Engineering
No ratings yet
03 17th October Traffic Engineering
57 pages
Interference
No ratings yet
Interference
19 pages
Universal Human Values 2 Understanding Harmony
No ratings yet
Universal Human Values 2 Understanding Harmony
2 pages
Kronig Penny Model: Engineering Physics by Dr. Amita Maurya, Peoples University, Bhopal. Unit 5
No ratings yet
Kronig Penny Model: Engineering Physics by Dr. Amita Maurya, Peoples University, Bhopal. Unit 5
7 pages
Electrical Engineering - Caltech Catalog
No ratings yet
Electrical Engineering - Caltech Catalog
9 pages
Free Electron Theory
No ratings yet
Free Electron Theory
4 pages
F2 Proposal Outline Form
No ratings yet
F2 Proposal Outline Form
2 pages
Lebanese International University School of Engineering
No ratings yet
Lebanese International University School of Engineering
12 pages
SPH 103 Waves and Optics - Lecture Notes - January - April 2016
No ratings yet
SPH 103 Waves and Optics - Lecture Notes - January - April 2016
32 pages
PHM 601 Question Bank-2018
No ratings yet
PHM 601 Question Bank-2018
4 pages
Unit 5
No ratings yet
Unit 5
61 pages
2.GATE - Classical Mechanics Solution-2023
No ratings yet
2.GATE - Classical Mechanics Solution-2023
3 pages
Mean Free Path and Collision Time
No ratings yet
Mean Free Path and Collision Time
2 pages
Dual Nature
No ratings yet
Dual Nature
6 pages
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
Phy 109 PDF
No ratings yet
Phy 109 PDF
1 page
CHAT GPT CHEAT CODES - v1.5
94% (47)
CHAT GPT CHEAT CODES - v1.5
77 pages
Exercises of Kinematics, Dynamics and Statics
From Everand
Exercises of Kinematics, Dynamics and Statics
Simone Malacrida
No ratings yet
Business Plan Target Customer Example
No ratings yet
Business Plan Target Customer Example
9 pages
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet

Chapter10 Keras

Uploaded by

Chapter10 Keras

Uploaded by

Chapter 10: Introduction to Artificial

Neural Networks with Keras

Ulsan National Institute of Science and Technology (UNIST)

• These networks can be combined to compute complex logical

• Perceptron convergence theorem:

• Loss function: cross-entropy loss (a.k.a. the

• Open a Python shell or a Jupyter notebook and print the version of

• Drop-in replacement of MNIST in Chapter 2.

• Loading data from Keras is different from Scikit-Learn:

• For example, the first image in the training set represents a

» The Flatten layer convert each input image into a 1D array.

• Use the "sparse_categorical_cross entropy" loss when we have

• Don’t tweak the hyperparameters to improve the accuracy of the

• They should be correct (otherwise, more training)

• Train the models with two datasets:

• Evaluate the outputs separately:

• Likewise, make predictions separately:

• Keras will use the HDF5 format to save

• If you use a validation set during training, you can set

• The callback automatically create the log directory, generate

• Step 1: Wrap our Keras models in objects that mimic regular

• Step 2: Create a KerasRegressor based on this build_model()

• Step 3: Use the KerasRegressor object like a regular Scikit-

• Then use GridSearchCV() to perform grid search

• After exploring for many hours, the result is

You might also like