Python Deep Learning: Understand How Deep Neural Networks Work and Apply Them To Real-World Tasks 3rd Edition Vasilev Ebook All Chapters PDF
Python Deep Learning: Understand How Deep Neural Networks Work and Apply Them To Real-World Tasks 3rd Edition Vasilev Ebook All Chapters PDF
com
https://textbookfull.com/product/python-deep-
learning-understand-how-deep-neural-networks-work-
and-apply-them-to-real-world-tasks-3rd-edition-
vasilev/
https://textbookfull.com/product/neural-networks-and-deep-learning-a-
textbook-charu-c-aggarwal/
textbookfull.com
https://textbookfull.com/product/applied-neural-networks-with-
tensorflow-2-api-oriented-deep-learning-with-python-orhan-gazi-yalcin/
textbookfull.com
https://textbookfull.com/product/all-rise-for-the-honorable-perry-t-
cook-first-edition-printing-edition-leslie-connor/
textbookfull.com
https://textbookfull.com/product/sensors-and-image-processing-
proceedings-of-csi-2015-1st-edition-shabana-urooj/
textbookfull.com
https://textbookfull.com/product/improving-father-daughter-
relationships-a-guide-for-women-and-their-dads-1st-edition-nielsen/
textbookfull.com
https://textbookfull.com/product/arduino-based-embedded-systems-
interfacing-simulation-and-labview-gui-1st-edition-rajesh-singh/
textbookfull.com
Ivan Vasilev
BIRMINGHAM—MUMBAI
Python Deep Learning
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted
in any form or by any means, without the prior written permission of the publisher, except in the case
of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information
presented. However, the information contained in this book is sold without warranty, either express
or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable
for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and
products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot
guarantee the accuracy of this information.
Published by
Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-83763-850-5
www.packtpub.com
Contributors
I would like to thank my wife, Anita, and daughter, Ananya, for giving me the time and space to
review this book.
Table of Contents
Prefacexi
2
Neural Networks 27
Technical requirements 27 Multi-layer NNs 46
The need for NNs 28 Activation functions 47
The universal approximation theorem 49
The math of NNs 28
Linear algebra 29 Training NNs 52
An introduction to probability 33 GD52
Differential calculus 39 Backpropagation56
A code example of an NN for the
An introduction to NNs 41
XOR function 58
Units – the smallest NN building block 42
Layers as operations 44 Summary64
vi Table of Contents
3
Deep Learning Fundamentals 65
Technical requirements 65 Improved activation functions 72
Introduction to DL 66 DNN regularization 76
5
Advanced Computer Vision Applications 135
Technical requirements 136 Introducing image segmentation 159
Transfer learning (TL) 136 Semantic segmentation with U-Net 160
Transfer learning with PyTorch 138 Instance segmentation with Mask R-CNN 162
Transfer learning with Keras 141 Image generation with diffusion
Object detection 145 models165
Approaches to object detection 146 Introducing generative models 166
Object detection with YOLO 148 Denoising Diffusion Probabilistic Models 167
Object detection with Faster R-CNN 153 Summary170
7
The Attention Mechanism and Transformers 211
Technical requirements 211 Understanding the attention
Introducing seq2seq models 212 mechanism214
viii Table of Contents
8
Exploring Large Language Models in Depth 245
Technical requirements 246 Training datasets 260
Introducing LLMs 246 Pre-training properties 263
FT with RLHF 268
LLM architecture 247
LLM attention variants 247 Emergent abilities of LLMs 270
Prefix decoder 254 Introducing Hugging Face
Transformer nuts and bolts 255 Transformers272
Models258 Summary276
Training LLMs 259
9
Advanced Applications of Large Language Models 277
Technical requirements 277 Conditioning transformer 290
Classifying images with Vision Diffusion model 292
Transformer278 Using stable diffusion with Hugging Face
Transformers293
Using ViT with Hugging Face Transformers 280
Exploring fine-tuning transformers 296
Understanding the DEtection
TRansformer282 Harnessing the power of LLMs with
Using DetR with Hugging Face Transformers 286
LangChain298
Using LangChain in practice 299
Generating images with stable
diffusion288 Summary302
Autoencoder289
Table of Contents ix
Index327
Chapter 5, Advanced Computer Vision Applications, discusses applying convolutional networks for
advanced computer vision tasks – object detection and image segmentation. It will also explore using
NNs to generate new images.
Chapter 6, Natural Language Processing and Recurrent Neural Networks, introduces the main paradigms
and data processing pipeline of NLP. It will also explore recurrent NNs and their two most popular
variants – long short-term memory and gated recurrent units.
Chapter 7, The Attention Mechanism and Transformers, introduces one of the most significant recent
deep learning advances – the attention mechanism and the transformer model based around it.
Chapter 8, Exploring Large Language Models in Depth, introduces transformer-based LLMs. It will
discuss their properties and what makes them different than other NN models. It will also introduce
the Hugging Face Transformers library.
Chapter 9, Advanced Applications of Large Language Models, discusses using LLMs for computer vision
tasks. It will focus on classic tasks such as image classification and object detection, but it will also
explore state-of-the-art applications such as text-to-image generation. It will introduce the LangChain
framework for LLM-driven application development.
Chapter 10, Machine Learning Operations (MLOps), will introduce various libraries and techniques
for easier development and production deployment of NN models.
Some code examples in the book may use additional packages not listed in the table. You can see the
full list (with versions) in the requirements.txt file in the book’s GitHub repo.
If you are using the digital version of this book, we advise you to type the code yourself or access
the code from the book’s GitHub repository (a link is available in the next section). Doing so will
help you avoid any potential errors related to the copying and pasting of code.
Preface xiii
Conventions used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file
extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Use
opencv-python to read the RGB image located at image_file_path.”
A block of code is set as follows:
def build_fe_model():
""""Create feature extraction model from the pre-trained model
ResNet50V2"""
import tensorflow.keras
When we wish to draw your attention to a particular part of a code block, the relevant lines or items
are set in bold:
import io
image = Image.open(io.BytesIO(response.content))
image.show()
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at customercare@
packtpub.com and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen.
If you have found a mistake in this book, we would be grateful if you would report this to us. Please
visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would
be grateful if you would provide us with the location address or website name. Please contact us at
[email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you
are interested in either writing or contributing to a book, please visit authors.packtpub.com.
https://packt.link/free-ebook/9781837638505
We’ll start this part by introducing you to basic machine learning theory and concepts. Then, we’ll
follow with a thorough introduction to neural networks – a special type of machine learning algorithm.
We’ll discuss the mathematical principles behind them and learn how to train them. Finally, we’ll
make the transition from shallow to deep networks.
This part has the following chapters:
• Introduction to ML
• Different ML approaches
• Neural networks
• Introduction to PyTorch
Technical requirements
We’ll implement the example in this chapter using Python and PyTorch. If you don’t have an
environment set up with these tools, fret not – the example is available as a Jupyter notebook on
Google Colab. You can find the code examples in the book’s GitHub repository: https://github.
com/PacktPublishing/Python-Deep-Learning-Third-Edition/tree/main/
Chapter01.
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
4 Machine Learning – an Introduction
Introduction to ML
ML is often associated with terms such as big data and artificial intelligence (AI). However, both are
quite different from ML. To understand what ML is and why it’s useful, it’s important to understand
what big data is and how ML applies to it.
Big data is a term used to describe huge datasets that are created as the result of large increases in data
that is gathered and stored. For example, this may be through cameras, sensors, or internet social sites.
Humans alone are unable to grasp, let alone analyze, such huge amounts of data, and ML techniques
are used to make sense of these very large datasets. ML is the tool that’s used for large-scale data
processing. It is well suited to complex datasets that have huge numbers of variables and features. One
of the strengths of many ML techniques, and DL in particular, is that they perform best when used
on large datasets, thus improving their analytic and predictive power. In other words, ML techniques,
and DL NNs in particular, learn best when they can access large datasets where they can discover
patterns and regularities hidden in the data.
On the other hand, ML’s predictive ability can be successfully adapted to AI systems. ML can be thought
of as the brain of an AI system. AI can be defined (though this definition may not be unique) as a
system that can interact with its environment. Also, AI machines are endowed with sensors that enable
them to know the environment they are in and tools with which they can relate to the environment.
Therefore, ML is the brain that allows the machine to analyze the data ingested through its sensors
to formulate an appropriate answer. A simple example is Siri on an iPhone. Siri hears the command
through its microphone and outputs an answer through its speakers or its display, but to do so, it
needs to understand what it’s being told. Similarly, driverless cars will be equipped with cameras,
GPS systems, sonars, and LiDAR, but all this information needs to be processed to provide a correct
answer. This may include whether to accelerate, brake, or turn. ML is the information-processing
method that leads to the answer.
We’ve explained what ML is, but what about DL? For now, let’s just say that DL is a subfield of ML.
DL methods share some special common features. The most popular representatives of such methods
are deep NNs.
Different ML approaches 5
Different ML approaches
As we have seen, the term ML is used in a very general way and refers to the general techniques that
are used to extrapolate patterns from large sets, or it is the ability to make predictions on new data
based on what is learned by analyzing available known data. ML techniques can roughly be divided
into two core classes, while one more class is often added. Here are the classes:
• Supervised learning
• Unsupervised learning
• Reinforcement learning
Supervised learning
Supervised learning algorithms are a class of ML algorithms that use previously labeled data to learn
its features, so they can classify similar but unlabeled data. Let’s use an example to understand this
concept better.
Let’s assume that a user receives many emails every day, some of which are important business emails
and some of which are unsolicited junk emails, also known as spam. A supervised machine algorithm
will be presented with a large body of emails that have already been labeled by a teacher as spam or
not spam (this is called training data). For each sample, the machine will try to predict whether the
email is spam or not, and it will compare the prediction with the original target label. If the prediction
differs from the target, the machine will adjust its internal parameters in such a way that the next
time it encounters this sample, it will classify it correctly. Conversely, if the prediction is correct, the
parameters will stay the same. The more training data we feed to the algorithm, the better it becomes
(this rule has caveats, as we’ll see next).
In the example we used, the emails had only two classes (spam or not spam), but the same principles
apply to tasks with arbitrary numbers of classes (or categories). For example, Gmail, the free email
service by Google, allows the user to select up to five categories, which are labeled as follows:
To summarize, the ML task, which maps a set of input values to a finite number of classes, is
called classification.
In some cases, the outcome may not necessarily be discrete, and we may not have a finite number
of classes to classify our data into. For example, we may try to predict the life expectancy of a group
of people based on their predetermined health parameters. In this case, the outcome is a numerical
value, and we don’t talk about classification but rather regression.
One way to think of supervised learning is to imagine we are building a function, f, defined over a
dataset, which comprises information organized by features. In the case of email classification, the
features can be specific words that may appear more frequently than others in spam emails. The use
of explicit sex-related words will most likely identify a spam email rather than a business/work email.
On the contrary, words such as meeting, business, or presentation are more likely to describe a work
email. If we have access to metadata, we may also use the sender’s information as a feature. Each email
will then have an associated set of features, and each feature will have a value (in this case, how many
times the specific word is present in the email’s body). The ML algorithm will then seek to map those
values to a discrete range that represents the set of classes, or a real value in the case of regression.
The definition of the f function is as follows:
f : space of features → classes = (discrete values or real values)
In later chapters, we’ll see several examples of either classification or regression problems. One such
problem we’ll discuss is classifying handwritten digits of the Modified National Institute of Standards
and Technology (MNIST) database (http://yann.lecun.com/exdb/mnist/). When given
a set of images representing 0 to 9, the ML algorithm will try to classify each image in one of the 10
classes, wherein each class corresponds to one of the 10 digits. Each image is 28×28 (= 784) pixels in
size. If we think of each pixel as one feature, then the algorithm will use a 784-dimensional feature
space to classify the digits.
The following figure depicts the handwritten digits from the MNIST dataset:
In the next sections, we’ll talk about some of the most popular classical supervised algorithms. The
following is by no means an exhaustive list or a thorough description of each ML method. We recommend
referring to the book Python Machine Learning, by Sebastian Raschka (https://www.packtpub.
Different ML approaches 7
com/product/python-machine-learning-third-edition/9781789955750). It’s
a simple review meant to provide you with a flavor of the different ML techniques in Python.
A regression algorithm is a type of supervised algorithm that uses features of the input data to predict
a numeric value, such as the cost of a house, given certain features, such as size, age, number of
bathrooms, number of floors, and location. Regression analysis tries to find the value of the parameters
for the function that best fits an input dataset.
In a linear regression algorithm, the goal is to minimize a cost function by finding appropriate
parameters for the function over the input data that best approximates the target values. A cost
function is a function of the error – that is, how far we are from getting a correct result. A popular
cost function is the mean squared error (MSE), where we take the square of the difference between
the expected value and the predicted result. The sum of all the input examples gives us the error of
the algorithm and represents the cost function.
Say we have a 100-square-meter house that was built 25 years ago with three bathrooms and two
floors. Let’s also assume that the city is divided into 10 different neighborhoods, which we’ll denote
with integers from 1 to 10, and say this house is located in the area denoted by 7. We can parameterize
this house with a five-dimensional vector, x = (x1 , x2 , x3 , x4 , x5 ) = (100,25,3, 2,7). Say that we also
know that this house has an estimated value of $100,000 (in today’s world, this would be enough for
just a tiny shack near the North Pole, but let’s pretend). What we want is to create a function, f, such
that f( x) = 100000.
A note of encouragement
Don’t worry If you don’t fully understand some of the terms in this section. We’ll discuss vectors,
cost functions, linear regression, and gradient descent in more detail in Chapter 2. We will also
see that training NNs and linear/logistic regressions have a lot in common. For now, you can
think of a vector as an array. We’ll denote vectors with boldface font – for example, x . We’ll
denote the vector elements with italic font and subscript – for example, xi.
2
E +=
(x(i)
⋅ w − t )
(i)
t
# here (i)
is the real house price
MSE =
______________
E # Mean Square Error
total number of samples
use gradient descent to update the weights w based on MSE until MSE
falls below threshold
First, we iterate over the training data to compute the cost function, MSE. Once we know the value
of MSE, we’ll use the gradient descent algorithm to update the weights of the vector, w. To do this,
we’ll calculate the derivatives of the cost function concerning each weight, wi. In this way, we’ll know
how the cost function changes (increase or decrease) concerning wi. Then we’ll update that weight’s
value accordingly.
Previously, we demonstrated how to solve a regression problem with linear regression. Now, let’s take
a classification task: trying to determine whether a house is overvalued or undervalued. In this case,
the target data would be categorical [1, 0] – 1 for overvalued and 0 for undervalued. The price of the
house will be an input parameter instead of the target value as before. To solve the task, we’ll use logistic
regression. This is similar to linear regression but with one difference: in linear regression, the output
is x ⋅ w. However, here, the output will be a special logistic function (https://en.wikipedia.
org/wiki/Logistic_function), σ(x ⋅ w). This will squash the value of x ⋅ win the (0:1)
interval. You can think of the logistic function as a probability, and the closer the result is to 1, the
more chance there is that the house is overvalued, and vice versa. Training is the same as with linear
regression, but the output of the function is in the (0:1) interval, and the labels are either 0 or 1.
Logistic regression is not a classification algorithm, but we can turn it into one. We just have to introduce
a rule that determines the class based on the logistic function’s output. For example, we can say that
a house is overvalued if the value of σ ( x ⋅ w) < 0.5and undervalued otherwise.
Multivariate regression
The regression examples in this section have a single numerical output. A regression analysis
can have more than one output. We’ll refer to such analysis as multivariate regression.
A support vector machine (SVM) is a supervised ML algorithm that’s mainly used for classification.
It is the most popular member of the kernel method class of algorithms. An SVM tries to find a
hyperplane, which separates the samples in the dataset.
Hyperplanes
A hyperplane is a plane in a high-dimensional space. For example, a hyperplane in a one-
dimensional space is a point, and in a two-dimensional space, it would just be a line. In three-
dimensional space, the hyperplane would be a plane, and we can’t visualize the hyperplane in
four-dimensional space, but we know that it exists.
Random documents with unrelated
content Scribd suggests to you:
Paulina's prediction came true, and we were soon busy preparing for her
father's marriage with Miss Cottrell. It took place in our beautiful old church on
the thirty-first of July. The happy pair spent a week at Felixstowe and then
came back to "Gay Bowers" to fetch Pollie. It was with genuine regret that
Aunt Patty and I watched Mr. and Mrs. Dicks and Paulina take their departure.
How different were our feelings now from those with which we had received
the Americans and Miss Cottrell! The paying guests had become our friends.
"Au revoir!" cried Pollie as they drove away. "We are coming back some day.
And, Mr. Faulkner, please don't forget that you are going to bring Nan to
Indianapolis some time."
We watched them pass out of our sight with the sadness most partings
inevitably bring, for who could say whether we should all meet again?
Two days later, Alan's sisters came to spend their holidays at "Gay Bowers."
They were such nice, bright girls that I had no difficulty in making friends of
them, and I am thankful to say they seemed to take to me at once. The
brother, who was their guardian, was so great a hero in their eyes, that I
wonder they thought me good enough for him. It must have been, because
they thought he could not make a wrong choice.
Peggy joined us ere August was far advanced, and we became a very lively
party. By this time Jack had returned to the vicarage. I had the satisfaction of
seeing that Aunt Patty had rightly gauged the depth of his wound. If the news
of my engagement to Alan Faulkner hurt him, the blow was one from which he
quickly recovered. He and Peggy became good comrades; she wanted to
practise sketching during her stay in the country and he helped her to find
suitable "bits," and was her attendant squire on many of her expeditions.
I had heard nothing from Agneta since her return to Manchester, but the news
of my engagement brought me a kind though rather sad letter from her. She
said she thought that I and Professor Faulkner were exactly suited to each
other and she was glad I was going to be happy, for I deserved happiness and
she supposed she never had. She knew now that she had been utterly
deluded when she imagined that Ralph Marshman would make her happy.
She wanted me to know that she was convinced of his worthless character
and of what an escape she had had. She thanked me for the efforts I had
made to save her from her own folly, and she begged me to forgive her for
being so ungrateful at the time. She said she was sick of her life at home. She
wanted her parents to let her adopt a career of her own and live a more useful
life, but her mother refused to entertain the idea for a moment.
"I am trying to be patient," Agneta wrote; "You know you were always
preaching patience to me, Nan; and I mean to do some 'solid' reading every
day. Do send me a list of books you think I ought to read. I know, although you
never said so, that you thought me very ignorant when I was with you. I don't
forget either how you once said that I never should be happy as long as I
made myself the centre of my life. So I try to be unselfish and to think of other
people, but there is really very little I can do for others in the life I lead here. I
almost envy girls who have to work for themselves."
I felt very sorry for Agneta as I read her letter, and yet I should have been
glad, for, if her words were sincere, they augured for her happier days than
she had yet known. For what hope of happiness is there for any one who is
shut up in the prison-house of self? It was good for Agneta, as it had been for
me and for Paulina, to suffer, if her trouble had led her into a larger, fuller, and
more blessed life.
But the story of Aunt Patty's guests, as far as I have known them intimately,
must be brought to a close. After all, I did not stay quite twelve months at "Gay
Bowers." I went home for Christmas and I did not return. There was no longer
any thought of my going up for Matriculation. Even now I regret that I never
did so, but mother was bent upon my entering on a course of domestic
economy, and the value of that study I am daily proving.
Early in the New Year, Olive was married. It was a very pretty wedding and
everything went off charmingly; but her departure for India six weeks later left
us all with very sore hearts. Alan was duly appointed to the professorship at
Edinburgh, and now my home is in that beautiful old city, for in the following
year, at the beginning of the summer vacation, we were married.
I should like to write about that wedding, but Alan thinks I had better not begin.
My three sisters, Alan's two, and Cousin Agneta were my bridesmaids. Mr.
Upsher assisted at the ceremony, and Jack, such a handsome young soldier,
was one of the guests. He still showed himself devoted to Peggy, but I hope
he is not seriously attracted by her, for Peggy declares that she is wedded to
her art and is quite angry if any one suggests that she may marry. She is now
working hard in Paris and promises to develop into a first-rate artist in "black
and white."
Agneta made a very pretty bridesmaid and looked as happy as one could
wish. I say this on mother's authority, for really I cannot remember how any
one looked except Alan. The sun must have been in my eyes all the time, for
my recollection of everything is so vague and hazy. So it was wise of Alan to
advise me not to attempt to describe our wedding. Soon afterwards we heard
of Agneta's engagement, with her parents' approval, to a young medical man,
so I dare say she did look happy.
Alan and I always agree that "Gay Bowers" is the most delightful old country
house we have ever known. Apparently many are of the same opinion, for
aunt seldom has a room to spare in it.
THE END
Updated editions will replace the previous one—the old editions will
be renamed.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the terms
of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
1.F.4. Except for the limited right of replacement or refund set forth in
paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.