0% found this document useful (0 votes)
3 views

2. 02 PyTorch, Datasets, and Models

This document provides an overview of PyTorch, a deep learning framework developed by Meta AI, highlighting its flexibility and applications in various fields such as computer vision and natural language processing. It discusses the differences between supervised, unsupervised, and reinforcement learning, as well as the importance of datasets in training models. Additionally, it explains the concept of artificial neural networks and their inspiration from biological systems.

Uploaded by

ayomide.adekoya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

2. 02 PyTorch, Datasets, and Models

This document provides an overview of PyTorch, a deep learning framework developed by Meta AI, highlighting its flexibility and applications in various fields such as computer vision and natural language processing. It discusses the differences between supervised, unsupervised, and reinforcement learning, as well as the importance of datasets in training models. Additionally, it explains the concept of artificial neural networks and their inspiration from biological systems.

Uploaded by

ayomide.adekoya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

PyTorch, Datasets, and Models

Learning Objectives
By the end of this chapter, you should be able to:
1. Discuss the difference between supervised and unsupervised learning
2. Discuss the difference between software development and machine &
deep learning
3. Understand the general idea behind building and training a model
4. Identify commonly used terms in machine and deep learning and their
meanings
Deep Learning Framework
▪ Deep learning frameworks are software libraries (APIs, tools, abstractions) that
help ML professionals with tools for training deep learning models.
▪ The packages in the library include activation functions, layers, loss functions and
optimizers that help create different architectures dor deep neural networks.
▪ Examples are Tensorflow, keras, Microsoft Cognitive Toolkit (CNTK0, Scikit-learn,
Theano
What Is PyTorch?
▪ PyTorch is an open-source deep learning framework developed by Meta AI in
2016. It offers both modularity and flexibility, making it a capable tool for
everything from tinkering with innovative models to maintaining an industry-scale
application.
▪ PyTorch can be used to solve many types of problems, including
optimization problems.
▪ To estimate the fuel efficiency of cars based on their power and weight. You'll need
data for this. PyTorch can help you develop a model that predicts the car's
efficiency based on its power and weight. This would be a specific type of
optimization problem called a linear regression model. This is a simple problem
that could be solved in Excel, but this makes it a useful thought experiment to
begin wrapping your head around the sorts of problems PyTorch can solve.
What Is PyTorch?
▪ But, what about really complex problems? What if you have hundreds
or thousands of columns in a spreadsheet? What if you have a
humongous amount of data (picture millions of rows in a
spreadsheet)?
▪ PyTorch uses an algorithm called gradient descent that is capable of
looking for solutions regardless of how complex the problem is, or
how massive the amount of data is. It starts from a random point and,
little by little, works on improving the solution, one baby step at a time.
This is performed by PyTorch's autograd module, which does the
heavy lifting for you so you can focus on more interesting matters.
The PyTorch Ecosystem
The range of fields and applications that can be powered by PyTorch is extensive:
▪ Computer Vision (Kornia, Medical Open Network for Artificial Intelligence (MONAI), OpenMMLab, PyTorchVideo,
Detectron2, PyTorch3D)
▪ machine and vehicular object detection, tracking, identification, and avoidance
▪ medical image analysis and diagnosis
▪ image recognition, classification, and tagging
▪ video classification and detection
▪ Natural Language Processing (AllenNLP, NeMo, Transformers, flair)
▪ text classification, summarization, generation, and translation virtual assistants
▪ sentiment analysis
▪ question answering and search engines
▪ Graph Neural Networks (torchdrug, PyTorch Geometric, DGL)
▪ molecule fingerprinting
▪ drug discovery and protein interface prediction
▪ social network analysis
▪ Spatio-Temporal Graph Neural Networks (PyTorch Geometric Temporal)
▪ route planning and navigation
▪ traffic control and management
▪ inventory planning
▪ logistics optimization
The PyTorch Ecosystem
The range of fields and applications that can be powered by PyTorch is extensive:
▪ Gaussian Processes (GPyTorch)
▪ time series modeling and anomaly detection
▪ risk management
▪ control engineering and robotics
▪ Reinforcement Learning (PRFL)
▪ industry automation and robotics manipulation
▪ dynamic treatment regimes (DTRs) in healthcare
▪ real-time bidding
▪ strategy games
▪ Recommender Systems (TorchRec)
▪ Interpretability and Explainability (Captum)
▪ Privacy-Preserving Machine Learning (CrypTen, PySyft, Opacus)
▪ Federated Learning - collaboratively training a model without the need to centralize the data (PySyft, Flower)

In a nutshell, there are libraries and pre-trained models available for a wide
range of topics and applications.
Hugging Face
▪ While not a part of PyTorch, Hugging Face is widely known for its
large open-source community and is a central hub for models and
Python libraries, especially in the area of natural language
processing (NLP).
▪ Because it is so often used alongside PyTorch, it belongs in any
discussion of the PyTorch ecosystem and we will make use of it in
this course.
Types of Machine Learning
There are three general categories under the umbrella of machine learning-
supervised learning, unsupervised learning and reinforcement learning.
Supervised vs. Unsupervised Learning
▪ The majority of models and algorithms in machine and deep learning fall
into one of these classes: supervised learning and unsupervised
learning.
▪ To train a model using supervised learning is like actively teaching a
toddler. You can show them a picture of a zebra and ask them what they
see. They reply "it's a horse", because they've never seen a zebra before.
You tell them it was a good guess, but the right answer is "zebra". You're
supervising their learning by providing the right answer to every question.
Hopefully, with enough examples and their corresponding answers, your
model will also learn.
Supervised Learning
Supervised learning can be used for two major tasks:
• Classification
• Regression
Classification tasks predict labels that are a category or class- like spam or
not spam.
Regression tasks predict labels that are numeric - like this house is valued
at $456,000.
Questions
Match the example with the appropriate type of supervised learning.
▪ An app to identify the plant type from an image.
▪ An e-commerce site predicting prices for their products.
▪ An e-commerce site recommending products based on other customers'
purchase similarities.
▪ A car dealership is scoring their customers on long-term value into 1, 2, 3
or 4
▪ A medical research company is predicting drug dosages for patients.
▪ A design software that removes noise from visual data to improve picture
quality.
Supervised vs. Unsupervised Learning
▪ Unsupervised learning, on the other hand, is like giving a toddler
a bunch of building blocks and asking them to organize them
without any specific instructions on how to do it.
▪ Maybe the toddler will split the blocks by color. Maybe they will
split the blocks by size. It depends on which feature of the
blocks, the color or the size, is more noticeable to them. Notice
that you're not giving the toddler any "right" answers. It's exactly
the same with these models and algorithms: they will look for
similarities in the data and use it to split it into groups. We won't
be covering those in this course.
Questions
Match the example with the correct type of machine learning.
▪ You ask a child to sort toys, and they can do it based on any
characteristics they like.
▪ You ask a child to sort toys into stuffed toys, trucks and building
materials.
Reinforcement Learning
▪ Reinforcement learning (RL) systems learn from
continuous experience rather than labels or historical
patterns.
▪ Learning happens through interaction.
▪ For every action, the system takes, a reward is given, and
the system's goal is to maximize the accumulated
rewards.
Software Development vs Machine and Deep
Learning
A function that converts temperature from Celsius to Fahrenheit, is given by:
def celsius2fahrenheit(celsius):
fahrenheit = celsius*(9/5)+32
return fahrenheit

That's "Software 1.0": given the arguments and the rules, it will produce the desired output.
Software Development vs Machine and Deep
Learning
Things start to fall apart quickly if the rules aren't clear. Let's go back to our toddler and
the zebra. If they're seeing the zebra for the first time, they may ask you: "is this a
horse?" And you'll explain to them that it's not a horse, even though both animals have
four legs. Zebras have stripes. That should do the trick for the toddler. But what about
code? Can you write a function that takes an image and returns if it's a zebra or not?
def contains_zebra(image):
# write your code here
is_zebra = ...
return is zebra

How do you even start coding that?! We challenge you to try to come up with a set of rules
to determine if an image shows a zebra in it or not. That's impossibly hard using traditional
software development.
Software Development vs Machine and Deep
Learning
Still, the toddler can easily learn that after a couple of tries. How? They're learning by
example, not by rules. You show the toddler the picture of the zebra (the input), and you
tell them the answer (the desired output). Their brain comes up with the rules, even if no
one can write them down or implement them in a function.
That's "Software 2.0". You don't know the rules: you only have inputs and outputs.
Unlike toddlers, however, it won't take just a couple of tries for a model to learn. It will
need thousands, if not millions, of examples. That's just how models are.
"Hello Model"
▪ Let's train a simple model from scratch to give you a better idea of the process. It is the "Hello
World" of training models or, better yet, the "Hello Model".
▪ We have four cars: "Chevrolet Chevelle Malibu", "Buick Skylark 320", "AMC Rebel SST", and "Ford
Torino".
▪ For each car, we know its power in HP: 130, 165, 150, and 140, respectively. We also know that
the more powerful a car is, the less efficient consumption-wise it will be too. But, how
inefficient? It would be interesting to devise a rule to figure out, given the car's power, how
efficient (or not) we expect it to be.
▪ So, let's drive them around and collect some fuel consumption data! Once data is collected, we
get the desired output of our model for the cars we drove around. The fuel consumption for each
one of the cars, in miles per gallon, is 18, 15, 16, and 17, respectively.
▪ We have data (power in HP), we have the desired output (fuel consumption in MPG), and we're
missing the rule that maps one into the other. Sounds like a task for "Software 2.0", that is, a
model.
▪ What's the simplest model you can come up with?
"Hello Model"
Y=-0.086x+29.075
MPG=-0.086HP+29.075
What if we wanted to use our new model to estimate the fuel consumption of the Ford
Torino? It has 140 HP. Answer is 17.035

In the model above, we have:


▪ a feature, which is an attribute of the car, power (in HP)
▪ a target, the fuel consumption of the car (in MPG)
▪ two parameters, 29.075 and -0.086
▪ the second parameter, -0.086, may also be called a weight since it's a
multiplying factor for a feature (HP)
▪ the model's estimate for the Ford Torino, 17.035, is called a prediction
▪ the difference between the prediction (17.035) and Torino's actual
consumption (17) is called an error
Datasets - Image
A dataset contains the data you'll use to train a model. In some cases, the elements are easy to identify. For
computer vision problems, a dataset is made of images. Each individual image, however, may be referred
to in a variety of ways:
▪ a data point
▪ a sample
▪ an instance
One image, one data point, one sample, or one instance. Regardless of the name, it's easy to make an
association between an image and its name. Moreover, even though images are made of thousands of
pixels, we don't take their individual values as meaningful.
Datasets – Tabular Data
Tabular data, that is, spreadsheet-like data such as a table, is a completely different matter. In a table, a row usually
represents a data point, and each individual value contained in every column is meaningful, and may be referred to in a
variety of ways as well:
▪ an attribute
▪ a feature
▪ a dimension
The first two, feature and attribute, are used somewhat interchangeably. In most cases, an attribute represents a
characteristic or property that's observed or measured (such as color, size, weight, etc.). The term feature may be used
to represent these things, but its usage is generally broader, and it may refer to transformed attributes or completely
abstract values such as intermediate values produced by models or algorithms. Therefore, you'll see features
everywhere: feature extraction, feature transformation, feature engineering, and so on.
Datasets – Tabular Data
The third term is dimension.
You can think of it as an even more generic term to refer to a value or a column. If a data
point has ten values associated with it, such as ten columns in a table (where each data
point is a row), it's possible to say the data point has 10 dimensions, with each
column/attribute/feature representing a different dimension.
Datasets
❑ Dimension itself, however, is one of those terms that may represent different things:
❑ one attribute/feature/column
❑ the number of levels in a nested array
❑ We already discussed the first meaning dimension has, so let's focus on the second. You can think of it
almost as a "physical" dimension.
❑ For example, a single row of values is like a horizontal line, and therefore it has only one dimension, from
left to right. A single column of values is like a vertical line, and it also has only one dimension, from top to
bottom.
❑ A table containing many rows and columns is like a rectangle, and therefore it has two dimensions.
An image also has two dimensions, height and width.
Datasets
Now, imagine that each table is printed on a square sheet of paper, let's say, one
table per day. If you stack many, many, sheets of paper, it will eventually look like a
cube, and therefore it has three dimensions. Colored images are also 3-D, but not
in a physical sense, obviously. Each image is internally represented by three
channels, one for red, one for green, and another one for blue (thus RGB). These
channels work like stacked sheets of paper, thus making colored images three-
dimensional: channel, height, and width.
Datasets - Image
One single image may be a data point.
What happens if we have many images, say, 32 of them, and we stack them up? We're in the
realm of 4D now. Unfortunately, we cannot easily draw analogies from the physical world
anymore, but we can represent the extra dimension as yet another nested level: image ID,
channel, height, and width. Luckily, this is usually as far as we go when it comes to nesting
arrays. It is quite unlikely that you'll ever see five or more levels when dealing with typical
tasks.
Datasets
Features are used as inputs to a model but, as we've seen in the Software 2.0 paradigm, we also need to provide an
"answer" or "result" so the model may try to learn the relationship between the features and this "answer" or "result". As
you probably guessed, this one also goes by different names:
▪ a label
▪ a target
If the "answer" is a name, such as "dog", "cat", "red", or "down", it is usually called a label, even though it may also be
referred to as the target. If the "answer" is numerical, it is more likely to be referred to as a target. The term target,
however, may refer to a more complex answer, such as a combination of numerical values and names.
In tabular data, that is, when we're talking about tables or stacks of tables, the target or label is usually one of the columns
from the table. This particular column should be used exclusively as a target or label, and excluded from the set of
features.
Artificial Neural Network (ANN) Definition
An ANN is an information processing model that was inspired
from the understanding of biological nervous system. ANN was
developed as a generalization of mathematical models of neural
biology based on the following assumptions:
1. Neurons are simple units in a nervous system at which
information processing occurs.
2. Incoming information are signals that are passed between
neurons through connection links.
3. Each connection link has a corresponding weight which multiplies the
transmitted signal.
4. Each neuron applies an activation function to its net input which is the
sum of weighted input signals to determine the output signal.

Biological Neuron (Source: Fundamental of Neural Networks: Architectures, Algorithms and Application By Laurene Fausett)
Similarity/Analogy of Biological Neural Network with ANN

S/N Biological Neural Artificial Neural Network


Network (ANN)
1 Soma Neuron
2 Dendrite Input
3 Axon Output
4 Synapse Weight
Biological Neural Network

Output Signals

Output Signals
Input Signals
Signals
Input

Input layer Output


Hidden layer Input layer Output layer
<-------Hidden layer--------
layer >
A) Shallow Architecture B) Deep Architecture

Typical Artificial Neural Network (ANN) architectures


Models
Models may be as simple as linear regression or as complex as a Transformer such as GPT models. Regardless of
their size, models are nothing more than a set of values, from a few values in linear regression to billions of values in GPT
models. These values, which are learned during the training process, also go by different names:
▪ coefficients
▪ parameters
▪ weights
The first term, coefficient, is generally used in simple linear regression tasks only. For each feature, there is a
corresponding coefficient, and multiplying them pairwise will yield a prediction. A coefficient is a parameter or weight of
the model (linear regression). In the figure below, each line connecting a feature to the target is a coefficient of linear
regression. Or you can call it a parameter. Or a weight as well.
Models
▪ The other two terms, parameters and weights, are used interchangeably,
and do not necessarily map to the input features. A relatively simple neural
network may have many more parameters/weights than input features.
▪ However, at each stage of the network, referred to as a layer, many
intermediate values are produced. These values, as we already discussed,
are often referred to as features as well. And for each one of these
intermediate features, there will be a corresponding parameter/weight.
▪ In a way, you can think of a neural network as a "stack of linear
regression models", the output of one regression being the input of the
next. Of course, this is a very simplified picture, and it does not correspond
to the reality of deep learning models, which include many other
transformations in-between.
Models
In the figure below, the original features are inputs to each one of the ten nodes in the center. It is as if we
were computing ten independent linear regressions. The result of each one of these regressions is an
intermediate feature for the linear regression at the end. Each line connecting any two nodes is a parameter or
weight. You can see that adding an extra layer of regressions in the middle increased considerably the total
number of parameters or weights. That's still a simplified version of an actual neural network, but it paints the
general picture nonetheless.
Setup and Environment
▪ Deep Learning models are computationally expensive and, although it's possible to
run smaller models in the CPU of a regular computer, you'll save a lot of time (and
energy) by leveraging the power of a GPU, a graphics card such as a GTX 1080 or an RTX
3080. However, these cards are expensive, and setting up their drivers to work with
PyTorch may be a cumbersome and difficult process.
▪ For this reason, we're prioritizing the usage of free cloud providers such as Google
Colab, Kaggle Notebooks, or Amazon Sagemaker Studio. These platforms offer free
GPUs that can be used to train, fine-tune, and run the models we'll be handling in this
course. Moreover, their environments come up with all major libraries already
preinstalled: PyTorch, Numpy, Scikit-Learn, Pandas, Matplotlib, etc.
▪ This course was prepared to run in Google Colab, but it should run with little to no
modification in any of the other platforms as well. Moreover, we'll tell you whenever there
are any extra libraries that need to be installed on top of those already available (in
Colab, that is).
Colab
Google Colab "allows you to write and execute Python in your browser, with zero configuration required, free access to
GPUs and easy sharing." You can create Jupyter Notebooks in Google Colab from scratch, upload them from your
computer, or load them directly from GitHub. Every notebook you create is automatically stored in your Google Drive, so
any changes you make to the notebook are preserved.
For a quick overview of Google Colab's capabilities, check the "Welcome To Colaboratory" and "Overview of Colaboratory
Features" notebooks.
One important thing to notice: in order to use a free GPU, you'll need to change the runtime type before running any code,
otherwise you'll have to re-run everything. You can easily change it in the "Runtime" menu, "Change runtime type" entry, as
shown below:
Colab
It will open a menu listing the default hardware accelerator, which is None, meaning it's running on a regular CPU:

To use the GPU, you need to select the corresponding option:

Click on "Save" to close the menu, and once you run any of the cells, it will connect to a GPU-powered runtime (it may
take a few seconds) and you'll be good to take that GPU for a spin!
Your Learning Journey
The road to learning how to effectively use deep learning models isn't without its
difficulties. We'll cover all the steps you need to master in order to prepare your data, train
your model, and use it to make predictions. There will be a lot of back and forth, so we'll
be showing you a map throughout your learning journey, to help keep you oriented at any
given moment:

Most of the time, we'll be following the steps in the right order when tackling
a given task. However, you'll see that in later chapters, we'll skip some steps,
or even move backwards (usually to the "Preprocessing Data" step).

You might also like