0% found this document useful (0 votes)

21 views11 pages

Deep Learning Workflow

Uploaded by

ayushmaanpattnayak0788

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views11 pages

Deep Learning Workflow

Uploaded by

ayushmaanpattnayak0788

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Deep Learning Workflow

Introduction :
Successfully using deep learning requires more than just knowing how to build
neural networks; we also need to know the steps required to apply them in
real-world settings effectively.

In this article, we cover the workflow for a deep learning project: how we build

out deep learning solutions to tackle real-world tasks. The deep learning

workflow has seven primary components:

Acquiring data

Preprocessing
Splitting and balancing the dataset

Building and training the model

Evaluation

Hyperparameter tuning

Deploying our solution (For Industry)

Part 1: Acquiring Data

In a deep learning project, the most pressing concern is almost always: “Can

we get enough labeled data?” The more labeled data we have, the better our

model can be. Our ability to acquire data can make or break our solution. Not

only is getting data usually the most important part of a deep learning project,

but it’s also often the hardest.

Luckily, there are many potential data sources:

Publicly Available Datasets

The best source of data is often publicly available datasets. Sites like Kaggle

host thousands of large, labeled data sources. Working with these curated

datasets helps reduce the overhead of starting a deep learning project.

Existing Databases
In some cases, our organization may have a large dataset on hand. Often,

these datasets are stored in a Relational Database Management System

(RDMS). In this case, we can build our specific dataset using SQL queries.

Web scraping/APIs
Online news, social media posts, and search results represent rich streams of

data, which we can leverage for our deep learning projects. We do this via Web

scraping: the extraction of data from websites. While scraping and collecting

data, we should keep in mind ethical considerations, including privacy and

consent issues. There are many tools to web scrape in Python, including

BeautifulSoup. Many sites, like Reddit and Twitter, have Python Application

Programming Interfaces (APIs). We can use APIs to gather data from different

applications. While some APIs are free, others are paid services.

Depending on the size of the dataset, we may be able to directly write our

scraped data to raw data files (e.g., .txt or .csv). However, for larger datasets,

we sometimes need to store the resulting data in our own databases.

Crowd-sourced Labeling
For many tasks, it’s much easier to acquire data than it is to find labeled data.

For example, it is much easier to scrape the raw text from an entire Reddit

subreddit than to correctly label the contents of each Reddit post. When

automated labeling tools aren’t available, we require human labels. One

possibility would be for us to go through our own data, and annotate each

datapoint ourselves. However, sometimes that just isn’t feasible, no matter

how much coffee we have on hand. An alternative is crowd-sourcing sites like

Amazon Mechanical Turk. We can utilize these sites to pay “gig” workers for

thousands of human annotations.

Part 2: Preprocessing
Once we have built our dataset, we need to preprocess it into useful features

for our deep learning models. At a high-level, we have three primary goals

when preprocessing data for neural networks: we want to 1) clean our data, 2)

handle categorical features and text and 3) scale our real-valued features

using normalization or standardization techniques. Preprocessing is also a

fantastic opportunity to become more familiar with our data.

Cleaning data
Often, our datasets contain noisy examples, extra features, missing data and

outliers. It is good practice to test for and remove outliers, remove

unnecessary features, fill-in missing data, and filter out noisy examples.

Scaling features
Because we initialize neural networks with small weights to stabilize training,

our models will struggle when faced with input features that have large

values. As a result, we often scale real-valued features in two ways: We can

normalize features so that they are between 0 and 1, and standardize them so

they have a mean of zero and a variance of one.

Handling categorical data and text

Neural networks expect numbers as their inputs. This means we need to

convert all categorical data and text to real-valued numbers. - We usually

handle categorical variables by assigning each option its own unique integer

or converting them to one-hot encodings. - When working with strings of raw

text, we need to handle a few extra processing steps before encoding our
words as integers. These steps include tokenizing our data (splitting our text

into individual words/tokens), and padding our data (adding padding tokens to

make all of our examples the same length).

Part 3: Splitting and Balancing the Dataset

Once we have processed our data, it’s time to split our dataset. Generally, we

split our data into two datasets: training and validation. In certain cases, we

also create a third holdout dataset, referred to as the test set. When we don’t

do this, we often use the terms “validation” and “test” sets interchangeably.

We train our model on the training dataset and we evaluate it on the validation

dataset. If we have defined a third holdout test set, we test our model on this

dataset after we have finished selecting our model and tuning our

hyperparameters. This third step helps us avoid choosing a set of

hyperparameters that only happen to work well on the data we chose for our

validation set.

When splitting our dataset, there are two major considerations: the size of our

splits, and whether we will stratify our data. After we split our data, we need to

address imbalances in our training set.

Splitting Our Data

We usually save 10-30% of our data for validation and testing. When we have a

smaller corpus, it is more important to assign a larger proportion of data to

the validation set. This helps ensure that our validation dataset better

represents the true distribution of our data.

Scikit-learn provides the train_test_split function, which splits our data

into training and validation datasets and specifies the size of our validation

data.

Stratified Train-Test Splits

We have to be extra careful when splitting a very imbalanced dataset for

classification; it’s very possible that more instances of our minority classes

end up in either the training or the validation set. In the first case, our

validation metrics will not accurately capture our model’s ability to classify the

minority class. In the second case, the model will overestimate the probability

of the majority class.

The solution is to use a stratified split: a split that ensures the training and

validation sets have the same proportion of examples from each class.

If we set the train_test_split function’s stratify parameter to our

array of labels, the function will compute the proportion of each class, and

ensure that this ratio is the same in our training and validation data.

Handling Imbalanced Data

Imbalanced data, where some classes appear much more than others, pose a

challenge for deep learning models. If we train neural networks on imbalanced

data, our resulting model will be heavily biased towards predicting those
majority classes. This is especially problematic because usually we care

much more about identifying instances of the minority classes (like rare cases

of disease or credit fraud).

There are two main approaches to dealing with imbalanced training data:

undersampling and oversampling. These two approaches should be taken-up

with utmost caution, and it’s best to have a domain expert on hand to weigh

in.

In undersampling, we balance our data by throwing out examples from

our majority class.

In oversampling, we duplicate instances of our minority class so that

they occur more often. A popular alternative to traditional oversampling

is called Synthetic Minority Oversampling TEchnique (SMOTE). The

SMOTE algorithm creates synthetic examples that are similar to those in

our minority class, and adds them to our dataset.

Almost always, we only correct the imbalance in our training data, and leave

the validation data as is. In order to only augment our training data, we need to

correct for imbalance only after our train-test split.

We never oversample our data before we split it. If we do, copies of our testing

data can sneak into our training data. This is called information leak.

Part 4: Building and Training the Model

Once we have split our dataset, it’s time to choose our loss function, and our

layers.

For each layer, we also need to select a reasonable number of hidden units.

There is no absolute science to choosing the right size for each layer, nor the

number of layers — it all depends on your specific data and architecture.

It’s good practice to start with a few layers (2-6).

Usually, we create each layer with between 32 and 512 hidden units.

We also tend to decrease the size of hidden layers as we move upwards

- through the model.

We usually try SGD and Adam optimizers first.

When setting an initial learning rate, a common practice is to default to

0.01.

Part 5: Evaluating Performance

Each time we train the model, we evaluate its performance on our validation

set. When we provide a validation set at training time, Keras handles this

automatically. Our performance on the validation set gives us a sense for how

our model will perform on new, unseen data.

When considering performance, it’s important to choose the correct metric. If

our data set is heavily imbalanced, accuracy (and even AUC) will be less

meaningful. In this case, we likely want to consider metrics like precision and

recall. F1-score is another useful metric that combines both precision and
recall. A confusion matrix can help visualize what data-points are

misclassified and what aren’t.

Part 6: Tuning Hyperparameters

We will almost always need to iterate upon our initial hyperparameters. When

training and evaluating our model, we explore different learning rates, batch

sizes, architectures, and regularization techniques.

As we tune our parameters, we should watch our loss and metrics, and be on

the lookout for clues as to why our model is struggling.:

Unstable learning means that we likely need to reduce our learning rate

and or/increase our batch size.

A disparity between performance on the training and evaluation sets

means we are overfitting, and should reduce the size of our model, or

add regularization (like dropout).

Poor performance on both the training and the test set means that we

are underfitting, and may need a larger model or a different learning

rate.

A common practice is to start with a smaller model and scale up our

hyperparameters until we do see training and validation performance diverge,

which means we have overfit to our data.

Critically, because neural network weights are randomly initialized, your scores

will fluctuate, regardless of hyperparameters. One way to make accurate

judgments is to run the same hyperparameter configuration multiple times,

with different random seeds.

Once our results are satisfactory, we are ready to use our model!

If we made a holdout test set, separate from our validation data, now is when

we use it to test out our model. The holdout test set provides a final guarantee

of our model’s performance on unseen data.

Part 7: Deployment (For Industry)

Once we have trained a model, we may want to deploy it into the real world.

This is especially true in industry settings, when our networks will be used by

our coworkers and customers, or working behind the scenes in our products

and internal tools.

When deploying a neural network there are three big considerations…

How will we handle the compute requirements for running our models?
It takes a significant amount of computation to evaluate a single input using a

neural network, let alone manage traffic from many different users. As a result,

when deploying a neural network model in a Docker container, it’s important to

host the container where it can access powerful computing resources. Cloud

platforms like AWS, GCP and Azure are great places to start. These platforms

provide flexible hosting services for applications that can scale up to meet

changing demand.
How will we pass inputs into the model?
A common approach for interfacing with our model over the web is Flask, a

Python-based web framework. Flask can handle requests and pass inputs to

our model.

How will we run the code and manage dependencies, wherever we host our
application? (Optional)
This can depend on where we host our model. However, a popular

general-purpose solution to this last question is Docker Containers. Docker

containers are a way to package up our code and its dependencies (e.g. the

correct version of TensorFlow), in such a way that our application can run

quickly in any computing environment.

Conclusion
In this article, we covered the general workflow for a deep learning project. We

covered a lot of material, so don’t sweat every detail. Our goal is to provide a

sense for the overarching flow of a deep learning project, from data

acquisition and preprocessing to evaluation and hyperparameter tuning.

These guidelines are not completely ironclad rules. Rather, in a successful

deep learning project, we often pivot back and forth between different steps,

continuously tweaking and debugging our scripts, data, and architectures on

our quest for the best performing model.

Designing Machine Learning Systems by Chip Huygen by Rick
No ratings yet
Designing Machine Learning Systems by Chip Huygen by Rick
15 pages
International Fashion Designer
No ratings yet
International Fashion Designer
25 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
All India Tariff ON Contractor'S All Risks Insurance: Tariff Advisory Committee Ador House, Mumbai
No ratings yet
All India Tariff ON Contractor'S All Risks Insurance: Tariff Advisory Committee Ador House, Mumbai
126 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
Chapter 2 Preparing To Model
No ratings yet
Chapter 2 Preparing To Model
49 pages
Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur - No Free Hunch
No ratings yet
Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur - No Free Hunch
22 pages
OceanofPDF - Com Hands-On Machine Learning From Scratch - Venelin Valkov
No ratings yet
OceanofPDF - Com Hands-On Machine Learning From Scratch - Venelin Valkov
119 pages
Three Reasons That You Should NOT Use Deep Learning - by George Seif - Towards Data Science
No ratings yet
Three Reasons That You Should NOT Use Deep Learning - by George Seif - Towards Data Science
1 page
Lecture 12 - Machine Learning
No ratings yet
Lecture 12 - Machine Learning
18 pages
Fixing Neural Network Course 2 1659759284
No ratings yet
Fixing Neural Network Course 2 1659759284
30 pages
NNDL - Unit 3CBS
No ratings yet
NNDL - Unit 3CBS
6 pages
Air Quality Prediction Using Machine Learning
No ratings yet
Air Quality Prediction Using Machine Learning
29 pages
5 DL
No ratings yet
5 DL
33 pages
Lec1 Introduction
No ratings yet
Lec1 Introduction
60 pages
5.1 Large Scale ML
No ratings yet
5.1 Large Scale ML
10 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Course Two
No ratings yet
Course Two
133 pages
Workflow of A Machine Learning Project
No ratings yet
Workflow of A Machine Learning Project
12 pages
Module 5.pptx - 20250608 - 201231 - 0000
No ratings yet
Module 5.pptx - 20250608 - 201231 - 0000
43 pages
Lec2 Intro To ML
No ratings yet
Lec2 Intro To ML
35 pages
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
Lecture 2 - Hello World in ML
No ratings yet
Lecture 2 - Hello World in ML
49 pages
UNIT-1 (1)
No ratings yet
UNIT-1 (1)
14 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
Lones 2024
No ratings yet
Lones 2024
28 pages
Lecture 2: Introduction To Pytorch
No ratings yet
Lecture 2: Introduction To Pytorch
7 pages
PYTHON DATA SCIENCE: A Practical Guide to Mastering Python for Data Science and Artificial Intelligence (2023 Beginner Crash Course)
From Everand
PYTHON DATA SCIENCE: A Practical Guide to Mastering Python for Data Science and Artificial Intelligence (2023 Beginner Crash Course)
Calvert Long
No ratings yet
Types of ML
No ratings yet
Types of ML
4 pages
Intro To ML
No ratings yet
Intro To ML
29 pages
Week 3 A
No ratings yet
Week 3 A
18 pages
Preparing Data For Machine Learning - Pluralsight PDF
No ratings yet
Preparing Data For Machine Learning - Pluralsight PDF
74 pages
MMC102_Module 4_Notes
No ratings yet
MMC102_Module 4_Notes
39 pages
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
Chapter-3-Common Issues in Machine Learning
No ratings yet
Chapter-3-Common Issues in Machine Learning
20 pages
ML Notion 1
No ratings yet
ML Notion 1
18 pages
Assignment 3 DL
No ratings yet
Assignment 3 DL
6 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
6 pages
ML Interactively
No ratings yet
ML Interactively
273 pages
Aws ML PDF
No ratings yet
Aws ML PDF
74 pages
SocrAI Day 3
No ratings yet
SocrAI Day 3
43 pages
Research Trends in Machine Learning: Muhammad Kashif Hanif
No ratings yet
Research Trends in Machine Learning: Muhammad Kashif Hanif
80 pages
Sony Ai Content
No ratings yet
Sony Ai Content
26 pages
Achine Learning Actionable Roadmap
No ratings yet
Achine Learning Actionable Roadmap
19 pages
Machine Learning
No ratings yet
Machine Learning
57 pages
ML_Unit 1
No ratings yet
ML_Unit 1
68 pages
7 Machine Learning and Deep Learning Mistakes and Limitations To Avoid
No ratings yet
7 Machine Learning and Deep Learning Mistakes and Limitations To Avoid
10 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
DATA 2024 - Dist
No ratings yet
DATA 2024 - Dist
72 pages
Simple Introduction of Neural Network
No ratings yet
Simple Introduction of Neural Network
28 pages
Unit 4 - DS - 1st Year
No ratings yet
Unit 4 - DS - 1st Year
6 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
ML and DL
No ratings yet
ML and DL
15 pages
Expanded Introduction To ML - CCAI Virtual Summer School 2024 (SHARED WITH EXTERNAL)
No ratings yet
Expanded Introduction To ML - CCAI Virtual Summer School 2024 (SHARED WITH EXTERNAL)
9 pages
Text Classification - Movie Review - News Wires
No ratings yet
Text Classification - Movie Review - News Wires
5 pages
ML Ans
No ratings yet
ML Ans
4 pages
Train: Dev: Test Sets
No ratings yet
Train: Dev: Test Sets
5 pages
Coding Neural Networks-Classification & Regression
No ratings yet
Coding Neural Networks-Classification & Regression
39 pages
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
100% (2)
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
32 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
01 - Introduction To Deep Learning
No ratings yet
01 - Introduction To Deep Learning
56 pages
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Groundwater CW
No ratings yet
Groundwater CW
16 pages
PAD111 CBT Study Questions
No ratings yet
PAD111 CBT Study Questions
5 pages
Coffee Corner - Geotechnical Analysis (AM) - Key Points On Earthquake Analysis
No ratings yet
Coffee Corner - Geotechnical Analysis (AM) - Key Points On Earthquake Analysis
3 pages
Story Telling (The Little Mermaid) Tugas Bahasa Inggris
No ratings yet
Story Telling (The Little Mermaid) Tugas Bahasa Inggris
7 pages
Strength and Durability of Fly Ash, Cement and Gypsum Bricks
No ratings yet
Strength and Durability of Fly Ash, Cement and Gypsum Bricks
4 pages
Whiting Crane Handbook - Unlocked
100% (2)
Whiting Crane Handbook - Unlocked
205 pages
Unit2 Unit3 Merged
No ratings yet
Unit2 Unit3 Merged
17 pages
And Glenn Martens: by Hui Zhu, Li Yuan Liao, Binta Djalo and Laury Labrouquère
No ratings yet
And Glenn Martens: by Hui Zhu, Li Yuan Liao, Binta Djalo and Laury Labrouquère
10 pages
Doha Port Redevelopment - Combined Mina District & Seawater Diving Pool Check List For Drilling Machine
No ratings yet
Doha Port Redevelopment - Combined Mina District & Seawater Diving Pool Check List For Drilling Machine
42 pages
2024 KLB Visionary Grade 5 Maths Schemes of Work Term 2
No ratings yet
2024 KLB Visionary Grade 5 Maths Schemes of Work Term 2
15 pages
Lesson 2 Working With Text
No ratings yet
Lesson 2 Working With Text
16 pages
Fehling's Test PDF
No ratings yet
Fehling's Test PDF
11 pages
Professional STEM Teacher Training Programs
No ratings yet
Professional STEM Teacher Training Programs
2 pages
Unit 3 Forces and Energy Unit Lesson Plan
No ratings yet
Unit 3 Forces and Energy Unit Lesson Plan
4 pages
Population Ecology Practice Quiz
100% (1)
Population Ecology Practice Quiz
4 pages
Anti Asthmatic Drugs
No ratings yet
Anti Asthmatic Drugs
19 pages
HRM Course Outline
No ratings yet
HRM Course Outline
5 pages
Conditional Sentences and Inversion
No ratings yet
Conditional Sentences and Inversion
3 pages
Project Lakshmi 1-1
No ratings yet
Project Lakshmi 1-1
30 pages
Contractor Approval Letter - Galfar Engineering & Contracting Saog - Cf-8659 - 98645799 - Sarwan Singh - 22-06-2024 05 - 32 PM
No ratings yet
Contractor Approval Letter - Galfar Engineering & Contracting Saog - Cf-8659 - 98645799 - Sarwan Singh - 22-06-2024 05 - 32 PM
1 page
Audit Report
No ratings yet
Audit Report
3 pages
Financial Assets or The Financial Value of Assets, Such As Cash. The Factories, Machinery and Equipment Owned by A Business
No ratings yet
Financial Assets or The Financial Value of Assets, Such As Cash. The Factories, Machinery and Equipment Owned by A Business
5 pages
General Chemistry, 11th Edition, Ralph H. Petrucci, ISBN-10: 0134193601
No ratings yet
General Chemistry, 11th Edition, Ralph H. Petrucci, ISBN-10: 0134193601
406 pages
DXRB
No ratings yet
DXRB
8 pages
National Policy and Criteria For The Implementation of RPL Amended in March 2019
No ratings yet
National Policy and Criteria For The Implementation of RPL Amended in March 2019
16 pages
AADHAAR Exam Chapter 2 - Questions & Answers - 2021
No ratings yet
AADHAAR Exam Chapter 2 - Questions & Answers - 2021
10 pages
RateMyCop Website - Sturgeon Bay Police Dept
No ratings yet
RateMyCop Website - Sturgeon Bay Police Dept
16 pages
Greenheck (471558GGB Iom)
No ratings yet
Greenheck (471558GGB Iom)
12 pages

Deep Learning Workflow

Uploaded by

Deep Learning Workflow

Uploaded by

Deep Learning Workflow

workflow has seven primary components:

​ Building and training the model

​ Deploying our solution (For Industry)

Part 1: Acquiring Data

but it’s also often the hardest.

Luckily, there are many potential data sources:

Publicly Available Datasets

datasets helps reduce the overhead of starting a deep learning project.

these datasets are stored in a Relational Database Management System

data, we should keep in mind ethical considerations, including privacy and

we sometimes need to store the resulting data in our own databases.

automated labeling tools aren’t available, we require human labels. One

datapoint ourselves. However, sometimes that just isn’t feasible, no matter

how much coffee we have on hand. An alternative is crowd-sourcing sites like

thousands of human annotations.

using normalization or standardization techniques. Preprocessing is also a

fantastic opportunity to become more familiar with our data.

outliers. It is good practice to test for and remove outliers, remove

values. As a result, we often scale real-valued features in two ways: We can

they have a mean of zero and a variance of one.

Handling categorical data and text

convert all categorical data and text to real-valued numbers. - We usually

or converting them to one-hot encodings. - When working with strings of raw

make all of our examples the same length).

Part 3: Splitting and Balancing the Dataset

hyperparameters. This third step helps us avoid choosing a set of

address imbalances in our training set.

Splitting Our Data

smaller corpus, it is more important to assign a larger proportion of data to

represents the true distribution of our data.

Scikit-learn provides the train_test_split function, which splits our data

Stratified Train-Test Splits

of the majority class.

If we set the train_test_split function’s stratify parameter to our

Handling Imbalanced Data

challenge for deep learning models. If we train neural networks on imbalanced

of disease or credit fraud).

undersampling and oversampling. These two approaches should be taken-up

​ In undersampling, we balance our data by throwing out examples from

our majority class.

​ In oversampling, we duplicate instances of our minority class so that

they occur more often. A popular alternative to traditional oversampling

is called Synthetic Minority Oversampling TEchnique (SMOTE). The

SMOTE algorithm creates synthetic examples that are similar to those in

our minority class, and adds them to our dataset.

correct for imbalance only after our train-test split.

Part 4: Building and Training the Model

number of layers — it all depends on your specific data and architecture.

​ It’s good practice to start with a few layers (2-6).

​ We also tend to decrease the size of hidden layers as we move upwards

- through the model.

​ We usually try SGD and Adam optimizers first.

​ When setting an initial learning rate, a common practice is to default to

Part 5: Evaluating Performance

our model will perform on new, unseen data.

When considering performance, it’s important to choose the correct metric. If

misclassified and what aren’t.

Part 6: Tuning Hyperparameters

sizes, architectures, and regularization techniques.

the lookout for clues as to why our model is struggling.:

and or/increase our batch size.

​ A disparity between performance on the training and evaluation sets

add regularization (like dropout).

are underfitting, and may need a larger model or a different learning

A common practice is to start with a smaller model and scale up our

hyperparameters until we do see training and validation performance diverge,

which means we have overfit to our data.

will fluctuate, regardless of hyperparameters. One way to make accurate

with different random seeds.

of our model’s performance on unseen data.

Part 7: Deployment (For Industry)

and internal tools.

When deploying a neural network there are three big considerations…

when deploying a neural network model in a Docker container, it’s important to

general-purpose solution to this last question is Docker Containers. Docker

Building and training the model

Deploying our solution (For Industry)

In undersampling, we balance our data by throwing out examples from

In oversampling, we duplicate instances of our minority class so that

It’s good practice to start with a few layers (2-6).

We also tend to decrease the size of hidden layers as we move upwards

We usually try SGD and Adam optimizers first.

When setting an initial learning rate, a common practice is to default to

A disparity between performance on the training and evaluation sets