0% found this document useful (0 votes)

89 views25 pages

M10 - Introduction To TensorFlow, Deep Learning and Application

Uploaded by

Rica Embestro

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views25 pages

M10 - Introduction To TensorFlow, Deep Learning and Application

Uploaded by

Rica Embestro

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

INTRODUCTION TO TENSORFLOW,

DEEP LEARNING & APPLICATION

(COMPUTER VISION –
OBJECT DETECTION)
MODULE 10
• WHAT IS OBJECT DETECTION
• DEEP LEARNING
• WHAT IS DEEP LEARNING
• DEEP LEARNING VS MACHINE LEARNING

OUTLINE • DEEP LEARNING APPROACHES

• TENSORFLOW OBJECT DETECTION API

• PREPARING DATA
• TRAINING & EVALUATING
WHAT IS OBJECT
DETECTION
OBJECT DETECTION =
OBJECT CLASSIFICATION + OBJECT LOCALIZATION
ONE MODEL FOR TWO TASKS?

Po - is object
exists
bx1
- bounding box
bx2 coordinates
Object detection - output is the one number (index) of a class
by1

c1
by2

c2 - object’s
variables
c3
…

Object localization - output is the four numbers - cn

coordinates of bounding box.
DEEP LEARNING
WHAT IS DEEP LEARNING

• DEEP LEARNING IS AN AI FUNCTION THAT MIMICS THE WORKINGS OF THE

HUMAN BRAIN IN PROCESSING DATA FOR USE IN DETECTING OBJECTS,
RECOGNIZING SPEECH, TRANSLATING LANGUAGES, AND MAKING
DECISIONS.
• DEEP LEARNING AI CAN LEARN WITHOUT HUMAN SUPERVISION, DRAWING
FROM DATA THAT IS BOTH UNSTRUCTURED AND UNLABELED.
• DEEP LEARNING, A FORM OF MACHINE LEARNING, CAN BE USED TO HELP
DETECT FRAUD OR MONEY LAUNDERING, AMONG OTHER FUNCTIONS.
DEEP LEARNING VS MACHINE LEARNING
• CLASSICAL APPROACH
(HAAR FEATURES) - FIRST
OBJECT DETECTION REAL
TIME FRAMEWORK (
VIOLA-JONES)
• DEEP LEARNING APPROACH
- NOW STATE OF THE ART IN
OBJECT DETECTION
APPROACHES • OVERFEAT
• R-CNN
• FAST R-CNN
• YOLO
• FASTER R-CNN
• SSD AND R-FCN
DEEP LEARNING
APPROACH
OverFeat - published in 2013, multi-scale
sliding window algorithm using Convolutional
Neural Networks (CNNs).

C.NN - Regions with CNN features. Three stage

approach:
- Extract possible objects using a region
proposa method (the most popular one being l
Selective Search).
- Extract features from each region using a
CNN.
- Classify each region with SVMs.
DEEP LEARNING APPROACH
Fast R-CNN - Similar to R-CNN, it used Selective
Search to generate object proposals, but instead of
extracting all of them independently and using SVM
classifiers, it applied the CNN on the complete
image and then used both Region of Interest (RoI)
Pooling on the feature map with a final feed
forward network for classification and regression.

YOLO - You Only Look Once: a

simple convolutional neural
network approach which has
both great results and high
speed, allowing for the first
time real time object
detection.
DEEP LEARNING APPROACH
Faster R-CNN - Faster R-CNN added what
they called a Region Proposal Network
(RPN), in an attempt to get rid of the
Selective Search algorithm and make the
model completely trainable end-to-end.

SSD and R-FCN

Finally, there are two notable papers, Single Shot
Detector (SSD) which takes on YOLO by using
multiple sized convolutional feature maps achieving
better results and speed, and Region-based Fully
Convolutional Networks (R-FCN) which takes the
architecture of Faster R-CNN but with only
convolutional networks.
TENSORFLOW OBJECT
DETECTION API
• OPEN SOURCE FROM 2017-07-
15
• BUILT ON TOP OF
TENSORFLOW
•
TF OBJECT CONTAINS TRAINABLE
DETECTION MODELS

DETECTION • CONTAINS FROZEN WEIGHTS

• CONTAINS JUPYTER
API NOTEBOOK
• MAKES EASY TO
CONSTRUCT, TRAIN AND
DEPLOY
• OBJECT DETECTION MODELS
• DEPENDENCIES: GETTING STARTED

If model will be trained locally - better

▪ PROTOBUF 2.6 to install tensorflow-gpu.
▪ PYTHON-TK
▪ PILLOW 1.0 Dependencies for tensorflow-gpu:
▪ LXML ▪ NVIDIA GPU with CUDA Compute Capability 3.0
▪ TF SLIM (INCLUDED) (list)
▪ Ubuntu 16.04 at least
▪ JUPYTER NOTEBOOK
▪ CUDA® Toolkit 9.0
▪ MATPLOTLIB ▪ NVIDIA drivers associated with CUDA Toolkit
▪ TENSORFLOW 9.0.
(TENSORFLOW- ▪ cuDNN v7.0
• GPU) ▪ libcupti-dev
Installation
▪ CYTHON instruction
▪COCOAPI Latest version of CUDA Toolkit - 9.1
INSTALLATION not compatible with tensorflow 1.6,
need to install 9.0
INSTRUCTION
CREATING A
DATASET
• TENSORFLOW OBJECT
DETECTION API USES THE
TFRECORD FILE FORMAT
• THERE IS AVAILABLE THIRD-
PARTY SCRIPTS TO CONVERT

DATASET • PASCAL VOC AND OXFORD PET

FORMAT
• IN OTHER CASE EXPLANATION OF
FORMAT AVAILABLE IN GIT REPO.
• INPUT DATA TO CREATE TFRECORD –
ANNOTATED IMAGE
GETTING IMAGES
• CREATE OWN IMAGES
Grab from internet ▪ RECORD VIDEO WITH NEEDED
▪ Scrap images from google or • OBJECT/OBJECTS (IN 640X480)
Pixabay or whatever ▪ PROCESS VIDEO AND SPLIT ON
▪ For batch downloading SCREENSHOTS - FFMPEG
- Faktun Bulk Image • TIPS
Downloader
▪ CREATE IMAGES WITH DIFFERENT
▪ For data mining by
multiplying existing images - • LIGHTS, BACKGROUND AND SO ON.
ImageMagic ▪ IF OBJECT IS ABLE TO HAVE
DIFFERENT FORMS - BETTER
TO CATCH THEM ALL.
▪ TRY TO MAKE 30%-50% OF
• IMAGES WITH OVERLAID
OBJECT
▪ TOOL FOR IMAGE
AUGMENTATION
LABELING (ANNOTATION) AN
IMAGES

Tools

▪ LabelImg ▪ input: images

▪ FIAT (Fast Image ▪ output: .xml files with
Data Annotation bounding boxes
Tool) coordinates
CREATING TFRECORD

▪ Tensorflow object detection API repo contains folder dataset_tools with scripts
to coverts common structures of data in TFRecord.

▪ If output data has another structure - here is explanation how to convert it

TRAINING
TENSORFLOW OD API PROVIDES A

SELECTING A COLLECTION OF DETECTION

MODELS PRE-TRAINED ON THE COCO
DATASET, THE KITTI DATASET, AND

MODEL THE OPEN IMAGES DATASET.

• MODEL NAME CORRESPONDS TO

A CONFIG FILE THAT WAS USED
TO TRAIN THIS MODEL.
• SPEED - RUNNING TIME IN MS
PER 600X600 IMAGE
• MAP STANDS FOR MEAN
AVERAGE PRECISION, WHICH
INDICATES HOW WELL THE
MODEL PERFORMED ON THE
COCO DATASET.
• OUTPUTS TYPES (BOXES, AND
MASKS IF
• APPLICABLE)
CONFIGURING
● Folders structure ● pipeline.config
train_config: {
fine_tune_checkpoint: "<path_to_model.ckpt>"
num_steps: 200000
}
train_input_reader {
label_map_path: "<path_to_labels.pbtxt>"
tf_record_input_reader {
input_path: "<path_to_train.record>"
}
}
eval_config
{ num_examples:
8000
● label.pbtxt max_evals: 10
use_moving_averages: false
}
eval_input_reader {
label_map_path: "<path_to_labels.pbtxt>"
shuffle: false
num_readers: 1
tf_record_input_reader {
input_path:
"<path_to_test.record>
"
}
}
TRAINING &
EVALUATING
# From the tensorflow/models/research directory
python object_detection/train.py
--logtostderr
--
pipeline_config_path=/tensorflow/models/object_detection/samples/configs/ssd_mobilenet_v1_p
ets.config
--train_dir=${PATH_TO_ROOT_TRAIN_FOLDER}

# From the tensorflow/models/research directory

python object_detection/eval.py \
--logtostderr \
--pipeline_config_path=$
{PATH_TO_YOUR_PIPELINE_CONFIG} \
--checkpoint_dir=${PATH_TO_TRAIN_DIR} \
--eval_dir=${PATH_TO_EVAL_DIR}

2013 Ul Hasan Icdar Can We Build Lanugage Independent Ocr Using LSTM Networks
No ratings yet
2013 Ul Hasan Icdar Can We Build Lanugage Independent Ocr Using LSTM Networks
6 pages
UserGuide PDF
No ratings yet
UserGuide PDF
1,626 pages
1 Chapter 13 Dependability Engineering
No ratings yet
1 Chapter 13 Dependability Engineering
50 pages
Lesson 07
No ratings yet
Lesson 07
59 pages
Refactoring PDF
No ratings yet
Refactoring PDF
43 pages
Face Mask Detector: A Project Report Submitted in Partial Fulfillment of The Requirement For The Award of The Degree of
No ratings yet
Face Mask Detector: A Project Report Submitted in Partial Fulfillment of The Requirement For The Award of The Degree of
28 pages
Lightweight Cryptography Algorithms For Resource-Constrained IoT Devices A Review Comparison and Research Opportunities
No ratings yet
Lightweight Cryptography Algorithms For Resource-Constrained IoT Devices A Review Comparison and Research Opportunities
17 pages
Flutter 2
No ratings yet
Flutter 2
121 pages
1 s2.0 S1877050920308218 Main
No ratings yet
1 s2.0 S1877050920308218 Main
8 pages
Slideshare Grokking Deep Learning 170314155452
No ratings yet
Slideshare Grokking Deep Learning 170314155452
20 pages
Transfer Learning Approach Based On MobileNet Architecture For Human Smile Detection
No ratings yet
Transfer Learning Approach Based On MobileNet Architecture For Human Smile Detection
9 pages
A Reimagined Future of Possibilities
No ratings yet
A Reimagined Future of Possibilities
11 pages
Instant HTML5 Geolocation How-To
From Everand
Instant HTML5 Geolocation How-To
Ben Werdmuller
No ratings yet
Age and Gender Detection
No ratings yet
Age and Gender Detection
4 pages
Class 9th Post Mid Dec
No ratings yet
Class 9th Post Mid Dec
2 pages
Study of Logic Gates
No ratings yet
Study of Logic Gates
26 pages
Face Recognition Based Attendance System
No ratings yet
Face Recognition Based Attendance System
54 pages
2023 LLMBC LLM Foundations
No ratings yet
2023 LLMBC LLM Foundations
92 pages
Mulesoft 4x Training Course Content SVR Technologies 02
No ratings yet
Mulesoft 4x Training Course Content SVR Technologies 02
5 pages
Bird Species Identifier Using Convolutional Neural Network
No ratings yet
Bird Species Identifier Using Convolutional Neural Network
9 pages
Introduction To Radial Basis Function Networks
No ratings yet
Introduction To Radial Basis Function Networks
45 pages
MongoDB Mongoosess
No ratings yet
MongoDB Mongoosess
31 pages
Jeff Dean's Lecture For YC AI
100% (19)
Jeff Dean's Lecture For YC AI
86 pages
Lecture 7
No ratings yet
Lecture 7
138 pages
3 - Golovko
No ratings yet
3 - Golovko
5 pages
Elegant Python: Simplifying Complex Solutions
From Everand
Elegant Python: Simplifying Complex Solutions
Michael Huang
No ratings yet
Mastering OpenCV Android Application Programming
From Everand
Mastering OpenCV Android Application Programming
Salil Kapur
No ratings yet
Multilayer Perceptron
No ratings yet
Multilayer Perceptron
24 pages
Hector SLAM USAR Kohlbrecher RRSS Graz 2012
No ratings yet
Hector SLAM USAR Kohlbrecher RRSS Graz 2012
39 pages
Delay Tolerant Networks Presentation
100% (1)
Delay Tolerant Networks Presentation
16 pages
Application of Computer Vision Technique On Sorting and Grading of Fruits and Vegetables 2157 7110.S1 001
No ratings yet
Application of Computer Vision Technique On Sorting and Grading of Fruits and Vegetables 2157 7110.S1 001
7 pages
Quad Report PDF
No ratings yet
Quad Report PDF
72 pages
Introduction To Networking
No ratings yet
Introduction To Networking
11 pages
IDC Futurescapes Predictions 2018 PDF
No ratings yet
IDC Futurescapes Predictions 2018 PDF
25 pages
Plant Disease Identification
No ratings yet
Plant Disease Identification
17 pages
Character Recognition Using DNN
No ratings yet
Character Recognition Using DNN
2 pages
Rhapsody Modeler Tutorial
100% (1)
Rhapsody Modeler Tutorial
160 pages
Azure Bicep QuickStart Pro: From JSON and ARM Templates to Advanced Deployment Techniques, CI/CD Integration, and Environment Management
From Everand
Azure Bicep QuickStart Pro: From JSON and ARM Templates to Advanced Deployment Techniques, CI/CD Integration, and Environment Management
Selina Threxan
No ratings yet
The ProductBook
No ratings yet
The ProductBook
220 pages
Eat Sleep Work Repeat Summary
No ratings yet
Eat Sleep Work Repeat Summary
4 pages
Study and Implementation of Object Detection and Visual Tracking
No ratings yet
Study and Implementation of Object Detection and Visual Tracking
32 pages
Real Time Object Detection Using Deep Learning Andmachine Learning Project
No ratings yet
Real Time Object Detection Using Deep Learning Andmachine Learning Project
56 pages
Fully Convolutional Neural Network
No ratings yet
Fully Convolutional Neural Network
7 pages
Extensible Markup Language
No ratings yet
Extensible Markup Language
38 pages
Multiple Object Tracking Using Deep Learning With Yolo v5 IJERTCONV9IS13010
No ratings yet
Multiple Object Tracking Using Deep Learning With Yolo v5 IJERTCONV9IS13010
5 pages
期末專題1
No ratings yet
期末專題1
14 pages
Scalable Javascript Application Architecture: Nicholas C. Zakas - @slicknet
No ratings yet
Scalable Javascript Application Architecture: Nicholas C. Zakas - @slicknet
108 pages
Bird Species Identification Using Deep Learning IJERTV8IS040112 6
No ratings yet
Bird Species Identification Using Deep Learning IJERTV8IS040112 6
5 pages
Types of Machine Learning
No ratings yet
Types of Machine Learning
63 pages
Object Detection Using Image Processing
No ratings yet
Object Detection Using Image Processing
17 pages
API Facade Pattern
No ratings yet
API Facade Pattern
37 pages
Topic 1 - Problem Domain of Artificial Intelligence
100% (1)
Topic 1 - Problem Domain of Artificial Intelligence
21 pages
1152 CS F425 Comprehensive Exam Question Paper DL
No ratings yet
1152 CS F425 Comprehensive Exam Question Paper DL
2 pages
Object Detection With Deep Learning
No ratings yet
Object Detection With Deep Learning
3 pages
Artificial Intelligence Presentation 2019
No ratings yet
Artificial Intelligence Presentation 2019
28 pages
Cacti 0.8 Network Monitoring
From Everand
Cacti 0.8 Network Monitoring
Dinangkur Kundu
No ratings yet
Data Visualization Nanodegree Program Syllabus PDF
No ratings yet
Data Visualization Nanodegree Program Syllabus PDF
4 pages
Straub - Understanding Technology Adoption
100% (1)
Straub - Understanding Technology Adoption
26 pages
Modernizing Legacy Applications in PHP
From Everand
Modernizing Legacy Applications in PHP
Paul M. Jones
No ratings yet
Image Colour Prediction Using Deep Learning
No ratings yet
Image Colour Prediction Using Deep Learning
4 pages
Use Case: From Wikipedia, The Free Encyclopedia
No ratings yet
Use Case: From Wikipedia, The Free Encyclopedia
6 pages
MIT 820 Architectures For Software Systems and Emerging
No ratings yet
MIT 820 Architectures For Software Systems and Emerging
26 pages
AI Text To Music
No ratings yet
AI Text To Music
2 pages
Automated Depression Detection Using Deep Representation and Sequence Learning With EEG Signals
No ratings yet
Automated Depression Detection Using Deep Representation and Sequence Learning With EEG Signals
12 pages
Danilo Cáceres Tanaka
No ratings yet
Danilo Cáceres Tanaka
1 page
Week 3
No ratings yet
Week 3
3 pages
Be - Computer Engineering - Semester 7 - 2022 - November - Machine Learning ML Pattern 2019
No ratings yet
Be - Computer Engineering - Semester 7 - 2022 - November - Machine Learning ML Pattern 2019
3 pages
Mesosphere Guide To Data-Rich Apps in Financial Services 1
No ratings yet
Mesosphere Guide To Data-Rich Apps in Financial Services 1
11 pages
Speech Segmentation
No ratings yet
Speech Segmentation
6 pages
LECTURE 1 - Inroduction To OOPs
100% (1)
LECTURE 1 - Inroduction To OOPs
22 pages
A Tour of Machine Learning Algorithms
No ratings yet
A Tour of Machine Learning Algorithms
9 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Learn OpenCV with Python by Examples
From Everand
Learn OpenCV with Python by Examples
James Chen
No ratings yet
NW.js Essentials
From Everand
NW.js Essentials
Alessandro Benoit
No ratings yet
Image Caption Generator
No ratings yet
Image Caption Generator
69 pages
Clean Architecture
No ratings yet
Clean Architecture
29 pages
Generative Adversarial Networks: (Ian Goodfellow)
No ratings yet
Generative Adversarial Networks: (Ian Goodfellow)
3 pages
Svelte Practice
No ratings yet
Svelte Practice
2 pages
MLP Sous Keras: A. MLP Pour Une Classification Binaire
No ratings yet
MLP Sous Keras: A. MLP Pour Une Classification Binaire
2 pages
What Is Generative Ai v7
No ratings yet
What Is Generative Ai v7
5 pages
Playing Card Detection and Identification: Project Goal
No ratings yet
Playing Card Detection and Identification: Project Goal
1 page
Shap Lime
No ratings yet
Shap Lime
6 pages
AI-Driven Music Composition
No ratings yet
AI-Driven Music Composition
3 pages
Supervised Vs Unsupervised Learning
No ratings yet
Supervised Vs Unsupervised Learning
4 pages
Dr. Sourabh Shrivastava - Image Processing
No ratings yet
Dr. Sourabh Shrivastava - Image Processing
4 pages
Introduction to Google's Go Programming Language: GoLang
From Everand
Introduction to Google's Go Programming Language: GoLang
Orhan Gazi
No ratings yet
Questions For ML - Built A Thon
No ratings yet
Questions For ML - Built A Thon
7 pages
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
From Everand
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
Kameron Hussain
No ratings yet
Kubernetes and Cloud Native Associate (KCNA) Exam Preparation
From Everand
Kubernetes and Cloud Native Associate (KCNA) Exam Preparation
Georgio Daccache
No ratings yet

M10 - Introduction To TensorFlow, Deep Learning and Application

Uploaded by

M10 - Introduction To TensorFlow, Deep Learning and Application

Uploaded by

INTRODUCTION TO TENSORFLOW,

DEEP LEARNING & APPLICATION

OUTLINE • DEEP LEARNING APPROACHES

• TENSORFLOW OBJECT DETECTION API

Object localization - output is the four numbers - cn

• DEEP LEARNING IS AN AI FUNCTION THAT MIMICS THE WORKINGS OF THE

C.NN - Regions with CNN features. Three stage

YOLO - You Only Look Once: a

SSD and R-FCN

DETECTION • CONTAINS FROZEN WEIGHTS

If model will be trained locally - better

DATASET • PASCAL VOC AND OXFORD PET

▪ LabelImg ▪ input: images

▪ If output data has another structure - here is explanation how to convert it

SELECTING A COLLECTION OF DETECTION

MODEL THE OPEN IMAGES DATASET.

• MODEL NAME CORRESPONDS TO

# From the tensorflow/models/research directory

You might also like