0% found this document useful (0 votes)
69 views

M10 - Introduction To TensorFlow, Deep Learning and Application

Uploaded by

Rica Embestro
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

M10 - Introduction To TensorFlow, Deep Learning and Application

Uploaded by

Rica Embestro
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

INTRODUCTION TO TENSORFLOW,

DEEP LEARNING & APPLICATION


(COMPUTER VISION –
OBJECT DETECTION)
MODULE 10
• WHAT IS OBJECT DETECTION
• DEEP LEARNING
• WHAT IS DEEP LEARNING
• DEEP LEARNING VS MACHINE LEARNING

OUTLINE • DEEP LEARNING APPROACHES

• TENSORFLOW OBJECT DETECTION API


• PREPARING DATA
• TRAINING & EVALUATING
WHAT IS OBJECT
DETECTION
OBJECT DETECTION =
OBJECT CLASSIFICATION + OBJECT LOCALIZATION
ONE MODEL FOR TWO TASKS?

Po - is object
exists
bx1
- bounding box
bx2 coordinates
Object detection - output is the one number (index) of a class
by1

c1
by2

c2 - object’s
variables
c3

Object localization - output is the four numbers - cn


coordinates of bounding box.
DEEP LEARNING
WHAT IS DEEP LEARNING

• DEEP LEARNING IS AN AI FUNCTION THAT MIMICS THE WORKINGS OF THE


HUMAN BRAIN IN PROCESSING DATA FOR USE IN DETECTING OBJECTS,
RECOGNIZING SPEECH, TRANSLATING LANGUAGES, AND MAKING
DECISIONS.
• DEEP LEARNING AI CAN LEARN WITHOUT HUMAN SUPERVISION, DRAWING
FROM DATA THAT IS BOTH UNSTRUCTURED AND UNLABELED.
• DEEP LEARNING, A FORM OF MACHINE LEARNING, CAN BE USED TO HELP
DETECT FRAUD OR MONEY LAUNDERING, AMONG OTHER FUNCTIONS.
DEEP LEARNING VS MACHINE LEARNING
• CLASSICAL APPROACH
(HAAR FEATURES) - FIRST
OBJECT DETECTION REAL
TIME FRAMEWORK (
VIOLA-JONES)
• DEEP LEARNING APPROACH
- NOW STATE OF THE ART IN
OBJECT DETECTION
APPROACHES • OVERFEAT
• R-CNN
• FAST R-CNN
• YOLO
• FASTER R-CNN
• SSD AND R-FCN
DEEP LEARNING
APPROACH
OverFeat - published in 2013, multi-scale
sliding window algorithm using Convolutional
Neural Networks (CNNs).

C.NN - Regions with CNN features. Three stage


approach:
- Extract possible objects using a region
proposa method (the most popular one being l
Selective Search).
- Extract features from each region using a
CNN.
- Classify each region with SVMs.
DEEP LEARNING APPROACH
Fast R-CNN - Similar to R-CNN, it used Selective
Search to generate object proposals, but instead of
extracting all of them independently and using SVM
classifiers, it applied the CNN on the complete
image and then used both Region of Interest (RoI)
Pooling on the feature map with a final feed
forward network for classification and regression.

YOLO - You Only Look Once: a


simple convolutional neural
network approach which has
both great results and high
speed, allowing for the first
time real time object
detection.
DEEP LEARNING APPROACH
Faster R-CNN - Faster R-CNN added what
they called a Region Proposal Network
(RPN), in an attempt to get rid of the
Selective Search algorithm and make the
model completely trainable end-to-end.

SSD and R-FCN


Finally, there are two notable papers, Single Shot
Detector (SSD) which takes on YOLO by using
multiple sized convolutional feature maps achieving
better results and speed, and Region-based Fully
Convolutional Networks (R-FCN) which takes the
architecture of Faster R-CNN but with only
convolutional networks.
TENSORFLOW OBJECT
DETECTION API
• OPEN SOURCE FROM 2017-07-
15
• BUILT ON TOP OF
TENSORFLOW

TF OBJECT CONTAINS TRAINABLE
DETECTION MODELS

DETECTION • CONTAINS FROZEN WEIGHTS


• CONTAINS JUPYTER
API NOTEBOOK
• MAKES EASY TO
CONSTRUCT, TRAIN AND
DEPLOY
• OBJECT DETECTION MODELS
• DEPENDENCIES: GETTING STARTED

If model will be trained locally - better


▪ PROTOBUF 2.6 to install tensorflow-gpu.
▪ PYTHON-TK
▪ PILLOW 1.0 Dependencies for tensorflow-gpu:
▪ LXML ▪ NVIDIA GPU with CUDA Compute Capability 3.0
▪ TF SLIM (INCLUDED) (list)
▪ Ubuntu 16.04 at least
▪ JUPYTER NOTEBOOK
▪ CUDA® Toolkit 9.0
▪ MATPLOTLIB ▪ NVIDIA drivers associated with CUDA Toolkit
▪ TENSORFLOW 9.0.
(TENSORFLOW- ▪ cuDNN v7.0
• GPU) ▪ libcupti-dev
Installation
▪ CYTHON instruction
▪COCOAPI Latest version of CUDA Toolkit - 9.1
INSTALLATION not compatible with tensorflow 1.6,
need to install 9.0
INSTRUCTION
CREATING A
DATASET
• TENSORFLOW OBJECT
DETECTION API USES THE
TFRECORD FILE FORMAT
• THERE IS AVAILABLE THIRD-
PARTY SCRIPTS TO CONVERT

DATASET • PASCAL VOC AND OXFORD PET


FORMAT
• IN OTHER CASE EXPLANATION OF
FORMAT AVAILABLE IN GIT REPO.
• INPUT DATA TO CREATE TFRECORD –
ANNOTATED IMAGE
GETTING IMAGES
• CREATE OWN IMAGES
Grab from internet ▪ RECORD VIDEO WITH NEEDED
▪ Scrap images from google or • OBJECT/OBJECTS (IN 640X480)
Pixabay or whatever ▪ PROCESS VIDEO AND SPLIT ON
▪ For batch downloading SCREENSHOTS - FFMPEG
- Faktun Bulk Image • TIPS
Downloader
▪ CREATE IMAGES WITH DIFFERENT
▪ For data mining by
multiplying existing images - • LIGHTS, BACKGROUND AND SO ON.
ImageMagic ▪ IF OBJECT IS ABLE TO HAVE
DIFFERENT FORMS - BETTER
TO CATCH THEM ALL.
▪ TRY TO MAKE 30%-50% OF
• IMAGES WITH OVERLAID
OBJECT
▪ TOOL FOR IMAGE
AUGMENTATION
LABELING (ANNOTATION) AN
IMAGES

Tools

▪ LabelImg ▪ input: images


▪ FIAT (Fast Image ▪ output: .xml files with
Data Annotation bounding boxes
Tool) coordinates
CREATING TFRECORD

▪ Tensorflow object detection API repo contains folder dataset_tools with scripts
to coverts common structures of data in TFRecord.

▪ If output data has another structure - here is explanation how to convert it


TRAINING
TENSORFLOW OD API PROVIDES A

SELECTING A COLLECTION OF DETECTION


MODELS PRE-TRAINED ON THE COCO
DATASET, THE KITTI DATASET, AND

MODEL THE OPEN IMAGES DATASET.

• MODEL NAME CORRESPONDS TO


A CONFIG FILE THAT WAS USED
TO TRAIN THIS MODEL.
• SPEED - RUNNING TIME IN MS
PER 600X600 IMAGE
• MAP STANDS FOR MEAN
AVERAGE PRECISION, WHICH
INDICATES HOW WELL THE
MODEL PERFORMED ON THE
COCO DATASET.
• OUTPUTS TYPES (BOXES, AND
MASKS IF
• APPLICABLE)
CONFIGURING
● Folders structure ● pipeline.config
train_config: {
fine_tune_checkpoint: "<path_to_model.ckpt>"
num_steps: 200000
}
train_input_reader {
label_map_path: "<path_to_labels.pbtxt>"
tf_record_input_reader {
input_path: "<path_to_train.record>"
}
}
eval_config
{ num_examples:
8000
● label.pbtxt max_evals: 10
use_moving_averages: false
}
eval_input_reader {
label_map_path: "<path_to_labels.pbtxt>"
shuffle: false
num_readers: 1
tf_record_input_reader {
input_path:
"<path_to_test.record>
"
}
}
TRAINING &
EVALUATING
# From the tensorflow/models/research directory
python object_detection/train.py
--logtostderr
--
pipeline_config_path=/tensorflow/models/object_detection/samples/configs/ssd_mobilenet_v1_p
ets.config
--train_dir=${PATH_TO_ROOT_TRAIN_FOLDER}

# From the tensorflow/models/research directory


python object_detection/eval.py \
--logtostderr \
--pipeline_config_path=$
{PATH_TO_YOUR_PIPELINE_CONFIG} \
--checkpoint_dir=${PATH_TO_TRAIN_DIR} \
--eval_dir=${PATH_TO_EVAL_DIR}

You might also like